Soutenance de thèse Célia Boyer Walther
Mme Célia Boyer Walther soutiendra, en vue de l'obtention du grade de docteur en systèmes d'information de la Faculté d'économie et de management (GSEM), sa thèse intitulée:
Methods and tools to retrieve reliable health information on the internet
Le jury de thèse est composé de:
- Prof. Dimitri KONSTANTAS, Président du jury, GSEM, Université de Genève
- Prof. Gilles FALQUET, directeur de thèse, GSEM, Université de Genève
- Prof. Antoine GEISSBUHLER, Faculté de Médecine, Université de Genève
- Prof. Marie-Christine JAULENT, Directrice du LIMICS – Directrice de recherche INSERM
The Internet makes extensive medical and healthcare knowledge available to everyone. As a consequence, it has become the starting point for health information searches. To make informed decisions about their health, users need to be able to search efficiently for health information on the Web, retrieve trustworthy Web pages and understand the information they read, not least because such information can have a direct impact on a person’s health status.
The goal of the research activities conducted and presented in this thesis is to propose combined tools to improve access to quality health content online. The thesis presents the three pillars of retrieving quality health information: query formulation, readability and trust level. The research conducted on these pillars uses the quality criteria defined for health and medical Web publishers by the Health On the Net (HON) Foundation and used in its HONcode certification process. The HONcode is a set of ethical, honesty, transparency and quality standards covering various aspects of health website content production. It is the most used model for the identification of health sites that are trustworthy and respect quality criteria.
This thesis first explores the automated detection of HONcode principles for health websites. Specifically, it considersthe feasibility of such an automated detection, the benchmarking and assessing of natural language processing methods, the multilingual automated detection of the HONcode principles, the identification of documents in English and French according to their complexity level, and how to help refine search results.
It then studies the use of HONcode principle classifiers that can be applied to health Web pages. It also investigates the development of an automated system to detect the reliability level of a document and to classify health documents online.
Thirdly, it examines directions for the integration of the tools into a user-centered health domain search engine dedicated to trustworthy information and the testing of the feasibility of integrating the automated detection system into a dedicated search engine. This includes the development of a generic solution to enable integration in different contexts, such as through a Web browser extension.
And lastly, it delves into the results of usability testing of the integration of the tools into a health search engine, evaluating the benefits of a search engine that provides access to trustworthy websites and the implementation of the filtering capability of trusted sources via the automated tool developed within the research activities.
This thesis brings together the main learnings of a collection of scientific publications published in journals and at conferences with peer-review processes. In sum, it supports the author’s hypothesis that an automated system can predict the reliability of health Web pages by identifying elements that correlate with the quality of a website. However, the results obtained differ from a certification process where a human interprets text meanings and where health editors have been trained in the ethical issues and responsibilities surrounding health and medical information on the Web. Additionally, automated systems could provide support for the monitoring of websites and the finding of the pages related to a principle. As this is recurring work of the reviewers, it would give them more time to focus on critical tasks.
Keywords : HONcode, trustworthiness, quality standards, automated detection, classification, health Internet, machine learning, named entity recognition, natural language processing.
Date: Mardi 4 octobre 2016 à 14h00
Lieu: Battelle bâtiment A - Salle de cours 404-407 (3ème étage)21 septembre 2016
À la Une