In the domain of biology I have worked extensively on the construction of diagnostic and prognostic models for biological problems with a focus on proteomics problems and mass spectrometry, within the context of a European Cost Action, and dealt extensively with quality assurance and control issues. The main part of the work focused on preprocessing and feature extraction followed by the application of machine learning to model how the extracted features determine the final patient outcome. Preprocessing issues I dealt with included baseline removal, denoising, smoothing, peak extraction and alignment, I have developed methods to select among different preprocessing methods and fine tune their parameters in order to retain the highest possible information content in the extracted features. Moreover I have participated in the definition of protocols to control the reproducibility of different sample preparation methods and procedures to control the within laboratory reproducibility of measurements and data analysis results.
On a parallel direction I have worked on the development of learning methods that are adapted to the idiosyncrasies of the data typically found in mass spectrometry, namely the high dimensionality and redundancy, using kernel tools. The resulting tools have been applied successfully to a variety of problems that share similar characteristics, such as classification of microarrays or text mining problems, results have been published in the SIAM conference on Data Mining. In a similar context, within the European project DropTop, I am working on multisource learning from proteomics, genomics, and transcriptomics data for the construction of prognostic models using survival analysis and exploiting learning methods from the area of kernel combination, weighting and learning.
In the domain of health informatics and within the context of the European project, DebugIT, whose major goal is to extract new knowledge concerning problematic patient-safety patterns which will be incorporated in the hospitalŐs monitor and decision support tool, I am investigating the development of data mining methods that are appropriate for the multimodal and spatiotemporal nature of the sequence of events that take place during the stay of a patient within the hospital. A number of challenges have to be addressed foremost among them is the semantic integration of data, information, and knowledge from a variety of different sources, e.g. information systems and data repositories of different hospitals. The results will be integrated into the Clinical Information Systems of participating European hospitals, industry (AGFA Health Care), and their clients and will become available globally.