Soutenance de thèse Soheil Rayatdoost
M. Soheil Rayatdoost soutiendra en anglais, en vue de l'obtention du grade de docteur ès sciences, mention informatique, sa thèse intitulée:
Computational Understanding of the Interaction between Facial Expressions and Electroencephalogram (EEG) Signals for Emotion Recognition
Date: Mardi 9 février 2021 à 16h00
Direction de : Professeur D. RUDRAUF (Faculté de psychologie et des sciences de l'éducation, Centre Interfacultaire en Sciences Affectives)
Codirection de : Professeur B. CHOPARD (Faculté des sciences, Département d''informatique)
Machine recognition of emotions is an essential prerequisite for natural and intelligent interactions between humans and computers. Existing research has demonstrated the potential of electroencephalogram (EEG)-based emotion recognition. EEG signals are weak electric potentials generating from cortical activities recorded by electrodes on the scalp. In addition to information from neural sources, EEG signals also contain a mixture of signals originating from sensorimotor, cognitive and affective activities in addition to artifacts. Facial expressions, eye and head movements generate electrical activities that have higher amplitude than the EEG signals and are the main source of artifacts. Even though artifacts are considered a nuisance to brain-computer interfaces, in affective brain-computer interfaces (aBCI), behavioral artifacts can be valuable for emotion recognition.
Our research aims to identify the inter-modality interaction between EEG signals and facial behaviors in order to improve EEG-based emotion recognition performance. The interaction between these two modalities can be explained by signal interference and joint emotional variations. For the first time, we designed a specific protocol and collected data to isolate the effects of expressions, subjective feelings and stimuli in these signals. We recorded a precisely synchronized multimodal database with visual, olfactory and mimicry stimuli, including genuine and acted expressions of emotions. This dataset enabled us to design and evaluate the emotion recognition methods in different scenarios and in the presence of between participant variance. We used our novel dataset in addition to the existing datasets (DEAP and MAHNOB) for within- and cross-corpus evaluation.
We examined the lack of generalization as a common problem in EEG-based emotion recognition. We explored the suitability of the existing EEG features for emotion recognition and investigated the performance of emotion recognition methods across different corpora. We demonstrated that the performance of the existing methods significantly decreases when evaluated across different corpora. We also developed a convolutional neural network fed by spectral topography maps from different bands for emotion recognition. We showed that the proposed network improves emotion recognition performance in within- and between-dataset evaluation.
To analyze the interaction between neural and behavioral activities, we calculated the correlations between features from various modalities. The correlation analysis showed that both behavioral and neural responses associated with emotions are present in EEG signals. This was evident when we found that a classifier trained on spontaneous emotions was able to learn posed expressions, even in cases where participants reported no felt emotions while mimicking expressions.
To better understand the contribution from different sources, we used adversarial learning to train a network invariant to the information from behavioral activities. The results from learned representations demonstrate that emotion recognition performance was very close to the chance level after unlearning behavior-related information. The sensory responses to stimuli also include powerful emotion specific activities. Our analyses demonstrate that behavioral and sensory activities are likely the leading features in EEG emotion recognition in the presence of behavioral activities.
We also proposed two novel deep representation learning approaches to learn joint and coordinated representations between different modalities to improve emotion recognition. The first representation learning method captures emotional features from EEG signals, guided by facial electromyogram (EMG) and electrooculogram (EOG) signals. We showed that the learned representation could be transferred to a different database without EMG and EOG and achieved superior performance.
The second proposed representation learning approach involves joint learning of a unimodal representation aligned with the other modality through cosine similarity and a gated information fusion for modality fusion. The results show that our deep representation can learn mutual and complementary information between EEG signals and face behavior, captured by action units, head and eye movements from face videos, in a manner that generalizes across databases. As a result, it outperforms other fusion methods for bimodal emotion recognition.
The discrepancies between the train and test data distributions, a.k.a., domain shift, result in lower generalization for emotion recognition methods. One of the main factors contributing to these discrepancies is human variability. To address the human variability problem, we developed an adversarial deep domain adaptation approach for emotion recognition from EEG signals. The method jointly learns a new representation that minimizes emotion recognition loss and maximizes subject confusion loss. We demonstrated that the proposed subject-independent representation can improve emotion recognition performance within and across databases.
This thesis studies the contribution of affective brain activities to EEG-based emotion recognition in the presence of behavioral activities. It also presents novel machine learning methods for EEG-based emotion recognition that consider intermodality interaction. It addresses the lack of generalization of emotion recognition methods and presents an adaptation method to reduce this gap. This thesis contributes to a better understanding of the limitations and real potentials of EEG-based emotion recognition. In the future, cross-modal representation learning methods can be developed by adding self-attention and cross-modal attention mechanisms, in order to learn the most important pieces of information across space (electrode) and time. Such methods can be used to fuse behavioral and neural responses and be deployed in wearable emotion recognition solutions, practical in situations in which computer vision expression recognition is not feasible like virtual reality applications.