Abstract:
Emotion recognition plays a crucial role in human life, which affects interpersonal communication and social behavior. A dual optimization of hardware and algorithms is presented in this paper to improve the performance of emotion recognition by using a multimodal acquisition helmet. The hardware acquisition system is designed to combine electroencephalogram (EEG) and eye movement signals to ensure efficient and precise data acquisition. The multimodal contrastive learning gated network (MCGNet) is developed to improve the accuracy of emotion recognition. Considering the insufficient extraction of EEG features, a multi-domain EEG feature extractor is designed. To better capture complementary information between modalities, the model uses contrastive representation learning and contrastive feature decomposition. To solve the problem of noise interference caused by multimodal fusion, which may lead to lower accuracy than single-modal methods, a gating structure is introduced to select different modality combinations and corresponding expert networks based on sample features. Experimental results demonstrate that the proposed multimodal acquisition helmet performs well, providing new insights into the application of emotion recognition.