Improving speech emotion recognition by fusing self-supervised learning and spectral features via mixture of experts

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 19
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorHyeon, Jonghwanko
dc.contributor.authorOh, Yung-Hwanko
dc.contributor.authorLee, Young-Junko
dc.contributor.authorChoi, Ho-Jinko
dc.date.accessioned2024-07-01T09:00:09Z-
dc.date.available2024-07-01T09:00:09Z-
dc.date.created2024-06-25-
dc.date.issued2024-03-
dc.identifier.citationDATA & KNOWLEDGE ENGINEERING, v.150-
dc.identifier.issn0169-023X-
dc.identifier.urihttp://hdl.handle.net/10203/320087-
dc.description.abstractSpeech Emotion Recognition (SER) is an important area of research in speech processing that aims to identify and classify emotional states conveyed through speech signals. Recent studies have shown considerable performance in SER by exploiting deep contextualized speech representations from self-supervised learning (SSL) models. However, SSL models pre-trained on clean speech data may not perform well on emotional speech data due to the domain shift problem. To address this problem, this paper proposes a novel approach that simultaneously exploits an SSL model and a domain-agnostic spectral feature (SF) through the Mixture of Experts (MoE) technique. The proposed approach achieves the state-of-the-art performance on weighted accuracy compared to other methods in the IEMOCAP dataset. Moreover, this paper demonstrates the existence of the domain shift problem of SSL models in the SER task.-
dc.languageEnglish-
dc.publisherELSEVIER-
dc.titleImproving speech emotion recognition by fusing self-supervised learning and spectral features via mixture of experts-
dc.typeArticle-
dc.identifier.wosid001146036900001-
dc.identifier.scopusid2-s2.0-85185881711-
dc.type.rimsART-
dc.citation.volume150-
dc.citation.publicationnameDATA & KNOWLEDGE ENGINEERING-
dc.identifier.doi10.1016/j.datak.2023.102262-
dc.contributor.localauthorOh, Yung-Hwan-
dc.contributor.localauthorChoi, Ho-Jin-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorSpeech emotion recognition-
dc.subject.keywordAuthorSelf-supervised learning-
dc.subject.keywordAuthorDomain shift-
dc.subject.keywordAuthorSpectral feature-
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0