Multi-domain Knowledge Distillation via Uncertainty-Matching for End-to-End ASR Models

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 254
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKim, Ho-Gyeongko
dc.contributor.authorLee, Min-joongko
dc.contributor.authorLee, Hoshikko
dc.contributor.authorKang, Tae Gyoonko
dc.contributor.authorLee, Jihyunko
dc.contributor.authorYang, Eunhoko
dc.contributor.authorHwang, Sung Juko
dc.date.accessioned2022-01-14T06:54:48Z-
dc.date.available2022-01-14T06:54:48Z-
dc.date.created2021-11-30-
dc.date.issued2021-09-01-
dc.identifier.citationINTERSPEECH 2021, pp.1311 - 1315-
dc.identifier.issn2308-457X-
dc.identifier.urihttp://hdl.handle.net/10203/291819-
dc.description.abstractKnowledge Distillation basically matches predictive distributions of student and teacher networks to improve performance in an environment with model capacity and/or data constraints. However, it is well known that predictive distribution of neural networks not only tends to be overly confident, but also cannot directly model various factors properly that contribute to uncertainty. Recently, deep learning studies based on uncertainty have been successful in various fields, especially in several computer vision tasks. The prediction probability can implicitly show the information about how confident the network is, however, we can explicitly utilize confidence of the output by modeling the uncertainty of the network. In this paper, we propose a novel knowledge distillation method for automatic speech recognition that directly models and transfers the uncertainty inherent in data observation such as speaker variations or confusing pronunciations. Moreover, we investigate an effect of transferring knowledge more effectively using multiple teachers learned from various domains. Evaluated on WSJ which is the standard benchmark dataset with limited instances, the proposed knowledge distillation method achieves significant improvements over student baseline models.-
dc.languageEnglish-
dc.publisherInternational Speech Communication Association-
dc.titleMulti-domain Knowledge Distillation via Uncertainty-Matching for End-to-End ASR Models-
dc.typeConference-
dc.identifier.scopusid2-s2.0-85119211956-
dc.type.rimsCONF-
dc.citation.beginningpage1311-
dc.citation.endingpage1315-
dc.citation.publicationnameINTERSPEECH 2021-
dc.identifier.conferencecountryCS-
dc.identifier.conferencelocationBrno-
dc.identifier.doi10.21437/Interspeech.2021-1169-
dc.contributor.localauthorYang, Eunho-
dc.contributor.localauthorHwang, Sung Ju-
dc.contributor.nonIdAuthorKim, Ho-Gyeong-
dc.contributor.nonIdAuthorLee, Min-joong-
dc.contributor.nonIdAuthorLee, Hoshik-
dc.contributor.nonIdAuthorKang, Tae Gyoon-
dc.contributor.nonIdAuthorLee, Jihyun-
Appears in Collection
RIMS Conference Papers
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0