Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

Cited 16 time in webofscience Cited 0 time in scopus
  • Hit : 521
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKim, Myung Jongko
dc.contributor.authorKim, Younggwanko
dc.contributor.authorYoo, Joohongko
dc.contributor.authorWang, Junko
dc.contributor.authorKim, Hoi-Rinko
dc.date.accessioned2017-10-23T02:00:25Z-
dc.date.available2017-10-23T02:00:25Z-
dc.date.created2017-06-25-
dc.date.created2017-06-25-
dc.date.created2017-06-25-
dc.date.issued2017-09-
dc.identifier.citationIEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, v.25, no.9, pp.1581 - 1591-
dc.identifier.issn1534-4320-
dc.identifier.urihttp://hdl.handle.net/10203/226455-
dc.description.abstractThis paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.-
dc.languageEnglish-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.subjectACOUSTIC MODEL-
dc.subjectNEURAL-NETWORKS-
dc.titleRegularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition-
dc.typeArticle-
dc.identifier.wosid000410192400022-
dc.identifier.scopusid2-s2.0-85029604485-
dc.type.rimsART-
dc.citation.volume25-
dc.citation.issue9-
dc.citation.beginningpage1581-
dc.citation.endingpage1591-
dc.citation.publicationnameIEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING-
dc.identifier.doi10.1109/TNSRE.2017.2681691-
dc.contributor.localauthorKim, Hoi-Rin-
dc.contributor.nonIdAuthorWang, Jun-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorDysarthria-
dc.subject.keywordAuthorspeech recognition-
dc.subject.keywordAuthorspeaker adaptation-
dc.subject.keywordAuthorKL-HMM-
dc.subject.keywordAuthorregularization-
dc.subject.keywordPlusACOUSTIC MODEL-
dc.subject.keywordPlusNEURAL-NETWORKS-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 16 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0