Confusion-based confidence measures for utterance verification and speaker recognition in noisy environment잡음환경에서의 발화검증과 화자인식을 위한 혼돈기반 신뢰도 측정법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 588
  • Download : 0
In recent years, the speech recognition technology has been extensively advanced with the developments of powerful computing devices, pattern recognition techniques, signal processing techniques, and so on, but its performance is not still perfect. The most critical reasons of the imperfection are mainly the absence of the explicit solutions to the unexpected noise corruption and the unknown speech understanding mechanism in human brain, and these will not be solved in near future. Thus, to apply the state-of-the-art speech recognition technology to commercial products we have to take care of the speech recognition errors. Especially, for the speech-based man-machine interface to be more natural, it is important to in advance estimate the degree of recognition confidence that implies how much a recognition result is credible. If the recognition results were turned out to be incorrect based on a confidence score, it would be desirable to ask the user to speak again or to ignore the recognition results by doing nothing rather than executing the corresponding unexpected action. To realize this functionality for natural man-machine interface (MMI), the recognizer should have an ability to determine whether the recognition result is correct or not by measuring a confidence score. This is called utterance verification. In this thesis, we propose confusion-based confidence measures for the utterance verification. Most conventional confidence measures are mainly based on likelihood ratio test (LRT). The drawback of the LRT-based confidence measures is that those are not robust to noise corruptions and require large amount of computations in calculating the likelihood of an alternative model. The proposed method finds momentary best-scored state (MBS) frame-by-frame during the Viterbi search, and the MBSs are compared with the state sequences of the recognition result from the Viterbi decoding to measure the recognition confidence, which is called confusion-based con...
Advisors
Kim, Hoi-Rinresearcher김회린researcher
Description
한국정보통신대학교 : 공학부,
Publisher
한국정보통신대학교
Issue Date
2006
Identifier
392587/225023 / 000995415
Language
eng
Description

학위논문(박사) - 한국정보통신대학교 : 공학부, 2006, [ xii, 99 p. ]

Keywords

Utterance Verification; Speaker Recognition; Speech Recognition; Confidence Measure; 신뢰도측정; 발화검증; 화자인식; 음성인식

URI
http://hdl.handle.net/10203/54552
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=392587&flag=dissertation
Appears in Collection
School of Engineering-Theses_Ph.D(공학부 박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0