Robust speech recognition based on partial information technique부분 정보 기법에 기반한 강인한 음성인식

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 479
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorOh, Yung-Hwan-
dc.contributor.advisor오영환-
dc.contributor.authorCho, Hoon-Young-
dc.contributor.author조훈영-
dc.date.accessioned2011-12-13T05:20:23Z-
dc.date.available2011-12-13T05:20:23Z-
dc.date.issued2003-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=181185&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/32838-
dc.description학위논문(박사) - 한국과학기술원 : 전산학전공, 2003.2, [ ix, 88 p. ]-
dc.description.abstractAutomatic speech recognition (ASR) systems in real environments may have to cope with various noise signals that corrupt some time-frequency regions of speech more severely than other regions. Though great progress has been achieved in the area of robust ASR, most techniques have focused on reasonably stationary wide-band noise and, therefore, are limited in their ability to achieve robustness in real noisy environments. Partial information technique, a developing area of research, models the human ability of emphasizing reliable partial information in time-frequency regions. As one of its main approaches, the multi-band ASR scheme splits the whole frequency range into several sub-bands. Sub-bands recognition results are then recombined by exploiting sub-band reliabilities to make a final decision. This dissertation aims at improving the ASR performance on partially corrupted speech based on the partial information technique. In the frequency domain, three limitations of the multi-band recognition system are addressed. First, the multi-band scheme cannot maximally utilize the uncorrupted parts because sub-band boundaries are fixed. The sub-band boundaries should be adaptive to localize the noise and better utilize partial information. Second, because sub-band feature vectors are processed independently in this method, the information contained in a global spectral structure may be lost. Finally, the whole ASR system must be rebuilt because of the architectural differences from a full-band ASR system. This study proposes a weighted filter bank analysis and model adaptation (WFBA-MA) method to resolve these problems. The proposed scheme estimates reliability weights of Mel filter bank channels and extracts a weighted Mel frequency cepstral coefficient by suppressing unreliable log filter bank energies. The same weights are also applied to an entire set of HMM parameters. An environment selective processing (ESP) method is also proposed, which determines whether a...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectWeighted filter bank analysis-
dc.subjectPartially corrupted speech-
dc.subjectMulti-band speech recognition-
dc.subjectPartial Information-
dc.subjectModel adaptation-
dc.subject모델 적응-
dc.subject가중 필터뱅크 분석-
dc.subject부분 손상 음성-
dc.subject다중대역 음성인식-
dc.subject부분 정보-
dc.titleRobust speech recognition based on partial information technique-
dc.title.alternative부분 정보 기법에 기반한 강인한 음성인식-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN181185/325007-
dc.description.department한국과학기술원 : 전산학전공, -
dc.identifier.uid000985369-
dc.contributor.localauthorOh, Yung-Hwan-
dc.contributor.localauthor오영환-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0