Robust speech recognition using missing data theory손실 데이터 이론을 이용한 강인한 음성 인식

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 480
  • Download : 0
For several decades, many researchers have proposed algorithms for robust automatic speech recognition so that a speech recognition system be utilized not only in laboratory environment but also in real noisy one. The robustness, in general, is de-fined as a characteristic of recognition systems that they are less sensitive to adverse conditions even when they are trained in clean ones. Speaker variation, speaking rate, and the mismatch of between training and testing environments make it difficult for the systems to be commercialized. Due to the characteristic of communication channel or the mismatch of between training and testing environments, linear filtering causes two problems: missing data in frequency and the masking effect that strong noisy signals make weaker ones inaudible. When there is mismatch in feature distributions between trained acoustic models and test features, recognition systems rapidly deteriorate. The goal of our work is to make recognition systems less sensitive to noisy environments. For this, we adopt a missing data theory, which is widely used in the field of statistics. The missing data theory has an advantage that it can be easily applicable to continuous density hidden Markov models. A marginalization method is used for processing missing data since it can be implemented with low complexity when applied to recognition systems. A spectral subtraction may be utilized for missing data detection. If the difference between the energy of speech and that of background noise is lower than a threshold, we assume that missing has occurred. Because we adopt a marginalization method for processing missing data, the wrong detection of missing data directly affects the recognizer``s performance. To solve the problem, we propose a novel method that utilizes voicing probability as the reliability degree of detected missing data. Since consonants are more likely to be masked by background noise than vowels, the subbands in consonants are more pro...
Advisors
Oh, Yung-Hwanresearcher오영환researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2002
Identifier
174641/325007 / 000985048
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2002.2, [ x, 110 p. ]

Keywords

voicing probability; noise robustness; missing data theory; speech recognition; noise masking threshold; 마스킹 임계치; 모음화 확률; 잡음 강인성; 손실 데이터 이론; 음성 인식

URI
http://hdl.handle.net/10203/33195
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=174641&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0