(An) out-of-vocabulary rejection technique using anti-phoneme model and on-line garbage model in hidden Markov model은닉 마르코프 모델에서의 반음소 모델과 온라인 가비지 모델을 이용한 비인식대상 어휘 제거 기법
Automatic Speech Recognition (ASR) has made great advances in last 10 years. The application of ASR in everyday situation represents a new possibility in human-machine interface. For this to become a reality, it is important to distinguish in-vocabulary words from out-of-vocabulary words.
In order to reject out-of-vocabulary words effectively while accepting in-vocabulary words, conventional approaches such as the filler model approach or the on-line garbage model approach have been proposed. These approaches either require the use of extraneous data to train filler models or require adjusting when the set of in-vocabulary words is changed. In this thesis, a novel approach based on probabilistic characteristics is proposed to reduce the confusion between a claimed model and the other models. In order to reflect probabilistic characteristics of models, the anti-model for the claimed model is constructed by weighting observation probabilities of other models with their weights being inversely proportional to their distances to the claimed model. In addition, a hybrid of the proposed model and the on-line garbage model is also suggested to improve performance. The proposed method is evaluated using 455 Korean isolated words speech corpus. For simulation, 90 words are selected as in-vocabulary words and the same number of words are selected as out-of-vocabulary words without overlapping. The proposed method results in 8.33% of equal error rate, which is improved by 63.16% for the filler model approach and 39.77% for the on-line garbage model in error rate reduction.