(The) noise robust voice activity detection잡음에 강건한 음성 검출기

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 503
  • Download : 0
Voice activity detection (VAD) is a key technique in numerous speech-related application such as speech recognition, speech enhancement and speech coding. In these applications, VAD discriminates the speech from the incoming signal, so that subsequent process steps can aim to speech signal rather than silence or noise. Therefore, VAD must have a robust accuracy in severe, various noise environment. Furthermore, VAD should have a low complexity to be adapted in real-time applications. The most important thing to construct the robust VAD is the feature that system found from the speech signal. Thus, the VAD design procedure can be mapped to feature extraction problem from speech signal. In this paper, we proposed two-direction to extract the robust feature from speech signal. First, unsupervised learning based feature that used the intrinsic harmonicity in the vowel sound. In this procedure, the new approach is proposed to verify the harmonicity and it was applied to VAD system. Our experiments show that the computation cost was extraordinarily reduced compared to previ-ous harmonicity based approach even though the accuracy is slightly improved in severe noise environment. Second, supervised learning based feature which use the discriminative pre-training (DPT). In this approach, we assume that various speech-related features have dissimilar robustness according to different noise types so that, if we fuse these features well, the fused one become a robust feature regardless of the noise type. In order to veri-fy this assumption, well-known speech-related features are fused by DPT. The training step was conducted with various SNR and noise type signal different from previous approach. The result show that the accuracy was out-standing compared to other state-of-the-art approaches.
Advisors
Lee, Hwang Sooresearcher이황수researcherCho, Kwang Hyunresearcher조광현researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2016
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2016.2 ,[ii, 30 p. :]

Keywords

Voice activity detection; speech signal processing; Machine learning; Vowel processing; Speech analysis; 음성 검출기; 음성 신호처리; 기계 학습; 모음처리; 음성 분석

URI
http://hdl.handle.net/10203/221791
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=649595&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0