(The) noise robust voice activity detection잡음에 강건한 음성 검출기

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 494
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorLee, Hwang Soo-
dc.contributor.advisor이황수-
dc.contributor.advisorCho, Kwang Hyun-
dc.contributor.advisor조광현-
dc.contributor.authorKim, Jun-Tae-
dc.contributor.author김준태-
dc.date.accessioned2017-03-29T02:38:49Z-
dc.date.available2017-03-29T02:38:49Z-
dc.date.issued2016-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=649595&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/221791-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2016.2 ,[ii, 30 p. :]-
dc.description.abstractVoice activity detection (VAD) is a key technique in numerous speech-related application such as speech recognition, speech enhancement and speech coding. In these applications, VAD discriminates the speech from the incoming signal, so that subsequent process steps can aim to speech signal rather than silence or noise. Therefore, VAD must have a robust accuracy in severe, various noise environment. Furthermore, VAD should have a low complexity to be adapted in real-time applications. The most important thing to construct the robust VAD is the feature that system found from the speech signal. Thus, the VAD design procedure can be mapped to feature extraction problem from speech signal. In this paper, we proposed two-direction to extract the robust feature from speech signal. First, unsupervised learning based feature that used the intrinsic harmonicity in the vowel sound. In this procedure, the new approach is proposed to verify the harmonicity and it was applied to VAD system. Our experiments show that the computation cost was extraordinarily reduced compared to previ-ous harmonicity based approach even though the accuracy is slightly improved in severe noise environment. Second, supervised learning based feature which use the discriminative pre-training (DPT). In this approach, we assume that various speech-related features have dissimilar robustness according to different noise types so that, if we fuse these features well, the fused one become a robust feature regardless of the noise type. In order to veri-fy this assumption, well-known speech-related features are fused by DPT. The training step was conducted with various SNR and noise type signal different from previous approach. The result show that the accuracy was out-standing compared to other state-of-the-art approaches.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectVoice activity detection-
dc.subjectspeech signal processing-
dc.subjectMachine learning-
dc.subjectVowel processing-
dc.subjectSpeech analysis-
dc.subject음성 검출기-
dc.subject음성 신호처리-
dc.subject기계 학습-
dc.subject모음처리-
dc.subject음성 분석-
dc.title(The) noise robust voice activity detection-
dc.title.alternative잡음에 강건한 음성 검출기-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0