Zero-crossing-based sound source localization, segregation and recognition = 영교차점에 기초한 음원의 방향 탐지, 분리 및 인식

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 592
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Sung-Ho-
dc.contributor.advisor김성호-
dc.contributor.advisorKil, Rhee-Man-
dc.contributor.advisor길이만-
dc.contributor.authorAn, Sung-Jun-
dc.contributor.author안성준-
dc.date.accessioned2011-12-14T04:40:47Z-
dc.date.available2011-12-14T04:40:47Z-
dc.date.issued2010-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=418773&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/41935-
dc.description학위논문(박사) - 한국과학기술원 : 수리과학과, 2010.2, [ ix, 87 p. ]-
dc.description.abstractThis thesis presents some new methods of spatial hearing Algorithm. The first one is zero-crossing-based sound source localization with precedence effect in severely reverberant conditions. And the second one is binaural mask estimation for sound segregation and recognition under the condition that multiple sound sources are present simultaneously. The precedence effect is a psychoacoustic effect related to a group of auditory phenomena. Especially under reverberant condition, when various similar sounds originated from one or more sources at different location from the listener, the direct sound arrived first and it is also heard first. To the listener, this creates the impression that the sound comes from that location alone due to a phenomenon and suppress the perception of later arrivals. By adapting this precedence effect to our sound source localization algorithm, we can get very good simulation results in sound localization under severely reverberant condition. For sound segregation and recognition, we use a ratio masking method. The masking is determined by the estimated sound source directions using the spatial cues such as inter-aural time differences (ITDs) and inter-aural intensity differences (IIDs). In the suggested method, the estimation of ITDs is utilizing the statistical properties of zero-crossings detected from binaural filter-bank outputs. We also consider the estimation of ITDs with the aid of IID samples to cope with the phase ambiguities of ITD estimates in high frequencies. For the masking method, we consider using the power ratio of the target to interference sources. We show that this power ratio is optimal from the view point of reconstructing the target speech signal and is effectively used in missing data speech recognition. To estimate the power ratio, the expectation and maximization (EM) method is used for ITD estimates. As a result, the proposed method is able to provide the better masking scheme for speech segregation and...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject반향-
dc.subject음성 인식-
dc.subject음원 방향 탐지-
dc.subject음성 분리-
dc.subject영교차점-
dc.subjectReverberation-
dc.subjectSpeech Recognition-
dc.subjectSpeech Segregation-
dc.subjectSound Source Localization-
dc.subjectZero-Crossing-
dc.titleZero-crossing-based sound source localization, segregation and recognition = 영교차점에 기초한 음원의 방향 탐지, 분리 및 인식-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN418773/325007 -
dc.description.department한국과학기술원 : 수리과학과, -
dc.identifier.uid020045146-
dc.contributor.localauthorKim, Sung-Ho-
dc.contributor.localauthor김성호-
dc.contributor.localauthorKil, Rhee-Man-
dc.contributor.localauthor길이만-
Appears in Collection
MA-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0