Robust speaker recognition based on filtering in autocorrelation domain and sub-band feature recombination

Cited 5 time in webofscience Cited 0 time in scopus
  • Hit : 549
  • Download : 10
DC FieldValueLanguage
dc.contributor.authorKim, Sungtakko
dc.contributor.authorJi, Miyoungko
dc.contributor.authorKim, HoiRinko
dc.date.accessioned2010-12-03T08:31:17Z-
dc.date.available2010-12-03T08:31:17Z-
dc.date.created2012-02-06-
dc.date.created2012-02-06-
dc.date.issued2010-05-
dc.identifier.citationPATTERN RECOGNITION LETTERS, v.31, no.7, pp.593 - 599-
dc.identifier.issn0167-8655-
dc.identifier.urihttp://hdl.handle.net/10203/20697-
dc.description.abstractThis paper presents a new method to improve features derived from filtering in autocorrelation domain, which are called relative autocorrelation sequence mel-frequency cepstral coefficients (RAS-MFCCs), one of the successful features in autocorrelation domain for noise-robust speaker recognition. The RAS-MFCCs are derived by applying temporal filtering to autocorrelation sequences under the assumption that corrupting noise is stationary. However, the use of only the filtered sequences could cause performance degradation due to the use of restricted information, and the assumption that noise is stationary might result in leaving non-stationary noise components in filtered autocorrelation sequences in real environments. To compensate for the restricted information, we propose a multi-streaming feature extraction that uses autocorrelation sequences as well as temporally filtered autocorrelation sequences for feature extraction. Furthermore, a hybrid feature representation, in which the multi-streaming feature extraction and the sub-band feature recombination are combined, is proposed to reduce the noise effects of autocorrelation sequences and the residual-noise effects of temporally filtered autocorrelation sequences. To evaluate the effectiveness of the proposed hybrid speaker recognition system in noisy conditions, we use the TIMIT database and the NTIMIT database. Experiments on the T1MIT database prove the effectiveness of the proposed hybrid method by reducing errors up to 26% and 14% over the conventional RAS-MFCCs in speaker identification and verification, respectively. On the NTIMIT database, the proposed hybrid feature representation provides error reduction of 24% and 18% over the conventional RAS-MFCCs for speaker identification and verification. (C) 2009 Elsevier B.V. All rights reserved.-
dc.languageEnglish-
dc.language.isoen_USen
dc.publisherELSEVIER SCIENCE BV-
dc.subjectSCORE NORMALIZATION-
dc.subjectIDENTIFICATION-
dc.subjectSPEECH-
dc.subjectMODELS-
dc.titleRobust speaker recognition based on filtering in autocorrelation domain and sub-band feature recombination-
dc.typeArticle-
dc.identifier.wosid000276700500008-
dc.identifier.scopusid2-s2.0-77949271463-
dc.type.rimsART-
dc.citation.volume31-
dc.citation.issue7-
dc.citation.beginningpage593-
dc.citation.endingpage599-
dc.citation.publicationnamePATTERN RECOGNITION LETTERS-
dc.embargo.liftdate9999-12-31-
dc.embargo.terms9999-12-31-
dc.contributor.localauthorKim, HoiRin-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorRelative autocorrelation sequence-
dc.subject.keywordAuthormel-frequency cepstral coefficients-
dc.subject.keywordAuthorTemporal filtering-
dc.subject.keywordAuthorMulti-streaming feature extraction-
dc.subject.keywordAuthorHybrid feature representation-
dc.subject.keywordAuthorSub-band feature recombination-
dc.subject.keywordPlusSCORE NORMALIZATION-
dc.subject.keywordPlusIDENTIFICATION-
dc.subject.keywordPlusSPEECH-
dc.subject.keywordPlusMODELS-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 5 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0