Improvement of speaker identification systems using candidate selection and likelihood ratio normalisation = 후보선정과 우도비 정규화를 이용한 화자식별 시스템의 성능향상

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 162
  • Download : 0
Speaker identification is the selection of the best matched speaker with input speech among the enrolled speakers. Speaker identification is mainly used in telephone services since it uses only speech as its input. In real environments, correct speaker identification is difficult for two main reasons. First, the number of enrolled speakers is large. In this case, subspaces which are represented by each speaker model can be covered by subspaces by other speaker models. Second, mis-matches occur between speaker models and input speech due to: insufficient training data, mis-matches between training and testing environments, and the effects of noise. Therefore, we need normalisation and scoring methods which will reduce the number of mis-matches. As a solution for the overlapping of speaker subspaces, this thesis proposes a confidence measure based on significance testing in order to select candidates for identification results. If the obtained confidence value from input by this measure is greater than the predefined threshold, the identification system accepts the identification result. If the obtained confidence value is less than the threshold for the client set, it rejects the identification result and selects the proper candidates. This thesis also proposes a scoring method which eliminates the frames which have a lower average rank of selected candidates after candidate selection, as a solution for mis-matches between speaker model and input speech. As a result, every speaker has the same selected frames when calculating the normalised score. In order to verify whether the proposed confidence measure accepts or rejects correctly, identification rates from all of the inputs and those inputs exceeding the pre-defined confidence level are compared. Those inputs exceeding the pre-defined confidence level (0.95) show an average of 28.71 percent higher identification rates than that of all inputs. In order to verify the candidate selection method, identificat...
Advisors
Oh, Yung-Hwanresearcher오영환researcher
Description
한국과학기술원 : 전산학과,
Publisher
한국과학기술원
Issue Date
1998
Identifier
134009/325007 / 000963163
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학과, 1998.2, [ v, [58] p. ]

Keywords

Normalisation; Speaker identification; Candidate selection; 후보선정; 정규화; 화자식별

URI
http://hdl.handle.net/10203/34257
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=134009&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0