Hangul keyword spotting using dynamically synthesized pseudo 2D hidden markov models = 실시간 합성 의사 2차원 은닉 마르코프 모델을 이용한 한글 핵심어 검출

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 440
  • Download : 0
Latin text runs linearly left to right and are written or printed as such. But Hangul characters consist of two or three graphemes and are composed nonlinearly inside a 2D rectangular box according to the Hangul character composition rule. Hangul character set is too big for computer modeling as in conventional methods. Therefore this thesis proposes a novel method of effective character modeling while reducing the number of unit model. The key idea of the proposed method is to synthesize character images in real time and convert to efficient statistical models or HMMs. The traditional methods of HMM, although highly successful in 1-D time series analysis, have not yet been successfully extended to 2-D image analysis while fully exploiting the hierarchical design and extension of HMM networks for complex structured signals. Instead of the traditional off-line training method of the Baum-Welch algorithm, we propose a new method of creating in real time the word or composite character HMMs for 2-D word/character patterns. The proposed method goes as follows: first, we manually prepared a set of location-preserving grapheme image samples for each grapheme and obtained their average, a grapheme template. Then by superposing two or three appropriate grapheme templates, we compose a character image template. Following this, we convert this character template to a P2DHMM in a systematic way. The idea of character composition is not new, but the application to strictly 2-D model design is. It is especially true in 2-D HMM framework. Another feature of the proposed method is the conversion of the grayscale template into a P2DHMM, which is theoretically correct in the sense of maximum likelihood estimation. An additional noteworthy feature is model size reduction by noting the information redundancy in the templates; successive HMM states are merged based on the similarity between their output PDs. The resulting models are often much smaller than the original and thus sp...
Advisors
Kim, Jin-Hyungresearcher김진형researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2004
Identifier
237672/325007  / 000935344
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2004.2, [ x, 83 p. ]

Keywords

KEYWORD SPOTTING; CHARACTER MODELLING; PSEUDO 2D HMM; DOCUMENT RETRIEVAL; 문서 검색; 핵심어 검출; 문자 모델링; 의사 2차원 은닉 마르코프 모델

URI
http://hdl.handle.net/10203/32867
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=237672&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0