Probabilistic parsing of Korean based on language-specific properties언어 특성에 기반한 한국어의 확률적 구문 분석

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 585
  • Download : 0
Natural language parsing is a central component to many natural language processing tasks. Since a natural language has inherently structural ambiguities, one of the difficulties of parsing is resolving the structural ambiguities. The ambiguities arise whenever a sentence can be interpreted in more than one way. Recently, a probabilistic approach to tackle this disambiguation problem has received considerable attention because it has some attractions such as automatic learning, wide-coverage, and robustness. Many probabilistic parsing models have been developed for the past few years, and they are mainly for English not Korean. In this thesis, we focus on Korean syntax and a probabilistic parsing model for Korean. We investigate two problems: representing Korean syntax, and building a language model for Korean syntax. The representation of a language is to describe the structure of the language, and it is the direct reflection of the features of the language. A language model is a probability distribution $P(S)$ over string $S$ that attempts to reflect how frequently a string $S$ occurs as a sentence. The claim of this thesis is that syntax of Korean in fact can be represented more efficiently using a grammar representation scheme that exploits the characteristics of Korean. Also, we argue that the consideration of the language specific features can produce a more accurate natural language parser than one without such consideration. These claims are justified by constructing a parser for Korean based on the specific properties of the Korean structures, and comparing its performance to a state-of-the-art parser for English on a common task.
Advisors
Kim, Gil-Chang김길창
Description
한국과학기술원 : 전산학과,
Publisher
한국과학기술원
Issue Date
1998
Identifier
134783/325007 / 000945817
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학과, 1998.2, [ vii, 91 p. ]

Keywords

Corpus; Probabilistic parsing; Korean; Syntactic analysis; Language properties; 언어 특성; 코퍼스; 확률 파싱; 한국어; 구문 분석

URI
http://hdl.handle.net/10203/33105
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=134783&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0