Building word knowledge for information retrieval using statistical information정보검색을 위한 단어지식의 통계적 구축

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 471
  • Download : 0
Information Retrieval(IR) is the subfield of computer science that deals with the automated storage and retrieval of documents. In IR systems, a user submits a query to find documents for his/her information need. User query is a representation method to deliver user``s need to IR system. User query, however, has individual varieties according to the users knowledge level and information needs. The property of user query can be characterized as following: subjectivity, incompleteness, and variety. Subjectivity of user query means that user query is generated from the subjective knowledge level of individual information needs. Incompleteness is a property that user``s knowledge cannot be a complete one. The level of incompleteness differs from one user to another. Variety of user query can be described as users generally do not use exact same terms for a single concept. These characteristics of user query yield that IR should use knowledge to process user``s query for finding their information needs. The use of knowledge can reduce the knowledge gap between a user and IR system. There are two main categories of the word knowledge in IR: domain knowledge and lexical knowledge. Domain knowledge represents the knowledge of the domain expert using the similarity of terms. While domain knowledge addresses the meaning coherence among terms, lexical knowledge deals with the individual knowledge on a specific term itself. Namely, lexical knowledge focuses on the variable form diversities in documents on a specific term. In this thesis, it is shown that the two major knowledge, domain knowledge and lexical knowledge, can be built by the statistical anaylsis. For the domain knowledge, Bayesian network is used to encode the statistical behavior of terms. The Collocation map, which is a particular instance of the Bayesian network for encoding term dependency relations, is shown to be useful for the task of automatic domain knowledge construction. The proposed similarity m...
Advisors
Choi, Key-Sunresearcher최기선researcher
Description
한국과학기술원 : 전산학과,
Publisher
한국과학기술원
Issue Date
1997
Identifier
128063/325007 / 000945172
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학과, 1997.8, [ vi, 70 p. ]

Keywords

Statistical analysis; Word knowledge; Information retrieval; Thesaurus compound noun; 복합명사; 시소러스; 통계적 분석; 단어지식; 정보검색

URI
http://hdl.handle.net/10203/33090
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=128063&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0