(A) novel term weighting scheme based on discrimination power질의 어절의 고유한 분별력에 기반한 어절 가중치 부여방법 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 642
  • Download : 0
Term weighting for document ranking and retrieval has been an important research topic in Information Retrieval for decades. We propose a novel term weighting method that utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieval documents, and their relevance judgments. A term’s evidential weight, DP (Discrimination Power) which we propose in this paper, depends on the degree to which the mean weighting scores for the relevant and non-relevant document distributions are different in the relevance-judged past document collection. It also takes into account the rankings and similarity values of the relevant and non-relevant documents to make a compensation for incorrect positions or scores in the retrieved document list. The experiments were performed using two well-known open-source search engines, Terrier and Indri, and four different ranking models including TFIDF, DFR (Divergence From Randomness) BM25, Hiemstra Language Model, and Indri Language Model. Our experimental result using a standard test collection (TREC-3,4, and 5) shows that a term weighting scheme that incorporates the notion of evidential weights outperforms the four baseline scheme. It is interesting to note that we obtained the performance increase with only a small number of terms found in the relatively small number of past queries. An additional analysis of how the effectiveness changes as the number of terms having DP value increases shows that DP has strong applicability given a large set of queries because the effect of DP is in proportion to the number of DP terms. Further analysis shows the notion of evidential weight, not based on the entire collection but based on the relevance-judged documents, is clearly distinct from IDF. In addition, an experiment was performed and showed significant result on TREC Web Blogs collection to show the proposed method is feasible to apply to general Web search. As a result, we designed a new te...
Advisors
Myaeng, Sung-Hyonresearcher맹성현researcher
Description
한국과학기술원 : 전산학과,
Publisher
한국과학기술원
Issue Date
2011
Identifier
466475/325007  / 020035904
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학과, 2011.2, [ 88 p. ]

Keywords

Evidential Weight; Language Model; Information Retrieval; 가중치 부여방법; 어절 분별력; 경험적 가중치; 랭킹 모델; 정보 검색; Term Weighting; Discrimination Power

URI
http://hdl.handle.net/10203/33335
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=466475&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0