(A) novel term weighting scheme based on discrimination power질의 어절의 고유한 분별력에 기반한 어절 가중치 부여방법 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 655
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorMyaeng, Sung-Hyon-
dc.contributor.advisor맹성현-
dc.contributor.authorSong, Sa-Kwang-
dc.contributor.author송사광-
dc.date.accessioned2011-12-13T05:27:59Z-
dc.date.available2011-12-13T05:27:59Z-
dc.date.issued2011-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=466475&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/33335-
dc.description학위논문(박사) - 한국과학기술원 : 전산학과, 2011.2, [ 88 p. ]-
dc.description.abstractTerm weighting for document ranking and retrieval has been an important research topic in Information Retrieval for decades. We propose a novel term weighting method that utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieval documents, and their relevance judgments. A term’s evidential weight, DP (Discrimination Power) which we propose in this paper, depends on the degree to which the mean weighting scores for the relevant and non-relevant document distributions are different in the relevance-judged past document collection. It also takes into account the rankings and similarity values of the relevant and non-relevant documents to make a compensation for incorrect positions or scores in the retrieved document list. The experiments were performed using two well-known open-source search engines, Terrier and Indri, and four different ranking models including TFIDF, DFR (Divergence From Randomness) BM25, Hiemstra Language Model, and Indri Language Model. Our experimental result using a standard test collection (TREC-3,4, and 5) shows that a term weighting scheme that incorporates the notion of evidential weights outperforms the four baseline scheme. It is interesting to note that we obtained the performance increase with only a small number of terms found in the relatively small number of past queries. An additional analysis of how the effectiveness changes as the number of terms having DP value increases shows that DP has strong applicability given a large set of queries because the effect of DP is in proportion to the number of DP terms. Further analysis shows the notion of evidential weight, not based on the entire collection but based on the relevance-judged documents, is clearly distinct from IDF. In addition, an experiment was performed and showed significant result on TREC Web Blogs collection to show the proposed method is feasible to apply to general Web search. As a result, we designed a new te...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectEvidential Weight-
dc.subjectLanguage Model-
dc.subjectInformation Retrieval-
dc.subject가중치 부여방법-
dc.subject어절 분별력-
dc.subject경험적 가중치-
dc.subject랭킹 모델-
dc.subject정보 검색-
dc.subjectTerm Weighting-
dc.subjectDiscrimination Power-
dc.title(A) novel term weighting scheme based on discrimination power-
dc.title.alternative질의 어절의 고유한 분별력에 기반한 어절 가중치 부여방법 연구-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN466475/325007 -
dc.description.department한국과학기술원 : 전산학과, -
dc.identifier.uid020035904-
dc.contributor.localauthorMyaeng, Sung-Hyon-
dc.contributor.localauthor맹성현-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0