DSpace at KOASAS: Utilizing non-local information to large-scale hierarchical text classification

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Utilizing non-local information to large-scale hierarchical text classification비국소적 정보를 이용한 대규모 계층적 문서 분류

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 571
Download : 0

Export

Oh, Heung-Seon / 오흥선

Hierarchical text classification to a web taxonomy is challenging because it is a very large-scale problem with hundreds of thousand categories and associated documents. Furthermore, the conceptual levels and training data availabilities of categories vary widely. Compared to the previous work solely relying on machine learning, a narrow-down approach is the state-of-the-art that utilizes a search engine for generating candidates from the taxonomy and builds a classifier for the final category selection. However, we observed the previous work just focusing on local information associated with candidate categories to train a classifier. In this thesis, we take the same approach but address the issue of using non-local information, i.e. global and path information, to improve the effectiveness of classification. To this end, this thesis proposes methods using non-local information based on statistical language modeling framework which is well-developed in information retrieval area by understanding the necessity of non-local information. For evaluation, we constructed a document collection from web pages in the Open Directory Project (ODP). A series of exhaustive experiments and their results show the superiority of our methods and reveal the role of non-local information in hierarchical text classification.

Advisors: Myaeng, Sung-Hyon researcher; 맹성현

Description: 한국과학기술원 : 전산학과,

Publisher: 한국과학기술원

Issue Date: 2014

Identifier: 568609/325007 / 020095234

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학과, 2014.2, [ vi, 86 p. ]

Keywords: language modeling; web taxonomy; 계층적 문서 분류; 언어모델; hierarchical text classification; 웹 텍사노미

URI: http://hdl.handle.net/10203/197821

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=568609&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Utilizing non-local information to large-scale hierarchical text classification비국소적 정보를 이용한 대규모 계층적 문서 분류

KOASAS

Communities & Collections