Text categorization of nuclear system documents using semi-automatic approach반자동화된 방법을 이용한 원자력 계통 문서의 분류

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 578
  • Download : 0
The current practice of nuclear material export decision is neither time-efficient nor cost-effective. It is becoming more and more difficult for the current system, in which human experts manually evaluate the submitted documents and export details described therein for the purpose of export permission decision, to deal with the explosion of export review requests. Toward the improvement of the situation, this research pro-poses a new text categorization technique of nuclear system documents. The knowledge acquisition bottleneck arises from a small number of experts working on a large set of information and there have been a continuous stream of studies to address this issue through automatic or machine approaches. However, the automated machine approach had to compensate for its speed with the quality of work. In this research, we suggest and demonstrate a new evaluation system for nuclear export control with the categorization of the documents us-ing keyterms and preexistent information of the categories. In particular, for the extraction of keyterms, three alternative approaches were compared: (1) automatic keyword extraction approach using TF-IDF, (2) semi-automatic approach, in which the automatic keyword extraction results were reviewed and adjusted by student experts who majored in nuclear engineering, and (3) totally manual approach in which a very experienced senior field expert extracted keywords without any machine support. The study results show that the semi-automatic approach is the most efficient in categorizing keywords, even though it relies on the work of the student experts, suggesting that when it is utilized involving field experts, the text categorization results will be even further better. The combination of machine and human seems a promising solution that can success-fully reduce the knowledge acquisition bottle-neck with reduced time/cost and improved accuracy.
Advisors
Yi, Mun-Yongresearcher이문용
Description
한국과학기술원 : 지식서비스공학과,
Publisher
한국과학기술원
Issue Date
2014
Identifier
569590/325007  / 020124381
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 지식서비스공학과, 2014.2, [ vi, 44 p. ]

Keywords

Nuclear; 지식획득장애; 반자동화된 방법; 핵심어 추출; 문서 분류; 원자력; Text categorization; Keyword extraction; Semi-automatic; Knowledge acquisition bottleneck

URI
http://hdl.handle.net/10203/197096
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=569590&flag=dissertation
Appears in Collection
IE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0