A term weighting approach exploiting external data for cancer clause classification from free-text radiology reports방사선과 보고서의 암 절 분류에 외부 데이터를 활용한 자질 가중치에 관한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 725
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorMyaeng, Sung-Hyon-
dc.contributor.advisor맹성현-
dc.contributor.authorNam, Sang-Soo-
dc.contributor.author남상수-
dc.date.accessioned2013-09-12T01:48:57Z-
dc.date.available2013-09-12T01:48:57Z-
dc.date.issued2013-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=515123&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/180446-
dc.description학위논문(석사) - 한국과학기술원 : 전산학과, 2013.2, [ v, 39 p. ]-
dc.description.abstractRadiology reports are written by a medical expert via analysing radiology images such as CT and MRI. It consists of cancer clause and non-cancer clauses. We focus on text classification for cancer and non-cancer classes. This data has two unique characters. First, the number of cancer clauses is much smaller than the number of non-cancer clauses. Second, important terms for cancer also occur in the non-cancer class. Since it is often difficult to determine the cancer based on radiology images, some clauses are labelled as non-cancer in spite of having important terms for cancer. Recently, term weighting approaches have been proposed to solve the data imbalance problem. However, we argue that it sometimes gives weight wrongly due to duplicate terms. Consequently, we utilize cancer related external data to calculate term weights. Since external data is highly related with cancer, we can find important terms for cancer and calculate its weight. Based on calculated weights from external data, term weights in the cancer class are increased and term weights in the non-cancer class are decreased. Through the experiment, proposed method showed enhanced performance than term weighting methods using the training data.eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectText classification-
dc.subjectImbalanced data-
dc.subjectRadiology report-
dc.subjectTerm weighting scheme-
dc.subject방사선과 보고서-
dc.subject용어 가중치-
dc.subject문서 분류-
dc.subject외부 데이터-
dc.subjectExternal data-
dc.titleA term weighting approach exploiting external data for cancer clause classification from free-text radiology reports-
dc.title.alternative방사선과 보고서의 암 절 분류에 외부 데이터를 활용한 자질 가중치에 관한 연구-
dc.typeThesis(Master)-
dc.identifier.CNRN515123/325007 -
dc.description.department한국과학기술원 : 전산학과, -
dc.identifier.uid020113190-
dc.contributor.localauthorMyaeng, Sung-Hyon-
dc.contributor.localauthor맹성현-
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0