DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Myaeng, Sung-Hyon | - |
dc.contributor.advisor | 맹성현 | - |
dc.contributor.author | Nam, Sang-Soo | - |
dc.contributor.author | 남상수 | - |
dc.date.accessioned | 2013-09-12T01:48:57Z | - |
dc.date.available | 2013-09-12T01:48:57Z | - |
dc.date.issued | 2013 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=515123&flag=dissertation | - |
dc.identifier.uri | http://hdl.handle.net/10203/180446 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전산학과, 2013.2, [ v, 39 p. ] | - |
dc.description.abstract | Radiology reports are written by a medical expert via analysing radiology images such as CT and MRI. It consists of cancer clause and non-cancer clauses. We focus on text classification for cancer and non-cancer classes. This data has two unique characters. First, the number of cancer clauses is much smaller than the number of non-cancer clauses. Second, important terms for cancer also occur in the non-cancer class. Since it is often difficult to determine the cancer based on radiology images, some clauses are labelled as non-cancer in spite of having important terms for cancer. Recently, term weighting approaches have been proposed to solve the data imbalance problem. However, we argue that it sometimes gives weight wrongly due to duplicate terms. Consequently, we utilize cancer related external data to calculate term weights. Since external data is highly related with cancer, we can find important terms for cancer and calculate its weight. Based on calculated weights from external data, term weights in the cancer class are increased and term weights in the non-cancer class are decreased. Through the experiment, proposed method showed enhanced performance than term weighting methods using the training data. | eng |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Text classification | - |
dc.subject | Imbalanced data | - |
dc.subject | Radiology report | - |
dc.subject | Term weighting scheme | - |
dc.subject | 방사선과 보고서 | - |
dc.subject | 용어 가중치 | - |
dc.subject | 문서 분류 | - |
dc.subject | 외부 데이터 | - |
dc.subject | External data | - |
dc.title | A term weighting approach exploiting external data for cancer clause classification from free-text radiology reports | - |
dc.title.alternative | 방사선과 보고서의 암 절 분류에 외부 데이터를 활용한 자질 가중치에 관한 연구 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 515123/325007 | - |
dc.description.department | 한국과학기술원 : 전산학과, | - |
dc.identifier.uid | 020113190 | - |
dc.contributor.localauthor | Myaeng, Sung-Hyon | - |
dc.contributor.localauthor | 맹성현 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.