Cost-sensitive case-based reasoning using a genetic algorithm: Application to medical diagnosis

Cited 27 time in webofscience Cited 0 time in scopus
  • Hit : 897
  • Download : 0
Objective: The paper studies the new learning technique called cost-sensitive case-based reasoning (CSCBR) incorporating unequal misclassification cost into CBR model. Conventional CBR is now considered as a suitable technique for diagnosis, prognosis and prescription in medicine. However it lacks the ability to reflect asymmetric misclassification and often assumes that the cost of a positive diagnosis (an illness) as a negative one (no illness) is the same with that of the opposite situation. Thus, the objective of this research is to overcome the limitation of conventional CBR and encourage applying CBR to many real world medical cases associated with costs of asymmetric misclassification errors. Methods: The main idea involves adjusting the optimal cut-off classification point for classifying the absence or presence of diseases and the cut-off distance point for selecting optimal neighbors within search spaces based on similarity distribution. These steps are dynamically adapted to new target cases using a genetic algorithm. We apply this proposed method to five real medical datasets and compare the results with two other cost-sensitive learning methods-C5.0 and CART. Results: Our finding shows that the total misclassification cost of CSCBR is lower than other cost-sensitive methods in many cases. Even though the genetic algorithm has limitations in terms of unstable results and over-fitting training data, CSCBR results with GA are better overall than those of other methods. Also the paired t-test results indicate that the total misclassification cost of CSCBR is significantly less than C5.0 and CART for several datasets. Conclusion: We have proposed a new CBR method called cost-sensitive case-based reasoning (CSCBR) that can incorporate unequal misclassification costs into CBR and optimize the number of neighbors dynamically using a genetic algorithm. It is meaningful not only for introducing the concept of cost-sensitive learning to CBR, but also for encouraging the use of CBR in the medical area. The result shows that the total misclassification costs of CSCBR do not increase in arithmetic progression as the cost of false absence increases arithmetically, thus it is cost-sensitive. We also show that total misclassification costs of CSCBR are the lowest among all methods in four datasets out of five and the result is statistically significant in many cases. The limitation of our proposed CSCBR is confined to classify binary cases for minimizing misclassification cost because our proposed CSCBR is originally designed to classify binary case. Our future work extends this method for multi-classification which can classify more than two groups. (C) 2010 Elsevier B.V. All rights reserved.
Publisher
ELSEVIER SCIENCE BV
Issue Date
2011-02
Language
English
Article Type
Article; Proceedings Paper
Keywords

SYSTEM; MACHINE; CANCER; TREES; CBR

Citation

ARTIFICIAL INTELLIGENCE IN MEDICINE, v.51, no.2, pp.133 - 145

ISSN
0933-3657
URI
http://hdl.handle.net/10203/98593
Appears in Collection
MT-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 27 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0