Knowledge discovery based on enhanced feature handling methods - mixed features and local feature weighting - and their application to CRM = 일반적인 형태의 자료 처리 및 질의 특성에 따른 가중치법의 지능적인 자료 처리 방법에 기반한 지식 추출 과정 및 고객관계관리에의 적용
By data flood produced by automation of business activities and rapidly changed business environment, knowledge discovery in databases (KDD), namely methodologies for extraction useful knowledge from database, came to play a very important role in business. Especially, data mining, the heart of the KDD process, has been taking a lot of attentions and many researchers have been trying to develop efficient data mining methodologies or algorithms. Though a lot of algorithms have been developed, there remain many problems unsolved. Among them, we focused on feature handling area, especially feature weighting and mixed feature problems. We developed three algorithms-MBNR, k-representatives algorithm, and k-GR algorithm, for the problems. The performance of each algorithm is proved by datasets from UCI Machine Learning Repository.
MBNR (Memory-Based Neural Reasoning) is a hybrid system of Case-Based Reasoning (CBR) and Neural Network. CBR has a very simple and comprehensible reasoning process but its prediction accuracy is a little low. In contrast to CBR, Neural Network shows very accurate prediction ability in many areas but it takes black-box approach, which means that it doesn``t provide comprehensible knowledge to users. The basic reasoning process of MBNR is that of CBR. The integrated Neural Network provides feature weights according a query coming to the system. The proposed hybrid system takes strengths from both CBR and Neural Network and provides very accurate and comprehensible prediction results.
k-representatives algorithm is a efficient algorithm for clustering nominal data. Most of previous clustering algorithms for nominal data use the number of mismatching nominal features as the difference measure. They don``t take the similarity between values into account and only consider they are same or not. k-representatives algorithm provides a new iterative refinement clustering approach with consideration of the similarity between nominal values.