DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 안정연 | - |
dc.contributor.author | Yoon, Changwon | - |
dc.contributor.author | 윤창원 | - |
dc.date.accessioned | 2024-07-25T19:31:03Z | - |
dc.date.available | 2024-07-25T19:31:03Z | - |
dc.date.issued | 2023 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045807&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/320619 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 산업및시스템공학과, 2023.8,[iii, 20 p. :] | - |
dc.description.abstract | Mixed data refer to tabular data which include both numerical and categorical features and they have become prevalent in various fields such as finance and medical studies. In our study, we propose a simple yet powerful sparse clustering technique for mixed data. Our approach combines model-based Gaussian-multinomial mixture model with partitioning method, leveraging the advantages of both. Also, we utilize the difference in log-likelihoods between cluster assignment and non-assignment of each feature to induce sparsity and feature selection tailored to the practitioner's needs. The proposed method performs under with high-dimensional settings where the number of features exceeds the number of observations, due to its straightforward structure and capacity to induce sparsity. Furthermore, our model can select different features for each cluster and offers feature importance rankings which greatly enhances interpretability of the clustering result compared to other sparse clustering techniques for mixed data. We demonstrate our method's performance using synthetic and real data and observe that it has competitive performance compared to some state-of-the-art mixed data clustering methods. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | 혼합형 데이터▼a군집화▼a로그-우도▼a변수 정렬▼a변수 선택 | - |
dc.subject | Mixed data▼aClustering▼aLog-likelihood▼aFeature Ranking▼aFeature Selection | - |
dc.title | Sparse clustering of mixed data with likelihood based feature ranking | - |
dc.title.alternative | 우도 기반 변수 정렬을 통한 혼합형 데이터의 희소 군집화 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :산업및시스템공학과, | - |
dc.contributor.alternativeauthor | Ahn, Jeongyoun | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.