Recovering valuable information from large dataset빅데이터로부터 가치 있는 정보를 복구하는 기법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 103
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorChung, Hye Won-
dc.contributor.advisor정혜원-
dc.contributor.authorKim, Daesung-
dc.date.accessioned2023-06-23T19:34:09Z-
dc.date.available2023-06-23T19:34:09Z-
dc.date.issued2023-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1030550&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/309180-
dc.description학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2023.2,[v, 133 p. :]-
dc.description.abstractRecovering valuable information from large datasets has become a major challenge in this era of data explosion. Large datasets seem to be random at first glance, but they are usually explainable through much small number of features. The goal of this thesis is to propose some computationally efficient, mathematical algorithms that recover the underlying signals or features. We theoretically analyze the algorithms under some probabilistic models, and we verify our findings with some simulations. Three different problems, binary classification with XOR queries, hypergraph clustering, matrix completion, are considered in this thesis. For the first problem, we provide a sharp threshold on the number of queries required to recover the binary labels, and we propose an algorithm based on belief propagation that achieves this. We assume a practical scenario where the error probability changes depending on the worker, and the proposed algorithm achieves the bound even without the knowledge of the worker reliability. For the hypergraph clustering, we first estimate the adjacency relation between nodes through convex optimization, and we recover the communities by clustering the rows of estimated adjacency matrix. Lastly, we studied nonconvex optimization and gradient descent in matrix completion. We prove that gradient descent converges to the global minima even in the presence of local minima or saddle points if accompanied with small random initialization.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectXOR query▼aHypergraph clustering▼aMatrix completion▼aBelief propagation▼aConvex optimization▼aNonconvex optimization▼aGradient descent-
dc.subjectXOR 질문▼a하이퍼그래프 군집화▼a행렬 채우기▼a신뢰 전파▼a볼록 최적화▼a비볼록 최적화▼a경사 하강법-
dc.titleRecovering valuable information from large dataset-
dc.title.alternative빅데이터로부터 가치 있는 정보를 복구하는 기법-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthor김대성-
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0