Theory and application of ultra high dimensional sparse representations for efficient and interpretable semantic search효율적이고 해석 가능한 의미 검색을 위한 초고차원 희소 표상의 이론과 응용

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 57
  • Download : 0
Deep learning-based models generally use low-dimensional dense representations to express data samples. Although compact and powerful, it bears several shortcomings that make it unsuitable for tasks requiring processing a large number of samples (e.g., searching documents from web-scale corpus). More specifically, since each dimension of low-dimensional dense representations is highly entangled because of the limited number of dimensions available, it is susceptible to false matches when the number of samples is large. Also, all the dimensions must participate in representing and comparing samples regardless of each sample's characteristics, which is inefficient. Lastly, it is usually hard to interpret the entangled dimensions of dense representations. This thesis shows how high-dimensional sparse representations can cope with such problems in the field of natural language processing (NLP). We first explain the theoretical background and properties of high-dimensional sparse representations. Then we show how high-dimensionality and sparseness allow us to kill two birds, the performance and efficiency when applied to information retrieval (IR) and question answering (QA), the NLP tasks that require accurately finding relevant documents or answers from a vast amount of corpus with low latency. Finally, we introduce a method to interpret the model's outcome in quantitative and qualitative ways.
Advisors
Myaeng, Sung-Hyonresearcher맹성현researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2022.2,[v, 62 p. :]

URI
http://hdl.handle.net/10203/309237
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=996352&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0