Context-aware model with generalized structured gate and attention일반화 및 구조화된 게이트와 어텐션 기반의 맥락 파악

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 172
  • Download : 0
To understand the meaning of the data clearly, it is necessary to capture the relationship between data as well as each data itself. Especially, the context, the meaning of surrounding elements, is helpful in understanding the data. Gate and attention have become widely used for the context-aware model. Both gate and attention compute the importance of given features and construct the context by combining the given features with computed importance. Gate and attention utilize the softmax function to calculate the importance, and they represent a value between 0 and 1. In this thesis, I focus on the studies of gate and attention for a model to understand the data well. Generalized and structured modeling is necessary to capture accurate and diverse contextual representations. However, we find that gate and attention lack these components and propose more generalized and structured models from the findings. First, we find that the gate in RNN and its variants cannot represent a value between zero and one because of its sigmoid function. Furthermore, traditional gate structures are formulated independently, and it lacks the correlation between gates. To improve the gate from our findings, we propose a more generalized structured bivariate Beta distributed gate structure. Second, we find that the attention in Transformer and GAT can be decomposed into similarity and magnitude terms. Furthermore, we found that the traditional unstructured multi-head attention (MHA) is hard to capture important diverse features. From our new interpretation, we propose a more generalized and structured multi-head implicit kernel attention (MIKAN). We validate our proposed models on text, image, music, time-series, graph-structured dataset.
Advisors
Moon, Il-Chulresearcher문일철researcher
Description
한국과학기술원 :산업및시스템공학과,
Country
한국과학기술원
Issue Date
2021
Identifier
325007
Language
eng
Article Type
Thesis(Ph.D)
URI
http://hdl.handle.net/10203/294596
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=956467&flag=dissertation
Appears in Collection
IE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0