Distant-supervision for question answering질의응답 시스템을 위한 원격 지도학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 54
  • Download : 0
Question answering (QA) aims to build a machine that answers natural language questions. Recent approaches in QA have focused on semantic alignment between a question and the context that contains the answer, such as sentences and passages. An intensive reasoning process is required to find the most relevant context, and QA models need large-scale training data to learn this reasoning process. However, constructing training data for QA is costly. In this thesis, I investigate a long-standing problem in QA, a lack of supervision signals, with three sub-topics in QA. The first is machine reading comprehension (MRC). Machine reading comprehension aims to find the answer in the given passage, and this is done by matching the semantics of the question and the surrounding context of the answer. However, MRC models sometimes predict irrelevant context. I enhance the context modeling capability of extractive QA models with our distant supervision method that weakly annotates word-level semantic similarity between the question and words in the context. The second is multi-hop QA. In multi-hop QA, questions consist of multiple sub-questions, and one of the goals is to find a set of passages that contains all the necessary information to answer the question. Recent multi-hop QA models iteratively retrieve a passage at a time and return the passage set. In this iterative retrieval process, question encoders are required to match the semantics of the given question and the context of passages. However, this task requires a complex reasoning process, leading to a lack of scalability in building training data in multi-hop QA. I propose a weakly-supervised pre-training method and a synthetic data generation method to increase the robustness of multi-hop retrievers when train data is insufficient. The third is question retrieval. Question retrieval is a recently proposed real-time QA model. This approach answers the given question by searching the most similar question in the pre-indexed question-answer database. However, training data for question retrieval is unavailable. I suggest a distant-supervision method that leverages the answers to the questions.
Advisors
Oh, Haeyunresearcher오혜연researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2022.8,[v, 45 p. :]

Keywords

Question answering▼aDistant-supervision▼aOpen-domain QA▼aMulti-hop QA▼aQuestion retrieval▼aMachine reading comprehension▼aDeep learning▼aDocument retrieval; 질의응답▼a다중홉질의응답▼a질문검색▼a기계독해▼a딥러닝▼a문서검색

URI
http://hdl.handle.net/10203/309253
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1007882&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0