Enhancing the design of retriever for open-domain question answering오픈 도메인 질의 응답을 위한 검색기 설계 개선

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 162
  • Download : 0
The state-of-the-art in open-domain question answering relies on a retrieve & read approach, which uses an efficient bi-encoder retriever to retrieve the documents relevant to the question from a large knowledge source and then uses a cross-encoder reader to the retrieved documents to find out the answer. This thesis covers various ways to enhance the design of retrievers for open-domain question answering systems. The main part of the thesis consists of how to reduce the size of a retriever-and-reader system for open-domain question answering and enhance accuracy. Here, we propose a combination of various approaches to size down a conventional retrieve & read system, and explore the trade-off between the storage budget and the accuracy. By applying our strategies to a recent extractive retrieve & read system, DPR, we reduce its size by 160x with little loss of accuracy, which is still higher than the performance of a purely parametric T5 baseline with a comparable docker-level storage footprint. The thesis also contains two additional small chapters which describe a knowledge distillation-based method for performance improvement and a new retriever approach that competes with the bi-encoder retriever approach. First, we talk about how to perform knowledge distillation from cross-encoder reader to bi-encoder retriever to overcome the performance limitations of the bi-encoder architecture. Second, we introduce a generative retrieval approach that solves search tasks by generating relevant documents from the model parameters based on input queries, taking up a smaller system footprint compared to the existing bi-encoder retrievers that select related documents from the index of the text corpus, which is often large in size.
Advisors
Seo, Minjoonresearcher서민준researcher
Description
한국과학기술원 :김재철AI대학원,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2023.2,[v, 27 p. :]

Keywords

Deep learning▼aNatural language processing▼aOpen-domain question answering▼aInformation retrieval▼aRetriever; 딥러닝▼a자연어 처리▼a오픈 도메인 질의 응답▼a정보 검색▼a검색기

URI
http://hdl.handle.net/10203/308181
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032336&flag=dissertation
Appears in Collection
AI-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0