DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Seo, Minjoon | - |
dc.contributor.advisor | 서민준 | - |
dc.contributor.author | Yang, Sohee | - |
dc.date.accessioned | 2023-06-22T19:31:12Z | - |
dc.date.available | 2023-06-22T19:31:12Z | - |
dc.date.issued | 2023 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032336&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/308181 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2023.2,[v, 27 p. :] | - |
dc.description.abstract | The state-of-the-art in open-domain question answering relies on a retrieve & read approach, which uses an efficient bi-encoder retriever to retrieve the documents relevant to the question from a large knowledge source and then uses a cross-encoder reader to the retrieved documents to find out the answer. This thesis covers various ways to enhance the design of retrievers for open-domain question answering systems. The main part of the thesis consists of how to reduce the size of a retriever-and-reader system for open-domain question answering and enhance accuracy. Here, we propose a combination of various approaches to size down a conventional retrieve & read system, and explore the trade-off between the storage budget and the accuracy. By applying our strategies to a recent extractive retrieve & read system, DPR, we reduce its size by 160x with little loss of accuracy, which is still higher than the performance of a purely parametric T5 baseline with a comparable docker-level storage footprint. The thesis also contains two additional small chapters which describe a knowledge distillation-based method for performance improvement and a new retriever approach that competes with the bi-encoder retriever approach. First, we talk about how to perform knowledge distillation from cross-encoder reader to bi-encoder retriever to overcome the performance limitations of the bi-encoder architecture. Second, we introduce a generative retrieval approach that solves search tasks by generating relevant documents from the model parameters based on input queries, taking up a smaller system footprint compared to the existing bi-encoder retrievers that select related documents from the index of the text corpus, which is often large in size. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Deep learning▼aNatural language processing▼aOpen-domain question answering▼aInformation retrieval▼aRetriever | - |
dc.subject | 딥러닝▼a자연어 처리▼a오픈 도메인 질의 응답▼a정보 검색▼a검색기 | - |
dc.title | Enhancing the design of retriever for open-domain question answering | - |
dc.title.alternative | 오픈 도메인 질의 응답을 위한 검색기 설계 개선 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :김재철AI대학원, | - |
dc.contributor.alternativeauthor | 양소희 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.