DSpace at KOASAS: Interpretation of natural language queries for effective data exploration over heterogeneous databases: applications to biomedical domain

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Interpretation of natural language queries for effective data exploration over heterogeneous databases: applications to biomedical domain이질적인 데이터베이스에서의 효과적 데이터 탐색을 위한 자연언어질의 해석: 생물의료 분야에의 적용

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 402
Download : 0

Export

Lee, Ho-Dong / 이호동

Data exploration is an essential process for discovering novel knowledge in scientific researches. However, it is difficult for field experts to find out the target data only by exploration, especially when the data are scattered over multiple and heterogeneous databases. Since such data are usually associated with one another, there may be appropriate sequences of searches that the field experts can use for queries to reach the target data. In order to help such data exploration, conventional database interfaces provide useful tools for querying in keywords or structured forms. However, we argue that they are inadequate to express the queries for sequences of searches in multiple databases which embody diverse relations among their data. In order to describe such queries in a convenient and expressive manner, we propose to use natural language queries (NLQs) to interact with the databases. Such a database interface shall automatically interpret NLQs into formal language queries (FLQs) that are in turn composed of small FLQs for different databases. This task requires us to address the problem of database heterogeneity due to the differences in formal query languages, database structures, and data contents. The dissertation addresses this problem by considering NLQs as terms and syntactic relations, which respectively correspond to data objects and their operations. We utilize SQL-like expressions to coordinate such terms and syntactic relations, resulting in FLQs via a straightforward mapping process. In this work, we present a method that derives the SQL-like expressions from NLQs in a Combinatory Categorial Grammar (CCG) framework, and then translates the expressions into the locations of data objects accessible from our target databases. The method then constructs FLQs for such locations in possible sequences with accounts for data associations. Our method thus provides a fully automated way to locate and retrieve available data from databases. We also...

Advisors: Park, Jong-C.researcher; 박종철 researcher

Description: 한국과학기술원 : 전산학전공,

Publisher: 한국과학기술원

Issue Date: 2008

Identifier: 304915/325007 / 000995309

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학전공, 2008. 8. , [ ix, 136 p. ]

Keywords: Natural Language Processing; Natural Language Interface; Natural Lanaguage Query; Bioinformatics; Text Mining; 자연언어처리; 자연언어인터페이스; 자연언어질의; 생물정보학; 텍스트 마이닝; Natural Language Processing; Natural Language Interface; Natural Lanaguage Query; Bioinformatics; Text Mining; 자연언어처리; 자연언어인터페이스; 자연언어질의; 생물정보학; 텍스트 마이닝

URI: http://hdl.handle.net/10203/33261

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=304915&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Interpretation of natural language queries for effective data exploration over heterogeneous databases: applications to biomedical domain이질적인 데이터베이스에서의 효과적 데이터 탐색을 위한 자연언어질의 해석: 생물의료 분야에의 적용

KOASAS

Communities & Collections