SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree

Cited 3 time in webofscience Cited 4 time in scopus
  • Hit : 734
  • Download : 0
A top-k keyword query in relational databases returns k trees of tuples where the tuples containing the query keywords are connected via primary key-foreign key relationships in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach and 2) the schema-free approach. We focus on the former utilizing database schema information for more effective ranking of the query results. Ranking measures used in existing works can be classified into two categories: I) the size of the tree (i.e., the syntactic score) and 2) ranking measures, such as TF-EDF, borrowed from the information retrieval field. However, these measures do not take into account semantic relevancy among relations containing the tuples in the query results. In this paper, we propose a new ranking method that ranks the query results by utilizing semantic relevancy among relations containing the tuples at the schema level. First, we propose a structure of semantically strongly related relations, which we call the strongly related tree (S RT). An SRT is a tree that maximally connects relations based on the lossless join property. Next, we propose a new ranking method, SRT-Rank, that ranks the query results by a new scoring function augmenting existing ones with the concept of the SRT. SRT-Rank is the first research effort that applies semantic relevancy among relations to ranking the results of keyword queries. To show the effectiveness of SRT-Rank, we perform experiments on synthetic and real datasets by augmenting the representative existing methods with SRT-Rank. Experimental results show that, compared with existing methods. SRI-Rank improves performance in terms of four quality measures the mean normalized discounted cumulative gain (nDCG), the number of queries whose top-1 result is relevant to the query, the mean reciprocal rank, and the mean average precision by up to 46.9%, 160.0%, 61.7%, and 63.8%, respectively. In addition, we show that the query performance of SRT-Rank is comparable to or better than those of existing methods.
Publisher
IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
Issue Date
2014-09
Language
English
Article Type
Article
Citation

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E97D, no.9, pp.2398 - 2414

ISSN
1745-1361
DOI
10.1587/transinf.2014EDP7040
URI
http://hdl.handle.net/10203/193034
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 3 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0