DSpace at KOASAS: Efficient parallel processing of skyline queries in MapReduce

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Efficient parallel processing of skyline queries in MapReduce맵리듀스를 이용한 스카이라인 질의의 효율적인 병렬처리

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 380
Download : 0

Export

Kim, Junsu

Skyline queries are useful for finding only interesting tuples from multi-dimensional datasets for multi-criteria decision making. To improve the performance of skyline query processing for large scale data, it is necessary to use parallel and distributed frameworks such as MapReduce that has been widely used recently. In this dissertation, we propose an efficient method to process skyline queries in a distributed and parallel manner using MapReduce. There are several approaches which process skyline queries on a MapReduce framework to improve the performance of query processing. Some methods process a part of the skyline computation in a serial manner while there are other methods that process all parts of the skyline computation in parallel. However, each of them suffers from at least one of two drawbacks: (1) The serial computations may prevent them from fully utilizing the parallelism of the MapReduce framework; (2) When processing the skyline queries in a parallel and distributed manner, the additional overhead for the parallel processing may outweigh the benefit gained from parallelization. In order to efficiently process skyline queries for large data in parallel, we propose a novel two-phase approach called SKY-IOC in MapReduce framework. In the first phase, we start by dividing the input dataset into a number of subsets (called cells) and then we compute local skylines only for the qualified cells. The outer-cell filter used in this phase considerably improves the performance by eliminating a large number of tuples in unqualified cells. In the second phase, the global skyline is computed from local skylines. To separately determine global skyline tuples from each local skyline in parallel, we design the inner-cell filter and also propose efficient methods to reduce the overhead caused by computing and utilizing the inner-cell filters. The primary advantage of our approach is that it processes skyline queries fast and in a fully parallelized manner in all states of the MapReduce framework with the two filtering techniques. Throughout extensive experiments, we demonstrate that the proposed approach substantially increases the overall performance of skyline queries in comparison with the state-of-the-art skyline processing methods. Especially, the proposed method achieves remarkably good performance and scalability with regard to the dataset size and the dimensionality. Our approach has significant benefits for large-scale query processing of skylines in distributed and parallel computing environments.

Advisors: Kim, Myoung Ho researcher; 김명호 researcher

Description: 한국과학기술원 :전산학부,

Publisher: 한국과학기술원

Issue Date: 2018

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학부, 2018.2,[iv, 59 p. :]

Keywords: skyline query processing▼aparallel processing▼adistributed processing▼abig data▼adistributed systems▼aMapReduce; 스카이라인 질의 처리▼a병렬 처리▼a분산 처리▼a빅데이터▼a분산시스템▼a맵리듀스

URI: http://hdl.handle.net/10203/265319

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=734415&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Efficient parallel processing of skyline queries in MapReduce맵리듀스를 이용한 스카이라인 질의의 효율적인 병렬처리

KOASAS

Communities & Collections