DSpace at KOASAS: Parallelization of Multi-query Processing for Hierarchical Data Streams

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Parallelization of Multi-query Processing for Hierarchical Data Streams계층 구조 스트림 데이터를 위한 다중 질의 병렬 처리 기법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 478
Download : 0

Export

Kim, Soo-Hyung / 김수형

Recently, as increasing amounts of information are stored, exchanged, and presented using eXtensible Markup Language (XML), it becomes more and more important to adequately process XML streams. Meanwhile, the multicore architecture has been the norm for all computing systems in recent years as it provides the CPU-level support of parallelism. However, existing algorithms for processing XML streams do not fully take advantage of the facility since they have not been devised to run in parallel. They also show a degraded processing performance as the number of user queries increases. In this thesis, we propose several methods to parallelize the finite state automata(FSA)-based XML stream processing technique efficiently. We transform a large collection of XPath expressions into multiple FSA-based query indexes and then process XML streams in parallel by virtue of index-level parallelism. Each core works only with its own query index so that no synchronization issue occurs while filtering XML streams with multiple path patterns given by users. Moreover, proposed algorithm permits query processing to share input scans and path solutions to reduce redundant processing and save computations and I/Os. We also present an in-memory MapReduce model that enables to process a large collection of twig pattern joins over XML streams simultaneously. Twig pattern joins in our approach are performed by multiple H/W threads in a shared and balanced way. In addition, we address performance issues in the in-memory MapReduce by providing a sophisticated run-time workload balancing scheme. It is achieved by computing the cost of each twig pattern join operation before actual joining. Extensive experiments show that our algorithm outperforms conventional algorithms by up to ten times on an 8-core CPU for processing 10 million XPath expressions over XML streams. Through extensive experiments with synthetic XML dataset, we prove that our parallel algorithms are efficient and scalable.

Advisors: Lee, Yoon-Joon researcher; 이윤준 researcher

Description: 한국과학기술원 :전산학부,

Publisher: 한국과학기술원

Issue Date: 2017

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학부, 2017.2,[iv, 59 p. :]

Keywords: data streams; XML; query processing; parallel processing; multicore architecture; intra-node parallelism; 데이터 스트림; 질의처리; 병렬처리; 멀티코어; 단일노드 병렬화

URI: http://hdl.handle.net/10203/242081

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=675850&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Parallelization of Multi-query Processing for Hierarchical Data Streams계층 구조 스트림 데이터를 위한 다중 질의 병렬 처리 기법

KOASAS

Communities & Collections