(An) efficient approach to improve the performance of concurrent read streams in distributed file systems with various running environments = 다양한 실행 환경에서 분산 파일 시스템의 다중 읽기 스트림에 대한 효과적인 성능 향상 기법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 234
  • Download : 0
Distributed file systems are widely used in various areas. One of the key issues is to provide high performance of concurrent read streams (i.e., multiple series of sequential reads by concurrent processes) rather than that of a single stream (i.e., a series of sequential reads by a process) for their applications. The reason is that the performance of concurrent read streams is much more important than that of a single one because concurrent read streams are frequently issued to an individual storage server by multiple clients that are used to serve cloud data, analyze big data, and calculate scientific data. Despite the many studies on local file systems, research has seldom been done on concurrent read streams in distributed file systems with different running environments (i.e., different types of storage devices at storage servers and various network delays between clients and storage servers). Furthermore, most of the existing distributed file systems (e.g., Gluster, HDFS (Hadoop Distributed File System), and Lustre) have a sharply degraded performance compared with a local file system (i.e., EXT4). Therefore, to achieve high performance in concurrent read streams, we do the following. First, for concurrent read streams, we dedicate an individual read stream to a specific I/O worker at a storage server. Second, for each individual read stream, we introduces a populating effect that keeps sending subsequent reads to a storage server (Population of Networked Reads (PNR)) and then, proposes an adaptable prefetching scheme (APS) to obtain the effect even in different running environments. Hence, our APS resolves all the problems that we identified as dramatically degrading the performance in existing distributed file systems. In three different types of storage devices and in various network delays, the evaluation results show that our APS (1) achieves almost the same performance as a local file system from an individual server and (2) minimizes the performance degradation of random reads. On the other hand, by adopting a striped RAID (Redundant Array of Independent Disks) (e.g., RAID-0 and RAID-5) which consists of multiple disks and spreads data across them in parallel, distributed file systems easily enhance the performance of a single read stream and increase storage capacity. In most existing distributed file systems, however, the performance becomes more degraded according to the increasing number of concurrent read streams at all different configurations of stripped RAIDs (i.e., the number of striped disks and the strip size). In this thesis, we do the following for different configurations of stripped RAIDs. First, for concurrent read streams, we define all the problems that degrade the performance and then, resolve them by allocating a network bandwidth to an individual stream in a fair way (FANB). Second, for each individual read stream, we identify why the existing prefetching way fails to achieve the expected performance (i.e., failure to achieve the PNR effect from a striped RAID). Then, we propose a strip-aware prefetching (SAP) to obtain the effect from different configurations of striped RAIDs efficiently. Eventually, our FANB+SAP outperforms all the existing distributed file systems by at least 2 times for all kinds and configurations of striped RAIDs. Furthermore, the performance gap between our proposal and the existing distributed file systems becomes wider according to the increasing number of striped disks.
Advisors
Hyun, Soon Jooresearcher현순주researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2019
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2019.2,[vi, 85 p. :]

Keywords

Distributed file system▼aconcurrent read streams▼adata prefetching▼aadaptable prefetching▼adevice type▼anetwork delay▼aRAID▼afair bandwidth allocation▼astrip-aware prefetching; 분산파일시스템▼a다중 읽기 스트림▼a데이터 프리패칭▼a적응형 프리패칭▼a디바이스 타입▼a네트워크 지연▼a공평한 대역폭 할당▼a스트라이프 인지 프리패칭

URI
http://hdl.handle.net/10203/265357
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=842404&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0