Separating the ordering guarantee and durability guarantee in modern IO stack현대 입출력 스택에서 순서 보장과 영속성 보장의 분리 지원

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 2
  • Download : 0
the write order between the checkpoint contents and the checkpoint commit block, and the write order between the checkpoint and file operations which are called after the checkpoint. Lightweight Checkpoint removes these durability primitives thanks to Order-Preserving Block Device. Lightweight Checkpoint returns after the dispatching of cache_barrier command. It improves the filesystem performance significantly, minimizing the blocked time of file operations during the checkpoint. In the third part of this dissertation, we optimize the transaction mechanism of applications such as a database by using Order-Preserving Block Device and filesystem level transaction. Modern applications implement their transaction themselves. They call excessive fsync()s for the write order. Order-Preserving Block Device provides ordering guaranteeing system calls, fbarrier() and fdatabarrier(), which return just after write requests for dirty pages and cache_barrier are dispatched. However, the transaction mechanisms still have a severe write amplification and a excessive context switch overhead because the transaction mechanisms are implemented as a sophisticated combination of system calls. In this dissertation, we expose the filesystem level transaction to applications. The applications can protect their data with lightweight filesystem level transaction but without the sophisticated and heavyweight application level transaction. We implement our transaction model in a log-structured filesystem (F2FS), and we call the new transactional filesystem exF2FS. We design the new multi-file transaction model, Membership-Oriented Transaction, and we add system calls to use Membership-Oriented Transaction. To overcome the memory bloat problem and the conflict between transactions and the garbage collection, exF2FS adopts two technical ingredients, Stealing and Shadow Garbage Collection.; Epoch-based IO Scheduler, Order-Preserving Dispatch, and cache_barrier command. In the second part of this dissertation, we redesign the consistency guaranteeing mechanism in the journaling filesystem (EXT4) and the log-structured filesystem (F2FS) by using Order-Presrving Block Device. For the journaling filesystem, we develop Concurrent Journaling Filesystem, CJFS for short. Existing journaling filesystem guarantees the write order by using the durability primitives. It serializes every IO request for the transaction commit. CJFS resolves the serial transaction commit, removing the durability primitives called for the write order thanks to Order-Preserving Block Device. CJFS has some challenges such as the conflict between transactions, the low coalescing degree of the transaction, and the excessive flush calls for durability of transactions. To overcome the challenges, this dissertation adopts four novel techniques including Dual-Thread Journal Design, Multi-Version Shadow paging, Opportunistic Coalescing, and Compound Flush. For the log-structured filesystem, we develop Lightweight Checkpoint. The checkpoint in the existing log-structured filesystem blocks all file operations for the consistency of the checkpoint. The checkpoint wakes up the blocked file operations just after its completion. The checkpoint uses the durability primitive to guarantee two write orders; Write order is essential to guarantee the crash consistency of the user data. There are two examples of essential write orders for the consistency: the write order between data blocks and the metadata which points to the data blocks, and the write order between a data structure and a commit block of the data structure. Modern IO stack does not support the ordering primitive. So, modern software guarantees the write order by using a durability primitive such as fsync() or flush. However, it disturbs to saturate the performance of the state of the art hardware with the large writeback cache, parallel flash chip architecture, and deep command queue depth. This dissertation separates the ordering guarantee from the durability guarantee. In the first part of this dissertation, we develop Order-Preserving Block Device. There are three write order reversions in Block Device: the reversion between Issue order and Dispatch order by IO scheduler, the reversion between Dispatch order and Transfer order by DMA controller, and the reversion between Transfer order and Flush order by storage firmware. To resolve these reversions, Order-Preserving Block Device adopts three techniques
Advisors
원유집researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[ix, 145 p. :]

Keywords

입출력 스택▼a커널▼a스토리지▼a블록 디바이스 드라이버▼a파일 시스템; IO stack▼aKernel▼aStorage▼aBlock device driver▼aFilesystem

URI
http://hdl.handle.net/10203/322167
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1100073&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0