Fault-based distributed recovery block (FDRB) for switching systems교환시스템을 위한 결함 기반의 분산 복구 블록

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 548
  • Download : 0
Distributed Recovery Block (DRB) is a widely used fault-tolerant technique for real-time systems, which provides a forward recovery scheme by treating hardware and software faults uniformly. However, DRB has certain limitations to be used for the practical use. When a fault is encountered during the program execution, DRB immediately switches to another version of the program without attempting to recover from the fault with a recovery scheme selected by taking the types and effects of the fault into accounts. Moreover, since DRB treats all the application functions on a computing station as a single program unit, it is not applicable to large real-time systems. We propose, therefore, a new fault tolerant technique specially designed for large real-time systems, especially switching systems. Our technique is tuned to the types and effects of faults observed while running the switching systems and based on the testing experience accumulated during software development of switching systems. We have added a self-checking and a selective recovery mechanism to the ordinary DRB in designing a fault-based DRB (FDRB). We have then extended FDRB to a hierarchical scheme for large real-time systems. Hierarchical FDRB (H-FDRB) includes multiple FDRB modules and a monitor for synchronizing the operation of duplicate nodes. We have compared the performance of our approach with that of the ordinary DRB using model analysis and simulation. We have evaluated the probability of the failure-free operation and the fault recovery time of ordinary DRB, FDRB, and H-FDRB. We have also conducted an empirical evaluation by implementing three versions of different algorithms and executing them with injected faults. The reliability achieved with FDRB is comparable to or better than that with DRB thanks to the fault recovery of the recovery-handling programs, and the fault recovery time is reduced without sacrificing the software reliability of a system. We have also found that FDRB and H...
Advisors
Kwon, Yong-Raeresearcher권용래researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2003
Identifier
181182/325007 / 000925256
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2003.2, [ vii, 88 p. ]

Keywords

Self-Checking Program; Distributed Recovery Block; Software Fault Tolerance; Recovery-Handling Program; 복구 처리 프로그램; 자기 검사 프로그램; 분산 복구 블록; 소프트웨어 결함 포용

URI
http://hdl.handle.net/10203/32835
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=181182&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0