Fault-based distributed recovery block (FDRB) for switching systems = 교환시스템을 위한 결함 기반의 분산 복구 블록

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 423
  • Download : 0
Distributed Recovery Block (DRB) is a widely used fault-tolerant technique for real-time systems, which provides a forward recovery scheme by treating hardware and software faults uniformly. However, DRB has certain limitations to be used for the practical use. When a fault is encountered during the program execution, DRB immediately switches to another version of the program without attempting to recover from the fault with a recovery scheme selected by taking the types and effects of the fault into accounts. Moreover, since DRB treats all the application functions on a computing station as a single program unit, it is not applicable to large real-time systems. We propose, therefore, a new fault tolerant technique specially designed for large real-time systems, especially switching systems. Our technique is tuned to the types and effects of faults observed while running the switching systems and based on the testing experience accumulated during software development of switching systems. We have added a self-checking and a selective recovery mechanism to the ordinary DRB in designing a fault-based DRB (FDRB). We have then extended FDRB to a hierarchical scheme for large real-time systems. Hierarchical FDRB (H-FDRB) includes multiple FDRB modules and a monitor for synchronizing the operation of duplicate nodes. We have compared the performance of our approach with that of the ordinary DRB using model analysis and simulation. We have evaluated the probability of the failure-free operation and the fault recovery time of ordinary DRB, FDRB, and H-FDRB. We have also conducted an empirical evaluation by implementing three versions of different algorithms and executing them with injected faults. The reliability achieved with FDRB is comparable to or better than that with DRB thanks to the fault recovery of the recovery-handling programs, and the fault recovery time is reduced without sacrificing the software reliability of a system. We have also found that FDRB and H...
Kwon, Yong-Raeresearcher권용래researcher
한국과학기술원 : 전산학전공,
Issue Date
181182/325007 / 000925256

학위논문(박사) - 한국과학기술원 : 전산학전공, 2003.2, [ vii, 88 p. ]


Self-Checking Program; Distributed Recovery Block; Software Fault Tolerance; Recovery-Handling Program; 복구 처리 프로그램; 자기 검사 프로그램; 분산 복구 블록; 소프트웨어 결함 포용

Appears in Collection
Files in This Item
There are no files associated with this item.


  • mendeley


rss_1.0 rss_2.0 atom_1.0