An optimal checkpointing-strategy for real-time control systems under transient faults

Cited 53 time in webofscience Cited 0 time in scopus
  • Hit : 640
  • Download : 730
DC FieldValueLanguage
dc.contributor.authorKwak, SWko
dc.contributor.authorChoi, BJko
dc.contributor.authorKim, Byung Kookko
dc.date.accessioned2007-09-07T02:45:37Z-
dc.date.available2007-09-07T02:45:37Z-
dc.date.created2012-02-06-
dc.date.created2012-02-06-
dc.date.issued2001-09-
dc.identifier.citationIEEE TRANSACTIONS ON RELIABILITY, v.50, no.3, pp.293 - 301-
dc.identifier.issn0018-9529-
dc.identifier.urihttp://hdl.handle.net/10203/1349-
dc.description.abstractReal-time computer systems are often used in harsh environments, such as aerospace, and in industry. Such systems are subject to many transient faults while in operation. Checkpointing enables a reduction in the recovery time from a transient fault by saving intermediate states of a task in a reliable storage facility, and then, on detection of a fault, restoring from a previously stored state. The interval between checkpoints affects the execution time of the task. Whereas inserting more checkpoints and reducing the interval between them reduces the reprocessing time after faults, checkpoints have associated execution costs, and inserting extra checkpoints increases the overall task execution time. Thus, a trade-off between the reprocessing time and the checkpointing overhead leads to an optimal checkpoint placement strategy that optimizes certain performance measures. Real-time control systems are characterized by a timely, and correct, execution of iterative tasks within deadlines. The reliability is the probability that a system functions according to its specification over a period of time. This paper reports on the reliability of a checkpointed real-time control system, where any errors are detected at the checkpointing time. The reliability is used as a performance measure to find the optimal checkpointing strategy. For a single-task control system, the reliability equation over a mission time is derived using the Markov model. Detecting errors at the checkpointing time makes reliability jitter with the number of checkpoints. This forces the need to apply other search algorithms to find the optimal number of checkpoints. By considering the properties of the reliability jittering, a simple algorithm is provided to rind the optimal checkpoints effectively. Finally, the reliability model is extended to include multiple tasks by a task allocation algorithm.-
dc.languageEnglish-
dc.language.isoen_USen
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.titleAn optimal checkpointing-strategy for real-time control systems under transient faults-
dc.typeArticle-
dc.identifier.wosid000172985400009-
dc.identifier.scopusid2-s2.0-0035466298-
dc.type.rimsART-
dc.citation.volume50-
dc.citation.issue3-
dc.citation.beginningpage293-
dc.citation.endingpage301-
dc.citation.publicationnameIEEE TRANSACTIONS ON RELIABILITY-
dc.embargo.liftdate9999-12-31-
dc.embargo.terms9999-12-31-
dc.contributor.localauthorKim, Byung Kook-
dc.contributor.nonIdAuthorChoi, BJ-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorfault tolerance-
dc.subject.keywordAuthoroptimal checkpointing-
dc.subject.keywordAuthorreal-time control systems-
dc.subject.keywordAuthorreliability analysis-
dc.subject.keywordAuthorrollback recovery-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 53 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0