Multihybrid job scheduling for fault-tolerant distributed computing in policy-constrained resource networks

Cited 11 time in webofscience Cited 14 time in scopus
  • Hit : 378
  • Download : 0
Unpredictable fluctuations in resource availability often lead to rescheduling decisions that sacrifice a success rate of job completion in batch job scheduling. To overcome this limitation, we consider the problem of assigning a set of sequential batch jobs with demands to a set of resources with constraints such as heterogeneous rescheduling policies and capabilities. The ultimate goal is to find an optimal allocation such that performance benefits in terms of makespan and utilization are maximized according to the principle of Pareto optimality, while maintaining the job failure rate close to an acceptably low bound. To this end, we formulate a multihybrid policy decision problem (MPDP) on the primary-backup fault tolerance model and theoretically show its NP-completeness. The main contribution is to prove that our multihybrid job scheduling (MJS) scheme confidently guarantees the fault-tolerant performance by adaptively combining jobs and resources with different rescheduling policies in MPDP. Furthermore, we demonstrate that the proposed MJS scheme outperforms the five rescheduling heuristics in solution quality, searching adaptability and time efficiency by conducting a set of extensive simulations under various scheduling conditions.
Publisher
ELSEVIER SCIENCE BV
Issue Date
2015-05
Language
English
Article Type
Article
Keywords

INTEGRATED APPROACH; MIGRATION; SYSTEMS; FAILURES

Citation

COMPUTER NETWORKS, v.82, pp.81 - 95

ISSN
1389-1286
DOI
10.1016/j.comnet.2015.02.030
URI
http://hdl.handle.net/10203/200223
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 11 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0