A New Diskless Check Pointing Approach For Multiple Processor Failures

Checkpoint Size Reduction in Application-level Fault Tolerant Solutions
Ivn Cores, Gabriel Rodr a guez, Mar Mart and Patricia Gonzlez a n a
Abstract Systems intended for the execution of long-running parallel applications should provide fault tolerant capabilities, since the probability of failure increases with the execution time and the number of nodes. Checkpointing and rollback recovery is one of the most popular techniques to provide faulttolerance support. However, in order to be useful for large scale systems, current checkpoint-recovery techniques should tackle the problem of reducing checkpointing cost. This paper addresses this issue through the reduction of the checkpoint le sizes. Dierent solutions to reduce the size of the checkpoints generated at application level are proposed and implemented in a checkpointing tool. Detailed experimental results on a multicore cluster show the eectiveness of the proposed methods. Keywords Fault Tolerance; Checkpointing; Parallel Programming 1
restores its own state, which allows to store only userspace data. The aim of this paper is to evaluate dierent techniques to reduce the checkpoint sizes and, thus, the computational and/or I/O cost of checkpointing in ALC approaches. The rest of the paper is organized as follows. Section II proposes dierent techniques to optimize the checkpoint sizes in ALC solutions. Section III explains the implementation details of those techniques on an ALC tool. Section IV evaluates the performance of the proposed methods. Section V describes related work. Finally, Section VI concludes the paper. II. Checkpoint Size Optimization on Application-Level Checkpointing The majority of the checkpointing tools proposed in the literature work at the system level. The basic dierence between SLC and ALC, in terms of state le size optimizations, surges from the fact that SLC sees the application memory as a single continuum, while ALC distinguishes a disperse set of contiguous memory blocks, each containing memory allocated to one or more variables. The exact number depends on the aliasing relationships of the application data. The following sections deal with the utilization of dierent checkpoint size optimization solutions into an application-level approach focusing on its application to array variables. In the context of ALC, more noticeable gains can be achieved by applying optimization techniques only to array variables. Applying them also to scalar variables results in minimal dierences in state le sizes and adds performance overhead derived from the analyses required to instrument the target optimizations. A. Live variable analysis The knowledge of application memory in ALC can be used to select those variables that are live during the creation of state les, avoiding storage of dead variables. Depending on the considered application, applying this technique can signicantly reduce checkpoint le sizes. The identication of these variables can be performed at compile-time through a standard live variable analysis. A variable x is said to be live at a given statement s in a program if there is a control ow path from s to a use of x that contains no definition of x prior to its use. The set LVin of live variables at a statement s can be calculated using the following expression: LVin (s) = (LVout (s) DEF (s)) U SE(s)
I. Introduction Current trends in high-performance computing (HPC) systems show that future improvements in performance will be achieved through increases in system scale. Todays large computational problems can be solved in clustered multi-core systems with core numbers in the range of one-hundred thousand to one million and more. However, as parallel platforms increase their number of resources, so does the failure rate of the global system [1]. Thus, programmers will need a way to ensure that not all computation done is lost on machine failures. Many fault tolerance methods for parallel applications exist in the literature, checkpoint-recovery [2] being one of the most popular. It periodically saves the computation state to stable storage, so that the application execution can be resumed by restoring such state. The overhead of saving checkpoints to disk is the main performance cost in checkpointrecovery methods. Most existing checkpoint systems are implemented at the operating system level. In system-level checkpointing (SLC) the whole state of the processes (program counter, registers and memory) are saved to stable storage. The most important advantage of this approach is its transparency. However, checkpointing to a parallel le system is expensive at large scale [1], [3]. Moreover, I/O bandwidths of largescale facilities do not increase as quickly as their computational capability [4]. Therefore, complete SLC of large parallel machines could become impracticable. A more attractive alternative for current and future HPC systems is application-level checkpointing (ALC), where the application program saves and
1 Computer Architecture Group, versity of A Corua, n Spain. {ivan.coresg,grodriguez,mariam,pglez}@udc.es
UniE-mail:
where LVout (s) is the set of live variables after executing statement s, and U SE(s) and DEF (s) are the sets of variables used and dened by s, respectively. The live variable analysis should take into account interprocedural data ow. Checkpoints in application-level approaches are usually triggered by an explicit call to a checkpoint function in the application code. This guarantees that checkpoints are not performed during a system call, which may have internal state unknown to the checkpointer, but rather inside user-level code. In this way, checkpoint callsites are limited and known at compile time, which allows for the live variable analysis to be bounded and not span the whole application code. For each checkpoint callsite ci , it is only necessary to store the set of variables which are live when the control ow enters the callsite, LVin (ci ). Live variables should be marked for inclusion in future checkpoints right before the checkpoint callsite. Variables that become dead following the control ow may be unmarked, further reducing the sizes of future checkpoint les. B. Incremental checkpointing and zero-blocks exclusion The most popular technique for checkpoint le size reduction in SLC approaches is incremental checkpointing [5], [6], [7]. This technique involves creating two dierent types of checkpoints: full and incremental. Full checkpoints contain all the application data. Incremental checkpoints only contain data that has changed since the last checkpoint. Usually, a xed number of incremental checkpoints is created in between two full ones. During a restart, the state is restored by using the most recent full checkpoint le, and applying in an ordered manner all the dierences before resuming the execution. There exist in the literature dierent solutions to implement incremental checkpointing in SLC approaches. One of them is to use the virtual memory page protection mechanisms [5]: upon starting a checkpoint, pages to be saved are marked as readonly. When the page is eectively saved into the checkpoint its original status is recovered. When the application tries to write to a read-only page, the race condition is resolved by the fault handler. Another option is to use a kernel-level memory management module that employs a page table dirty bit scheme [7]. The third classical choice is the hashbased checkpoint [6], which uses a secure hash function to obtain a unique identier for each block of application memory to be written into state les. This value is stored and compared against the value calculated for the same block upon creating a new checkpoint. If the two hash values dier, the block contents have changed and it is stored again in the new checkpoint le. In ALC it is unadvisable to track changes to memory blocks using the virtual memory page protection mechanism or dirty bits as array variables do not necessarily start at page boundaries. Evaluating
Memory of array A
0 1 2 3 4 5 6 7 8 9 ...
H-0 H-1 H-2 H-3 H-4 H-5 H-6 H-7 H-8 H-9
...
0 1 2 6 7 9 ...
HDD
Checkpoint 1 (Full) Continue
execution
0 1 2 3 4 5 6 7 8 9 ...
H-0 H-1 H-2* H-3*H-4*H-5* H-6* H-7 H-8 H-9
2 3 4 5 6
HDD
Checkpoint 2 (Incremental)
Memory of array A
Data block
Empty block
Block modified 0 since last ckpt. H-0
Block id Hash value
Mark of empty block
Fig. 1. Construction of an incremental checkpoint
memory changes for each array as a whole is also unadvisable, following the locality principle. The best compromise is to divide array variables into chunks of memory of a previously specied size, assumed to be constant, and control changes into these chunks using a secure hash function. The calculated hash value for each chunk of memory is stored in memory and used for comparison when creating incremental checkpoints. When working with real scientic applications it is well known that quite often many elements of the arrays are null, resulting in memory blocks that contain only zeros. Therefore, a possible optimization to further reduce the checkpoint le size is to avoid storage of those zero-blocks. In addition to control the changes into memory blocks, the hash function may be used to detect zero-blocks. Once a zero-block is detected, instead of dumping it into the checkpoint le, a small marker is saved indicating that this block is zero. During restart this marker is identied and the target memory is lled with zeros, which recovers the original state at a negligible cost in terms of both performance and disk usage. The construction of an incremental checkpoint is depicted in Figure 1. The process of restarting from incremental checkpoints is shown in Figure 2. The last available full checkpoint is restored rst, and the updates contained in each incremental checkpoint are then applied in an ordered manner. III. Implementation The techniques described in Section II have been implemented on CPPC [8], an open-source checkpointing tool available from http://cppc.des.udc.es under GPL license. A. CPPC overview CPPC is an application-level checkpointing tool focused on the insertion of fault tolerance into longrunning message-passing applications. It is designed with special focus on portability: it uses portable code and protocols, and generates portable checkpoint les, allowing for execution restart on dierent
0 1 2 6 7 9 ... ...
HDD
Checkpoint 1 (Full)
Step 1
Restart
2 3 4 5 6
Checkpoint 2 (Incremental) HDD
Step 2
located in external libraries, the default behavior is to assume all parameters to be of input type. Therefore, all not previously included variables contained in the set LVin (sp ), being sp the analyzed procedure call, will be marked for inclusion. C. Incremental checkpointing and zero-blocks exclusion
0 1 2 3 4 5 6 7 8 9 ...
Memory of array A Data block Empty block Overwritten block Mark of empty block
Fig. 2. Restart from an incremental checkpoint
Fig. 3. Integration of a parallel application with the CPPC framework
architectures and/or operating systems. CPPC appears to the user as a compiler tool and a runtime library. The integration between the application and the CPPC framework is automatically performed by the CPPC compiler, a source-to-source tool that converts an application code into an equivalent version with added checkpointing capabilities. The global process is depicted in Figure 3. At compile time, the CPPC compiler instruments the code by inserting calls to the CPPC library. At runtime, the application will send petitions to the CPPC controller. From the structural point of view, the controller consists of three basic layers: a facade, that keeps track of the state to be stored when the next checkpoint is reached; the checkpointing layer, which gathers, manages and puts together all data to be stored into the state les; and a writing layer which decouples the other two layers from the specic le format used for state storage. Currently CPPC provides a writing plugin based on the Hierachical Data Format 5 (HDF5) [9], a hierarchical data format and associated library for the portable transfer of graphical and numerical data between computers. B. Live variable analysis The live variable analysis explained in Section II-A constitutes one of the passes of the CPPC compiler. It does not currently perform optimal bounds checks for pointers and arrays variables. This means that some arrays and pointers are registered in a conservative way: they are entirely stored if they are used at any point in the re-executed code. When dealing with calls to precompiled procedures
For the implementation of incremental checkpointing, the HDF-5 writer was modied to divide array variables into blocks of memory. The size of these memory blocks may have a great impact on the performance of the incremental checkpointing technique. CPPC allows the user to choose the size to be applied to each particular application. The new HDF-5 writer also calculates the hash values of the memory blocks. The choice of the hash function impacts the correctness, since many hash functions present a signicant probability of collisions, that is, situations where a memory block changes from one checkpoint to the next checkpoint but its hash value remains the same. Secure hash functions should be used to implement reliable incremental checkpointing techniques [10]. The implementation of incremental checkpointing in CPPC allows the user to choose between dierent secure hash functions, such as MD5 or SHA. The calculated hash functions are used to detect both zero-blocks that can be excluded in the next checkpoint and changes in the memory blocks from previous checkpoints. In order to detect zeroblocks the calculated hash values are compared to the known hash value of a zero-block. To detect changes in the memory blocks, the hash values calculated in previous checkpoints have to be stored to be compared with the new ones. In our implementation, the hash codes are stored into main memory rather than in disk, to improve the performance of the technique. Only the modied blocks with non-zero elements will be stored in the checkpoint le. In order to enable full data recovery during restart, an identier is stored in the checkpoint le for each modied memory block, including zero-blocks (see Figure 1 and Figure 2). This identier indicates the original position of the block in memory relative to the start of the array. CPPC uses the high-order bit of the identier to mark the zero-blocks that are not included in the checkpoint le but should be restored in this step. In addition to the checkpointing mechanism, the restart mechanism has also been modied to comply with incremental checkpointing as described in Section II-B. The last available full checkpoint is restored rst and the modications contained in the associated incremental checkpoints are orderly updated. IV. Experimental Results This section assesses the impact of the described optimization techniques in the size of the checkpoint
TABLE I Checkpoint sizes per process (in MB)
NPB BT (B.4) CG (C.8) EP (C.8) FT (C.8) IS (C.8) LU (B.8) MG (C.8) SP (B.4)
SLC 175.98 219.85 67.42 962.93 354.52 93.51 502.15 168.55
ALC Base 109.95 151.65 1.18 768.14 288.12 27.16 435.89 102.35
LiveVar 109.76 66.33 1.04 640.10 288.10 26.69 435.85 102.25
Full 106.48 58.74 1.04 640.18 192.18 26.62 303.25 96.36
Incremental Incr. 1 Incr. 2 95.02 95.02 2.48 2.48 1.04 1.04 512.15 512.15 46.48 47.95 20.03 20.03 303.04 303.04 77.33 77.33
les and in the execution time overheads. A multicore cluster was used to evaluate our proposal. It consists of 16 nodes, each one of them powered by two Intel Xeon E5620 quad-core CPUs with 16 GB of RAM. The cluster nodes are connected through an Inniband network. The front-end is powered by one Intel Xeon E5502 quad-core CPU with 4 GB of RAM. The connection between the front-end and the execution nodes is an Inniband network too. Finally, the working directory is mounted via NFS and is connected to the cluster by a Gigabit Ethernet network. The application testbed was comprised of the eight applications in the MPI version of the NAS Parallel Benchmarks v3.1 [11] (NPB from now on). These are well-known and widespread applications that provide a de-facto test suite. Out of the NPB suite, the BT, LU and SP benchmarks were run using class B, while the rest were run using class C due to memory constraints. The experiments can be divided into two blocks. The rst block analyzes the checkpoint size optimization obtained using the proposed techniques. The second block evaluates the execution overhead caused by the computation of the hash functions, and the restart overhead caused by the restart mechanism in the incremental technique. All the experiments were run on a single node using all the 8 cores, except for BT and SP, that run over 4 cores, as they require a square number of processes. All the checkpoint les were stored into the working directory mounted via NFS. A. Checkpoint le sizes The reduction in checkpoint le size is the main goal of the techniques described in this work. Table I allows to compare the checkpoint le size reduction obtained by the dierent techniques. The rst column shows results for a SLC approach. The second column shows results for an ALC approach without applying any optimization technique (base case), that is, all the user variables are stored in the checkpoint le. The remaining columns show results for checkpoint le sizes when using the live variable analysis and the incremental checkpointing technique proposed in this paper. Two incremen-
TABLE II Checkpoint latency (in s)
NPB BT CG EP FT IS LU MG SP
ALC Base 5.65 14.24 0.05 89.56 26.48 2.68 39.36 5.05
LiveVar 5.42 6.52 0.07 62.25 26.81 2.78 37.89 5.37
Incremental Full Incr. 1 5.55 5.26 5.42 0.39 0.05 0.03 65.40 50.29 17.10 4.95 2.50 1.87 28.24 27.33 5.21 4.30
tal checkpoints were created after a full checkpoint. As can be seen, ALC obtains better results than the SLC approach and its checkpoint les can be further reduced using the optimization techniques proposed in this paper. Live variable analysis signicantly reduces checkpoint le sizes for CG (56.26% of reduction) and obtains a considerable gain in FT (16.7%). It can be concluded that this technique may have great inuence on reducing le sizes for certain applications and, as it introduces overhead only at compile time, no application can be adversely aected by its use. The incremental checkpointing proposed in this paper achieves important le size reductions for almost all the applications. Note that reductions achieved in the full checkpoint relative to the live variable technique are due to the elimination of zeroblocks. These reductions may vary with the size of the memory block. Table I shows results for memory blocks of 8K elements, where reductions respect to the base case range from 3% (BT) to 60% (CG) for the full checkpoint and from 12% (BT) to 98% (CG) for the incremental checkpoints. B. Checkpoint latency Table II shows the checkpoint latency obtained for the dierent NPB applications. The checkpoint latency is dened as the ellapsed time between the call to the checkpointing function and the return of control to the application. The experiments were repeated 10 times for each application and the mini-
TABLE III Restart times (in s)
NPB BT CG EP FT IS LU MG SP
ALC Base 4.44 11.71 0.10 58.36 22.03 2.08 33.12 4.31
LiveVar 5.22 6.02 0.09 48.80 22.05 2.04 33.12 4.32
Zero 4.55 4.56 0.09 49.92 15.01 2.08 23.81 4.08
Incr. 11.15 4.54 0.24 156.46 20.83 4.90 66.14 9.42
columns of Table I). A possible approach to reduce this overhead would be to bring together the full checkpoint le and the incremental ones into an unique le at the checkpoint server before a restart is required. The number of incremental checkpoints will have a great inuence in the restart overhead. There exist studies [12] that provide a model to determine the optimal number of incremental checkpoints between two consecutive full checkpoints. V. Related Work The live variable analysis, presented in Section IIA, can be seen as a complementary approach to memory exclusion techniques proposed by Plank et al. [13]. As regards incremental checkpointing, as mentioned in Section II-B, there exist in the literature a number of techniques to implement it in SLC [5], [6], [7], [14], [15], [16]. The implementation proposed in this paper is inspired by the hash-based approaches [6], [16] but it is intended for ALC. Using an application-level approach reduces drastically the number of memory blocks to be checked in runtime and, thus, the overhead of the approach. The reduction in the number of analyzed blocks implies also a reduction in the size of the hash tables to be stored. This reduction allows to store these tables into main memory instead of disk, reducing even more the overhead of the technique. Additionally, the size of the generated checkpoint les is reduced through the detection and elimination of zero-blocks. The idea of not storing zero-blocks has a certain similarity to the technique used in the SLC tool Berkeley Labs Checkpoint/Restart (BLCR) Library [17] to exclude zero pages, that is, those that have never been touched and logically contain all zeros. Other alternative present in the literature to reduce the checkpoint size is data compression. It was implemented, for instance, in the CATCH compiler [18] and ickp checkpointer [19]. Experimental results show that compression signicantly reduces checkpoint sizes. However, the potential benets of compression for reducing the overhead of checkpointing depend on the time required to compress data, the compression rate and the ratio of processor speed to disk speed. CATCH also implements adaptive checkpoint, that is, it uses a heuristic algorithm to determine the optimal places, in terms of checkpoint size, to insert checkpoints. This technique could be useful for programs with large variations in memory usage. All the techniques mentioned so far focus on reduction of the checkpoint le sizes. Another way to reduce the computational and I/O cost of checkpointing is to avoid the storage of checkpoint les on the parallel le system. In [20] Plank et al. proposed to replace stable storage with memory and processor redundancy. Some recent works [21], [22], [23] have adapted the technique, known as diskless checkpointing, to contemporary architectures. The main draw-
mum time obtained is reported. The main goal of the experiment is to measure the overhead introduced by the computation of the hash functions and the inspections needed to create the incremental checkpoint. The hash function selected in CPPC for these experiments was MD5. From the results, it can be observed that the overhead introduced by the incremental checkpointing technique is hidden by the gain obtained from the reduction in checkpoint size. Results for the creation of the full checkpoint in the incremental technique allow also to assess the gain obtained when solely applying the zero-blocks exclusion. In general, both the live variable analysis, whose overhead is moved to compile time, and the incremental checkpointing technique with zero-blocks exclusion, perform better than the base approach. In some cases the reduction in checkpoint overhead can be as high as 30 40% (CG or IS). CPPC can be congured so that the checkpoint is built in parallel with the execution of the application by creating new threads. Thus, the application execution does not have to be stalled until the checkpoints are created and the above latencies may be hidden. C. Restart overhead Restart times are shown in Table III. The restart time includes the read of the checkpoint les and the restart of the application to the point where the the checkpoint was dumped. Again, at least 10 executions were performed for each application and the minimum time obtained is reported. The memory was freed before each execution to avoid the eect of page cache and to guarantee that checkpoint les are read from disk. Column labeled as Zero shows the restart overhead when there is no incremental checkpoint le but the full one. These results allow to evaluate the overhead when applying only the zero-blocks exclusion, which is always less than the overhead of the base approach. The incremental checkpointing technique presents a high overhead at restart compared to the others. This is due to a larger volume of data to be moved and read in the case of incremental checkpointing (that can be calculated as the sum of the last three
back of diskless checkpoiniting is its large memory requirements. VI. Concluding Remarks This work has analyzed dierent alternatives to reduce the size of the checkpoint les generated in ALC approaches: live variable analysis, zero-blocks elimination and incremental checkpointing. The techniques have been implemented in an ALC tool, CPPC, obtaining important le size reductions. The reduction of the checkpoint sizes will be particularly useful for parallel applications with a large number of parallel processes, where the transference of a large amount of checkpoint data to stable storage can saturate the network and cause a drop in application performance. The results have shown that incremental checkpointing is the most eective of the three explored methods in terms of checkpoint size reduction. However, global storage requirements increase for this technique as it is necessary to keep stored at least one full checkpoint and all the incremental ones associated. Additionally, it complicates the restart, resulting in an overhead that may become important depending on the number of incremental checkpoints and the characteristics of the network. The results indicate that merging the checkpoint les before restarting could signicantly reduce restart times. As regards live variable analysis and zero-block elimination techniques, the checkpoint size reductions obtained are not as signicant, however, they decrease globally the storage demand and are able to reduce the overhead of both the checkpoint le writing and the restart phase. At present, our implementation of the live variable analysis does not perform optimal bounds checks for pointer and array variables. This means that they are entirely stored if they are used at any point in the re-executed code. Thus, there is still room for future optimizations in this compilation analysis. Acknowledgment This research was supported by the Ministry of Science and Innovation of Spain (Project TIN201016735) and by the Galician Goverment (Project 10PXIB105180PR). References
[1] [2] Bianca Schroeder and Garth A Gibson, Understanding failures in petascale computers, Journal of Physics: Conference Series, vol. 78, no. 1, pp. 012022, 2007. E.N. Elnozahy, L. Alvisi, Y.-M. Wang, and D.B. Johnson, A survey of rollback-recovery protocols in messagepassing systems, ACM Computing Surveys, vol. 34, no. 3, pp. 375408, 2002. Franck Cappello, Fault tolerance in petascale/ exascale systems: Current knowledge, challenges and research opportunities, International Journal of High Performance Computing Applications, vol. 23, no. 3, pp. 212 226, 2009. Kamil Iskra, John W. Romein, Kazutomo Yoshii, and Pete Beckman, Zoid: I/o-forwarding infrastructure for petascale architectures, in Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming. 2008, PPoPP 08, pp. 153162, ACM.
[5]
[6]
[7]
[8]
[9] [10] [11] [12]
[13]
[14]
[15] [16]
[17] [18]
[19] [20]
[21]
[3]
[22]
[4]
[23]
J. S. Plank, J. Xu, and R. H. B. Netzer, Compressed dierences: an algorithm for fast incremental checkpointing, Tech. Rep. CS-95-302, University of Tennessee, Department of Computer Science, Aug. 1995. S. Agarwal, R. Garg, and M. S. Gupta, Adaptive incremental checkpointing for massively parallel systems, in Proceedings of the 18th Annual International Conference on Supercomputing (ICS04), Saint Malo, France, 26 June01 July 2004, pp. 277286, ACM, New York. R. Gioiosa, J. C. Sancho, S. Jiang, and F. Petrini, Transparent, incremental checkpointing at kernel level: a foundation for fault tolerance for parallel computers, in Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing (SC05), Seattle, WA, USA, 1218 November 2005, IEEE Computer Society Press, Los Alamitos. G. Rodr guez, M.J. Mart P. Gonzlez, J. Touri o, and n, a n R. Doallo, CPPC: A compiler-assisted tool for portable checkpointing of message-passing applications, Concurrency and Computation: Practice and Experience, vol. 22, no. 6, pp. 749766, 2010. The HDF5 Group, HDF-5: Hierarchical Data Format, http://www.hdfgroup.org/HDF5/. Last accessed June 2010. Hyo-Chang Nam, Jong Kim, SungJe Hong, and Sunggu Lee, Secure checkpointing, Journal of Systems Architecture, vol. 48, pp. 237254, 2003. National Aeronautics and Space Administration, The NAS Parallel Benchmarks, http://www.nas.nasa.gov/ Software/NPB, Last accessed June 2010. Nichamon Naksinehaboon, Yudan Liu, Chokchai (Box) Leangsuksun, Raja Nassar, Mihaela Paun, and Stephen L. Scott, Reliability-aware approach: An incremental checkpoint/restart model in hpc environments, in Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid, 2008, pp. 783788. J.S. Plank, M. Beck, and G. Kingsley, Compiler-assisted memory exclusion for fast checkpointing, IEEE Technical Committee on Operating Systems and Application Environments, vol. 7, no. 4, pp. 1014, 1995. E.N. Elnozahy, D.B. Johnson, and W. Zwaenepoel, The performance of consistent checkpointing, in Reliable Distributed Systems, 1992. Proceedings., 11th Symposium on, Oct. 1992, pp. 39 47. J. S. Plank, M. Beck, G. Kingsley, and K. Li, Libckpt: Transparent Checkpointing under Unix, in Usenix Winter Technical Conference, January 1995, pp. 213223. Hyo-Chang Nam, Jong Kim, SungJe Hong, and Sunggu Lee, Probabilistic checkpointing, in Fault-Tolerant Computing, 1997. FTCS-27. Digest of Papers., TwentySeventh Annual International Symposium on, June 1997, pp. 48 57. Laurence Berkeley National Laboratory, Berkeley Lab Checkpoint/Restart, https://ftg.lbl.gov/ CheckpointRestart/. Last accessed December 2010. C.-C.J. Li and W.K. Fuchs, Catch-compiler-assisted techniques for checkpointing, in Fault-Tolerant Computing, 1990. FTCS-20. Digest of Papers., 20th International Symposium, June 1990, pp. 74 81. James S. Plank and Kai Li, ickp: A consistent checkpointer for multicomputers, IEEE Parallel Distrib. Technol., vol. 2, pp. 6267, June 1994. J. S. Plank, K. Li, and M. A. Puening, Diskless checkpointing, IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 10, pp. 972986, October 1998. Zizhong Chen, Graham E. Fagg, Edgar Gabriel, Julien Langou, Thara Angskun, George Bosilca, and Jack Dongarra, Fault tolerant high performance computing by a coding approach, in Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, New York, NY, USA, 2005, PPoPP 05, pp. 213223, ACM. Leonardo Arturo Bautista Gomez, Naoya Maruyama, Franck Cappello, and Satoshi Matsuoka, Distributed diskless checkpoint for large scale systems, Cluster Computing and the Grid, IEEE International Symposium on, vol. 0, pp. 6372, 2010. Ge-Ming Chiu and Jane-Ferng Chiu, A new diskless checkpointing approach for multiple processor failures, IEEE Transactions on Dependable and Secure Computing, vol. 99, no. PrePrints, 2010.

A New Diskless Check Pointing Approach For Multiple Processor Failures

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A New Diskless Check Pointing Approach For Multiple Processor Failures

Uploaded by

Copyright:

Available Formats

Checkpoint Size Reduction in Application-level Fault Tolerant Solutions

Checkpoint 1 (Full) Continue

Block modified 0 since last ckpt. H-0

Block id Hash value

Mark of empty block

Fig. 1. Construction of an incremental checkpoint

Checkpoint 2 (Incremental) HDD

Fig. 2. Restart from an incremental checkpoint

Fig. 3. Integration of a parallel application with the CPPC framework

TABLE I Checkpoint sizes per process (in MB)

NPB BT (B.4) CG (C.8) EP (C.8) FT (C.8) IS (C.8) LU (B.8) MG (C.8) SP (B.4)

SLC 175.98 219.85 67.42 962.93 354.52 93.51 502.15 168.55

LiveVar 109.76 66.33 1.04 640.10 288.10 26.69 435.85 102.25

Full 106.48 58.74 1.04 640.18 192.18 26.62 303.25 96.36

TABLE II Checkpoint latency (in s)

LiveVar 5.42 6.52 0.07 62.25 26.81 2.78 37.89 5.37

TABLE III Restart times (in s)

LiveVar 5.22 6.02 0.09 48.80 22.05 2.04 33.12 4.32

Zero 4.55 4.56 0.09 49.92 15.01 2.08 23.81 4.08

Incr. 11.15 4.54 0.24 156.46 20.83 4.90 66.14 9.42

[9] [10] [11] [12]

You might also like