The CPPC Project

Controller/comPiler for Portable Checkpointing

CPPC is a checkpointing tool focused on the insertion of fault tolerance into long-running message-passing applications. It is designed to allow for execution restart on different architectures and/or operating systems, also supporting checkpointing over heterogeneous systems, such as the Grid. It uses portable code and protocols, and generates portable checkpoint files while avoiding traditional solutions which add an unscalable overhead (such as process coordination or message-logging).

CPPC is made up of a library and a compiler. The library contains routines for variable-level checkpointing. The compiler helps achieve transparency by relieving the user from time-consuming tasks, such as data flow and communication analyses and adding instrumentation code. The result is a checkpointer that is fully automatic but also optimizes performance by taking into account the details of the code.