Title | Compiler-Assisted Checkpointing for MPI Programs |
Speaker E-mail From |
Alison N. Smith Department of Computer Science University of Texas at Austin" |
Date Time Location |
Tuesday, April 29, 2003 1-2pm (PST) 2-3pm (MNT) Bldg. 921, Room 137 (Sandia - CA) Bldg. 980, Room 95 (Sandia - NM)
|
Abstract | This talk addresses issues of fault tolerance in large-scale supercomputing clusters. In particular, we discuss ways that compiler technology can transparently support efficient checkpointing of MPI programs. Our main goal is to strategically place checkpoints to reduce bandwidth contention among processes and to reduce checkpoint size within each process. In contrast to other proposed techniques, our solution ensures no extra messages or useless checkpoints, allowing for low failure-free overhead. This talk discusses both our on-going efforts and future plans. |
About the Speaker | Alison is a third-year Ph.D. student in Computer Science at The University of Texas at Austin. Her graduate advisor is Professor Calvin Lin. |
Host | Patty Hough, 08962, (925) 294-1518 |