Title Compiler-Assisted Checkpointing for MPI Programs
Speaker
E-mail
From
Alison N. Smith
Department of Computer Science
University of Texas at Austin"
Date
Time
Location
Tuesday, April 29, 2003
1-2pm (PST)
2-3pm (MNT)
Bldg. 921, Room 137 (Sandia - CA)
Bldg. 980, Room 95 (Sandia - NM)

Abstract This talk addresses issues of fault tolerance in large-scale supercomputing clusters. In particular, we discuss ways that compiler technology can transparently support efficient checkpointing of MPI programs. Our main goal is to strategically place checkpoints to reduce bandwidth contention among processes and to reduce checkpoint size within each process. In contrast to other proposed techniques, our solution ensures no extra messages or useless checkpoints, allowing for low failure-free overhead. This talk discusses both our on-going efforts and future plans.
About the Speaker Alison is a third-year Ph.D. student in Computer Science at The University of Texas at Austin. Her graduate advisor is Professor Calvin Lin.
Host Patty Hough, 08962, (925) 294-1518