[BANANA] ASC ITS Lecture Series announces Prof. Jack Dongarra

Linda Becker becker22 at llnl.gov
Mon May 2 13:52:21 PDT 2005


ASC Institiute for Terascale Simulation
Lecture Series 2005

"An Overview of High Performance Computing and
Self-Adapting Numerical Software"

Jack Dongarra
University Distinguished Professor of Computer Science
University of Tennessee

Tuesday, May 17, 2005
2:00pm
B453 R1001 (Armadillo Room)
(unclassified, property protected area)


Abstract:

Professor Dongarra will look at how High Performance Computing (HPC) as 
changed over the last 10 years and examine future trends. A new generation 
of software libraries and algorithms is needed for the effective and 
reliable use of (wide-area) dynamic, distributed and parallel environments. 
Some of the software and algorithm challenges have already been 
encountered, such as management of communication and memory hierarchies 
through a combination of compile-time and run-time techniques. However, the 
increased scale of computation, depth-of-memory hierarchies, range of 
latencies, and increased run-time environment variability will make these 
problems much harder.

As the number of processors in today's high-performance computers continues 
to grow, the mean time to failure (MTTF) of these computers is becoming 
significantly shorter than the execution time of many current 
high-performance computing applications. Although today's architectures are 
usually robust enough to survive node failures without suffering a complete 
system breakdown, most of today's high-performance computing applications 
are not. Whenever there is a node failure, systems have to abort themselves 
and restart from either the beginning or a stable, storage-based checkpoint.

Along these lines, we will discuss the development of fault-tolerant linear 
algebra algorithms and present an approach to building fault-survivable, 
high-performance computing applications using diskless checkpointing with 
FT-MPI. We will also give a detailed presentation on how to write a 
fault-survivable application with FT-MPI and evaluate the performance 
overhead of our fault-tolerance approach by using a preconditioned 
conjugate gradient equation solver as an example. Experimental results 
demonstrate that we can survive a small number of simultaneous processor 
failures with low performance overhead and little numerical impact.


Speaker Bio:

Jack Dongarra received a Bachelor of Science in Mathematics from Chicago 
State University in 1972 and a Master of Science in Computer Science from 
the Illinois Institute of Technology in 1973. He received his Ph.D. in 
Applied Mathematics from the University of New Mexico in 1980. He worked at 
the Argonne National Laboratory until 1989, becoming a senior scientist. He 
now holds an appointment as University Distinguished Professor of Computer 
Science in the Computer Science Department at the University of Tennessee 
and holds the title of Distinguished Research Staff in the Computer Science 
and Mathematics Division at Oak Ridge National Laboratory (ORNL), and an 
Adjunct Professor in the Computer Science Department at Rice University. He 
is the director of the Innovative Computing Laboratory at the University of 
Tennessee which has a staff of 50 people doing research in the area of high 
performance computing. He is also the director of the Center for 
Information Technology Research at the University of Tennessee which 
coordinates and facilitates IT research efforts at the University.

Dongarra specializes in numerical algorithms in linear algebra, parallel 
computing, the use of advanced computer architectures, programming 
methodology, and tools for parallel computers. His research includes the 
development, testing and documentation of high quality mathematical 
software. He has contributed to the design and implementation of the 
following open source software packages and systems: EISPACK, LINPACK, the 
BLAS, LAPACK, ScaLAPACK, Netlib, PVM, MPI, NetSolve, Top500, ATLAS, and 
PAPI. He has published approximately 200 articles, papers, reports and 
technical memoranda and he is coauthor of several books. He was awarded the 
IEEE Sid Fernbach Award in 2004 for his contributions in the application of 
high performance computers using innovative approaches. He is a Fellow of 
the AAAS, ACM, and the IEEE and a member of the National Academy of 
Engineering.


Technical Contact:
Steven Lee
(925) 424-5989
lee117 at llnl.gov

Administrative Contact:
Linda Becker
(925) 423-0421
becker22 at llnl.gov


**********************************************************
Linda Becker, Division Administrator
Institute for Scientific Computing Research (ISCR)
Lawrence Livermore National Laboratory
P.O. Box 808, L-419
Livermore, CA 94551

phone:  (925) 423-0421
fax:  (925) 422-7819
email: becker22 at llnl.gov
URL:  http://www.llnl.gov/iscr/people/becker/

**********************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://csmr.ca.sandia.gov/pipermail/banana/attachments/20050502/9f080307/attachment.html


More information about the BANANA mailing list