Introduction

People

FY03 Milestones and Deliverables

Workshops

Sample Programs and Benchmarks

Tools

Reports

FY02 Program Plan

News

Other Expeditions

    FY03 Performance Expedition Milestones


    Project Start Date: Tue 10/1/02
    Project End Date: Tue 9/30/03


    Performance Engineering for Clusters and Grids


    Tasks (sorted by ID)

    ID

    Task Name

    Start

    Finish

    Status and Task Dependency

    Resource Names

    1

    Performance Engineering for Clusters and Grids

    Tue 10/1/02

    Tue 9/30/03

     

    Expedition co-leader: William Gropp (Argonne), Expedition co-leader: Valerie Taylor (Texas A&M)

    2

    Performance Analysis

    Tue 10/1/02

    Tue 9/30/03

    Working with the MEAD expedition

    Taylor (Texas A&M)

    3

    Make PAIDE available to the Alliance community

    Tue 10/1/02

    Tue 12/31/02

    Completed. See the link Prophesy of Tools webpage (http://prophesy.cs.tamu.edu).

     

    4

    Begin analyzing the MEAD codes; explore the use of data partitioning for distributed systems with the MEAD codes

    Tue 10/1/02

    Tue 12/31/02

    Start to analyzing the MEAD codes; explore the use of MEAD data partitioning on distributed systems

     

    5

    Make Prophesy Database available to the community; continue work with MEAD codes

    Wed 1/1/03

    Mon 3/31/03

    Completed. See the link Prophesy of Tools webpage.

     

    6

    Release the automated modeling component of Prophesy to the community

    Tue 4/1/03

    Mon 6/30/03

    Completed. See the link Prophesy of Tools webpage.

     

    7

    Obtain initial performance analysis and static load balancing results from the MEAD codes

    Tue 4/1/03

    Mon 6/30/03

     

     

    8

    Continue work with different modeling techniques to be incorporated into Prophesy; continue work with the MEAD codes

    Tue 7/1/03

    Tue 9/30/03

     

     

    9

    PAPI

    Tue 10/1/02

    Tue 9/30/03

    Collaborating with NCSA to support use of PAPI and PAPI-related tools by the Community Application Codes Expedition.

    Dongarra (Tennessee), Moore (Tennessee)

    10

    Port PAPI to the Intel Itanium2 (McKinley) processor

    Tue 10/1/02

    Tue 12/31/02

    Initial port has been completed and was included in PAPI 2.3.1 released in November 2002. The Itanium2 port will be further refined for the PAPI 3.0 release.

     

    11

    Complete the PAPI User's Guide

    Tue 10/1/02

    Tue 12/31/02

    Completed and and released with PAPI 2.3.1.

     

    12

    Release PAPI version 3

    Tue 12/31/02

    Tue 12/31/02

    Still under development. PAPI version 3 is a complete rewrite that will be more flexible and have lower overheads than the current version. We backed off to release 2.3.1 for November 2003. We hope to release version 3 beta by May 2003.

     

    13

    Add multiprocess capability for the Dynaprof dynamic instrumentation tool

    Wed 1/1/03

    Mon 3/31/03

     

     

    14

    Compare counting vs. sampling modes of using the Itanium processor performance monitoring unit

    Wed 1/1/03

    Mon 3/31/03

     

     

    15

    Investigate the use of Event Address Registers and event qualification

    Wed 1/1/03

    Mon 3/31/03

     

     

    16

    Design and implement a papirun utility

    Tue 4/1/03

    Mon 6/30/03

    Started.

     

    17

    Prototype implementation of new PAPI features

    Tue 4/1/03

    Mon 6/30/03

     

     

    18

    Release PAPI 3.5 with additional features; Document thef use of PAPI and related tools with applications

    Tue 9/30/03

    Tue 9/30/03

     

     

    19

    Performance Tools for Parallel and Grid Computing

    Tue 10/1/02

    Tue 9/30/03

     

    Reed (UIUC/NCSA)

    20

    Deploy latest version of SvPablo on IA-64 clusters

    Tue 12/31/02

    Tue 12/31/02

    Completed.

     

    21

    Port the PCF tool to the IA-64 environment

    Tue 10/1/02

    Tue 12/31/02

    Completed.

     

    22

    Present mini-tutorials on SvPablo and PCF to Alliance users

    Tue 10/1/02

    Tue 12/31/02

    Set for SPR/03

     

    23

    Perform SvPablo instrumentation and characterization of TPM code

    Wed 1/1/03

    Mon 3/31/03

    In progress

     

    24

    Install PCF at NCSA's IA-64 cluster

    Wed 1/1/03

    Mon 3/31/03

    Completed.

     

    25

    Update SvPablo installation, compatible with PAPI-3.0

    Wed 1/1/03

    Mon 3/31/03

    Not started (waiting for PAPI-3.0).

     

    26

    Improve TPM code performance

    Tue 4/1/03

    Mon 6/30/03

    In progress.

     

    27

    Perform SvPablo instrumentation and characterization of nano biology code

    Tue 4/1/03

    Mon 6/30/03

     

     

    28

    Instrument PCF and conduct I/O characterization of AIPS++

    Tue 4/1/03

    Mon 6/30/03

    Started before the planned starting date.

     

    29

    Update of SvPablo and PCF installations on NCSA's IA-64 cluster

    Tue 4/1/03

    Mon 6/30/03

     

     

    30

    Improve nano-biology code performance; improve AIPS++ I/O performance; update SvPablo and PCF installations on NCSA's IA-64 cluster

    Tue 7/1/03

    Tue 9/30/03

     

     

    31

    Compiler and Programming System Technologies for DTF

    Tue 10/1/02

    Tue 9/30/03

    Working with the MEAD expedition on performance tuning of climate codes

    Kennedy (Rice),Fowler (Rice)

    32

    Demonstrate NCOMMAS and tools in Alliance booth at Supercomputing 2002. Demonstrate performance tools and transformation technology in Rice booth

    Mon 11/18/02

    Fri 11/22/02

    Completed

     

    33

    Release a completely open source version of HPCView suite. Release an open source replacement for bloop that runs on Itanium2

    Tue 12/31/02

    Tue 12/31/02

    Completed. See the link HPCView of Tools webpage.

     

    34

    Extend documentation of Open64-based infrastructure for source-to-source transformation infrastructure

    Tue 10/1/02

    Tue 12/31/02

    A draft of the documentation is included in the distribution. We are working on reorganizing the distribution and installation processes. A revision of the documentation will be part of this.

     

    35

    Hold workshop/tutorial in conjunction with Alliance All Hands Meeting

    Mon 5/26/03

    Fri 5/30/03

    Being planned.

     

    36

    Demonstrate source-to-source transformation technology to generate code compatible with the SHMOD library

    Tue 4/1/03

    Mon 6/30/03

     

     

    37

    Port source-to-source transformation tools for use with NCOMMAS and other applications to use Open-64 infrastructure; release on Web site

    Tue 9/30/03

    Tue 9/30/03

     

     

    38

    SHMOD

    Tue 10/1/02

    Tue 9/30/03

    Working with the MEAD expedition and the team at Rice University.

    Woodward (Minnesota)

    39

    Demonstrate fault-tolerant execution of NCOMMAS under the SHMOD framework on Titan cluster

    Wed 10/1/02

    Fri 12/31/02

    Completed.

     

    40

    Make available on Web SHMOD library for Titan cluster, sPPM example code with SHMOD, and documentation.

    Tue 1/1/03

    Mon 3/31/03

    Expected completion 3/31/03.

     

    41

    Demonstrate and document high parallel performance of PPM and NCOMMAS codes on an IA-64 cluster.

    Tue 4/1/03

    Mon 4/30/03

    Expected completion 4/30/03.

     

    42

    Demonstrate on-demand computing capability using PPM and NCOMMAS in SHMOD framework.

    Tue 5/1/03

    Mon 9/30/03

    Expected completion 9/30/03.

     

    43

    Demonstrate backfill time skewing using either PPM or NCOMMAS on IA-64 cluster.

    Tue 9/30/03

    Tue 9/30/03

    Expected completion 2/28/04.

     

    44

    Demonstrate and document high parallel performance of PPM and NCOMMAS codes using SHMOD on TeraGrid.

    Tue 9/30/03

    Tue 9/30/03

    Expected completion 8/31/04. Refined SHMOD library, simplified example code, and precompiler tool to ease programming burden available on Web 9/30/04.

     

    45

    Parallel Performance

    Tue 10/1/02

    Tue 9/30/03

     

    Gropp (ANL)

    46

    (1) Parallel Numerical Libraries

    Tue 10/1/02

    Tue 9/30/03

     

     

    47

    Measure PETSc performance on IA 64 platforms. Identify key under-performing operations

    Wed 1/1/03

    Mon 3/31/03

    Completed.

     

    48

    Develop new viewers for PETSc vectors using NETCDF format

    Tue 10/1/02

    Tue 12/31/02

    Completed.

     

    49

    Optimize vector scatter operations in PETSc

    Tue 4/1/03

    Mon 6/30/03

    In progress.

     

    50

    Develop IA 64 optimized implementations of key PETSc operations. Integrate these into the PETSc source tree and test against applications

    Tue 7/1/03

    Tue 9/30/03

     

     

    51

    (2) Parallel I/O and PVFS

    Tue 10/1/02

    Tue 9/30/03

     

     

    52

    Port available performance benchmarks from Parallel I/O Benchmarking Consortium (PIOBench) to IA-64 platform; Evaluate PVFS and ROMIO performance on IA64 platform using performance benchmarks; identify potential improvements

    Tue 10/1/02

    Mon 3/31/03

    The "Tile I/O" test now runs on Linux and IBM AIX platforms.

     

    53

    Improve PVFS and ROMIO implementations based on Q2 findings; Identify applications that might benefit from PVFS

    Tue 4/1/03

    Mon 6/30/03

    So far single-node performance of PVFS with the iotest benchmark has not been particularly good.

     

    54

    Expand collection of available PIOBench benchmarks; Evaluate application performance on top of PVFS; identify potential improvements in PVFS and possible improvements to application

    Tue 7/1/03

    Tue 9/30/03

    Additional effort is being placed on the PIO Benchmarking Consortium.

     

    55

    Parallel Virtual File System V2 Development

    Tue 10/1/02

    Tue 9/30/03

     

    Ligon (Clemson)

    56

    Implementation and coding of server request engine and system interface. Refinement of transport modules (BMI, Trove, and Flows).

    Tue 10/1/02

    Wed 1/1/03

    Completed.

     

    57

    Testing and integration of transport modules including data distribution and request processing modules. Testing and integration of server request engine with job and transport modules. Testing and integration of system interface with job and transport modules. Code reviews.

    Wed 1/1/03

    Mon 3/31/03

    75%: Successful tests of data transfer requests on a server. Successful testing of most of the server request processing codes. Updates to job system and Trove storage system.

     

    58

    First end-to-end system. Extensive testing and debugging, refinement. Adding features.

    Tue 4/1/03

    Mon 6/30/03

    15%: Successful test end-to-end of a request from client interface to servers and back again. Some testing of client interface systems. Extensive testing and refinement under way.

     

    59

    Target date for first release. Performance studies and tuning. User documentation development and testing.

    Tue 7/1/03

    Tue 9/30/03

     

     

    60

    Monitoring and Discovery Service (MDS) extensions for the Alliance and TeraGrid

    Tue 10/1/02

    Tue 9/30/03

     

    Schopf (ANL), Evard (ANL)

    61

    Implement a "software" information provider to allow listing in the MDS basic information

    Tue 10/1/02

    Tue 12/31/02

    Completed. See the link http://www.thecodefactory.org/mds/info-providers/.

     

    62

    Supplying flexible php scripts to the teragrid for ease of access to MDS data.

    Wed 1/1/03

    Mon 3/31/03

    Completed. Available from www.mcs.anl.gov/~jms/new.html.

     

    63

    Define any needed new information providers for use by the TeraGrid

    Tue 4/1/03

    Mon 6/30/03

    Under evaluation.

     

    64

    Implement another identified information provider, as prioritized by the operations group

    Tue 7/1/03

    Tue 9/30/03

     

     

    65

    The Design of TeraGrid Applications with Optimal Performance and High Performance

    Tue 10/1/02

    Tue 9/30/03

     

    Vernon (Wisconsin)

    66

    Transfer results for scheduling with more accurate requested runtimes to NCSA cluster scheduling team

    Tue 10/1/02

    Tue 12/31/02

     

     

    67

    Complete the optimization of Cen/Ostriker's cosmology code on the O2000

    Wed 1/1/03

    Mon 3/31/03

     

     

    68

    Complete second phase of scheduling results for the TeraGrid; complete specification of optimized MPI implementation of the cosmology code

    Tue 4/1/03

    Mon 6/30/03

     

     

    69

    Complete the optimized on-demand scheduling for the SHMOD applications; explore the optimization of one other significant performance expedition code

    Tue 7/1/03

    Tue 9/30/03