Performance Analysis Tools

Table of Contents

  1. Scope and Motivation
  2. Performance Considerations and Strategies
  3. Timers
    1. time
    2. timex
    3. MPI Timing Routines
    4. gettimeofday()
    5. read_real_time()
    6. IBM Fortran Routines
      1. rtc()
      2. irtc()
      3. dtime_()
      4. etime_()
      5. mclock()
      6. timef()
    7. AIX Trace Facility
  4. AIX System Commands
    1. Hardware Configuration Commands
      1. lscfg
      2. lsdev
      3. lsattr
      4. lsps
      5. qcpu
    2. vmstat
    3. netstat
    4. iostat
    5. ps
  5. Profilers
    1. prof
    2. gprof
    3. tprof
    4. monitor
    5. xprofiler
    6. mpiP
  6. Performance Analysis Tools
    1. HPM Toolkit
    2. PE Benchmarker Toolset
    3. VampirGuideView (VGV)
    4. Paraver and Dimemas
    5. Performance Toolbox
    6. Dynamic Probe Class Library (DPCL)
    7. Other Multi-Platform Parallel Performance Analysis Tools
  7. Miscellaneous Tools
  8. References and More Information
  9. Exercise


Scope and Motivation


Scope of This Tutorial:

Motivation:

Performance Considerations and Strategies



Timers


time


timex

Note Note that much of the timex command's output is NOT described in the timex man page. Some of it may be understood from reading the sar command man page.


MPI Timing Routines


gettimeofday()


read_real_time()
time_base_to_time()


xlf Fortran Timing Routines

The routines described here are included as part of the IBM xlf compiler's service and utilities procedures. They will probably not be found in non-IBM Fortran environments.

rtc()

irtc()

dtime_()

etime_()

mclock()

timef()


AIX Trace Facility

Example:

  1. Get event tags from /usr/include/sys/trchkid.h for desired events.

    Thread create=465, thread terminate=467, wait lock=46d.

  2. Remove any existing tracefile (can cause problems):

    rm trcfile

  3. Execute program, creating a tracefile with specified events:

    trace -a -o trcfile -j 465,467,46d; dotprod; trcstop

  4. Produce a readable trace report:

    trcrpt -o trace.report -O pid=on trcfile

  5. View report using your favorite editor.

    Sample AIX Trace Facilty output
    Fri Jul 19 00:02:05 2002
    System: AIX frost067 Node: 5
    Machine: 006008594C00
    Internet Address: 86092F43 134.9.47.67
    The system contains 16 cpus, of which 16 were traced.
    Buffering: Kernel Heap
    This is from a 32-bit kernel.
    Tracing only these hooks, 465,467,46d
    
    trace -a -o trcfile -j 465,467,46d
    
    
    ID  PID      I    ELAPSED_SEC     DELTA_MSEC   APPL    SYSCALL KERNEL  INTERRUPT
    
    001 70640         0.000000000       0.000000                   TRACE ON channel
    0
                                                                   Fri Jul 19 00:02:
    05 2002
    465 -1            0.000482511       0.482511           thread_create:   pid=7825
    0 tid=226787 priority=60 policy=0
    46D 85616         0.000495484       0.012973           wait_on_lock:   pid=85616
     tid=238427 lockaddr=5819F0
    465 -1            0.027897260      27.401776           thread_create:   pid=7731
    8 tid=146769 priority=61 policy=0
    467 77318         0.028225717       0.328457           thread_terminate:   pid=7
    7318 tid=146769
    465 -1            0.028235827       0.010110           thread_create:   pid=7731
    8 tid=206215 priority=61 policy=0
    46D -1            0.028251016       0.015189           wait_on_lock:   pid=77318
     tid=238429 lockaddr=5819F0
    465 -1            0.028633968       0.382952           thread_create:   pid=7731
    8 tid=151151 priority=61 policy=0
    467 77318         0.028795141       0.161173           thread_terminate:   pid=7
    7318 tid=206215
    467 77318         0.028861449       0.066308           thread_terminate:   pid=7
    7318 tid=151151
    465 -1            0.028876390       0.014941           thread_create:   pid=7731
    8 tid=233391 priority=61 policy=0
    46D -1            0.028886636       0.010246           wait_on_lock:   pid=77318
     tid=238429 lockaddr=5819F0
    467 77318         0.029303598       0.416962           thread_terminate:   pid=7
    7318 tid=233391
    002 -1            0.055766970      26.463372                   TRACE OFF channel
     0000 Fri Jul 19 00:02:05 2002



AIX System Commands

Hardware Configuration Commands

AIX provides a number of commands which can be used to determine a machine's configuration. Some potentially useful ones are described below. Each command is linked to its corresponding man page.

lscfg

lsdev -C

lsattr

lsps -a

qcpu


vmstat - Virtual Memory Statistics


netstat - Network Statistics


iostat - I/O Statistics


ps - Process Status



Profilers

prof


gprof

Additional Notes About prof and gprof:


tprof


monitor


xprofiler

xprofiler xprofiler

Overview:

Example Displays and Reports:

Using xprofiler:

  1. Compile and link your program with both of the options: -g -pg . The -g option enables source statement profiling and -pg turns profiling on.

    Note: when you compile and link separately, you must use the -pg option with both the compile and link commands.

  2. Run your serial or parallel code as usual. When it has completed, you will find one statistics file file for each task. Serial codes produce gmon.out. For parallel jobs, the files will be called gmon.out.0, gmon.out.1, gmon.out.2 and so on.

  3. Invoke xprofiler. This can be done several ways.

    xprofiler
    Starts without a file loaded. Must load file by using xprofiler's File pull-down menu.
    xprofiler myprog gmon.out
    Loads the serial program with it's stat file
    xprofiler myprog gmon.out.N 
    Loads parallel program with selected stat file
    xprofiler myprog gmon.out.* 
    Loads parallel program with combined/merged stat files

    Note that there are also several command line flags available to define certain xprofiler characteristics and behaviors.

  4. Use xprofiler's pull down menus and hidden menus (press right mouse button on an object such as an arc or function box) to accomplish desired actions, such as:
    • Zooming-in, out
    • Examining arc information
    • Examining function statistics
    • Producing reports
    • Loading new files
    • Setting configuration options
    • Saving/producing screen dumps
    • Unclustering functions from their library group
    • Collapsing/hide library information
    • and more....
    xprofiler hidden function menu
    xprofiler hidden function menu
    xprofiler hidden arc menu
    xprofiler hidden arc menu

    Important note: it is often necessary to uncluster functions and zoom-in to get to important detailed information. It is also usually useful to collapse/hide library information that isn't needed (like system libs).

Documentation:


mpiP

Overview: Using mpiP: Understanding mpiP Output:

Performance Analysis Tools

HPM Toolkit

Overview:

Using hpmcount:

Using libhpm: Using hpmviz:


PE Benchmarker Toolset

Overview: Using the Performance Collection Tool (PCT): Using the UTE Utilties: Using the Profile Visualization Tool (PVT): Documentation: (you're going to need this)


VampirGuideView (VGV)


Paraver and Dimemas


Performance Toolbox

Performance Toobox Performance Toobox

Overview:

Example Displays:


Dynamic Probe Class Library (DPCL)

Overview:

Using DPCL:

Documentation:


Other Multi-Platform Parallel Performance Analysis Tools:


Miscellaneous Tools

Several tools which fall into the "other" category are available for the SP environment. Note that some of these tools are installed at LLNL, under development and/or unsupported. Some may even be extinct.

mpi_trace:

MPIMap:

And More...


This completes the tutorial.

Evaluation Form       Please complete the online evaluation form - unless you are doing the exercise, in which case please complete it at the end of the exercise.

Where would you like to go now?



References and More Information