MPMM Performance Data, May 1995

The following MPMM performance data was collected May 17, 1995 using between 4 and 64 ``wide'', RS/6000 580 equivalent nodes, on the NAS SP2.

Three MM5 problems were run for the performance benchmarks.

  • Single domain. 61x61x23, 100km resolution. Standard math library.
  • Single domain. 61x61x23, 100km resolution. With IBM MASS numerical library. This contains tuned versions of the standard AIX math library functions for exponentiation, square roots, logarithms, etc.
  • Nested. Parent domain: 61x61x23 100km resolution; nested domain: 61x61x23, 33km.
  • The parallel MM5 was run for a small number of time steps and then the average time per time step was calculated. Non-hydrostatic dynamics, explicit moisture, mixed-phase ice physics, and other options were enabled. Also, long-wave and short-wave radiation computations were performed every 30 minutes. The proportion of these more expensive radiation time steps is reflected in the average.

    The first graph shows the results as millions of floating point operations per second (Mflop/second). This was computed computed by dividing the number of floating point operations per time-step by the time, in seconds, for an average time step for a given run. One single-domain time step entailed 721 Mflop. In the nested scenario, the number was four times that, or 2882 Mflop, the cost of 1 coarse domain step plus 3 steps on the nest. The floating point operation counts were extrapolated from Cray runs of the code.

    Accurately estimating operation counts in this way is somewhat problematic and less emphasis should be placed on the absolute floating point rates than on the relative rates between different cases. Mflop/sec is a good relative measure because it adjusts for differences in workload between cases. This is useful for gauging the effect on parallel efficiency of adding a nested domain (and the attendant cost of exchanging forcing and feedback data between the domains).

    The second graph presents the data in a different manner, with an emphasis on time-to-solution. The measure is the number of seconds in the simulation that are computed per second of elapsed time. Basically, it measures how much faster the simulation is than the weather itself. Here, the additional work involved computing a higher resolution nested domain is seen. Higher resolution can improve the accuracy of a forecast, but entails a smaller time step, more grid cells for a given area, and therefore additional computation. Nevertheless, the cost of computing a 100km coarse domain with a 33km nest is considerably less expensive than computing the entire 6000km by 6000km coarse domain at 33km resolution.

    The runs show a 10-15 percent improvement in the single-domain times from using the IBM MASS math library. Profiling will provide additional information about how the library is performing.

    Return to MPMM index page.