PSTSWM AlphaSC-500 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC SENDRECV Performance

(ordered sendrecv of 128KB message using MPI between two nodes)

(performance measured per processor when all processors send right, read left in a logical ring)

Date/Person: January 26, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (500 MHz Alpha 21264 with 4MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Communication Library: MPI
SWAP size: 16384 REAL*8 floating point values each direction
Message size: Largest - 16384 REAL*8 floating point values
Smallest - 16 REAL*8 floating point values
Processors: 0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 0
Latency Definition:(T1024-T512)/512
Model Error Range:[1,1024]
Results:

ordered simple swap
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 115.89 11.78 65.0%
10 iter. 134.54 16.83 54.4%

ordered swap using nonblocking send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 113.91 13.99 59.2%
10 iter. 139.14 16.59 54.1%
1 iter. w/overlap 113.94 14.69 58.7%
10 iter. w/overlap 141.76 17.71 54.7%

ordered swap using nonblocking receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 112.82 19.91 47.6%
10 iter. 140.45 21.24 45.0%
1 iter. w/overlap 119.18 13.99 62.2%
10 iter. w/overlap 141.68 16.31 59.4%

ordered swap using nonblocking send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 115.85 26.00 35.5%
10 iter. 139.53 23.06 42.0%
1 iter. w/overlap 121.93 14.96 60.4%
10 iter. w/overlap 144.16 16.30 58.9%

ordered swap using ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 112.83 36.84 40.0%
10 iter. 133.00 36.65 40.7%
1 iter. w/overlap 119.11 39.49 38.9%
10 iter. w/overlap 140.59 55.77 30.7%

ordered swap using nonblocking ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 112.03 36.24 38.9%
10 iter. 137.54 35.92 39.5%
1 iter. w/overlap 118.11 38.39 39.2%
10 iter. w/overlap 138.05 55.13 30.7%

synchronous
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 112.62 38.92 35.8%
10 iter. 138.84 38.58 37.3%

ordered swap using nonblocking sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 118.47 23.86 45.5%
10 iter. 149.96 23.55 46.9%
1 iter. w/overlap 121.17 24.33 40.2%
10 iter. w/overlap 145.43 22.92 46.2%

ordered swap using nonblocking receive with sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 117.21 27.59 37.8%
10 iter. 145.18 30.46 33.0%
1 iter. w/overlap 119.73 19.13 53.4%
10 iter. w/overlap 141.75 22.27 50.8%

ordered swap using nonblocking sync. send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 122.10 30.31 33.2%
10 iter. 149.23 29.73 34.0%
1 iter. w/overlap 119.86 17.94 55.3%
10 iter. w/overlap 142.85 21.63 50.0%

ordered simple swap using sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 115.11 22.91 49.6%
10 iter. 138.24 21.00 54.2%


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (1 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   3.6892e-02   3.6027e-05   7.11   0.71   0.65   1.45 
  256   2.3988e-02   4.6851e-05   10.93   0.51   0.52   1.11 
  512   2.3187e-02   9.0575e-05   11.31   0.20   0.14   0.52 
  1024   1.2752e-02   9.9625e-05   20.56   0.18   0.10   0.44 
  2048   7.4024e-03   1.1566e-04   35.41   0.12   0.05   0.35 
  4096   4.6302e-03   1.4469e-04   56.62   0.10   0.05   0.28 
  8192   3.2290e-03   2.0181e-04   81.18   0.08   0.05   0.22 
  16384   2.5710e-03   3.2137e-04   101.96   0.06   0.05   0.17 
  32768   2.2778e-03   5.6945e-04   115.09   0.06   0.03   0.17 
  65536   2.1470e-03   1.0735e-03   122.10   0.07   0.07   0.16 
  131072   2.2628e-03   2.2628e-03   115.85   0.02   0.02   0.03 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   2   3   10 
  256   1   0   3   2   10 
  512   1   0   2   3   8 
  1024   0   1   2   3   9 
  2048   0   1   7   9   3 
  4096   3   2   0   1   9 
  8192   1   3   2   7   9 
  16384   8   9   7   3   1 
  32768   8   7   3   1   2 
  65536   9   7   8   0   10 
  131072   3   9   10   0   7 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   2 
  256    1   2   4 
  512    1   4   8 
  1024    1   3   8 
  2048    2   5   8 
  4096    3   4   9 
  8192    1   4   11 
  16384    3   5   11 
  32768    1   7   11 
  65536    1   3   11 
  131072    4   11   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (10 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   4.1314e-02   4.0346e-05   6.35   0.55   0.48   1.19 
  256   2.4331e-02   4.7521e-05   10.77   0.51   0.53   1.10 
  512   2.2617e-02   8.8350e-05   11.59   0.21   0.15   0.54 
  1024   1.2406e-02   9.6920e-05   21.13   0.17   0.10   0.45 
  2048   7.2260e-03   1.1291e-04   36.28   0.13   0.09   0.35 
  4096   4.5680e-03   1.4275e-04   57.39   0.09   0.06   0.25 
  8192   3.1714e-03   1.9821e-04   82.66   0.07   0.04   0.21 
  16384   2.4900e-03   3.1125e-04   105.28   0.06   0.04   0.15 
  32768   2.1793e-03   5.4482e-04   120.29   0.03   0.01   0.08 
  65536   1.8666e-03   9.3332e-04   140.44   0.05   0.06   0.11 
  131072   1.7481e-03   1.7481e-03   149.96   0.07   0.08   0.13 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   1   0   2   3   10 
  256   1   0   2   3   10 
  512   1   2   0   3   9 
  1024   1   0   3   2   9 
  2048   0   1   3   2   8 
  4096   2   1   0   3   9 
  8192   1   0   3   2   7 
  16384   7   8   9   10   1 
  32768   9   7   1   2   8 
  65536   8   7   9   10   1 
  131072   7   9   8   2   3 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   1   3 
  256    1   1   4 
  512    1   4   8 
  1024    2   4   8 
  2048    1   4   8 
  4096    2   4   10 
  8192    1   8   11 
  16384    2   8   11 
  32768    4   7   11 
  65536    1   4   11 
  131072    2   3   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (1 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   3.7843e-02   3.6956e-05   6.93   0.52   0.27   1.38 
  256   2.2799e-02   4.4529e-05   11.50   0.46   0.24   1.19 
  512   2.3624e-02   9.2280e-05   11.10   0.19   0.08   0.55 
  1024   1.3105e-02   1.0238e-04   20.00   0.15   0.05   0.45 
  2048   7.4618e-03   1.1659e-04   35.13   0.14   0.08   0.41 
  4096   4.6018e-03   1.4381e-04   56.97   0.12   0.10   0.32 
  8192   3.3070e-03   2.0669e-04   79.27   0.06   0.02   0.21 
  16384   2.5382e-03   3.1728e-04   103.28   0.08   0.07   0.17 
  32768   2.2532e-03   5.6330e-04   116.34   0.07   0.05   0.33 
  65536   2.1500e-03   1.0750e-03   121.93   0.06   0.02   0.33 
  131072   2.2602e-03   2.2602e-03   115.98   0.02   0.02   0.05 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   1   2   3   0   9 
  256   1   2   3   0   9 
  512   0   1   7   3   2 
  1024   1   2   0   3   7 
  2048   1   2   3   0   7 
  4096   1   2   0   9   7 
  8192   2   1   8   7   0 
  16384   3   7   2   0   1 
  32768   3   2   9   1   7 
  65536   3   7   9   8   2 
  131072   3   9   7   8   4 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   3   5 
  256    1   2   6 
  512    2   5   8 
  1024    4   6   8 
  2048    1   5   8 
  4096    1   4   8 
  8192    3   8   11 
  16384    2   2   11 
  32768    3   6   10 
  65536    2   8   10 
  131072    4   11   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (10 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   4.1073e-02   4.0110e-05   6.38   0.60   0.31   1.96 
  256   2.4370e-02   4.7599e-05   10.76   0.54   0.27   1.67 
  512   2.3771e-02   9.2854e-05   11.03   0.25   0.09   0.85 
  1024   1.2689e-02   9.9133e-05   20.66   0.23   0.08   0.78 
  2048   7.3632e-03   1.1505e-04   35.60   0.18   0.05   0.66 
  4096   4.6308e-03   1.4471e-04   56.61   0.13   0.03   0.47 
  8192   3.2253e-03   2.0158e-04   81.28   0.09   0.03   0.32 
  16384   2.5526e-03   3.1907e-04   102.70   0.05   0.01   0.17 
  32768   2.1577e-03   5.3942e-04   121.49   0.03   0.01   0.09 
  65536   1.9024e-03   9.5121e-04   137.80   0.03   0.02   0.10 
  131072   1.8025e-03   1.8025e-03   145.43   0.04   0.03   0.16 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   2   3   0   1   9 
  256   2   3   1   0   9 
  512   0   1   3   2   7 
  1024   1   0   2   3   7 
  2048   0   1   2   3   7 
  4096   0   2   1   8   7 
  8192   0   1   2   7   3 
  16384   2   1   7   3   9 
  32768   9   3   2   7   1 
  65536   7   3   8   2   9 
  131072   7   10   3   9   1 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   4 
  256    1   2   5 
  512    1   3   8 
  1024    2   3   8 
  2048    1   6   8 
  4096    2   7   9 
  8192    2   7   9 
  16384    4   8   11 
  32768    5   7   11 
  65536    5   8   11 
  131072    3   8   11 

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:02:31 EDT.
3072 accesses since 1/2/96.