PSTSWM AlphaSC-500 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC SWAP Performance

(ordered swap of 128KB message using MPI within a node)

Date/Person: January 26, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (500 MHz Alpha 21264 with 4MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Communication Library: MPI
SWAP size: 16384 REAL*8 floating point values each direction
Message size: Largest - 16384 REAL*8 floating point values
Smallest - 16 REAL*8 floating point values
Processors: 0 and 1
Latency Definition:(T1024-T512)/512
Model Error Range:[1,1024]
Results:

ordered simple swap
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 299.39 6.94 72.8%
10 iter. 337.39 6.78 73.2%

ordered swap using nonblocking send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 279.77 7.27 71.7%
10 iter. 322.92 7.19 71.9%
1 iter. w/overlap 267.88 7.36 71.1%
10 iter. w/overlap 322.73 7.61 71.3%

ordered swap using nonblocking receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 290.75 8.44 69.6%
10 iter. 330.71 8.33 69.9%
1 iter. w/overlap 278.76 8.55 69.8%
10 iter. w/overlap 336.57 9.66 67.0%

ordered swap using nonblocking send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 296.48 8.86 68.7%
10 iter. 333.41 8.82 68.9%
1 iter. w/overlap 301.31 8.84 69.0%
10 iter. w/overlap 328.82 10.13 67.3%

ordered swap using ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 295.94 15.86 54.5%
10 iter. 329.43 15.90 53.9%
1 iter. w/overlap 295.94 8.66 69.0%
10 iter. w/overlap 335.13 9.78 66.7%

ordered swap using nonblocking ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 290.56 16.27 53.6%
10 iter. 328.48 15.81 54.5%
1 iter. w/overlap 297.15 9.26 67.9%
10 iter. w/overlap 328.10 10.31 67.1%

synchronous
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 295.87 13.90 57.2%
10 iter. 331.68 12.82 59.8%

ordered swap using nonblocking sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 291.40 9.09 67.1%
10 iter. 334.06 8.98 67.2%
1 iter. w/overlap 294.94 7.96 69.9%
10 iter. w/overlap 331.77 8.24 69.6%

ordered swap using nonblocking receive with sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 290.63 9.89 66.3%
10 iter. 330.81 9.78 66.2%
1 iter. w/overlap 290.17 9.90 66.3%
10 iter. w/overlap 331.32 10.49 65.5%

ordered swap using nonblocking sync. send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 288.07 10.30 65.3%
10 iter. 333.81 10.46 64.5%
1 iter. w/overlap 289.60 9.32 67.6%
10 iter. w/overlap 327.16 10.80 65.5%

ordered simple swap using sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 291.53 9.44 66.5%
10 iter. 328.72 9.43 66.2%


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (1 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   1.9913e-02   1.9447e-05   13.16   0.37   0.22   0.92 
  256   1.2809e-02   2.5017e-05   20.47   0.29   0.17   0.70 
  512   1.6253e-02   6.3490e-05   16.13   0.09   0.07   0.22 
  1024   8.2814e-03   6.4698e-05   31.65   0.10   0.09   0.25 
  2048   4.4342e-03   6.9284e-05   59.12   0.10   0.09   0.23 
  4096   2.5884e-03   8.0887e-05   101.28   0.07   0.06   0.15 
  8192   1.6112e-03   1.0070e-04   162.70   0.07   0.08   0.19 
  16384   1.2772e-03   1.5965e-04   205.25   0.06   0.06   0.11 
  32768   1.1664e-03   2.9160e-04   224.75   0.04   0.04   0.10 
  65536   1.0656e-03   5.3280e-04   246.01   0.03   0.03   0.04 
  131072   8.7560e-04   8.7560e-04   299.39   0.03   0.03   0.07 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   2   7   3 
  256   0   1   10   7   2 
  512   0   1   7   10   2 
  1024   0   1   10   3   7 
  2048   0   10   1   7   2 
  4096   0   10   7   1   3 
  8192   0   7   10   1   3 
  16384   0   10   3   6   7 
  32768   0   6   3   7   10 
  65536   0   6   3   10   7 
  131072   0   3   4   6   10 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   6 
  256    1   2   6 
  512    1   3   11 
  1024    1   2   11 
  2048    1   4   11 
  4096    1   5   11 
  8192    1   4   11 
  16384    1   2   11 
  32768    1   9   11 
  65536    1   11   11 
  131072    2   10   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (10 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   1.9578e-02   1.9119e-05   13.39   0.37   0.22   0.93 
  256   1.2631e-02   2.4670e-05   20.75   0.28   0.17   0.70 
  512   1.5868e-02   6.1983e-05   16.52   0.09   0.07   0.23 
  1024   8.2053e-03   6.4104e-05   31.95   0.09   0.07   0.21 
  2048   4.2964e-03   6.7132e-05   61.01   0.09   0.07   0.20 
  4096   2.4119e-03   7.5372e-05   108.69   0.07   0.06   0.18 
  8192   1.4756e-03   9.2222e-05   177.66   0.07   0.05   0.18 
  16384   1.1467e-03   1.4333e-04   228.61   0.06   0.05   0.14 
  32768   1.0715e-03   2.6787e-04   244.66   0.04   0.03   0.16 
  65536   9.3752e-04   4.6876e-04   279.61   0.03   0.02   0.10 
  131072   7.7698e-04   7.7698e-04   337.39   0.02   0.02   0.04 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   2   7   3 
  256   0   1   10   7   2 
  512   0   1   7   10   2 
  1024   0   1   7   10   2 
  2048   0   1   7   10   2 
  4096   0   1   7   10   2 
  8192   0   7   1   2   10 
  16384   0   10   2   7   3 
  32768   0   10   2   7   1 
  65536   0   10   2   3   8 
  131072   0   7   9   3   6 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   6 
  256    1   2   6 
  512    2   4   11 
  1024    2   4   11 
  2048    1   3   11 
  4096    1   4   11 
  8192    1   5   11 
  16384    1   5   11 
  32768    1   9   11 
  65536    1   10   11 
  131072    2   11   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (1 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   1.9969e-02   1.9501e-05   13.13   0.23   0.22   0.75 
  256   1.2985e-02   2.5361e-05   20.19   0.17   0.14   0.55 
  512   1.6220e-02   6.3361e-05   16.16   0.06   0.07   0.15 
  1024   8.3220e-03   6.5016e-05   31.50   0.07   0.08   0.14 
  2048   4.5446e-03   7.1009e-05   57.68   0.05   0.06   0.11 
  4096   2.5438e-03   7.9494e-05   103.05   0.08   0.08   0.33 
  8192   1.6104e-03   1.0065e-04   162.78   0.09   0.08   0.31 
  16384   1.2950e-03   1.6187e-04   202.43   0.06   0.06   0.11 
  32768   1.1702e-03   2.9255e-04   224.02   0.10   0.05   0.57 
  65536   1.0664e-03   5.3320e-04   245.82   0.07   0.03   0.37 
  131072   8.7000e-04   8.7000e-04   301.31   0.04   0.02   0.12 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   7   2   4 
  256   0   1   7   10   2 
  512   0   1   7   10   4 
  1024   0   1   7   10   4 
  2048   7   10   0   1   4 
  4096   0   7   10   1   4 
  8192   0   10   7   1   3 
  16384   0   4   10   6   8 
  32768   0   4   6   10   8 
  65536   0   4   6   3   5 
  131072   3   5   4   0   10 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   8 
  256    1   2   9 
  512    1   3   11 
  1024    1   3   11 
  2048    2   4   11 
  4096    1   4   10 
  8192    1   2   10 
  16384    1   4   11 
  32768    1   6   10 
  65536    1   9   10 
  131072    1   9   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (10 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   2.0450e-02   1.9970e-05   12.82   0.25   0.26   0.64 
  256   1.3122e-02   2.5630e-05   19.98   0.19   0.20   0.54 
  512   1.6299e-02   6.3669e-05   16.08   0.07   0.07   0.15 
  1024   8.4113e-03   6.5713e-05   31.17   0.07   0.08   0.14 
  2048   4.3910e-03   6.8610e-05   59.70   0.09   0.08   0.15 
  4096   2.4711e-03   7.7222e-05   106.08   0.07   0.07   0.13 
  8192   1.4801e-03   9.2509e-05   177.11   0.08   0.09   0.12 
  16384   1.1645e-03   1.4556e-04   225.11   0.05   0.05   0.10 
  32768   1.0765e-03   2.6912e-04   243.52   0.05   0.02   0.32 
  65536   9.3828e-04   4.6914e-04   279.39   0.03   0.02   0.13 
  131072   7.7886e-04   7.7886e-04   336.57   0.02   0.01   0.04 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   7   10   2 
  256   0   1   7   10   2 
  512   0   1   7   10   2 
  1024   0   1   7   10   4 
  2048   0   1   7   10   4 
  4096   0   7   10   1   2 
  8192   0   7   2   4   8 
  16384   10   0   2   4   8 
  32768   10   0   2   4   8 
  65536   4   0   2   10   8 
  131072   2   0   4   6   10 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   4 
  256    1   2   8 
  512    2   4   11 
  1024    1   4   11 
  2048    1   4   11 
  4096    1   4   11 
  8192    1   2   11 
  16384    2   6   11 
  32768    4   10   10 
  65536    4   10   11 
  131072    4   11   11 

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:02:33 EDT.
3016 accesses since 1/2/96.