PSTSWM AlphaSC-500 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC SWAP Performance

(ordered swap of 8KB message using MPI within a node)

Date/Person: January 26, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (500 MHz Alpha 21264 with 4MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Communication Library: MPI
SWAP size: 1024 REAL*8 floating point values each direction
Message size: Largest - 1024 REAL*8 floating point values
Smallest - 1 REAL*8 floating point values
Processors: 0 and 1
Latency Definition:(T1024-T512)/512
Model Error Range:[1,1024]
Results:

ordered simple swap
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 141.49 6.46 68.5%
10 iter. 161.04 6.35 69.2%

ordered swap using nonblocking send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 130.86 7.12 66.1%
10 iter. 159.97 6.85 68.1%
1 iter. w/overlap 134.08 7.15 66.3%
10 iter. w/overlap 154.89 7.13 67.9%

ordered swap using nonblocking receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 130.65 8.25 64.5%
10 iter. 153.04 8.23 65.2%
1 iter. w/overlap 138.38 8.47 64.8%
10 iter. w/overlap 157.84 9.37 63.5%

ordered swap using nonblocking send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 126.42 8.44 63.3%
10 iter. 153.78 8.65 64.1%
1 iter. w/overlap 138.85 9.07 63.1%
10 iter. w/overlap 154.48 9.76 63.4%

ordered swap using ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 117.53 14.89 55.4%
10 iter. 141.61 14.82 51.7%
1 iter. w/overlap 142.22 8.53 64.2%
10 iter. w/overlap 157.51 9.33 63.3%

ordered swap using nonblocking ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 114.09 15.16 49.8%
10 iter. 141.66 15.23 51.6%
1 iter. w/overlap 135.40 8.75 64.0%
10 iter. w/overlap 153.55 9.79 64.2%

synchronous
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 133.20 11.00 59.5%
10 iter. 149.00 10.50 61.8%

ordered swap using nonblocking sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 126.81 8.69 61.9%
10 iter. 156.10 8.52 64.1%
1 iter. w/overlap 117.36 7.70 63.1%
10 iter. w/overlap 152.49 7.73 66.1%

ordered swap using nonblocking receive with sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 125.07 9.36 63.3%
10 iter. 151.99 9.30 63.1%
1 iter. w/overlap 127.60 9.52 61.6%
10 iter. w/overlap 155.33 10.31 61.8%

ordered swap using nonblocking sync. send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 120.29 9.87 59.8%
10 iter. 149.65 9.69 61.5%
1 iter. w/overlap 119.77 9.45 60.5%
10 iter. w/overlap 153.44 10.33 62.2%

ordered simple swap using sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 130.65 8.95 61.8%
10 iter. 160.85 8.92 63.3%


Protocol Sensitivity Summary for Unidirectional Swap of 8192 Bytes (1 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   1.3273e-02   1.2962e-05   1.23   0.53   0.41   1.37 
  16   6.6616e-03   1.3011e-05   2.46   0.54   0.44   1.38 
  32   3.2878e-03   1.2843e-05   4.98   0.57   0.43   1.45 
  64   2.0614e-03   1.6105e-05   7.95   0.43   0.31   1.16 
  128   1.2430e-03   1.9422e-05   13.18   0.39   0.21   1.06 
  256   7.9600e-04   2.4875e-05   20.58   0.31   0.20   0.72 
  512   1.0230e-03   6.3938e-05   16.02   0.11   0.07   0.35 
  1024   5.3360e-04   6.6700e-05   30.70   0.09   0.07   0.22 
  2048   2.9160e-04   7.2900e-05   56.19   0.10   0.08   0.23 
  4096   1.7200e-04   8.6000e-05   95.26   0.10   0.09   0.24 
  8192   1.1580e-04   1.1580e-04   141.49   0.12   0.12   0.24 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   0   1   2   3   7 
  16   0   1   2   3   7 
  32   0   1   2   3   7 
  64   0   1   2   3   7 
  128   0   1   2   3   10 
  256   0   1   7   2   3 
  512   0   1   7   10   3 
  1024   0   1   7   10   2 
  2048   0   1   10   2   3 
  4096   0   1   2   10   3 
  8192   0   6   1   10   2 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    1   1   2 
  16    1   1   2 
  32    1   1   2 
  64    1   2   3 
  128    1   2   6 
  256    1   2   6 
  512    1   3   10 
  1024    1   4   11 
  2048    1   2   11 
  4096    1   2   11 
  8192    1   1   11 


Protocol Sensitivity Summary for Unidirectional Swap of 8192 Bytes (10 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   1.3130e-02   1.2823e-05   1.25   0.53   0.40   1.37 
  16   6.6259e-03   1.2941e-05   2.47   0.52   0.40   1.35 
  32   3.3151e-03   1.2950e-05   4.94   0.53   0.41   1.36 
  64   2.0619e-03   1.6109e-05   7.95   0.42   0.30   1.15 
  128   1.2509e-03   1.9545e-05   13.10   0.36   0.21   0.91 
  256   8.0160e-04   2.5050e-05   20.44   0.28   0.15   0.72 
  512   9.9040e-04   6.1900e-05   16.54   0.11   0.08   0.26 
  1024   5.1958e-04   6.4947e-05   31.53   0.09   0.07   0.23 
  2048   2.7718e-04   6.9295e-05   59.11   0.09   0.07   0.24 
  4096   1.6032e-04   8.0160e-05   102.20   0.07   0.06   0.17 
  8192   1.0174e-04   1.0174e-04   161.04   0.06   0.05   0.14 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   0   1   2   3   7 
  16   0   1   2   3   7 
  32   0   1   2   3   7 
  64   0   1   2   7   3 
  128   0   1   2   7   3 
  256   0   1   10   7   2 
  512   0   1   7   10   2 
  1024   0   1   10   7   3 
  2048   0   1   10   7   2 
  4096   0   10   7   1   2 
  8192   0   10   1   7   3 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    1   1   2 
  16    1   1   2 
  32    1   1   2 
  64    1   1   3 
  128    1   2   6 
  256    1   2   6 
  512    1   2   10 
  1024    1   3   11 
  2048    1   5   11 
  4096    1   5   11 
  8192    3   5   11 


Protocol Sensitivity Summary for Unidirectional Swap of 8192 Bytes (1 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   1.3557e-02   1.3240e-05   1.21   0.31   0.34   0.67 
  16   6.7498e-03   1.3183e-05   2.43   0.30   0.31   0.67 
  32   3.4196e-03   1.3358e-05   4.79   0.31   0.33   0.67 
  64   2.0792e-03   1.6244e-05   7.88   0.26   0.26   0.60 
  128   1.2572e-03   1.9644e-05   13.03   0.24   0.21   0.78 
  256   8.1120e-04   2.5350e-05   20.20   0.18   0.16   0.54 
  512   1.0276e-03   6.4225e-05   15.94   0.07   0.08   0.16 
  1024   5.2660e-04   6.5825e-05   31.11   0.08   0.09   0.18 
  2048   3.0000e-04   7.5000e-05   54.61   0.04   0.03   0.08 
  4096   1.7280e-04   8.6400e-05   94.81   0.06   0.07   0.11 
  8192   1.1520e-04   1.1520e-04   142.22   0.09   0.06   0.21 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   0   1   7   2   4 
  16   0   1   7   2   4 
  32   0   1   7   2   4 
  64   0   1   7   4   2 
  128   0   1   7   2   4 
  256   0   1   7   2   10 
  512   0   1   7   4   10 
  1024   0   1   2   4   7 
  2048   0   2   3   7   4 
  4096   0   1   2   3   4 
  8192   4   0   3   2   5 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    1   1   3 
  16    1   1   3 
  32    1   1   4 
  64    1   1   5 
  128    1   1   7 
  256    1   2   9 
  512    1   3   11 
  1024    1   2   11 
  2048    3   6   11 
  4096    1   4   11 
  8192    1   4   11 


Protocol Sensitivity Summary for Unidirectional Swap of 8192 Bytes (10 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   1.3898e-02   1.3572e-05   1.18   0.36   0.38   0.70 
  16   6.9694e-03   1.3612e-05   2.35   0.36   0.39   0.68 
  32   3.4887e-03   1.3628e-05   4.70   0.36   0.40   0.68 
  64   2.1734e-03   1.6979e-05   7.54   0.29   0.32   0.57 
  128   1.3013e-03   2.0332e-05   12.59   0.24   0.26   0.61 
  256   8.2838e-04   2.5887e-05   19.78   0.20   0.20   0.56 
  512   1.0392e-03   6.4953e-05   15.77   0.07   0.07   0.15 
  1024   5.3386e-04   6.6733e-05   30.69   0.08   0.09   0.17 
  2048   2.8414e-04   7.1035e-05   57.66   0.07   0.07   0.13 
  4096   1.6450e-04   8.2250e-05   99.60   0.05   0.06   0.13 
  8192   1.0358e-04   1.0358e-04   158.18   0.02   0.02   0.07 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   0   1   7   10   2 
  16   0   1   7   10   2 
  32   0   1   7   10   4 
  64   0   1   7   10   2 
  128   0   1   7   10   4 
  256   0   1   7   10   4 
  512   1   0   7   10   4 
  1024   0   1   7   10   2 
  2048   0   10   1   7   2 
  4096   10   0   1   4   7 
  8192   10   0   2   4   8 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    1   1   3 
  16    1   1   3 
  32    1   1   3 
  64    1   1   3 
  128    1   2   4 
  256    1   2   7 
  512    3   4   11 
  1024    2   4   11 
  2048    2   4   11 
  4096    2   4   11 
  8192    4   10   11 

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:02:40 EDT.
2985 accesses since 1/2/96.