************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./m3dp_fsymm_opt.x on a cray-xt3_ named . with 4096 processors, by Unknown Mon Nov 6 11:39:31 2006 Using Petsc Release Version 2.3.0, Patch 21, April, 26, 2005 Max Max/Min Avg Total Time (sec): 3.608e+03 1.00022 3.607e+03 Objects: 2.234e+03 1.00000 2.234e+03 Flops: 1.660e+11 1.02146 1.643e+11 6.730e+14 Flops/sec: 4.603e+07 1.02156 4.555e+07 1.866e+11 MPI Messages: 3.263e+06 5.02354 8.143e+05 3.335e+09 MPI Message Lengths: 1.526e+09 1.53521 1.829e+03 6.099e+12 MPI Reductions: 8.055e+01 1.00979 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.8144e-05 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: Starting: 2.7536e+01 0.8% 3.0105e+10 0.0% 4.312e+06 0.1% 3.130e+00 0.2% 5.900e+02 0.2% 2: TimeStepping: 3.5799e+03 99.2% 6.7301e+14 100.0% 3.331e+09 99.9% 1.826e+03 99.8% 3.281e+05 99.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: Starting VecScale 16 1.0 7.7009e-04 1.1 6.57e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 6 0 0 0 2497521 VecSet 33 1.0 2.5291e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyBegin 56 1.0 3.8890e-0141.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+02 0 0 0 0 0 0 0 0 0 28 0 VecAssemblyEnd 56 1.0 1.9121e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 20 1.0 2.4076e-03 1.1 3.47e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 11 0 0 0 1331450 VecScatterBegin 212 1.0 1.8757e-02 3.5 0.00e+00 0.0 3.3e+06 1.8e+03 0.0e+00 0 0 0 0 0 0 0 76 57 0 0 VecScatterEnd 212 1.0 1.4751e-01 6.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 12 1.0 2.0011e-02 1.1 3.36e+08 1.1 1.8e+05 1.8e+03 0.0e+00 0 0 0 0 0 0 83 4 3 0 1248070 MatConvert 25 1.0 1.1215e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 52 1.0 9.7070e-02 9.7 0.00e+00 0.0 5.5e+05 9.3e+00 1.0e+02 0 0 0 0 0 0 0 13 0 18 0 MatAssemblyEnd 52 1.0 1.2826e-01 1.0 0.00e+00 0.0 4.3e+05 1.1e+03 2.2e+02 0 0 0 0 0 0 0 10 5 37 0 MatGetRow 37503750.0 5.1417e-0356.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 2: TimeStepping M3d Solver 1307 1.0 2.5026e+03 1.0 2.49e+07 1.0 1.1e+09 1.8e+03 2.1e+05 68 37 34 34 64 69 37 34 34 64 99165 M3d Par 61551 1.0 1.8292e+02 1.6 0.00e+00 0.0 2.4e+09 1.8e+03 0.0e+00 4 0 71 71 0 4 0 71 71 0 0 VecMax 444 1.0 9.1482e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 0 0 0 0 0 0 0 VecMin 16 1.0 3.3209e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 VecDot 117896 1.0 2.9368e+01 1.5 4.66e+08 1.5 0.0e+00 0.0e+00 1.2e+05 1 6 0 0 36 1 6 0 0 36 1278142 VecNorm 61158 1.0 2.6013e+01 1.1 2.05e+08 1.1 0.0e+00 0.0e+00 6.1e+04 1 3 0 0 18 1 3 0 0 19 748751 VecScale 114924 1.0 3.3188e+0023.1 2.31e+0923.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 407018 VecCopy 42958 1.0 8.8328e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 262244 1.0 1.5963e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 204204 1.0 3.2802e+01 1.1 5.86e+08 1.1 0.0e+00 0.0e+00 0.0e+00 1 10 0 0 0 1 10 0 0 0 2120184 VecAYPX 50714 1.0 7.8883e+00 1.4 7.02e+08 1.4 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 2044699 VecAssemblyBegin 6440 1.0 2.6850e+00 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.9e+04 0 0 0 0 6 0 0 0 0 6 0 VecAssemblyEnd 6440 1.0 1.0243e+00-1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 34065 1.0 8.4336e+00 1.4 2.19e+08 1.4 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 643407 VecScatterBegin 217258 1.0 2.0723e+01 5.0 0.00e+00 0.0 3.3e+09 1.8e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 217258 1.0 1.0335e+02 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatMult 59558 1.0 1.0393e+02 1.1 3.16e+08 1.1 9.1e+08 1.8e+03 0.0e+00 3 18 27 27 0 3 18 27 27 0 1184756 MatMultAdd 4028 1.0 6.7853e+00 1.3 4.11e+08 1.3 6.2e+07 1.8e+03 0.0e+00 0 1 2 2 0 0 1 2 2 0 1330680 MatCopy 5228 1.0 1.0788e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 5228 1.0 6.6452e+00 1.2 2.54e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 881765 MatAssemblyBegin 4008 1.0 1.1102e+0126.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+03 0 0 0 0 2 0 0 0 0 2 0 MatAssemblyEnd 4008 1.0 4.7701e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+03 0 0 0 0 1 0 0 0 0 1 0 MatGetRow 790428 1.0 2.2484e+00-1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 32 1.0 5.8842e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 5228 1.0 2.4310e+03 1.0 2.36e+07 1.0 8.5e+08 1.8e+03 1.8e+05 67 34 26 26 54 67 34 26 26 54 93973 PCSetUp 32 1.0 7.8594e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 22 0 0 0 0 22 0 0 0 0 0 PCApply 61158 1.0 1.4768e+03 1.0 7.57e+05 1.1 0.0e+00 0.0e+00 1.2e+01 40 1 0 0 0 40 1 0 0 0 2969 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage --- Event Stage 1: Starting Index Set 66 64 93664 0 Map 856 129 40248 0 Vec 767 29 9181168 0 Vec Scatter 57 0 0 0 IS Local to global mapping 3 1 320672 0 Application Order 1 0 0 0 Matrix 212 50 0 0 Krylov Solver 56 0 0 0 Preconditioner 56 0 0 0 --- Event Stage 2: TimeStepping Map 32 20 6240 0 Vec 128 20 6331840 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 9.99928e-05 Average time for zero size MPI_Send(): 7.19581e-06 Compiled without FORTRAN kernels Compiled with double precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 Configure run at: Fri Oct 7 16:29:43 2005 Configure options: --with-memcmp-ok --sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 --sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 --bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with-batch=1 --with-shared=0 --with-cc="cc --target=catamount" --with-cxx="CC --target=catamount" --with-fc="ftn --target=catamount" --with-blas-lib=acml --with-lapack-lib=acml --with-debugging=0 --COPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --CXXOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --FOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" -PETSC_ARCH=cray-xt3_fast ----------------------------------------- Libraries compiled on Thu Aug 24 15:02:33 EDT 2006 on jaguar8 Machine characteristics: Linux jaguar8 2.6.5-7.252-ss #6 Mon Jul 31 18:05:34 PDT 2006 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /apps/PETSC/petsc-2.3.0 Using PETSc arch: cray-xt3_fast ----------------------------------------- Using C compiler: cc --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 Using Fortran compiler: ftn --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 ----------------------------------------- Using include paths: -I/apps/PETSC/petsc-2.3.0 -I/apps/PETSC/petsc-2.3.0/bmake/cray-xt3_fast -I/apps/PETSC/petsc-2.3.0/include ------------------------------------------ Using C linker: cc --target=catamount -O2 Using Fortran linker: ftn --target=catamount -O2 Using libraries: -Wl,-rpath,/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -L/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,/spin/apps/HYPRE/hypre-1.10.0b/cray-xt3/lib -L/spin/apps/HYPRE/hypre-1.10.0b/cray-xt3/lib -lHYPRE_DistributedMatrix -lHYPRE_DistributedMatrixPilutSolver -lHYPRE_Euclid -lHYPRE_IJ_mv -lHYPRE_LSI -lHYPRE_MatrixMatrix -lHYPRE_ParaSails -lHYPRE_krylov -lHYPRE_parcsr_ls -lHYPRE_parcsr_mv -lHYPRE_seq_mv -lHYPRE_sstruct_ls -lHYPRE_sstruct_mv -lHYPRE_struct_ls -lHYPRE_struct_mv -lrt -lacml -lacml -L/opt/acml/3.0/pgi64/lib/cray/cnos64 -L/opt/xt-mpt/default/mpich2-64/P2/lib -L/opt/acml/3.0/pgi64/lib -L/opt/xt-libsci/default/pgi/cnos64/lib -L/opt/xt-mpt/default/sma/lib -L/opt/xt-lustre-ss/default/lib64 -L/opt/xt-catamount/default/lib/cnos64 -L/opt/xt-pe/default/lib/cnos64 -L/opt/xt-libc/default/amd64/lib -L/opt/xt-os/default/lib/cnos64 -L/opt/xt-service/default/lib/cnos64 -L/opt/pgi/default/linux86-64/default/lib -L/opt/gcc/3.2.3/lib/gcc-lib/x86_64-suse-linux/3.2.3/ -llapacktimers -lsci -lmpichf90 -lmpich -lacml -llustre -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgc -lm -lcatamount -lsysio -lportals -lC -lcrtend ------------------------------------------