************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./m3dp.x on a cray-xt3_ named ¸Uj with 512 processors, by Unknown Sat Jun 10 01:44:47 2006 Using Petsc Release Version 2.3.0, Patch 21, April, 26, 2005 Max Max/Min Avg Total Time (sec): 5.588e+03 1.00004 5.588e+03 Objects: 6.018e+04 1.00000 6.018e+04 Flops: 9.202e+11 1.14093 8.714e+11 4.462e+14 Flops/sec: 1.647e+08 1.14090 1.559e+08 7.984e+10 MPI Messages: 2.915e+06 1.65886 2.331e+06 1.193e+09 MPI Message Lengths: 3.225e+09 1.10718 1.326e+03 1.582e+12 MPI Reductions: 1.839e+03 1.10838 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.5278e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: Starting: 2.9401e+00 0.1% 8.6283e+08 0.0% 3.493e+05 0.0% 5.337e-01 0.0% 3.660e+02 0.0% 2: TimeStepping: 5.5852e+03 99.9% 4.4617e+14 100.0% 1.193e+09 100.0% 1.325e+03 100.0% 9.022e+05 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: Starting VecScale 16 1.0 4.5395e-04 1.0 5.55e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 14 0 0 0 272048 VecSet 36 1.0 1.4744e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyBegin 56 1.0 1.2308e-0122.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+02 0 0 0 0 0 1 0 0 0 46 0 VecAssemblyEnd 56 1.0 1.8954e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 20 1.0 1.5481e-03 1.1 2.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 24 0 0 0 132958 VecScatterBegin 256 1.0 7.4687e-03 2.0 0.00e+00 0.0 3.3e+05 1.3e+03 0.0e+00 0 0 0 0 0 0 0 94 66 0 0 VecScatterEnd 256 1.0 8.1380e-0221.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 4 1.0 3.8230e-03 1.0 2.81e+08 1.0 5.1e+03 1.3e+03 0.0e+00 0 0 0 0 0 0 62 1 1 0 139550 MatConvert 9 1.0 2.0526e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatAssemblyBegin 20 1.0 2.3590e-0217.8 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+01 0 0 0 0 0 0 0 0 0 11 0 MatAssemblyEnd 20 1.0 2.8718e-02 1.0 0.00e+00 0.0 1.5e+04 7.2e+02 9.2e+01 0 0 0 0 0 1 0 4 2 25 0 MatGetRow 161009 1.0 6.0809e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 --- Event Stage 2: TimeStepping M3d Solver 1201 1.0 4.1260e+03 1.2 2.20e+08 1.1 9.9e+08 1.3e+03 7.8e+05 68 93 83 83 87 68 93 83 83 87 100191 M3d-Par 66032 1.0 8.1691e+02 2.4 0.00e+00 0.0 2.2e+08 1.3e+03 0.0e+00 9 0 18 18 0 9 0 18 18 0 0 VecMax 1224 1.0 1.4762e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 0 0 0 0 0 0 0 VecMin 408 1.0 4.9033e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.1e+02 0 0 0 0 0 0 0 0 0 0 0 VecMDot 368668 1.1 7.6778e+02 2.0 5.09e+08 1.9 0.0e+00 0.0e+00 3.5e+05 10 24 0 0 39 10 24 0 0 39 137774 VecNorm 389128 1.1 1.1836e+02 1.3 1.59e+08 1.2 0.0e+00 0.0e+00 3.7e+05 2 2 0 0 41 2 2 0 0 41 64232 VecScale 490036 1.1 1.5727e+01 1.3 5.78e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 240395 VecCopy 31856 1.0 3.6184e+00 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 995146 1.1 2.8270e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 42112 1.1 3.8459e+00 2.4 1.15e+09 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 247416 VecMAXPY 384324 1.1 3.1351e+02 1.2 7.65e+08 1.0 0.0e+00 0.0e+00 0.0e+00 5 25 0 0 0 5 25 0 0 0 360365 VecPointwiseMult 7434 1.0 1.1963e+00 6.4 7.97e+08 6.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 63950 VecScatterBegin 1335568 1.1 1.3075e+02 1.3 0.00e+00 0.0 1.2e+09 1.3e+03 0.0e+00 2 0 97 94 0 2 0 97 94 0 0 VecScatterEnd 1335568 1.1 1.6310e+02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecNormalize 384324 1.1 1.2730e+02 1.3 2.11e+08 1.2 0.0e+00 0.0e+00 3.6e+05 2 3 0 0 40 2 3 0 0 40 88417 MatMult 387128 1.1 4.1803e+02 1.3 2.90e+08 1.2 4.7e+08 1.3e+03 0.0e+00 6 11 39 38 0 6 11 39 38 0 117212 MatSolve 389128 1.1 2.2423e+03 1.3 1.41e+08 1.2 0.0e+00 0.0e+00 0.0e+00 35 29 0 0 0 35 29 0 0 0 56824 MatLUFactorNum 4804 1.0 9.2263e+01 1.3 1.73e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 70614 MatILUFactorSym 4804 1.0 1.8179e+02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.8e+03 2 0 0 0 1 2 0 0 0 1 0 MatCopy 4804 1.0 1.0701e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+04 2 0 0 0 2 2 0 0 0 2 0 MatAssemblyBegin 13212 1.0 1.9299e+0162.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+04 0 0 0 0 2 0 0 0 0 2 0 MatAssemblyEnd 13212 1.0 8.4849e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.4e+03 0 0 0 0 1 0 0 0 0 1 0 MatGetRow 96566008 1.0 5.1696e+01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetSubMatrice 4804 1.0 4.5104e+01 1.1 0.00e+00 0.0 3.1e+07 3.0e+03 1.4e+04 1 0 3 6 2 1 0 3 6 2 0 MatGetOrdering 4804 1.0 1.1766e+00 7.1 0.00e+00 0.0 0.0e+00 0.0e+00 9.6e+03 0 0 0 0 1 0 0 0 0 1 0 MatIncreaseOvrlp 32 1.0 6.3041e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.4e+01 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 4804 1.0 8.7557e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 368668 1.1 1.0571e+03 1.6 6.05e+08 1.5 0.0e+00 0.0e+00 3.5e+05 15 47 0 0 39 15 47 0 0 39 200139 KSPSetup 9608 1.0 1.0135e+00-1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 4804 1.0 3.9817e+03 1.2 2.28e+08 1.1 9.7e+08 1.3e+03 7.5e+05 66 93 81 82 83 66 93 81 82 83 103795 PCSetUp 9608 1.0 3.1571e+02 1.3 5.30e+07 1.3 3.1e+07 2.9e+03 3.4e+04 5 1 3 6 4 5 1 3 6 4 20636 PCSetUpOnBlocks 4804 1.0 2.7004e+02 1.4 6.54e+07 1.3 0.0e+00 0.0e+00 1.4e+04 4 1 0 0 2 4 1 0 0 2 24126 PCApply 389128 1.1 2.3936e+03 1.3 1.30e+08 1.2 4.7e+08 1.3e+03 0.0e+00 37 29 40 38 0 37 29 40 38 0 53231 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage --- Event Stage 1: Starting Index Set 34 32 36384 0 Map 536 49 15288 0 Vec 687 13 2100176 0 Vec Scatter 25 0 0 0 IS Local to global mapping 3 1 403924 0 Application Order 1 0 0 0 Matrix 84 18 0 0 Krylov Solver 56 0 0 0 Preconditioner 56 0 0 0 --- Event Stage 2: TimeStepping Index Set 14508 14380 1178857696 0 Map 24116 23924 7464288 0 Vec 10368 9640 1557361280 0 Vec Scatter 32 0 0 0 Matrix 9608 9544 -2147483648 0 Krylov Solver 32 0 0 0 Preconditioner 32 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 9.11713e-05 Average time for zero size MPI_Send(): 7.89249e-06 Compiled without FORTRAN kernels Compiled with double precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 Configure run at: Fri Oct 7 16:29:43 2005 Configure options: --with-memcmp-ok --sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 --sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 --bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with-batch=1 --with-shared=0 --with-cc="cc --target=catamount" --with-cxx="CC --target=catamount" --with-fc="ftn --target=catamount" --with-blas-lib=acml --with-lapack-lib=acml --with-debugging=0 --COPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --CXXOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --FOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" -PETSC_ARCH=cray-xt3_fast ----------------------------------------- Libraries compiled on Fri Mar 31 16:27:33 EST 2006 on jaguar4 Machine characteristics: Linux jaguar4 2.4.21-0-sles9-ss-lustre #1 Wed Mar 1 15:31:08 PST 2006 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /spin/apps/PETSC/petsc-2.3.0 Using PETSc arch: cray-xt3_fast ----------------------------------------- Using C compiler: cc --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 Using Fortran compiler: ftn --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 ----------------------------------------- Using include paths: -I/spin/apps/PETSC/petsc-2.3.0 -I/spin/apps/PETSC/petsc-2.3.0/bmake/cray-xt3_fast -I/spin/apps/PETSC/petsc-2.3.0/include ------------------------------------------ Using C linker: cc --target=catamount -O2 Using Fortran linker: ftn --target=catamount -O2 Using libraries: -Wl,-rpath,/spin/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -L/spin/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -lrt -lacml -lacml -L/opt/acml/2.6/pgi64/lib/cray/cnos64 -L/opt/xt-mpt/default/mpich2-64/P2/lib -L/opt/acml/2.6/pgi64/lib -L/opt/xt-libsci/default/pgi/cnos64/lib -L/opt/xt-mpt/default/sma/lib -L/opt/xt-lustre-ss/default/lib64 -L/opt/xt-catamount/default/lib/cnos64 -L/opt/xt-pe/default/lib/cnos64 -L/opt/xt-libc/default/amd64/lib -L/opt/xt-os/default/lib/cnos64 -L/opt/xt-service/default/lib/cnos64 -L/opt/pgi/default/linux86-64/default/lib -L/opt/gcc/3.2.3/lib/gcc-lib/x86_64-suse-linux/3.2.3/ -llapacktimers -lsci -lmpichf90 -lmpich -lacml -llustre -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgc -lm -lcatamount -lsysio -lportals -lC -lcrtend ------------------------------------------