************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./m3dp_fsymm_opt.x on a cray-xt3_ named . with 160 processors, by Unknown Fri Sep 29 09:41:10 2006 Using Petsc Release Version 2.3.0, Patch 21, April, 26, 2005 Max Max/Min Avg Total Time (sec): 1.191e+03 1.00004 1.191e+03 Objects: 2.234e+03 1.00000 2.234e+03 Flops: 9.361e+10 1.00361 9.346e+10 1.495e+13 Flops/sec: 7.858e+07 1.00362 7.845e+07 1.255e+10 MPI Messages: 7.328e+05 1.50766 6.087e+05 9.739e+07 MPI Message Lengths: 7.823e+08 1.00720 1.280e+03 1.246e+11 MPI Reductions: 1.348e+03 1.00930 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 7.4385e-05 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: Starting: 2.9177e+00 0.2% 6.0307e+08 0.0% 1.450e+05 0.1% 2.178e+00 0.2% 5.900e+02 0.3% 2: TimeStepping: 1.1884e+03 99.8% 1.4953e+13 100.0% 9.725e+07 99.9% 1.277e+03 99.8% 2.141e+05 99.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: Starting VecScale 16 1.0 3.8528e-04 1.0 6.57e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 6 0 0 0 100166 VecSet 33 1.0 1.3237e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyBegin 56 1.0 3.6373e-02 6.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+02 0 0 0 0 0 0 0 0 0 28 0 VecAssemblyEnd 56 1.0 1.3828e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 20 1.0 1.3635e-03 1.1 3.30e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 11 0 0 0 47173 VecScatterBegin 272 1.0 7.4069e-03-0.0 0.00e+00 0.0 1.1e+05 1.3e+03 0.0e+00 -0 0 0 0 0 -0 0 75 66 0 0 VecScatterEnd 272 1.0 2.3654e-02-0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 12 1.0 1.0212e-02 1.0 3.14e+08 1.0 4.8e+03 1.3e+03 0.0e+00 0 0 0 0 0 0 83 3 3 0 48976 MatConvert 25 1.0 5.7813e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 MatAssemblyBegin 52 1.0 3.1039e-02 8.0 0.00e+00 0.0 2.3e+04 9.3e+00 1.0e+02 0 0 0 0 0 1 0 16 0 18 0 MatAssemblyEnd 52 1.0 6.4727e-02 1.0 0.00e+00 0.0 1.1e+04 7.2e+02 2.2e+02 0 0 0 0 0 2 0 8 4 37 0 MatGetRow 5001 1.0 3.5343e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 2: TimeStepping M3d Solver 1201 1.0 6.6599e+02 1.0 3.69e+07 1.0 2.4e+07 1.3e+03 1.7e+05 55 26 25 24 77 55 26 25 24 77 5746 M3d-Par 77476 1.0 8.1936e+01 1.4 0.00e+00 0.0 7.7e+07 1.3e+03 0.0e+00 6 0 79 80 0 6 0 79 80 0 0 VecMax 1224 1.0 9.4970e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 1 0 0 0 0 1 0 VecMin 408 1.0 3.3467e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.1e+02 0 0 0 0 0 0 0 0 0 0 0 VecDot 90428 1.0 9.1109e+00 1.3 5.14e+08 1.3 0.0e+00 0.0e+00 9.0e+04 1 4 0 0 42 1 4 0 0 42 63421 VecNorm 47214 1.0 1.0465e+01 1.2 2.22e+08 1.2 0.0e+00 0.0e+00 4.7e+04 1 2 0 0 22 1 2 0 0 22 28838 VecScale 39764 1.0 5.4106e-01 1.0 2.92e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 45197 VecCopy 48696 1.0 4.0605e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 215492 1.0 8.2489e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 210060 1.0 1.3823e+01 1.3 8.58e+08 1.3 0.0e+00 0.0e+00 0.0e+00 1 10 0 0 0 1 10 0 0 0 105277 VecAYPX 37606 1.0 2.2572e+00 1.9 1.23e+09 1.8 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 106320 VecAssemblyBegin 5600 1.0 2.9991e+00 5.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+04 0 0 0 0 8 0 0 0 0 8 0 VecAssemblyEnd 5600 1.0 2.5605e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 25434 1.0 2.5509e+00 1.0 2.04e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 32065 VecScatterBegin 242694 1.0 7.4044e+00 2.3 0.00e+00 0.0 9.7e+07 1.3e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 242694 1.0 2.5525e+01 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatMult 46018 1.0 4.0437e+01 1.1 3.09e+08 1.0 1.8e+07 1.3e+03 0.0e+00 3 13 19 19 0 3 13 19 19 0 47128 MatMultAdd 3604 1.0 3.5068e+00 1.0 2.93e+08 1.0 1.4e+06 1.3e+03 0.0e+00 0 1 1 1 0 0 1 1 1 0 46140 MatCopy 4804 1.0 5.4352e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 4804 1.0 2.4118e+00 1.0 2.84e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 44714 MatAssemblyBegin 3604 1.0 6.9635e+0023.5 0.00e+00 0.0 2.9e+05 6.7e+00 7.2e+03 0 0 0 0 3 0 0 0 0 3 0 MatAssemblyEnd 3604 1.0 1.7671e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 2 0 0 0 0 2 0 MatGetRow 402824 1.0 1.1429e+00-1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 32 1.0 3.5076e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 4804 1.0 6.2979e+02 1.0 3.48e+07 1.0 1.7e+07 1.3e+03 1.4e+05 52 23 17 17 63 52 23 17 17 63 5505 PCSetUp 32 1.0 1.0044e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 8 0 0 0 0 8 0 0 0 0 0 PCApply 47214 1.0 4.6671e+02 1.0 7.61e+05 1.0 0.0e+00 0.0e+00 1.2e+01 39 0 0 0 0 39 0 0 0 0 119 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage --- Event Stage 1: Starting Index Set 66 64 72800 0 Map 856 129 40248 0 Vec 767 29 4685008 0 Vec Scatter 57 0 0 0 IS Local to global mapping 3 1 403924 0 Application Order 1 0 0 0 Matrix 212 50 0 0 Krylov Solver 56 0 0 0 Preconditioner 56 0 0 0 --- Event Stage 2: TimeStepping Map 32 20 6240 0 Vec 128 20 3231040 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 7.5388e-05 Average time for zero size MPI_Send(): 7.60704e-06 Compiled without FORTRAN kernels Compiled with double precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 Configure run at: Fri Oct 7 16:29:43 2005 Configure options: --with-memcmp-ok --sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 --sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 --bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with-batch=1 --with-shared=0 --with-cc="cc --target=catamount" --with-cxx="CC --target=catamount" --with-fc="ftn --target=catamount" --with-blas-lib=acml --with-lapack-lib=acml --with-debugging=0 --COPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --CXXOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --FOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" -PETSC_ARCH=cray-xt3_fast ----------------------------------------- Libraries compiled on Thu Aug 24 15:02:33 EDT 2006 on jaguar8 Machine characteristics: Linux jaguar8 2.6.5-7.252-ss #6 Mon Jul 31 18:05:34 PDT 2006 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /apps/PETSC/petsc-2.3.0 Using PETSc arch: cray-xt3_fast ----------------------------------------- Using C compiler: cc --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 Using Fortran compiler: ftn --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 ----------------------------------------- Using include paths: -I/apps/PETSC/petsc-2.3.0 -I/apps/PETSC/petsc-2.3.0/bmake/cray-xt3_fast -I/apps/PETSC/petsc-2.3.0/include ------------------------------------------ Using C linker: cc --target=catamount -O2 Using Fortran linker: ftn --target=catamount -O2 Using libraries: -Wl,-rpath,/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -L/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,/spin/apps/HYPRE/hypre-1.10.0b/cray-xt3/lib -L/spin/apps/HYPRE/hypre-1.10.0b/cray-xt3/lib -lHYPRE_DistributedMatrix -lHYPRE_DistributedMatrixPilutSolver -lHYPRE_Euclid -lHYPRE_IJ_mv -lHYPRE_LSI -lHYPRE_MatrixMatrix -lHYPRE_ParaSails -lHYPRE_krylov -lHYPRE_parcsr_ls -lHYPRE_parcsr_mv -lHYPRE_seq_mv -lHYPRE_sstruct_ls -lHYPRE_sstruct_mv -lHYPRE_struct_ls -lHYPRE_struct_mv -lrt -lacml -lacml -L/opt/acml/3.0/pgi64/lib/cray/cnos64 -L/opt/xt-mpt/default/mpich2-64/P2/lib -L/opt/acml/3.0/pgi64/lib -L/opt/xt-libsci/default/pgi/cnos64/lib -L/opt/xt-mpt/default/sma/lib -L/opt/xt-lustre-ss/default/lib64 -L/opt/xt-catamount/default/lib/cnos64 -L/opt/xt-pe/default/lib/cnos64 -L/opt/xt-libc/default/amd64/lib -L/opt/xt-os/default/lib/cnos64 -L/opt/xt-service/default/lib/cnos64 -L/opt/pgi/default/linux86-64/default/lib -L/opt/gcc/3.2.3/lib/gcc-lib/x86_64-suse-linux/3.2.3/ -llapacktimers -lsci -lmpichf90 -lmpich -lacml -llustre -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgc -lm -lcatamount -lsysio -lportals -lC -lcrtend ------------------------------------------