- Symptom:
When building a PETSc application (or example) using BOPT=O_complex or any optimized build
under linux using the Version 8.0 of the Intel C/C++ compiler
before patch 066, typically using PETSC_ARCH=linux_intel, the following message may be encountered:
/home/petsc/petsc-2.1.6/lib/libO_complex/linux_intel/libpetscfortran.a(zksp.o):
In function `kspview_': zksp.o(.text+0x156):
undefined reference to `PETSC_VIEWER_SOCKET_(int)'
In make_log_linux_intel_O_complex, there is another clue as to the problem:
libfast in: /home/petsc/petsc-2.1.6/src/sys/src/viewer/impls/socket
send.c(100): error: Unsupported feature: Gnu statement expressions in C++
sa.sin_port = htons((u_short) portnum);
^
send.c(100): error: expected a ")"
sa.sin_port = htons((u_short) portnum);
^
send.c(100): error: expected a ";"
sa.sin_port = htons((u_short) portnum);
^
compilation aborted for send.c (code 2)
ar: send.o: No such file or directory
Problem: The Intel compiler (Version 8.0, before patch 066) has attempted to include
GNU style inlined assembly instructions from a linux header file, which it does not interpret properly.
Cure: There are two solutions. The most appropriate solution is
to obtain and install patch 066 or later for Version 8.0 of the Intel C/C++ compiler. If this is not practical, then you
can remove references to -use_masm from the $PETSC_DIR/bmake/$PETSC_ARCH/variables file (typically PETSC_ARCH=linux_intel),
and in petscconf.h from the same directory, replace the line: #define PETSC_HAVE_SSE "iclsse.h" with #define PETSC_HAVE_SSE "gccsse.h"
- Symptom:
alder> make BOPT=g
ex21 cc -Dsun4 -g -o ex21 ex21.o affine3d.o
../../libs/libsg/sun4/domain.a ../../libs/libsg/sun4/Xtools.a
../../libs/libsg/sun4/tools.a ../../libs/libsg/sun4/liblapack.a
/../libs/libsg/sun4/blas.a ../../libs/libsg/sun4/system.a -lX11
-lm
ld: Undefined symbol _s_cmp
_e_wsfe _do_f_out _s_copy _s_wsFe _s_stop
Compilation failed *** Error code 2
Problem: You
are attempting to link a program that uses Fortran libraries
that are not being found by the linker.
Cure: Try
setting FC_LIB in bmake/$PETSC_ARCH/packages to the appropriate
library. Several examples of missing routines are listed below.
Try searching this file for your missing routine names. If you
do not find them, then to help find the missing library use the UNIX
command "nm -o", (nm -Bo on IRIX) which can be used to search
libraries for particular routines. We demonstrate an example of
this below:
eagle>cd /usr/lang/SC1.0.1/
eagle>nm -o libF77.a libF77.so* | grep do_f_out
libF77.a:dfe.o: U _do_f_out
libF77.a:ioinit.o: U _do_f_out
libF77.a:iio.o: U _do_f_out
libF77.a:sfe.o: U _do_f_out
libF77.a:dofio.o:00000000 T _do_f_out
libF77.so.1.4.1:0003ed90 T _do_f_out
eagle>
From the line T _do_f_out we see that the routine is defined in the
library libF77.a. (and its shared lib counterpart
libF77.so.1.4.1).
If you do not have a clue of the library in which the
routine is defined, you can try something like the following:
eagle>foreach i (*.a *.so*)
foreach? echo $i foreach?
nm -o $i | grep do_f_out
foreach? end
libF77.a
libF77.a:dfe.o: U _do_f_out
libF77.a:ioinit.o: U _do_f_out
libF77.a:iio.o: U _do_f_out
libF77.a:sfe.o: U _do_f_out
libF77.a:dofio.o:00000000 T _do_f_out
libF77_p.a libF77_p.a:dfe.o: U _do_f_out l
ibF77_p.a:ioinit.o: U _do_f_out
libF77_p.a:iio.o: U _do_f_out
libF77_p.a:sfe.o: U _do_f_out
libF77_p.a:dofio.o:00000000 T _do_f_out
libFxview.a libV77.a
libV77_p.a
libm.a
libm_p.a
libpfc.a
libpfc_p.a
libF77.so.1.4.1
libF77.so.1.4.1:0003ed90 T _do_f_out
libV77.so.1.1
libpfc.so.1.1
eagle>
Here we see the routine is used and defined in both libF77.a
libF77.so.1.4.1 and libF77_p.a but not used or defined in any
of the other libraries. On Sun systems the Fortran libraries
are usually hidden in directories like /usr/lang/SC1.0.1 or
/usr/lang/SC3.0; also check /usr/lib.
- Symptom: Corrupt argument:
Problem:
An argument to a function is invalid. In Fortran this may be caused by forgeting to list an argument in the call,
especially the final ierr. Otherwise it is usually caused by memory corruption; that is somewhere the code is
writing out of array bounds. To track this down rerun the BOPT=g (or g_c++) version of the code with the option
-trdebug. Occasionally the code may crash only with the BOPT=O (or O_c++) version, in that case run the optimized
version with -trdebug. If you determine the problem is from memory corruption you can put the macro MEMCHKQ in the
code near the crash to determine exactly what line is causing the problem.
Cure:
- Symptom: Detected zero pivot in LU factorization:
Problem:
A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that the matrix is singular.
You can use -pc_[type]_shift, -[level]_pc_[type]_shift to prevent the zero pivot. [type] is lu, ilu, cholesky, or icc and
[level] is sub is for a block in the bjacobi or ASM preconditioner and -mglevels and -mgcoarse are
for inside multigrid smoothers or the coarse grid solver). See PCLUSetShift(), PCILUSetShift(), PCCholeskySetShift() or
PCICCSetShift().
This error can also happen if your matrix is singular , see KSPSetNullSpace() for how to handle this.
If this error occurs in the zeroth row of the matrix it is likely you have an error in the code
that generates the matrix.
Cure:
- Symptom: alder>make BOPT=g
make: Warning: Can't find `../../bmake/.g':
/Users/barrysmith/petsc-dev/bmake/common/variables:176:
/Users/barrysmith/petsc-dev/bmake//variables: No such file or directory
Problem: You
have not set the variable PETSC_ARCH to the architecture of your
machine (e.g., sun4, rs6000).
Cure: Include
in your .cshrc file some code to
set it automatically. Or remember to include the PETSC_ARCH in the
command line every time you use make. For instance, make BOPT=g
PETSC_ARCH=sun4 example4
- Symptom: alder>make BOPT=g ex1
make: Warning: Can't find `/home/joe/bmake/sun4.g':
No such file or directory make: Fatal error in reader: makefile, line
33: Read of include file `/home/joe/bmake/sun4.g' failed
Problem: The
variable in the makefile, PETSC_DIR does not point to the PETSc
directory; in this case it points to the directory /home/joe.
Cure: Make
sure the variable PETSC_DIR in the makefile points to the PETSc
directory. Be aware that at many sites, your home directory may
have different names on different machines so it is usually better to
make the path relative, rather than absolute. That is, use
PETSC_DIR = ../../petsc rather than PETSC_DIR =
/c/cafa/username/petsc.
- Symptom: alder> make BOPT=g
ex1 f77 -g -o ex1 ex1.o affine2d.o \ ../../libs/libsg/sun4/domain.a
../../libs/libsg/sun4/Xtools.a ../../libs/libsg/sun4/tools.a
../../libs/libsg/sun4/liblapack.a ../../libs/libsg/sun4/blas.a
../../libs/libsg/sun4/system.a -lX11 -lm
ld: ex1.o: bad magic number Compilation failed *** Error code 4
Problem: The
file ex1.o was compiled on a different architecture or with a
different compiler.
Cure: Remove
all .o files and recompile from scatch.
- Symptom: alder>make BOPT = g ex1
make: Fatal error: Don't know how to make target `BOPT'
Problem: When
using command line options with make, do not place spaces on either
side of the ``='' signs.
Cure: Use the
following command (with no extra spaces): alder>make BOPT=g
ex1
- Symptom: alder> make BOPT=g ex21
cc -Dsun4 -g -o ex21 ex21.o affine3d.o ../../libs/libsg/sun4/domain.a
../../libs/libsg/sun4/Xtools.a ../../libs/libsg/sun4/tools.a
../../libs/libsg/sun4/liblapack.a ../../libs/libsg/sun4/blas.a
../../libs/libsg/sun4/system.a -lX11 /usr/lang/SC1.0.1/libF77.a -lm
ld: Undefined symbol ___class_quadruple Compilation failed *** Error
code 2
Problem: You
are attempting to link a program which uses Fortran libraries
which are not being found by the linker.
Cure: See
(1).
- Symptom:
cannot find include file:
Problem: The
standard X11 files are not in the usual place, /usr/include.
Cure: Make
sure the file ${PETSC_DIR}/bmake/${PETSC_ARCH}/packages has the
correct location of the X11 include files and libraries; for
instance it may have X11_INCLUDE = -I/usr/openwin/include X11_LIB
= /usr/openwin/lib/libX11.a
- Symptom: f77 -g -o ex1 ex1.o affine2d.o \
../../libs/libsg/sun4/domain.a ../../libs/libsg/sun4/Xtools.a
../../libs/libsg/sun4/tools.a ../../libs/libsg/sun4/liblapack.a
../../libs/libsg/sun4/blas.a ../../libs/libsg/sun4/system.a -lm
ld: Undefined symbol _XCreateColormap _XGetWMName _XSetWMName
_XAllocColor _XGetImage _XSetStandardProperties _XQueryFont
_XGetGeometry ....
Problem: The
standard X libraries are not being found.
Cure: Make
sure the file ${PETSC_DIR}/bmake/${PETSC_ARCH}/packages has the
correct location of the X11 include files and libraries; for
instance it may have X11_INCLUDE = -I/usr/openwin/include X11_LIB
= /usr/openwin/lib/libX11.a
- Symptom: ld: Undefined symbol _dpotrs_
Problem: This
is a routine within LAPACK. Either you are not linking the LAPACK
library, or your LAPACK library is incomplete.
Cure: Install
the entire LAPACK library, available via netlib. See the file
${PETSC_DIR}/docs/website/documentation/installation.html for information on
retrieving LAPACK.
- Symptom: ld: Undefined symbol ___s_stop ___ansi_fflush
Problem:
These are Fortran system calls.
Cure: See
(1). On Sun Sparcstations you might try including the libraries
/usr/lang/SC3.0.1/lib/libansi.a or
/usr/lang/SC2.0.1/lib/libansi.a, etc. depending on the compiler version
you are using. Include them in the variable FC_LIB in the file
${PETSC_DIR}/bmake/${PETSC_ARCH}/packages
- Symptom: On
the IBM RS/6000
0706-317 ERROR: Unresolved or undefined symbols detected:
Symbols in error (followed by references) are dumped s in error
(followed by references) are dumped to the load map. The
-bloadmap: option will create a load map. .__divss .__mulh
Problem:
These are Fortran system calls, which are linked by the Fortran
linker but not the C linker.
Cure: In
petsc/bmake/rs6000/rs6000 make sure the line that defines
CLINKER includes -bI:/usr/lpp/xlf/lib/lowsys.exp ... Also, see
(1).
- Symptom: On
the Sun running Solaris
Undefined first referenced symbol in file __pow_di
bsmith/lapack/solaris/liblapack.a(dlamch.o)
Problem:
These are Fortran library calls, which are linked by the Fortran linker
but not the C linker.
Cure: In
the file ${PETSC_DIR}/bmake/solaris/solaris make sure the line
that defines FC_LIB contains /opt/SUNWspro/SC3.0/lib/libM77.a
... Also, see (1).
- Symptom: On
the IBM RS6000
0706-317 ERROR: Unresolved or undefined symbols detected: Symbols in
error (followed by references) are dumped to the load map. The
-bloadmap: option will create a load map. .errsav .errset
.errstr .einfo .dgef .dgesm .dpof .dposm
Problem:
These are IBM ESSL routines that we assume are called by the IBM
implementation of BLAS or LAPACK.
Cure: In
bmake/rs6000/packages on the line that defines BLAS_LIB add at
the end -lessl ... This will cause PETSc to always search ESSL
for these routines.
- Symptom: [merlin]
make BOPT=O ex1 gcc -DPARCH_sun4 -pipe -c -DHAVE_STROPTS_H
-DHAVE_SEARCH_H -DHAVE_PWD_H -DHAVE_STRING_H -DHAVE_MALLOC_H
-DHAVE_X11 -DHAVE_BLOCKSOLVE -I../../../ -I../../..//include -Dmpi
-I/usr/local/mpi/include -I../../..//src
-I/home/curfman/block_solve_mpi/include -O -Wall -Wshadow
-fomit-frame-pointer -DINLINE_FOR -DPETSC_DEBUG -Dlint -DPETSC_BOPT_O
-DPETSC_LOG ex1.c
Libraries not built in ../../..//lib/libO/sun4
Problem: The
PETSc library for BOPT=O has not yet been built on the sun4.
Cure: In
the PETSc home directory, type make BOPT=O all >&
make_log to build the optimized version of the PETSc library.
Then recompile the example program.
- Symptom: on
DEC alpha or Paragon
Make: Cannot open ./bmake/alpha/./bmake/common. Stop. or
Make: Cannot open ../../../bmake/paragon/../../../bmake/common. Stop.
etc.
Problem: The
OSF designers decided to change make for no apparent reason.
The make on these machines tries to include additional makefiles
relative to the path of the last makefile included rather than relative
to the path of the original makefile, like all other machines
makes do.
Cure: Try
either:
- Always run make with -e
PETSC_DIR=the_complete_path_of_the_petscdir This will override
the relative pathname of PETSC_DIR in the makefiles. For example, make
-e BOPT=g PETSC_DIR=/home/bsmith/petsc all
- Use gnumake instead of the default make, and change the
line in the file ${PETSC_DIR}/bmake/${PETSC_ARCH}/base that
defines OMAKE to gnumake.
- Symptom: on
DEC alpha or Paragon
/usr/lib/cmplrs/cc/cfe: Error: mal.c, line 16: 'free' undefined,
reoccurrences will not be reported PetscErrorCode (*PetscFree)(void
*,int,char*) = (PetscErrorCode (*)(void*,int,char*))free;
Problem: The
include files on your machine are out of sync with the ones we
used for developing PETSc.
Cure: Add
and remove entries from petsc/pinclude/petscfix.h to avoid
conflicts with prototypes in the system include files and to
define any functions that are missing in the include files.
- Symptom: on
DEC alpha
adebug.c: /usr/lib/cmplrs/cc/cfe: Error:
/users/madams/petsc/pinclude/petscfix.h, line 172:
redeclaration of '\ vfprintf'; previous declaration at line 189 in file
'/usr/include/stdio.h' extern int vfprintf(FILE*,char*,...);
/usr/lib/cmplrs/cc/cfe: Warning: file.c, line 427: illegal
combination of pointer and integer if (istmp) fname = mktemp(
fname ); /usr/lib/cmplrs/cc/cfe:
Error: mal.c, line 15: 'malloc' undefined, reoccurrences will not be
reporte\ d (void*(*)(unsigned int,int,char*))malloc;
Cure: See
(16)
- Symptom: on
some IBM RS6000
fp.c: "/usr/include/fpxcp.h", line 84.33: 1506-310 (W) The type "struct
sigcontext" was introduced in a parameter list, and will go out of
scope at the end of the function declaration or definition.
"/usr/include/fpxcp.h", line 85.32: 1506-310 (W) The type
"struct sigcontext" was introduced in a parameter list, and
will go out of scope at the end of the function declaration or
definition. ....
Problem: The
include files are not correctly defining the needed struct
sigcontext.
Cure: Edit
petsc/src/sys/src/fp.c and locate the line #include and make sure there is a line struct sigcontext; above it.
- Symptom: on
linux When compiling and linking Fortran code we got the error
message
make [filename.o] Error 4 (ignored)
Problem:
Unknown
Cure: The
compile seems ok, so this message can be safely ignored.
- Symptom: using
MPICH [0] Truncated message (in CHK_MSGLEN)
[0] Aborting program!
p0_8959: p4_error: (null): 1
Problem: this
is due to some bug in a call to an MPI routine.
Cure: Run
the program with the option -start_in_debugger. In the
debugger, type "break p4_error" (or "stop in p4_error" for
dbx); then type "cont". When the program aborts, use debugger
commands such as "where" to track down the problem with the call.
- Symptom: on
HP-UX
Make: Unknown flag argument -. Stop.
Make: Unknown flag argument -. Stop.
Make: Unknown flag argument -. Stop.
We have gotten this on the HP-UX using the native
(vendor provided) make.
Cure: Install
and use Gnu make. To force PETSc to use an alternative make,
edit the file petsc/bmake/$PETSC_ARCH/base and change OMAKE to
your alternative.
- Symptom: on
IBM SP
Could not load program
/afs/rpi.edu/big/00/0000/hongwl/petsc/petsc/src/ksp/examp les/ex1
Symbol XSetWMProperties in pmd2 is undefined Symbol XSetWMName in pmd2
is undefined
Error was: Exec format error
Problem: The
libraries on the IBM SP front-end for X may be different than on the
nodes.
Cure: Get
your system administrator to make sure the dynamic libraries on
the nodes are IDENTICAL to those on the compiler server.
- Symptom: using
Fortran While using VecGetArray(), MatGetArray(), ISGetArray()
/usr/local/mpi/bin/mpirun.ch_p4: 17545 Breakpoint then program stops
Problem: You
have compiled some of your code with the option to check for
arrays out of bound. (on the IBM rs6000 this is the -C option)
Cure: Recompile
all code making sure it does not check for arrays out of bound.
The use of VecGetArray(), etc. requires accessing arrays out of
bounds; this is done safely. -
- Symptom: You
create Draw windows or ViewerDraw windows or use options
-ksp_xmonitor or -snes_xmonitor and the program seems to run OK
but windows never open.
Problem: The
libraries were compiled without support for X windows.
Cure: Make
sure that the file petsc/bmake/$PETSC_ARCH/base contains the
-DHAVE_X11 in the definition of CONF. Also, make sure that X11
is installed on your machine. Then recompile the PETSc libraries.
- Symptom:
Problem: PETSc
cannot work on a machine where the length of C integers does not equal
the length of Fortran integers.
Cure: Change
your compilers so that you use ones that have the same length
for integers. Or check compiler flags to see if you can change
the default integer lengths to match.
- Symptom: [merlin]
make BOPT=g ex19 f77 -c -I/tmp/petsc -I/tmp/petsc/include
-I/usr/local/mpi/include -I/tmp/petsc/src -DHAVE_BLOCKSOLVE
-DHAVE_MPE -DHAVE_STROPTS_H -DHAVE_SEARCH_H -DHAVE_PWD_H
-DHAVE_STRING_H -DHAVE_MALLOC_H -DHAVE_X11
-DHAVE_FORTRAN_UNDERSCORE -DHAVE_DRAND48 -g -Wall -DPETSC_DEBUG
-DPETSC_LOG -DPETSC_BOPT_g -Dlint -g -dalign ex19.F /tmp/cpp.22009.0.f:
MAIN: f77 -g -dalign -o ex19 ex19.o
-L/tmp/petsc/lib/libg/sun4_local -lpetscfortran
-L/tmp/petsc/lib/libg/sun4_local -lpetscsles -lpetscksp -lpetscmat
-lpetscvec -lpetscdraw -lpetscsys /tmp/otherlibs/libBS95.a
/tmp/otherlibs/lapack_double.a /tmp/otherlibs/lapack_complex.a
/tmp/otherlibs/blas_double.a /tmp/otherlibs/blas_complex.a
/tmp/otherlibs/libX11.a /tmp/otherlibs/libmpe.a /tmp/otherlibs/libmpi.a
/tmp/otherlibs/libF77.a -lm /tmp/otherlibs/libfm.a
/usr/lib/debug/malloc.o /usr/lib/debug/mallocmap.o -lm
ld: -lpetscfortran: No such file or directory
Compilation failed *** Error code 4 (ignored) rm -f -f ex19.o
Problem: The
PETSc Fortran interface library does not exist.
Cure: The
PETSc Fortran interface library must be compiled from the base
PETSc directory using the command make BOPT=g fortran (or make
BOPT=O fortran for the optimized version). See the Fortran section
within the file ${PETSC_DIR}/docs/installation.html for details.
- Symptom: under
Linux
You use the -start_in_debugger option and it seems to start up gdb ok,
but where gives you funny stuff like (gdb) where #0 0x50067114
in globmemsize () #1 0x5008d404 in globmemsize ()
Problem: GDB
is confused about where it is. Everything is fine.
Cure: Set
some break points and continue.
- Symptom: with
recent versions of G++ compiler
libfast in: /tmp_mnt/home/schwarz/cai/petsc-2.0.14/src/is/interface
In file included from ../../../include/petsc.h:112, from
../../../include/is.h:9, from ../isimpl.h:13, from index.c:7:
../../../include/options.h:12: type specifier omitted for
parameter ../../../include/options.h:12: parse error before `*'
Problem: Gnu
completely changed the way it does complex numbers now. It uses a
templated complex class.
Cure: In
petsc/bmake/$PETS_ARCH/variables add -DUSES_TEMPLATED_COMPLEX to the
line defining GCOMP_PETSCFLAGS and
OCOMP_PETSCFLAGS. -
- Symptom: In
file included from
/home/petsc/BlockSolve95/include/BSsparse.h:25, from
../../../../../src/mat/impls/rowbs/mpi/mpirowbs.h:12, from
mpirowbs.c:6: /home/petsc/BlockSolve95/include/BSdepend.h:38:
warning: declaration of `int exit(int)'
/home/petsc/BlockSolve95/include/BSdepend.h:38: warning: conflicts with
built-in declaration `void exit(int)'
Problem: A
BlockSolve95 include file has a prototype it shouldn't have.
Cure: Edit
the file BlockSolve95/include/BSdepend.h and remove the line(s)
extern int exit(int);
- Symptom: on
DEC
alpha g++ -g -o ex1f ex1f.o
-L/home/curfman/petsc/lib/libg_complex/alpha -lpetscfortran
-L/home/curfman/petsc/lib/libg_complex/alpha -lpetscsles -lpetscksp
-lpetscmat -lpetscvec -lpetscsys
/home/petsc/BlockSolve95/lib/libg_complex/alpha/libBS95.a -ldxml -lX11
/usr/local/mpi/lib/alpha/ch_p4/libmpi.a /usr/lib/libutil.a
/usr/lib/libFutil.a /usr/lib/libots.a -lm
collect2: ld returned 1 exit status /usr/ucb/ld: Unresolved: main
for_stop for_write_seq_lis for_set_reentrancy iargc_ getarg_
Problem: It
cannot find certain Fortran library routines.
Cure: PETSc
2.0.22 and earlier - Add -lfor to the bmake/alpha/packages
after /usr/lib/libFutil.a
With later versions please send us email petsc-maint@mcs.anl.gov
- Symptom: On
Sun4 running SunOS 4.1.3 with G++ version 2.7.2, compiling with
complex
eagle> make BOPT=g_complex ex17 g++ -DPETSC_COMPLEX -DPARCH_sun4 -c
-I../../../.. -I../../../../include -I/usr/local/mpi/include
-DHAVE_BLOCKSOLVE -DHAVE_MPE -DPETSC_DEBUG -DPETSC_LOG
-DPETSC_BOPT_g -DPETSC_COMPLEX -DUSES_TEMPLATED_COMPLEX
-D__DIR__='"src/sles/examples/tests/"'
-I/home/petsc/BlockSolve95/include -g ex17.c g++ -g -o ex17
ex17.o -L../../../../lib/libg_complex/sun4_local -lpetscsnes
-lpetscsles -lpetscksp -lpetscmat -lpetscvec -lpetscsys
/tmp/otherlibs/libX11.a /tmp/otherlibs/libBS95.a
/tmp/otherlibs/lapack_complex.a
/home/bsmith/lapack/lapack_sun4_g_double.a
/tmp/otherlibs/blas_complex.a /tmp/otherlibs/blas_double.a
/tmp/otherlibs/libmpe.a /tmp/otherlibs/libmpi.a
/usr/lib/debug/malloc.o /usr/lib/debug/mallocmap.o
collect2: ld returned 2 exit status
ld: Undefined symbol _s_wsFe __Fz_eq _s_stop _s_cmp _do_f_out __Fz_ne
_s_cat _e_wsfe _s_copy ***
Error code 1 (ignored) rm -f ex17.o
Cure: You
must list in sun4/packages for the variable FC_LIB
/usr/lang/SC1.0.1/libF77.a /usr/lang/SC1.0.1/libm.a See also
the next two troubleshooting problems
- Symptom: On
Sun4 running SunOS 4.1.3 with G++ version 2.7.2, compiling with
complex
eagle>make BOPT=g_complex ex17 g++ -DPETSC_COMPLEX -DPARCH_sun4 -c
-I../../../.. -I../../../../include -I/usr/local/mpi/include
-DHAVE_BLOCKSOLVE -DHAVE_MPE -DPETSC_DEBUG -DPETSC_LOG
-DPETSC_BOPT_g -DPETSC_COMPLEX -DUSES_TEMPLATED_COMPLEX
-D__DIR__='"src/sles/examples/tests/"'
-I/home/petsc/BlockSolve95/include -g ex17.c g++ -g -o ex17
ex17.o -L../../../../lib/libg_complex/sun4_local -lpetscsnes
-lpetscsles -lpetscksp -lpetscmat -lpetscvec -lpetscsys
/tmp/otherlibs/libX11.a /tmp/otherlibs/libBS95.a
/tmp/otherlibs/lapack_complex.a
/home/bsmith/lapack/lapack_sun4_g_double.a
/tmp/otherlibs/blas_complex.a /tmp/otherlibs/blas_double.a
/tmp/otherlibs/libmpe.a /tmp/otherlibs/libmpi.a
/tmp/otherlibs/libF77.a /usr/lib/debug/malloc.o
/usr/lib/debug/mallocmap.o
collect2: ld returned 2 exit status
ld: Undefined symbol __Fz_eq __Fz_ne ***
Error code 1 (ignored) rm -f ex17.o
Cure: You
must include in packages for FC_SITE also the library
/usr/lang/SC1.0.1/libm.a See also the next troubleshooting
problem.
- Symptom: On
Sun4 running SunOS 4.1.3 with G++ version 2.7.2, compiling with
complex C
eagle> make BOPT=g_complex ex17 g++ -DPETSC_COMPLEX -DPARCH_sun4 -c
-I../../../.. -I../../../../include -I/usr/local/mpi/include
-DHAVE_BLOCKSOLVE -DHAVE_MPE -DPETSC_DEBUG -DPETSC_LOG
-DPETSC_BOPT_g -DPETSC_COMPLEX -DUSES_TEMPLATED_COMPLEX
-D__DIR__='"src/sles/examples/tests/"'
-I/home/petsc/BlockSolve95/include -g ex17.c g++ -g -o ex17
ex17.o -L../../../../lib/libg_complex/sun4_local -lpetscsnes
-lpetscsles -lpetscksp -lpetscmat -lpetscvec -lpetscsys
/tmp/otherlibs/libX11.a /tmp/otherlibs/libBS95.a -lm
/tmp/otherlibs/lapack_complex.a
/home/bsmith/lapack/lapack_sun4_g_double.a
/tmp/otherlibs/blas_complex.a /tmp/otherlibs/blas_double.a
/tmp/otherlibs/libmpe.a /tmp/otherlibs/libmpi.a
/usr/lang/SC1.0.1/libF77.a /usr/lang/SC1.0.1/libm.a
/usr/lib/debug/malloc.o /usr/lib/debug/mallocmap.o
collect2: ld returned 2 exit status
ld: /lib/libm.a(trig.o): _fp_pi: multiply defined *** Error code 1
(ignored)
Problem: the
variable fp_pi is defined in both /usr/lang/SC1.0.1/libm.a and
the usual -lm math library. The g++ linker has a bug in it that
trys to include if from both.
Cure: You
must make a copy of /usr/lang/SC1.0.1/libm.a say cp
/usr/lang/SC1.0.1/libm.a ~/libfm.a then delete the reference to
the variable in that file with ar d ~/libfm.a __fp_pi.o ranlib
~/libfm.a Now in sun4/packages list ~/libfm.a instead of
/usr/lang/SC1.0.1/libm.a
- Symptom: On
DEDC alpha Unaligned access pid=15199 va=140021674 pc=12001e8d8
ra=12001e8c0 type=ldt
Problem: The
system has detected an unaligned variable. This is usually an
unaligned double.
Cure: Make
sure in Fortran that you always write double precision numbers
as 10.d0 etc not just as 10. cause then it will be stored as a
single precision number and may not be properly aligned. -
- Symptom: PetscScalarAddressToFortran:C
and Fortran arrays are not commonly aligned
or are too far apart to be indexed by an integer.
Locations: C 1920156 Fortran 2438656 [
0] MPI Abort by user Aborting program !
[0] Aborting program!
Problem: This
occurs when trying to access a PETSc array from Fortran. The array may
have been obtained with VecGetArray(), MatGetArray(), etc. On
the IRIX64 this is because the Fortran address's are so far
away from the C address that you cannot move between them with an
integer offset (integers are just not big enough). On other machines
this is because the distance between the Fortran array starting
point and the C array starting point is not divisible by the
length of a double (or complex). This one cannot access the other with
an integer offset.
Cure: 1)
Rewrite Fortran code to not use the particular XXXGetXXX()
routine. For example, use VecSetValues() instead of directly
stuffing the values into the array. 2) Determine how to force the
Fortran and or C compiler to commonly align doubles or complex
numbers. That is, if all doubles are double aligned then this
won't be a problem, if all complex are quad aligned then it is not a
problem. If you determine how to do this for a particular machine,
please let use know so we can add it to PETSc.
- Symptom: On
rs6000 machines, the program encounters a segmentation fault
when initializing MPI.
[light] mpirun ex1
/light_home2/lmcinnes/mpich/lib/rs6000/ch_p4/mpirun: 23817 Memory fault
See, e.g., the following debugger session:
[light] 525%gdb ex1
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the
conditions. There is absolutely no warranty for GDB; type "show
warranty" for details. GDB 4.13 (rs6000-ibm-aix3.2), Copyright
1994 Free Software Foundation, Inc...
(gdb) run -p4pg joe
Starting program:
/light_home2/lmcinnes/petsc-2.0.15/src/sles/examples/tutorials/ex1
-p4pg joe
Program received signal SIGSEGV, Segmentation fault. 0x10003750 in
getenv ()
(gdb) where
#0 0x10003750 in getenv ()
#1 0x10001438 in MPIR_Init (=0x2ff7f630, =0x2ff7f634)
#2 0x10001384 in MPI_Init (=0x2ff7f630, =0x2ff7f634)
#3 0x100004d8 in main (argc=3, args=0x2ff7f65c) at ex1.c:37
#4 0x10000430 in __start ()
Problem: As
shown below, libxlf.a contains the Fortran routine getenv(), which is
being used instead of the UNIX routine that we really need.
This seems to occur when using gcc/g++ instead of xlc.
Cure: Edit
the file petsc/bmake/rs6000/bpackages and define FC_LIB as as
follows, making sure to list "-lbsd -lc" BEFORE libxlf.a and
any other Fortran libraries. FC_LIB = -lbsd -lc
/usr/lib/libxlf.a
- Symptom: make
BOPT=g ex1 xlC -DPARCH_rs6000 -D_POSIX_SOURCE -c
-I/tmp_mnt/home/someone/current_petsc/petsc-2.0.15
-I/tmp_mnt/home/someone/current_petsc/petsc-2.0.15/include
-I/usr/local/mpich/include -I/usr/local/mpich/mpe -DHAVE_ESSL
-DHAVE_MPE -DPETSC_DEBUG -DPETSC_LOG -DPETSC_BOPT_g
-D__DIR__='"src/mat/examples/tests/"' -g ex1.c mpcc -g -o ex1 ex1.o
-L/tmp_mnt/home/someone/current_petsc/petsc-2.0.15/lib/libg/rs6000
-lpetscts -lpetscsnes -lpetscsles -lpetscksp -lpetscmat
-lpetscvec -lpetscsys
-lX11/usr/local/mpich/lib/rs6000/ch_mpl/libpmpi.a
/usr/local/mpich/lib/rs6000/ch_mpl/libmpe.a
/usr/local/mpich/lib/rs6000/ch_mpl/libmpi.a /usr/lib/libxlf.a
/usr/lib/libxlf90.a -bI:/usr/lpp/xlf/lib/lowsys.exp -lm
0706-317 ERROR: Unresolved or undefined symbols detected:
Symbols in error (followed by references) are dumped to the load map.
The -bloadmap:
Problem:
Those are IBM library routines for sparse direct solution of
linear systems. You must have compiled PETSc with the flag -DHAVE_ESSL
flag defined in bmake/rs6000/packages but not listed -lessl on
the line that defines the Lapack libraries LAPACK_LIB = ....
Cure:
- if you have essl installed on your machine add -lessl
to the the line LAPACK_LIB = .. in bmake/rs6000/packages or
- if you do not have essl,
a) then remove -DHAVE_ESSL from bmake/rs6000/packages
b) cd to src/mat/impls/aij/seq
c) type touch essl.c
d) type make BOPT=g or (make BOPT=O) This will rebuild
just the one library that needs to be rebuilt.
- Symptom: While
Installing PETSc on rs6000, using g++ libfast in:
/tmp/petsc/src/sys/src
fdate.c: In function `char * PetscGetDate()':
fdate.c:18: warning: implicit declaration of function `int
gettimeofday(...)'
libfast in: /tmp/petsc/src/viewer/impls/matlab
send.c: In function `int ViewerDestroy_Matlab(struct _PetscObject *)':
send.c:101: warning: implicit declaration of function `int
setsockopt(...)'
Problem: The
prototypes of the above functions are not specifed in the
gcc/g++ include files
Cure: Edit
petsc/include/pinclude/petscfix.h -- ,
after the lines
#if defined(PARCH_rs6000)
#if defined(__cplusplus)
extern "C" {
add the following:
extern int setsockopt(...);
extern int gettimeofday(...);
- Symptom: On
Linux and FreeBSD using older versions of f2c/gcc (for example
gcc 2.6.3) while compiling the BLAS libraries zrotg: Error on
line 13 of zrotg.f: bad argument type to intrinsic dsqrt
Problem:
Error in the compiler
Cure:
- upgrade your system or
- remove any reference to the file zrotg.f from the
makefile in the blas1 directory.
-
- Symptom: On
IBM rs6000 "send.c", line 66.12:
1506-343 () Redeclaration of connect differs from previous declaration
on line 373 of "/usr/include/sys/socket.h".
1506-286: (E) Error in message set 12, unable to retrieve message 377.
"send.c", line 74.12: 1506-343 () Redeclaration of sleep differs from
previous declaration on line 154 of "/usr/include/unistd.h".
Problem: IBM
added prototypes to these functions that did not use to have them. Cure: Comment out the
prototype for connect() and sleep() in the file
src/viewers/impls/matlab/send.c
- Symptom: On
SGI Origin 2000
make BOPT=g fortran rm -f -f
/scratch-modi4/barrys/petsc-2.0.15/lib/libg/IRIX64/libpetscfortran.*
Beginning to compile Fortran interface library
Using Fortran compiler: f77 -O -g
Using C/C++ compiler: cc -64 -DPARCH_IRIX64 -woff 1164 -woff 1048 -g
Using PETSc flags: -DPETSC_DEBUG -DPETSC_LOG -DPETSC_BOPT_g
Using configuration flags: -DHAVE_PWD_H -DHAVE_STRING_H
-DHAVE_STROPTS_H -DHAVE_MALLOC_H -DHAVE_64BITS -DHAVE_X11
-DHAVE_FORTRAN_UNDERSCORE -DHAVE_DRAND48 -DHAVE_GETDOMAINNAME
-DHAVE_UNAME -DHAVE_UNISTD_H -DHAVE_SYS_TIME_H
Using include paths: -I/scratch-modi4/barrys/petsc-2.0.15
-I/scratch-modi4/barrys/petsc-2.0.15/include -DUSES_INT_MPI_COMM
Using PETSc directory: /scratch-modi4/barrys/petsc-2.0.15
Using PETSc arch: IRIX64 ------------------------------------------
for i in zoptions.o zksp.o zpc.o zsnes.o zsys.o zmat.o zvec.o zsles.o
zdraw.o zda.o zviewer.o zis.o zplog.o zstart.o zstartf.o zts.o
zao.o;
do make libmember LIBMEMBER=$i ;
done cc -64 -DPARCH_IRIX64 -woff 1164 -woff 1048 -c
-I/scratch-modi4/barrys/petsc-2.0.15
-I/scratch-modi4/barrys/petsc-2.0.15/include -DUSES_INT_MPI_COMM
-DPETSC_DEBUG -DPETSC_LOG -DPETSC_BOPT_g -DHAVE_PWD_H
-DHAVE_STRING_H -DHAVE_STROPTS_H -DHAVE_MALLOC_H -DHAVE_64BITS
-DHAVE_X11 -DHAVE_FORTRAN_UNDERSCORE -DHAVE_DRAND48
-DHAVE_GETDOMAINNAME -DHAVE_UNAME -DHAVE_UNISTD_H
-DHAVE_SYS_TIME_H -D__DIR__='"src/fortran/custom/"' -g
zoptions.c ar cr
/scratch-modi4/barrys/petsc-2.0.15/lib/libg/IRIX64/libpetscfortran.a
zoptions.o rm -f zoptions.o
sh: 12345 Memory fault(coredump) *** Error code 139 (bu21) *** Error
code 1 (bu21) (ignored) for i in somefort.o; do make libmember
LIBMEMBER=$i ; done sh: 12307 Memory fault(coredump) *** Error
code 139 (bu21)
Problem: Bug
in the SGI make
Cure: Compile
the Fortran interface "manually". cd to src/fortran/custom and
run make BOPT=g (or O, etc) libfast then cd to src/fortran/auto
and run make BOPT=g (or O, etc) libfast
- Symptom: libfast
in:
/usr/project/hsg/lemur/lemur_apps/petsc-2.0.15/src/is/interface In file
included from
/disk/usr/project/hsg/lemur/lemur_apps/petsc-2.0.15/include/petsc.h:177,
from
/disk/usr/project/hsg/lemur/lemur_apps/petsc-2.0.15/include/is.h:9,
from
/disk/usr/project/hsg/lemur/lemur_apps/petsc-2.0.15/src/is/isimpl.h:13,
from index.c:7:
/disk/usr/project/hsg/lemur/lemur_apps/petsc-2.0.15/include/plog.h:134:
mpe.h: No such file or directory
Problem: You
are installing PETSc with the HAVE_MPE option and MPE is not
installed on your machine
Cure: Either
edit the file bmake/$PETSC_ARCH/packages and locate the line PCONF =
remove the reference -DHAVE_MPE Or install MPE on your system
and make edit bmake/$PETSC_ARCH/packages to make sure that the
directory where mpe.h is located is listed on the line MPI_INCLUDE =
-Istuff
- Symptom: under
Solaris "vinv.c", line 61: MPI_Allreduce: macro recursion
gcreatev.c: > vscat.c: > "vscat.c", line 535:
MPI_Allreduce: macro recursion > "vscat.c", line 559:
MPI_Allreduce: macro recursion > "vscat.c", line 577:
MPI_Allreduce: macro recursion > "vscat.c", line 602: MPI_Allreduce:
macro recursion > "vscat.c", line 649: MPI_Allreduce: macro
recursion > "vscat.c", line 673: MPI_Allreduce: macro
recursion > vpscat.c: > >>
Problem: Not
sure. Could be a bug in the CPP preprocessor.
Cure: include/plog.h
and locating the line #if !defined(PETSC_USING_MPIUNI)
&& !defined(PARCH_hpux) and changing it to #if
!defined(PETSC_USING_MPIUNI) && !defined(PARCH_hpux) &&
!defined(PARCH_solaris) or in versions of PETSc greater then 2.0.15
edit the file include/petsclog.h
- Symptom: under
Solaris > testexamples_3 in:
/vol/hsm/apps/petsc-2.0.15/src/is/examples/tests > f77 -c
-I/vol/hsm/apps/petsc-2.0.15 -I/vol/hsm/apps/petsc-2.0.15/include
-I/vol/hsm/apps/mpi ch/include -DPETSC_DEBUG -DPETSC_LOG
-DPETSC_BOPT_g -D__DIR__='"src/is/examples/tests/"' -g - xs
ex1f.F > /tmp/fpp.14568.0.f: > MAIN: > Not in
assembler subset: .xstabs ".stab.index",
.15/src/is/examples/tests;/vol/sunws/SU NWspro/bin/../SC4.2/bin/f77 -c
-I/vol/hsm/apps/petsc-2.0.15 -I/vol/hsm/apps/petsc-2.0.15/includ e
-I/vol/hsm/apps/mpich/include -DPETSC_DEBUG -DPETSC_LOG -DPETSC_BOPT_g
-D__DIR__='\\"src/is/e xamples/tests/\\"' -g -xs -qoption f77pass1
-p\\$XA0SD6NKcKFzi64. ex1f.F,0x34,0,0,0 > *** Error code 1 >
make: Fatal error: Command failed for target ex1f.o >
Current working directory /vol/hsm/apps/petsc-2.0.15/src/is/examples/tests > Missing: program
name Program ex1f either does not exist, is not
Problem: The
Sun WorkShop Compiler FORTRAN 77 4.2 for .F files uses the symbol
__DIR__ which PETSc also uses URGHH.
Cure: In
the directory src/fortran/custom remove from the makefile
-D__DIR__='"${LOCDIR}"'. Also in every examples/tutorials or
tests directory with .F files remove the -D__DIR__=something.
- Symptom: The
program seems to use more and more memory as it runs, even
though you don't think you are allocating more memory.
Problem:
Possibly some of the following:
- You are creating new PETSc objects but never freeing
them.
- There is a memory leak in PETSc or your code.
- Something much more subtle: (if you are using Fortran).
When you declare a large array in Fortran, the operating
system does not allocate all the memory pages for that array
until you start using the different locations in the array. Thus, in a
code, if at each step you start using later values in the
array your virtual memory usage will "continue" to increase
as measured by ps or top.
- You are running with the -log, -log_mpe, or -log_all
option. He a great deal of logging information is stored in
memory until the conclusion of the run.
- You are linking with the MPI profiling libraries; these
cause logging of all MPI activities. Another Symptom is at the
conclusion of the run it may print some message about writing log
files.
Cures:
- Run with the -trmalloc_log option or -trdump. Use the
commands PetscTrDump() and PetscTrLogDump() sprinkled in
your code to track memory that is allocated and not later
freed. Use the commands PetscTrSpace() and PetscGetResdidentSetSize()
to monitor memory allocated and total memory used as the
code progresses.
- This is just the way Unix works and is harmless.
- Do not use the -log, -log_mpe, or -log_all option, or
use PLogEventDeactivate() or PLogEventDeactivateClass(),
PLogEventMPEDeactivate() to turn off logging of specific
events.
- Make sure you do not link with the MPI profiling
libraries. Edit the file bmake/$PETSC_ARCH/packages and
remove all references to libraries with lmpi and pmpi in
their names.
- Symptom: Under
Windows when Installing using g++ libfast in:
/users/petsc/petsc_prj/petsc/src/sys/src str.c: In function `void
PetscStrncpy(char *, char *, int)': str.c:36: warning: implicit
declaration of function `int strncpy(...)' ... ...
Problem: This
is due to the case insensitivity of Windows file systems. Instead of using
string.h , the compiler is picking up String.h - a C++
include-file, causing these errors.
Cure: In
the gcc include dir do "cp string.h string_bak.h" - Edit
petsc/src/sys/src/str.c replace string.h with string_bak.h -
Edit petsc/src/sys/src/memc.c replace memory.h with string_bak.h -
recompile.
- Symptom: on
SGI (or Origin) using SGI MPI version 2.0 MPI Error, rank:0,
function:MPI_ERRHANDLER_SET, Invalid communicator MPI_Abort()
called, aborting program! Other, random crashes in MPI.
Problem: bug
in SGI's implementation of MPI called version 2.0 (confirmed by
SGI)
Cure: Upgrade
to SGI's version 3.0 of MPI.
- Symptom: on
SGI Powerchallenge running 6.2 ld64: WARNING 134: weak
definition of __dcis in /usr/lib64/mips4/libftn.so preempts that
weak definition in /usr/lib64/mips4/libm.so. ld64: WARNING 134: weak
definition of __rcis in /usr/lib64/mips4/libftn.so preempts
that weak definition in /usr/lib64/mips4/libm.so.
Problem:
Message seems harmless
Cure: Change
the CLINKER and FLINKER in bmake/IRIX64/base to
CLINKER = cc -64 ${COPTFLAGS} -Wl,-woff,84,-woff,85,-woff,134 -rpath
${LDIR}:${DYLIBPATH}
FLINKER = f77 -64 ${FOPTFLAGS} -Wl,-woff,84,-woff,85,-woff,134 -rpath
${LDIR}:${DYLIBPATH}
- Symptom: on
HP running HP's version of MPI. sl0934 222: mpirun -np 1 ./ex12
-f
/ford/sl0934/u/kellwood/misc/petsc/petsc-2.0.17/src/mat/examples/matbinary.ex
ex12: Rank 0: Pid 4610: MPI_Attr_get: Invalid communicator:
Null communicator ex12: Rank 0: Pid 4610: MPI_Abort: Aborting
the application mpirun: Error: Job ID 4609 ended abnormally
Problem: The
HP implementation of MPI uses different values to represent MPI
communicators in C/C++ and Fortran so we have to translate the
values in the Fortran stubs.
Cure: Edit
the file src/fortran/custom/zpetsc.h and change the lines #else
#define PetscToPointer(a) (a) #define PetscFromPointer(a)
(int)(a) #define PetscRmPointer(a) #define
PetscToPointerComm(a) (a) #define PetscFromPointerComm(a) (int)(a)
#endif to #else #define PetscToPointer(a) (a) #define
PetscFromPointer(a) (int)(a) #define PetscRmPointer(a) #define
PetscToPointerComm(a) MPI_Comm_F2C(a) #define PetscFromPointerComm(a)
MPI_Comm_C2F(a) #endif then rebuild the fortran interface library by
running make BOPT=g (or BOPT=O or g_c++ etc) fortran in the
main PETSc directory
- Symptom: on
Cray T3D/T3E mpirun -np 1 ex1 -log_info /bin/mpprun: exec of
'ex1' failed: No such file or directory
Problem: ./
is not in your path.
Cure: add ./
to your PATH variable in your .cshrc
- Symptom: With
GMRES At restart the second residual norm printed does not
match the first
26 KSP Residual norm 3.421544615851e-04
27 KSP Residual norm 2.973675659493e-04
28 KSP Residual norm 2.588642948270e-04
29 KSP Residual norm 2.268190747349e-04
30 KSP Residual norm 1.977245964368e-04
30 KSP Residual norm 1.994426291979e-04 <----- At restart the
residual norm is printed a second time
Problem:
Actually this is not surprising. GMRES computes the norm of the
residual at each iteration via a recurrence relation between
the norms of the residuals at the previous iterations and quantities
computed at the current iteration; it does not compute it via directly
|| b - A x^{n} ||. Sometimes, especially with an
ill-conditioned matrix, or computation of the matrix-vector
product via differencing, the residual norms computed by GMRES start to
"drift" from the correct values. At the restart, we compute the
residual norm directly, hence the "strange stuff," the
difference printed. The drifting, if it remains small, is
harmless (doesn't effect the accuracy of the solution that GMRES
computes).
Cure: There
realy isn't a cure, but if you use a more powerful
preconditioner the drift will often be smaller and less
noticeable. Of if you are running matrix-free you may need to tune the
matrix-free parameters.
- Symptom: Error
message while installing compiling in
src/mat/impls/rowbs/mpi/mpirowbs.c about ProcSet, for example,
mpirowbs.c:
cfe: Error: mpirowbs.c, line 1753: Syntax Error
BSctx_set_ps(bspinfo,(ProcSet*)comm); {if (__BSERROR_STATUS)
> {fprintf((&__iob[2]) , "BlockSolve95 Error Code >
%d\n",__BSERROR_STATUS); {if (1) {return
PetscError(1753,"MatCreateMPIRowbs" >
,"mpirowbs.c","src/mat/impls/rowbs/mpi/" ,1,0,(char *)0);} ;} ;}} ;
Problem: They
changed BlockSolve95 incompletely.
Cure: Remove
the (ProcSet*) from mpirowbs.c
- Symptom: Error
message when using BlockSolve95 0 - Error in MPI object : Could
not convert index -268437876(effff68c) into a pointer. The
index may be an incorrect argument. Possible sources of this problem
are a missing "include 'mpif.h'", a misspelled MPI object
(e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD) or a misspelled
user variable for an MPI object (e.g., com instead of comm).
[0] Aborting program ! [0] Aborting program!
Problem: This
is due to changes in prototye of a BlockSolve95 function.
Cure: Install
the latest version of BlockSolve95
- Symptom: Error
When Installing PETSc with BlockSolve95 mpirowbs.c: In function
`int MatCreateMPIRowbs(struct MPIR_COMMUNICATOR *, int, int,
int, int *, void *, struct _p_Mat **)': mpirowbs.c:1755: passing
`MPIR_COMMUNICATOR *' as argument 2 of `BSctx_set_ps(__BSprocinfo *,
MPIR_COMMUNICATOR **)'
Problem: This
is due to changes in prototye of a BlockSolve95 function.
Cure: Install
the latest version of BlockSolve95
- Symptom: on
Cray T3E/T3D My mixed Fortran/(C or C++) code works fine on
other machines but does not link (or links but crashes on) on
the Cray T3D or T3E.
Probable problems:
- NOT LINKING: the Cray Fortran compiler changes all
Fortran routine names to all caps, so when you call them
from C/C++ with all small letters, the linker cannot find the.
- STRANGE CRASHING: the Cray Fortran compiler uses double
precision to denote quad precision and single precision to
denote "regular" double precision.
Cures:
- You must make sure that when you call Fortran routines
from C/C++ the name of the routine called (in C/C++) is in
all caps. The PETSc macro HAVE_FORTRAN_CAPS is defined on
machines like the Cray so you can use it in your C/C++ like this #if
defined(HAVE_FORTRAN_CAPS) #define myfortranroutine_ MYFORTRANROUTINE
#elif !defined(HAVE_FORTRAN_UNDERSCORE) #define
myfortranroutine_ #define myfortranroutine #endif /* some C
code that calls Fortran */ myfortranroutine_(.....). See
src/fortran/custom/zoptions.c for examples.
- To get the Fortran compiler to to behave like a normal
Unix Fortran compiler you must make sure that all of your
Fortran routines are compiled with the -dp flag. If you use the
PETSc makefiles and macro FC to compile your Fortran code this will
handle this automatically.
- Symptom: Error
when running petsc example
fire >>mpirun ex11
ld.so.1: /XXXXX/petsc/src/sles/examples/tutorials/ex11: fatal:
libcomplex.so.5: can't open file:
errno=2
Problem: This
happens when a shared library is used instead of the regular
library, (when the -l option is used, shared version of the
library is used if present) but the location of this library is not
known to the shared library loader ld.so.
Cure: You can
do either of the following:
1. Add the path where the library is located to the environmental
variable LD_LIBRARY_PATH. Assuming that this library is located
in /opt/SUNWspro/lib, you can do:
setenv LD_LIBRARY_PATH /opt/SUNWspro/lib:/lib
Note: For parallel jobs you have to make sure that all processes
started see this variable, so it should be set in your .cshrc
file
2. On some machines (e.g., solaris, IRIX, IRIX64), you can
set this path in the variable DYLIBPATH in ${PETSC_DIR}/bmake/${PETSC_ARCH}/packages
as:
DYLIBPATH = /opt/SUNWspro/lib
On Solaris the library can often be found in /opt/SUNWspro/SC4.2/lib
or /opt/SUNWspro/SC4.4/lib etc
- Symptom: ld64:
ERROR 33: Unresolved text symbol "MPIR_ToPointer" -- 1st
referenced by /d$ > ld64: ERROR 33: Unresolved text symbol
"MPIR_FromPointer" -- 1st referenced by $ > ld64: INFO 60: Output
file removed because of error. or When first compiled the fortran
libraries give this error: > > Error zsys.c: line 114
MPI_Comm incompatible with (void*) parameter. >n*(int*)comm
= PetscFromPointerComm(c);
Cure: Make
sure that the file bmake/$PETSC_ARCH/packages on the line MPI_INCLUDE
has -DUSES_INT_MPI_COMM if it is not there add it and rerun
make BOPT=g (or O etc) fortran.
- Symptom: [0]
PETSC ERROR: MatAssemblyBegin() line 1858 in
src/mat/interface/matrix.c Not for factored matrix
Problem: You
are trying to assemble a matrix that has been factored.
Normally this does not make sense, unless you are using an implace
factorization and want to reuse the space.
Cure: Call
MatSetUnfactored(Mat); before calling the MatSetValues()
routines.
- Symptom: Error
while running PETSc examples on IBM SP using IBM's MPI
exec(): 0509-036 Cannot load program ex1 because of
the following errors:
0509-023 Symbol pm_exit_value in /usr/lpp/ppe.poe/lib/libmpi.a is not
defined.
0509-023 Symbol pm_exit_value in /usr/lpp/ppe.poe/lib/libmpi.a is not
defined.
0509-022 Cannot load library libvtd.a[dynamic.o].
0509-026 System error: Cannot run a file that does not have a valid
format.
Problem: This
problem occurs when the version of libc.a on your system is
different than the verison of libc.a used to build IBM's MPI.
For more information on this problem refer to 'IBM AIX PE: Hitchhikers
Guide'.
Cure: edit
bmake/rs6000/packages and add the following to the variable
EXTERNAL_LIB
EXTERNAL_LIB = /usr/lpp/ppe.poe/lib/libc.a /usr/lpp/ppe.poe/lib/libc_r.a
- Symptom: Error
while compiling under Linux (more recent versions of OS), with
C++ compiler
libfast in:/usr/local/petsc-2.0.21/src/sys/src
mem.c:13: declaration of C function 'int getrusage(int, struct
rusage*)' conflicts with
.....
Problem: Linux
recently added a prototype for this function, that conflicts
with the one
PETSc was using
Cure: Edit
src/sys/src/mem.c and src/sys/src/cputime.c and remove the line
beginning with
extern int getrusage(....)
- Symptom: Error
when building a fortran example:
ld: 0706-006 Cannot find or open library file: -l petscfortran
ld:open(): A file or directory in the path name does not exist.
make: 1254-004 The error code from the last command is 255.
make: 1254-005 Ignored error code 255 from last command.
rm -f ex4f.o
....
Problem: Fortran
Libraries are not built
Cure: Build
the fortran libraries by invoking the command
make BOPT=g fortran, note any errors when compiling and send them to
petsc-maint@mcs.anl.gov
- Symptom: Error
while running program
> mpirun ex1
p0_27137: p4_error: semget failed for setnum=%d: 0
Problem:
Inproperly installed or configured MPICH. Often this results
from compiling the socket based version of MPICH, with device
ch_p4 but using the mpirun associated with the shared memory version or
the other way around.
Cure: First,
make sure that you can run plain old MPI programs (those
without PETSC). Make sure you are using the correct version of
mpirun for the installed version of MPICH or reinstall MPICH
.
- Symptom:
Error when compiling PETSc examples
> ld : -lg2c no such file
Problem: Your
fortran compiler is probably using libf2c.a instead of libg2c.a
Cure: Edit
bmake/${PETSC_ARCH}/variables and replace -lg2c with -lf2c
-
Symptom:
Get the following errors when using PETSc graphics on windows/cygwin-X11
X Error of failed request: BadMatch (invalid parameter attributes)
Major opcode of failed request: 78 (X_CreateColormap)
Serial number of failed request: 8
Current serial number in output stream: 9
Problem: This
problem might occur when using 25 color mode or 32bit color
mode on windows.
Cure: This
can be fixed by changing the display settings on windows to 16
bit colors or 24 bit colors.
- Symptom: Some
Krylov methods seem to print two residual norms per iteration,
for example
> 1198 KSP Residual norm 1.366052062216e-04
> 1198 KSP Residual norm 1.931875025549e-04
> 1199 KSP Residual norm 1.366026406067e-04
> 1199 KSP Residual norm 1.931819426344e-04
Problem:
Some Krylov methods, for example tfqmr, actually have a
"sub-iteration"
of size 2 inside the loop; each of the two substeps has its own matrix
vector
product and application of the preconditioner and updates the residual
approximations. This is why you get this "funny" output where it looks
like
there are two residual norms per iteration. You can also think of it as
twice
as many iterations.
- Symptom: The
example compiles fine - but at runtime gives the following
error:
[0]PETSC ERROR: PetscInitialize_DynamicLibraries() line 63
in src/sys/src/dll/reg.c
[0]PETSC ERROR: Unable to locate PETSc dynamic library
/home/balay/spetsc/lib/libg/linux/libpetsc
You cannot move the dynamic libraries!
or remove USE_DYNAMIC_LIBRARIES from
${PETSC_DIR}/bmake/$PETSC_ARCH/petscconf.h
and rebuild libraries before moving!
Problem: When
using DYNAMIC libraries - the libraries cannot be moved after
they are installed. This could also happen on clusters - where
the paths are different on the (run) nodes - than on the
(compile) front-end.
Cure:
Do not use dynamic libraries & shared libraries. This can
be done by removing the flag PETSC_USE_DYNAMIC_LIBRARIES from
bmake/${PETSC_ARCH}/petscconf.h file and rebuilding the
libraries. You might also want to remove shared libraries by
invoking
make BOPT=g deleteshared
- Symptom:
When running with -start_in_debugger one gets the error message
PETSC: Attaching gdb to /opt/procast_mpich/procast051003/./procast of pid 31603 on display linux.:0.\
0 on machine linux.
: Can't get address for linux.
Xt error: Can't open display: linux.:0.0
Problem: The remote nodes
do not know where to display the debugger window.
Cure: Run with the additional option
-display displayname where displayname is something like mymachine.0.0
- Symptom:
Too many communicators (2046) in MPI_Comm_dup
Problem:If you create a PETSc object
with MPI_COMM_WORLD, MPI_COMM_SELF or a communicator you made yourself, PETSc needs to duplicate
the communicator (otherwise it may have conflicts between the tags PETSc uses and you use). Thus if
you create many PETSc objects you may run out of communicators.
Cure: Use PETSC_COMM_WORLD, PETSC_COMM_SELF,
or a communicator obtained with PetscCommDuplicate() or PetscObjectGetComm().
- Symptom:
Problem:
Cure: