MPI at LLNL: FAQ

Privacy and Legal Notice

Home

Overview

News/Events

Tutorials/
Documentation

Libraries/
Building Executables

Environment Variables

SMT and
OpenMP

Performance Results

Open Issues, Gotchas, and Recent Changes

FAQs

Glossary

Contacts

Frequently Asked Questions

How do I diagnose and/or debug an MPI problem where the code hangs?
Do we have installation for C++ binding for mpi++ (on AIX)?
What causes the .mpirun[process num] files to appear (on AIX)?
Is there any limitation on how many processes my MPI job can fork on a system?
I am using mpigather and receiving error "trying to receive a message when there are no connections." Why?
What do these errors mean? "MPIRUN chose the wrong device ch_shmem; program needs device ump2", "ump2main.c227 "Internal Error, The magic is missing." (on GPS machine)
I am receiving a run-time error: "mpi Invalid communicator error." Why?
What is the maximum number of MPI tasks a job may have? (on Purple, uP, UM, UV machines)
I don't understand this error message: "MPI: INTERNAL ERROR catalog was closed, or catalog was not initialized." (on IBM machines)
I don't understand this error message: MPID Die - ump2main.c:453 "ump_init failure." (on SC cluster)
Fortran compiler "mpiifc" does not include a link to "mpi-io" libraries. (on ALC, MCR)
I get segmentation fault when I compile with 64-bit mpi. (on Purple, uP, UM, UV)
What does this warning message mean, and how can I eliminate it: "weak symbol multiply defined"?
Are there any web pages or other documents that might provide "stack size information and strategy" for the users?
What does this error mean? "semget failed for setnum = 0 Abort." (on ILX)
How can I determine which version of MPICH is running on a machine?
What is simulataneous multithreading?
What are the expected performance gains for SMT-capable systems?
I have heard people refer to "the hypervisor" on the Purple and uP machines. What is the hypervisor. Do all LLNL machines have hypervisors?

Q: How do I diagnose and/or debug an MPI problem where the code hangs?
A:Try these steps:

Try to run with one process; is the error still present? If not, try running with two processes; is the error still present?

Try running the job under the debugger (TotalView). When you think the job is hung, interact with the debugger to determine where the hang is occurring (i.e., what part of your code or MPI is involved). For example, are you in a loop sending messages (infinitely), or are you hung because you are waiting for an event that doesn't appear to happen, such as a message receive?

Check your MPI environment variables (env | grep MP_). Are these the right settings for your MPI use?

If none of these things help, give the LC-Hotline more details on what your code is doing, such as:

Are you using the vendor's MPI or MPICH?
Size of messages and number of messages being sent?
Does this same job run successfully if you run it as a batch job? interactive job? Are you setting different environment variables with a batch version than with the interactive version?
Does your job read input interactively? Is there a message number prefixing the error message, and what is it? (e.g., 032-xxxx or some such form).

Top

Q: Do we have installation for C++ binding for mpi++ (on AIX)?
A: The IBM MPI supports C++. There is no mpi++, but there are C++ interfaces that are provided for C++ codes that make MPI calls. At the present time, IBM supports only the C compatibility for C++, not the C++ interfaces that were added for MPI-2. You need to #include <mpi.h>. You also need to load with the MPI library, which is automatic when you use the mpCC (C++) MPI script for compiling/linking with the IBM MPI library.

Q: What causes the .mpirun[process num] files to appear (on AIX)?
A: Using mpirun to launch IBM MPI jobs is the reason for the .mpirun[process number] files being created. You should be using poe instead. mpirun is for launching MPICH MPI jobs, not a general purpose parallel job launcher.

Q: Is there any limitation on how many processes my MPI job can fork on a system?
A: There are no limits other than those imposed by the batch system.

Q: I am using mpigather and receiving error "trying to receive a message when there are no connections." Why?
A: You were calling MPI_Barrier with MPI_COMM_WORLD, but not all processors executed this command, so it couldn't get communication from all processors.

Top

Q: What do these errors mean? "MPIRUN chose the wrong device ch_shmem; program needs device ump2", "ump2main.c227 "Internal Error, The magic is missing". (on GPS machine)
A: This happens when a user is mixing and matching the Compaq MPI with MPICH. Check to see if you have an explicit -lmpi being loaded. This would (inadvertently) add the Compaq MPI library, and probably before the -lmpich that would be loaded automatically by the mpicc script. MPI Libraries/Building Executables explains that there are two different MPIs available on GPS. One is the Compaq MPI, whose executables must be run with dmpirun. The other is MPICH, whose executables must be built and run with the standard MPICH scripts (mpicc and mpirun). It sounds like if you want the MPICH version, you should recompile and load with the mpicc (or mpiCC or mpif90 or whatever) script, and then run your executable with mpirun. (By the way, the default MPICH on GPS is a shared-memory version; there is also a P4 version.) Please note that the Compaq MPI provides optimizations for the Alphas that MPICH cannot, but you must run the executable with dmpirun to get the correct initialization of the MPI environment.

Q: I am receiving a run-time error: "mpi Invalid communicator error". Why?
A: This sort of error occurs when there is a mixture of MPICH header files with IBM's MPI libraries, or vice versa. As a first step, which version of MPI do you wish to use on White? The IBM MPI is the recommended version. If you are using MPICH, it is recommended you use the MPICH scripts (mpicc, mpirun, etc.) that provide the correct -I and -L paths and the correct libraries. It is possible that you are using explicit -I or -L options that are no longer valid; this could result in locating the incorrect header files.

Q: What is the maximum number of MPI tasks a job may have? (on Purple, uP, UM, UV machines)
A: Up to 4096 User Space tasks. Up to 2048 IP tasks.

Top

Q: I don't understand this error message: "MPI: INTERNAL ERROR catalog was closed, or catalog was not initialized". (on IBM machines)
A: Compile with the magic -binitfinipoe_remote_mainlinker flag that you need for POE applications. This will give informative error messages that will indicate any linking problems.

Q: I don't understand this error message: MPID Die - ump2main.c:453 "ump_init failure" (on SC cluster)
A: This means that not enough shared memory is available. Run mpiclean, then execute your parallel code again. If the problem persists, try running on another node of the cluster. If the problem still persists, call the LC-Hotline and ask them to have the shared-memory cleaned up.

Q: Fortran compiler "mpiifc" does not include a link to "mpi-io" libraries. (on ALC, MCR)
A: This is a known issue. We are still waiting for the vendor (Quadrics) to provide a Fortran library libfmpi.a with the mpiio routines included. In the meantime, users can modify their code to link with the C library libmpi.a which includes the mpiio routines.

Top

Q: I get segmentation fault when I compile with 64-bit mpi. (on Purple, uP, UM, UV)
A: To compile with 64-bit mpi:

Compile with flags -q64 -qwarn64. This will tell you about all the illegal conversions from int to pointer and back.

Add -brtl -L... after -q64 for the link line only (the -brtl can screw up normal compiles to object files).

Do not use the flags -bmaxdata or -bmaxstack in 64-bit mode compilations. In 32-bit mode, these give you more memory. In 64-bit mode, they restrict your memory usage. In 64-bit mode, the default is unlimited.

Set environment variable: setenv OBJECT_MODE 64

You cannot mix 32-bit and 64-bit items, so make sure your entire code has been compiled with these options. If you are loading any of your own libraries, they too must be compiled with the 64-bit options.

Do not explicitly include -lxlf90 -lm -lc in your link line. These should not be necessary, and it is possible for this to cause problems (usually link problems). We recommend taking out these unless there is a good reason to have them.

Caveat: Most 64-bit codes that get a segmentation fault on White are not prototyping the malloc routine. Your best bet would be to run TotalView on the executable and see if the segmentation fault happens in C code. I suspect that a C library (which you link with) is calling malloc. Look for all C routines that deal with memory allocation and add # include <stdlib.h> in them. In 32-bit mode, malloc/calloc/etc. works properly even if stdlib.h is not included. In 64-bit mode, not having the prototype causes the pointer to get corrupted, resulting in seg faults. On other 64-bit platforms such as IRIX and Tru64, they eventually modified their compilers to automatically prevent this type of error for malloc/calloc/etc. (sort of an automatic prototyping) because of all the problems caused. Make sure a prototype for array_alloc() that returns a pointer is visible from everywhere it is being used. Any function that returns a pointer but doesn't have a prototype visible will cause problems. The compiler will do the wrong thing every time otherwise.

Top

Q: What does this warning message mean, and how can I eliminate it: "weak symbol multiply defined"?
A: The -w option to mpiCC is passed to g++, and it suppresses compilation warning messages only. To suppress the load (multiply-defined symbol) messages, you should try -Wl,-s. This will pass the -s to the loader (ld). If the -Wl,-s option is not satisfactory, you could also try the g++ option to turn off weak symbol support, -fno-weak.

Q: Are there any web pages or other documents that might provide "stack size information and strategy" for the users?
A: Read Jeff Fier's excellent summary of thread stack usage.

Q: What does this error mean? "semget failed for setnum = 0 Abort" (on ILX)
A: This error means that not enough shared memory is available. There are memory segments left over from a crash that need to be cleaned up. Users should run mpiclean to clean up any memory segments that may have been left on the node. If the problem persists, contact the LC-Hotline.

Q: How can I determine which version of MPICH is running on a machine?
A: To determine which version of MPICH is running on a machine, you may use the -compile_info or -link_info options to any of the MPICH compilation scripts, such as mpicc. For example, typing mpicc -compile_info identifies the default version of MPICH used on the ILX machine as 1.2.4. On all elan-based clusters (ALC, MCR, Pengra, Emperor, Adelie, and Lilac), we run the same version of MPI. This MPI (provided by Quadrics) is version 1.24-8 and is based on MPICH 1.2.4.

Q: What is simultaneous multithreading?
A: Simultaneous multithreading (SMT) is not hard to understand. In traditional designs, the entire collection of functional units in the CPU belong to one process at a time. A process can therefore be executing instructions in some or all of the functional units of the processor and nobody else can be using it at that instance. With a feature that we called hardware multithreading in the RS64 line of processors a few years ago, we provided additional hardware resources that allowed two processes to have their state, essentially, on chip. When a process had a cache miss that would normally stall it, it would switch to the other process, the other thread, with a three-cycle pause. So this was still only one process executing at a time, but could switch back and forth between two of them very, very rapidly.

In SMT we have widened the data path somewhat to allow a thread indicator on each instruction. So, we actually can fetch from two different instruction streams and have instructions from two different instruction streams issuing simultaneously to the different functional units on the chip.

We currently support two threads on the system, and it is a very general-purpose mechanism. You can have instructions from different threads in different pipeline stages of the same functional unit just following each other through. And it provides the ability to use the hardware, the processor functional units much more efficiently.
[Adapted from text provided by John McCalpin, IBM]

Q: What are the expected performance gains for SMT-capable systems?
A: It varies from negative, in some cases, to up to 60% in some cases. So, it is not at all unusual to see 20 to 30% speedup on applications without doing anything special.
[Adapted from text provided by John McCalpin, IBM]

Q: I have heard people refer to "the Hypervisor" on the Purple and uP machines. What is the Hypervisor. Do all LLNL machines have Hypervisors?
A: The Hypervisor is software present on IBM POWER5-based machines. Traditionally, the operating system's job is to provide an interface between the user and the hardware and to provide protection so that the user cannot access any part of the hardware in an uncontrolled way. In current directions, especially in server consolidation projects, one finds that you want to run multiple operating systems on the same piece of hardware, especially with all of the operating system exploits and security problems that are happening.

So, what IBM has done is added a new operating system, essentially, called the Hypervisor, that sits between the operating systems and the hardware. The Hypervisor is modestly complicated, but it's enough smaller than an operating system that one can have a lot more confidence about its reliability. And now when the operating system wants to interact with the hardware, it has to do it only through the Hypervisor or with the permission of the Hypervisor. So that you can have, for example, multiple Linux kernels running on the same hardware, and even if there is a security problem in Linux and someone compromises the kernel, that kernel is still prevented from interfering with any of the other partitions on the machine.
[Adapted from text provided by John McCalpin, IBM]

Top

Last modified September 7, 2006
UCRL-MI-126792