Date: Sun, 6 May 2001 18:06:09 -0400 (EDT) From: Jeff Squyres X-X-Sender: To: Subject: MPI_FINALIZE MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-mpi-21@XXXXXXXXXXXXX Precedence: bulk A few points: 1. Implementing the change in MPI_FINALIZE to make it collective over "the union of all processes that have been and continue to be connected" is a non-trivial distributed algorithm, since it is essentially a barrier over potentially unrelated and not-directly-connected processes. 2. Is there a difference between "have been and continue to be connected" and "are connected"? 3. This change can potentially drastically change the semantics of currently-valid MPI programs. As one example: currently-valid "task-farm" programs may unintentionally cause a lot of "zombied" MPI processes that are simply waiting for an MPI_FINALIZE from their ancestor(s). Consider what happens if a root process continually spawns short-lived MPI processes to perform some task in a "fire and forget" kind of model. The short-lived child processes could previously invoke MPI_FINALIZE and die. With the proposed change, the short-lived processed will now block waiting for the parent to invoke MPI_FINALIZE as well. This program can be fixed by having the root and child processes invoke MPI_COMM_DISCONNECT right after spawning (or after whenever the last message between the root and children finishes) so that the child can MPI_FINALIZE by itself, and then die. But my concern is backwards compatibility: we have no idea how many programs exist that rely on MPI_FINALIZEing over just MPI_COMM_WORLD. Changing the spec now could cause unintended side-effects in currently-valid MPI programs. {+} Jeff Squyres {+} jsquyres@lam-mpi.org {+} http://www.lam-mpi.org/