This article was kindly scanned, transformed into text, and spellchecked by George Perkins and Jeannette Dubendorf of MSU, with a little postprocessing by Jim Linnemann. COMPUTERS IN PHYSICS, VOL 9, NO. 2, MAR/APR 1995 p 175 SCIENTIFIC PROGRAMMING CREATING AND USING PDB FILES Stewart A. Brown, Paul F. Dubois, and David H. Munro Stewart A. Brown is physicist in the Defense Sciences Program at Lawrence Livermore National Laboratory. Paul F. Dubois is a mathematician in the Inertial Confinement Fusion Program at Lawrence Livermore National Laboratory. David H. Munro is physicist in the Inertial Confinement Fusion Program at Lawrence Livermore National Laboratory. Department Editor: Paul F. Dubois duboisl@llnl.gov. Hypertext Link Needed Here For some time I have been puzzled about how best to discuss certain topics in this department. Obviously, it is . nice to be able to talk about programs in Scientific Programming, but it is in fact somewhat hard to do. Any real program, even a real toy program, would occupy page after page in the magazine, which could otherwise be filled with more interesting information (or advertisements!). Up to now, I have tried to reduce things to small excerpts only. However, this has its limitations, and so this month I am trying an experiment. This article refers to a sample program, which you should obtain or view via the Internet for maximum reading pleasure. Instructions for obtaining it are in "Feedback" at the end of this article. Your comments on this approach would be appreciated. You can reach me at duboisl@llnl.gov. Paul F. Dubois Department Editor There are many ways to write scientific data into files for later processing. Recent years have seen a movement away from application-specific formats towards symbolic, self-describing portable files. I The following advantages offered by this method are compelling: -Access is keyed by symbolic names and need not be in the order the information was written. Unnecessary data do not have to be read. - The files are portable over a wide range of computers. - The files are robust with respect to changes in the programs that read and write them. -The technically difficult part is encapsulated in the library that accesses the files and need not be repeated in each application. This approach has not yet been standardized, despite the fact that the leading contenders have much the same goals and general philosophy. Most libraries have the ability to store simple basic data types only. There is a lack of standards for indicating the semantics of the data. For example, If we wish to write such an array that is dimensioned, say, (-10:10), how do we convey that fact, via the file, to programs that may wish to read and present the data if the database does not usually store such information? How do we store data of complex type if this is not a type native to the database? For the user of languages that allow pointers and structures, there is the additional question of pointer-following (picking up all the data in the structure and the data it points to, etc.) and morphology preservation (making sure that if objects A and B both point to C as part of themselves, they still share C after retrieval from the file). In the Eiffel language, a facility for storing persistent objects solves both these problems. Indeed, it is possible to transmit objects between client and server. In C++, several methods of solution are available, including the straightforward approach of the Rogue Wave library and the more exotic OODBM schemes such as Object Design's Object Store, in which only minimal distinctions are made between the variables in the program and the persistent data. In this article, we will show you how to use the PDB library from Fortran. This library is freely redistributable, has some technical advantages, and is easy to use from Fortran. Once we have shown you how to write data files, we will discuss three of the postprocessing tools available for examining these files as well as how to use them. Instructions for obtaining and installing the software mentioned in this article are in a separate box at the end of this article. All the software is available via anonymous FTP. Two of the tools discussed can also process files written in the popular netCDF format. Part of PACT The PDB library is part of a larger development package named PACT. Two other development/postprocessing systems are presented, Basis and Yorick. This article contains brief examples of using these tools. Because all these tools can both read and write PDB and ASCII data, and can read or be modified to read a variety of other formats, they are worth knowing about in their own right. Some of the philosophy behind such programmable-application development packages was presented previously.2 We will also cover how to access the data if you are writing your own postprocessing tool or a data filter to a commercial package. Those programming in C should know that when an application is fully integrated into PACT, an arbitrary C-struct can be stored with pointer-following done by the library. Writing a file "Snapshot" is Livermorese for "a set of data written out at a given time" in a time-evolving simulation. A "history" file, on the other hand, is a file or family of files to which new values are periodically added, so that they contain the history of the value of a quantity sampled at discrete time points. The sample program is, for simplicity, a combination of both types, containing a single image of several different scalar and array quantities (a snapshot) plus the time histories of two variables, time andy, stored under the names time and yt in the file. This is obvious but worth saying: the file can contain the data under a different name, or even a different shape, from that in the program. In our example, we will create a "complex" type, because that is not native to PDB, and store a complex Fortran array of dimension m in it. However, we could store it as a real array of dimension 2 m or as a real array of dimension ( 2, m) . We also store the variable z in the file with a shape that is different from the shape that it had in the program. Creating the complex type allows us to deliver the semantics of this array to the postprocessing utility, which may understand this particular type and be able to deal with it appropriately. Creating the file In the sample program, we first initialize the data to be stored. Then we create the file, define the complex type in it, write the snapshot data, and make a series of "time history" calls. Finally we close the file. The file is created by the PDB library function pfopen. In the sample program, we have encapsulated all the PDB calls (those with names beginning with the letters pf) with routines that call the PDB routines with an easier-to-use set of arguments. These encapsulation routines also check the error return and in case of error retrieve the library error message. The names of the encapsulation routines begin with the letters p f x. Utility routines for handling strings and error messages are included. Most notable is strsize, which calculates the length of a character variable without trailing blanks. Note that the call to pfxopen (filename w) is similar to the standard C call for creating a file. The user-defined data type fcomplex is created in the routine pfxwdef, which sets up a small array to contain the definition of the type as a C-struct. In this case, we tell PDB that each element of an fcomplex type array is to contain two floating-point members called r and i, representing the real and imaginary parts. Writing data To write data to the PDB file, you must specify the file identifier returned by pfopen, the name under which you wish to write the data, a PDB type name corresponding to the type of the data (for the most part, this is the C type), the actual data, the dimensionality of the data ndim, and an array ind(3 ,ndim) of integers giving, for each dimension, the lowest and highest dimensions of the data to be stored and an optional stride. For example, to write the scalar integer n, we have a call to pfxwrtd with arguments (filel,"n","long",n,O,O) while for the real array x ( - n, n) we have ndim = 1 ind(l,l) -n ind(2,1) = n ind(3,1) = 1 and the call to pfxwrtd has arguments (filel,"x","float",x,ndim,ind) For Fortran character variables, the best thing to do is to pretend these are arrays of characters (C type "char"), so that the value we store in the file appears to have an "extra" first dimension representing the length of each Fortran component. In the example, astring is a variable of Fortran type character ~ ( 40 ), and so we store it as an array of 40 chars. Closing the file The file is closed with a call to pfxclose(filel). After writing time-history data, we would normally either flush the file or close the file and use pfopen to open it with an a (append) argument the next time in order to add data to it. In this way, the file is in a completely usable state even if the physics crashes. Namespace considerations PDB has a facility for dividing variables into "directories" inside the file, similar to a typical file system. Our example does not use it. If you have an x you wish to write out in two different places in your program, you need to choose nonconflicting names, either by using the directory structure or by using a naming convention. The one we use in Basis is to append an @foo so that x@foo means the x that came from the foo part of the physics. Postprocessing Having written our file Data1, we have several ways of postprocessing it. First, we can write our own PDB library calls to read the data from the file into the same or a different program. Or, we can use one of the existing tools described below: PACT utilities, Basis, and Yorick. In the example code, subroutine rtest reads and prints some data from the file along with the original values for verification. The routine pfxivar calls the routine pfivar, which is one of the basic calls for inquiring about the data in the file. Similar calls exist, and are equally easy to use, to determine the names of the variables in their files, their types, and so on. Also available is a routine similar to the pfread routine for reading only part of an object back into memory. Existing PDB utilities All of the utilities we describe below contain many more facilities than we have room to explain. They each contain full-fledged programming languages with facilities for manipulating values, creating variables, creating functions, reading and writing ASCII, executing programs conditionally, and so on. Basis and Yorick are array languages similar in concept to Matlab or IDL (Basis was begun in 1984, inspired in part by the original academic version of Matlab, which later became a commercial product). PACT utilities are based on Scheme, a member of the Lisp family. PDBVIEW and Yorick can read netCDF files and other formats. PACT tools PDB is part of the PACT project. PACT is a rich set of utilities and libraries for the development of numerical applications in C and Fortran. PACT is widely portable. PACT was created by Stewart Brown and his group at Lawrence Livermore National Laboratory (LLNL) and is used in applications at LLNL. Naturally, PACT utilities can read any file written with PDB, and they will have at least the basic ability to display user-defined data types as C-structs. A set of mapping facilities can be used to add certain kinds of semantic data to the file, which can be used by PACT utilities. Here is the PDBVIEW input for doing some postprocessing: ps-flag on cgm - flag of f ; open data1 cf Data1 ; plot y and z against x plot y x lncolor red 1 plot z x lncolor blue 2 ; make a hardcopy hc color ; delete the plots from the frame dl 1 2 ; plot the time histories plot yt time hc dl 1 ; make a shaded surface plot ; set up the mapping with the data def a (lr ' zz '(xx yy) ) cf nil ;set the view angle va 60 10 0 ; set the rendering vr shaded ;draw it pl a hc dl 1 ; integrate it def b (integrate a) pl b hc ;create a new file cf data2 w ; copy data from the original file cf Data1 copy data2 x y Basis Basis, a development system written by Paul Dubois and his group at LLNL, is used in our Lasnex laser-target-design code and in many other programs. Following is the Basis input for opening the file, making a plot of y and z versus x with appropriate labels including the value of the variable astring, which we stored in the file. Then we make a filled contour map of zz considered as a function of xx and yy. Finally, we make a new PDB file data2 containing a vector of the norms of the elements of the complex array w, and the x and y arrays. ps on color ezcshow=false open Data1 # plot y and z against x titles astring plot y x color=green plot z color=blue nf # plot the time histories titles "Time History of Y", "Time" plot yt time scale=linlog nf # make a color-filled contour plot titles "Contours of ZZ", "XX", "YY" plotz zz xx yy color=fill nf # write out a new file real wnorm=sqrt (w*conjg (w) ) create datax write wnorm, x, y Yorick Yorick is a utility developed by Dave Munro of LLNL. Laser-fusion-target designers use Yorick to postprocess the files written by Lasnex. Yorick is particularly fast and is adept at adapting to new file formats. It can be used with netCDF files. Here is the Yorick code to accomplish something similar to the Basis example: hcp_file, "post . ps", dumpe=1 palette, "earth . gp" f = openb ( "Data1 " ) / * plot y and z against x * / plg, f.y, f.x, color = "green" plg, f.z, f.x, color="blue" / * astring is not Yorick's "native" string type * / pltitle, string (&f . astring) / * make hardcopy, then frame advance * / hcp; fma / * plot the time history * / plg, collect (f, "yt" ), collect (f, "time" ) logxy, O, 1 hcp; fma / * make a contour plot on top of filled mesh * / plf, zncen(f.zz), f.yy, f.xx plc, f.zz, f.yy, f.xx, color="red" logxy, O,O hcp /* write out a new file */ / * w is not Yorick ' s * "native" complex type * / wnorm= abs (f .w. r + li*f .w. i) x = f.x y = f.y save, createb ( " datax" ), wnorm, x, y Feedback Feedback I am delighted to announce that my program-development system, Basis, has recently been given unrerestricted release. This means you can now obtain it, as well as the other software discussed in this article, via anonymous FTP by following the directions below. Sample program The sample program is available by anonymous FTP from ftp-icf.llnl.gov:/pub/cip/pfx.f. The file Makefile. pfx in the same directory shows how the program is compiled and loaded on a Sun under SunOs. PACT PACT is freely redistributable software and is portable. It is available by anonymous FTP from phoenix. ocf .11nl . gov. In area pub look for the latest-dated file whose name begins with the word "pact." As of this writing, this file was pact11-16-94.tar.Z Also in that directory are pact .FAQ, pact.README, and the documentation with a name like pactlO_28_93doc.tar.Z. Yorick Yorick is freely redistributable software and is portable. The best place to get Yorick is wuarchive. wust1.edu:/languages/yorick, which mirrors the home site ftp - icf .11nl . gov : /pub /Yorick. The above two sites have the Macintosh version as well as the Unix version. The Unix version is also available at sunsite.unc.edu: /pub /languages /yorick and netlib.att.com: /netlib/env and posted to comp. sources .misc (volume 46, v46i071 through v46i138). In all these cases, the file is yorick- 1. 0 . tar . gz. You need the gnu gzip (from prep . ai .mit . edu:/pub/gnu) and Unix tar utilities to unpack Yorick; the line to do this is gzip - dc yorick- 1.O.tar.gz|tar xvf-. This will create a directory called yorick- 1. 0. To build Yorick, cd into this top-level directory and read the text file called README. Basis Basis is freely redistributable software and is available via anonymous FTP at ftp - icf .11nl . gov: / pub/basis. See the README file in that directory for instructions. Basis is available on the Cray YMP and C90, the Meiko, and workstations from Sun, HP, IBM, and SGI. Ports are under way to the new DEC and SGI 64-bit workstations. The graphics package requires the NCAR graphics library from the National Center for Atrmospheric Research. Installation and use require Perl. Basis is installed on the YMP and C90 computers at the National Energy Research Supercomputer Center. Paul F. Dubois References 1. S. A. Brown et al., "Software for Portable Scientific Data Management," Comput. Phys. 7 (3), 304 (1993). 2. Paul F. Dubois, "Making Applications Programmable," Comput. Phys. 8 (1), 70 (1994).