This article was kindly scanned, transformed into text, and spellchecked by
George Perkins and Jeannette Dubendorf of MSU, with a little postprocessing by
Jim Linnemann.

COMPUTERS IN PHYSICS, VOL 9, NO. 2, MAR/APR 1995	p 175

                        SCIENTIFIC PROGRAMMING
                     CREATING AND USING PDB FILES
            Stewart A. Brown, Paul F. Dubois, and David H. Munro


Stewart A. Brown is physicist in the Defense Sciences Program at Lawrence
Livermore National  Laboratory.

Paul F. Dubois is a mathematician in the Inertial Confinement Fusion Program at
Lawrence  Livermore  National Laboratory.

David H. Munro is physicist in the Inertial Confinement Fusion Program at
Lawrence Livermore  National Laboratory.

Department Editor: Paul F. Dubois 
duboisl@llnl.gov.

 Hypertext Link Needed Here

    For some time I have been puzzled about how best to discuss certain topics
in this department.  Obviously, it is . nice to be able to talk about programs
in Scientific Programming, but it is in fact  somewhat hard to do. Any real
program, even a real toy program, would occupy page after page in  the magazine,
which could otherwise be filled with more interesting information (or 
advertisements!). Up to now, I have tried to reduce things to small excerpts
only. However, this has its limitations,  and so this month I am trying an
experiment. This article refers to a sample program, which you  should obtain or
view via the Internet for maximum reading pleasure. Instructions for obtaining
it  are in "Feedback" at the end of this article. Your comments on this approach
would be appreciated. You can reach me at duboisl@llnl.gov.

                        Paul F. Dubois
                       Department Editor


    There are many ways to write scientific data into files for later
processing. Recent years have seen a movement away from application-specific
formats  towards symbolic, self-describing portable files. I The following
advantages offered by this method  are compelling:

-Access is keyed by symbolic names and need not be in the order the information
    was written.  Unnecessary data do not have to be read. 
- The files are portable over a wide range of computers.
- The files are robust with respect to changes in the programs that read and
    write them.
-The technically difficult part is encapsulated in the library that accesses the
    files and need not be  repeated in each application. 

    This approach has not yet been standardized, despite the fact that the
leading contenders have much the same goals and general philosophy. Most
libraries have the ability to store simple basic data types only. There is a
lack of standards  for indicating the semantics of the data. For example,

    If we wish to write such an array that is dimensioned, say, (-10:10), how do
we convey that fact,  via the file, to programs that may wish to read and
present the data if the database does not  usually store such information?

    How do we store data of complex type if this is not a type native to the
database? For the user of languages that allow pointers and structures, there is
the additional question of  pointer-following (picking up all the data in the
structure and the data it points to, etc.) and  morphology preservation (making
sure that if objects A and B both point to C as part of themselves,  they still
share C after retrieval from the file).

    In the Eiffel language, a facility for storing persistent objects solves
both these problems.  Indeed, it is possible to transmit objects between client
and server. In C++, several methods of  solution are available, including the
straightforward approach of the Rogue Wave library and the  more exotic OODBM
schemes such as Object Design's Object Store, in which only minimal 
distinctions are made between the variables in the program and the persistent
data. In this article, we will show you how to use the PDB library from Fortran.
This library is freely  redistributable, has some technical advantages, and is
easy to use from Fortran. Once we have shown  you how to write data files, we
will discuss three of the postprocessing tools available for examining  these
files as well as how to use them.

    Instructions for obtaining and installing the software mentioned in this
article are in a separate  box at the end of this article. All the software is
available via anonymous FTP. Two of the tools  discussed can also process files
written in the popular netCDF format.

Part of PACT

    The PDB library is part of a larger development package named PACT. Two
other development/postprocessing systems are presented, Basis and Yorick. This 
article contains brief examples of using these tools. Because all these tools
can both read and write PDB  and ASCII data, and can read or be modified to read
a variety of other formats, they are worth knowing  about in their own right.
Some of the philosophy behind such programmable-application development 
packages was presented previously.2 We will also cover how to access the data if
you are writing your  own postprocessing tool or a data filter to a commercial
package. Those programming in C should know that when an application is fully
integrated into PACT, an  arbitrary C-struct can be stored with
pointer-following done by the library.

Writing a file

    "Snapshot" is Livermorese for "a set of data written out at a given time" in
a time-evolving  simulation. A "history" file, on the other hand, is a file or
family of files to which new values are  periodically added, so that they
contain the history of the value of a quantity sampled at discrete time  points.

    The sample program is, for simplicity, a combination of both types,
containing a single image of  several different scalar and array quantities (a
snapshot) plus the time histories of two variables, time  andy, stored under the
names time and yt in the file.

    This is obvious but worth saying: the file can contain the data under a
different name, or even a  different shape, from that in the program. In our
example, we will create a "complex" type, because that  is not native to PDB,
and store a complex Fortran array of dimension m in it. However, we could store 
it as a real array of dimension 2 m or as a real array of dimension ( 2, m) .
We also store the  variable z in the file with a shape that is different from
the shape that it had in the program. Creating the complex type allows us to
deliver the semantics of this array to the postprocessing  utility, which may
understand this particular type and be able to deal with it appropriately.

Creating the file

    In the sample program, we first initialize the data to be stored. Then we
create the file, define the  complex type in it, write the snapshot data, and
make a series of "time history" calls. Finally we close  the file.

    The file is created by the PDB library function pfopen. In the sample
program, we have  encapsulated all the PDB calls (those with names beginning
with the letters pf) with routines that call  the PDB routines with an
easier-to-use set of arguments. These encapsulation routines also check the 
error return and in case of error retrieve the library error message. The names
of the encapsulation  routines begin with the letters p f x. Utility routines
for handling strings and error messages are  included. Most notable is strsize,
which calculates the length of a character variable without trailing  blanks.

    Note that the call to pfxopen (filename w) is similar to the standard C call
for creating a file. The  user-defined data type fcomplex is created in the
routine pfxwdef, which sets up a small array to  contain the definition of the
type as a C-struct. In this case, we tell PDB that each element of an  fcomplex
type array is to contain two floating-point members called r and i, representing
the real and  imaginary parts.

Writing data

    To write data to the PDB file, you must specify the file identifier returned
by pfopen, the name  under which you wish to write the data, a PDB type name
corresponding to the type of the data (for the  most part, this is the C type),
the actual data, the dimensionality of the data ndim, and an array ind(3  ,ndim)
of integers giving, for each dimension, the lowest and highest dimensions of the
data to be  stored and an optional stride. For example, to write the scalar
integer n, we have a call to pfxwrtd with  arguments

    (filel,"n","long",n,O,O)
while for the real array x ( - n, n) we have ndim = 1 ind(l,l) -n ind(2,1) = n
ind(3,1) = 1 and the call to pfxwrtd has arguments 

    (filel,"x","float",x,ndim,ind)

    For Fortran character variables, the best thing to do is to pretend these
are arrays of characters (C  type "char"), so that the value we store in the
file appears to have an "extra" first dimension representing  the length of each
Fortran component. In the example, astring is a variable of Fortran type 
character ~ ( 40 ), and so we store it as an array of 40 chars.

Closing the file

    The file is closed with a call to pfxclose(filel). After writing
time-history data, we would  normally either flush the file or close the file
and use pfopen to open it with an a (append) argument  the next time in order to
add data to it. In this way, the file is in a completely usable state even if
the  physics crashes.

Namespace considerations

PDB has a facility for dividing variables into "directories" inside the file,
similar to a typical file  system. Our example does not use it. If you have an x
you wish to write out in two different places in  your program, you need to
choose nonconflicting names, either by using the directory structure or by 
using a naming convention. The one we use in Basis is to append an @foo so that
x@foo means the x  that came from the foo part of the physics.

Postprocessing

    Having written our file Data1, we have several ways of postprocessing it.
First, we can write our  own PDB library calls to read the data from the file
into the same or a different program. Or, we can use  one of the existing tools
described below: PACT utilities, Basis, and Yorick. In the example code,
subroutine rtest reads and prints some data from the file along with the 
original values for verification. The routine pfxivar calls the routine pfivar,
which is one of the  basic calls for inquiring about the data in the file.
Similar calls exist, and are equally easy to use, to  determine the names of the
variables in their files, their types, and so on. Also available is a routine
similar to the pfread routine for reading only part of an  object back into
memory.

Existing PDB utilities

    All of the utilities we describe below contain many more facilities than we
have room to  explain. They each contain full-fledged programming languages with
facilities for manipulating  values, creating variables, creating functions,
reading and writing ASCII, executing programs  conditionally, and so on. Basis
and Yorick are array languages similar in concept to Matlab or  IDL (Basis was
begun in 1984, inspired in part by the original academic version of Matlab, 
which later became a commercial product). PACT utilities are based on Scheme, a
member of  the Lisp family. PDBVIEW and Yorick can read netCDF files and other
formats.

PACT tools

    PDB is part of the PACT project. PACT is a rich set of utilities and
libraries for the  development of numerical applications in C and Fortran. PACT
is widely portable. PACT was  created by Stewart Brown and his group at Lawrence
Livermore National Laboratory (LLNL) and  is used in applications at LLNL.

    Naturally, PACT utilities can read any file written with PDB, and they will
have at least  the basic ability to display user-defined data types as
C-structs. A set of mapping facilities can be  used to add certain kinds of
semantic data to the file, which can be used by PACT utilities. Here is the
PDBVIEW input for doing some postprocessing:

ps-flag on 
cgm - flag of f     

; open data1
cf Data1

; plot y and z against x 
plot y x lncolor red 1 
plot z x lncolor blue 2

; make a hardcopy 
hc color 
; delete the plots from the frame 
dl 1 2

; plot the time histories 
plot yt time 
hc 
dl 1

; make a shaded surface plot 
; set up the mapping with the data 
def a (lr ' zz '(xx yy) ) 
cf nil

;set the view angle
va 60 10 0

; set the rendering 
vr shaded

;draw it
pl a
hc
dl 1

; integrate it 
def b (integrate a) 
pl b
hc

;create a new file
cf data2 w

; copy data from the original file 
cf Data1 copy data2 x y

Basis

    Basis, a development system written by Paul Dubois and his group at LLNL, is
used in our  Lasnex laser-target-design code and in many other programs.

    Following is the Basis input for opening the file, making a plot of y and z
versus x with  appropriate labels including the value of the variable astring,
which we stored in the file.  Then we make a filled contour map of zz considered
as a function of xx and yy. Finally, we make a new PDB file data2 containing a
vector of the norms of the elements of  the complex array w, and the x and y
arrays.

ps on color 
ezcshow=false

open Data1 
# plot y and z against x 
titles astring 
plot y x color=green 
plot z color=blue 
nf

# plot the time histories 
titles "Time History of Y", "Time" 
plot yt time 
scale=linlog 
nf

# make a color-filled contour 
plot titles "Contours of ZZ", "XX", "YY" 
plotz zz xx yy 
color=fill 
nf

# write out a new file 
real wnorm=sqrt (w*conjg (w) ) 
create datax 
write wnorm, x, y

Yorick

    Yorick is a utility developed by Dave Munro of LLNL. Laser-fusion-target
designers use  Yorick to postprocess the files written by Lasnex. Yorick is
particularly fast and is adept at  adapting to new file formats. It can be used
with netCDF files. Here is the Yorick code to accomplish something similar to
the Basis example:

hcp_file, "post . ps", dumpe=1 
palette, "earth . gp"

f = openb ( "Data1 " )

/ * plot y and z against x * / 
plg, f.y, f.x, color = "green" 
plg, f.z, f.x, color="blue" 
/ * astring is not Yorick's "native" string type * / 
pltitle, string (&f . astring) 
/ * make hardcopy, then frame advance * / 
hcp; fma 
/ * plot the time history * /
plg, collect (f, "yt" ), 
    collect (f, "time" ) 
logxy, O, 1 
hcp; fma 
/ * make a contour plot on top of filled mesh * / 
plf, zncen(f.zz), f.yy, f.xx 
plc, f.zz, f.yy, f.xx, color="red" 
logxy, O,O 
hcp 
/* write out a new file */ 
/ * w is not Yorick ' s
* "native" complex type * / 
wnorm= abs (f .w. r + li*f .w. i) 
x = f.x
y = f.y
save, createb ( " datax" ), wnorm, x, y


Feedback

                    Feedback

    I am delighted to announce that my program-development system, Basis, has
recently been given  unrerestricted release. This means you can now obtain it,
as well as the other software discussed  in this article, via anonymous FTP by
following the directions below.
  Sample program
The sample program is available by anonymous FTP from 
ftp-icf.llnl.gov:/pub/cip/pfx.f.  The file Makefile. pfx in the same directory
shows how the program is compiled and loaded on a  Sun under SunOs. 
    PACT
PACT is freely redistributable software and is portable. It is available by
anonymous FTP  from phoenix. ocf .11nl . gov. In area pub look for the
latest-dated file whose name begins with  the word "pact." As of this writing,
this file was pact11-16-94.tar.Z Also in that directory are pact  .FAQ, 
pact.README, and the documentation with a name like pactlO_28_93doc.tar.Z.
  Yorick
Yorick is freely redistributable software and is portable. The best place to get
Yorick is  wuarchive. wust1.edu:/languages/yorick, which mirrors the home site
ftp - icf .11nl . gov :  /pub /Yorick. The above two sites have the Macintosh
version as well as the Unix version. The  Unix version is also available 
at sunsite.unc.edu: /pub /languages /yorick and netlib.att.com: /netlib/env and
posted to comp.  sources .misc (volume 46, v46i071 through v46i138).
In all these cases, the file is yorick- 1. 0 . tar . gz. You need the gnu gzip
(from prep . ai .mit  . edu:/pub/gnu) and Unix tar utilities to unpack Yorick;
the line to do this is gzip - dc yorick- 1.O.tar.gz|tar xvf-. 
This will create a directory called  yorick- 1. 0. To build Yorick, cd into this
top-level directory and read the text file called  README.
       Basis
Basis is freely redistributable software and is available via anonymous FTP at
ftp - icf  .11nl . gov: / pub/basis. See the README file in that directory for
instructions. Basis is available on the Cray YMP and C90, the Meiko, and
workstations from Sun, HP,  IBM, and SGI. Ports are under way to the new DEC and
SGI 64-bit workstations. The graphics  package requires the NCAR graphics
library from the National Center for Atrmospheric Research.  Installation and
use require Perl.

Basis is installed on the YMP and C90 computers at the National Energy Research 
Supercomputer Center.

Paul F. Dubois

References

1. S. A. Brown et al., "Software for Portable Scientific Data
Management," Comput. Phys. 7 (3), 304 (1993).
2. Paul F. Dubois, "Making Applications Programmable,"
Comput. Phys. 8 (1), 70 (1994).