Standard Reference Datasets



Project Proposal

Tools for Evaluating Mathematical and Statistical Software

Ronald F. Boisvert
Eric S. Lagergren
William F. Guthrie

NIST Information Technology Laboratory


Summary

This is a joint project of the NIST Applied and Computational Mathematics Division and Statistical Engineering Division. Its purpose is to improve the reliability of mathematical and statistical software products by providing reference data and computational results that enable objective evaluation of algorithms and software by developers and users.

Technical Strategy

Effective methodologies for the testing and evaluation of mathematical and statistical algorithms are absolutely essential to the development of robust, reliable software. Unfortunately, this area has received little attention by the research community, resulting in a lack of tools to aid in product validation by developers and users. One universally applied method for the testing of numerical software is to exercise it on a battery of representative problems. Typically such problems are generated randomly, which insures that a large number of test cases can be easily applied. Unfortunately, this is rarely sufficient for serious numerical software testing. Errors or, more likely, numerical difficulties, in software typically occur for highly structured problems or for those near the boundaries of applicability of the underlying algorithm. These parts of the domain are rarely sampled in random problem generation, and hence testing must also be done on problems which illustrate special behaviors. These are quite difficult to produce, and, as a result, the generation and use of structured problem sets has occurred only on an ad hoc basis. Such data sets have a wide variety of uses:

Unfortunately, these collections are often lost when the underlying technology is adopted by the commercial sector, leaving software developers and users without the tools to judge the capability of their products. A goal of this project is to identify, preserve, and make available such test corpora for use by numerical software researchers, developers and users.

Expectations

The approach best used in the development and dissemination of test data for mathematical and statistical software is dependent upon the particular problem domain and target audience. As a result, we will select particular focus areas that characterize different domains in order to learn more about their differing requirements. Within these areas we intend to develop expertise, perform relevant research, and provide highly visible services using modern communications media such as the World Wide Web. The following focus areas have been selected.

Collaborations

This project parallels other NIST programs in many respects. Like the calibration and Standard Reference Materials programs, it will improve measurement assurance by ensuring software reliability. Batteries of tests for software also provide a baseline for product comparison, as standard physical tests do in other domains. NIST is uniquely positioned for such tasks because of its reputation for technical expertise and objectivity.

NIST is well-known as a producer of mathematical and statistical reference data (e.g., AMS 55). For example, Wolfram Research used tabulated data from AMS 55 to verify the special functions in Mathematica. Further indication that this is an important emerging area comes from IFIP WG2.5 (International Federation on Information Processing, Working Group 2.5 (Numerical Software)), which has identified The Quality of Numerical Software: Assessment and Enhancement as the theme for their next working conference; ACMD is an invited participant. SED has been requested directly by industry (e.g., Dupont) to take leadership in providing such tools in the area of statistical software.

ACMD and SED also have a long tradition in the development of mathematical and statistical algorithms and software. As a result, the issues involved in producing numerical software are well known to us. Other research institutions with interests in numerical software testing include Boeing Computer Services, George Mason University, Los Alamos National Lab, Oak Ridge National Lab, and Rutherford Appleton Lab. We have had positive feedback from each of these.

Participants

Participants will include : R.F. Boisvert, J. Filliben, L. Gill, W. Guthrie, E. Lagergren, D. Lozier, H.-K. Liu, R. Pozo, K. Remington, J. Rogers, M. Vangel, N.-F. Zhang.


Return to SRD Homepage