WATER QUALITY: Technical Information--Use and verification of water-quality models. June 24, 1976 QUALITY OF WATER BRANCH TECHNICAL MEMORANDUM NO. 76.20 Subject: WATER QUALITY: Technical Information--Use and verification of water-quality models. In the course of recent discussions. with the Florida District and the Southeastern Region on the above subject, the Quality of Water Branch asked several of the Water Resources Division's experts in hydrologic modeling to provide their views on certain aspects of the use of-water-quality models. The quality and applicability of two of these replies merit their wider circulation. Accordingly, the replies of R. A. Baltzer and J. P. Bennett are attached for the information of WRD field offices, along with a copy of the thoughtful memorandum that triggered the discussions. I hope that in the near future, Bob and Jim can put their useful summaries in publishable form so they can be circulated throughout the hydrologic modeling fraternity. Please see that the attachments get appropriate circulation to WRD personnel. R. J. Pickering , Attachments WRD Distribution: A, B, FO, PO T0: R. J. Pickering, WRD, Reston, Va. (MS ,412) DATE: January 29, 1976 THRU: T.J. Buchanan, WRD, Miami, Fla. FROM: Robert A. Miller, WRD, Miami, Fla. SUBJECT: WATER QUALITY MODELING: Verification of the calibration of a parametric simulation model Water quality review of the Miami office was held during the first week of January. While reviewing my work in calibrating a water quality parametric simulation.model, Phil Greeson brought up the subject of verification. My present concern is---in a water quality modeling effort what degree of validation is needed before a model can be used for predictive purposes? Validation or verification of a model can mean several different things. First of all if one is building a model from "ground up" then he is most interested in validating the model itself--the concepts involved the mathematical expressions for those concepts and very important: the logic of the programming. This type of validation does not concern my present work since the model I'm utilizing is an EPA stock model, namely QUAL-II developed by Water. Resources Engineers of California. One is also interested in knowing just how well the parameter values determined for the model from field data collected at some particular point in time agrees with the parameter values for data collected at some different point in time. The general approach is that if the set of parameter values for the first calibration (or fitting) are 'the same' as the parameter values for the second calibration then the model -(or more precisely the calibration) "is adequate." But this only raises further questions. Namely: 1. In calibrating the model for time period 1, should some criterion (a numerical measurement) be used before a model is considered calibrated or is it sufficient to think in somewhat nebulous terms of a 'best fit'? 2. If we calibrate for time period one and then recalibrate for time period two (calling the second calibration a verification or validation) how do we decide if the first calibration is actually verified? This is especially difficult if no criterion for 'goodness of calibration' (question 1) is chosen. 3. Past work has inferred, not-proven I believe, that the parameter values in a parametric simulation water quality model may vary as a function of time on an annual cycle. Therefore, if time period one and time period two differ by, say a quarter of a year, the second calibration may not verify the first calibration but to no fault of the calibration work but because of change of state in the real world. Does this 'lack of verification' disqualify the model from being used for predictive work? The above questions involve modeling policy and procedure which in turn involves project management and design. If policy requires two calibrations then at best data collection costs will be doubled and computer costs man-power and time of study will be increased by probably 25%. At worst, verification may not be accomplished and use of the model for predictive purposes thereby prohibited . We are interested in these problems not only because of our present work but also in the adoption of a rational approach toward modeling so that better project management can be accomplished. R. A. Miller cc: L. B. Laird, Atlanta C. S. Conover, Talla. R. A. Kreiger, Atlanta M. E. Jennings Bay St. Louis P. E. Greeson, Reston Irwin, Talla. March 17, 1976 Memorandum To: Chief, Quality of Water Branch, WRD From: James P. Bennett, WRD-NR Subject: PROGRAMS AND PLANS: District Activities--Inquiry from the Florida District on need to verify hydrologic models, Re your March 12, 1976.memo, same subject. In general, every mathematical modeling project should have six. components. I can best describe these using a portion of some notes from a recent Training Course, see the enclosure. Each of the six components requires about the same amount of attention if a viable mathematical modeling program is to be maintained. Let us presume that Mr. Miller is correct in assuming that by using Qual. II he can short-circuit step 1 (A reasonable assumption.) Further assume that he knows how to collect: field data adequate for calibration at least for "some" time period. (Also probably a good assumption.) The numbered points in Mr. Miller's Memo indicate a basic lack of understanding of the next two steps, calibration and verification; let's discuss them in order. 1. Given a particular mathematical model and a particular data set the model is "calibrated" when "best-fit is achieved, that is when the mean square error function is minimized. How good this best- fit is, is determined by the magnitude of the mean square error function at this minimum. Determination of goodness of fit is purely subjective, a management decision, which must be made on the basis of penalties to be incurred as a result of prediction errors resources which can be devoted to the modeling program and many others. Given Qual II and its data set, I feel that the best the District can do is report the value of the. error function achieved at "best fit " 2. Verification is not recalibration at a different time period. Verification consists of comparing predictions made with parameters derived during calibration to an independent data set. Whether or not the parameters (or model) are accepted depends on whether or not the prediction mean square error is less than or greater than some pre-selected allowable value. The magnitude of this value again must be determined by a management decision. Here again, the District should simply report the magnitude of the value obtained. The lower limit of this value is, of course, the value obtained during calibration. 3. Mr. Miller is correct, water-quality model parameters will vary markedly over a year mainly because we don't know enough about the basic processes to be able to call our parameters anything more than "ignorance coefficients". This shouldn't really bother him however, because water-quality models are generally, and should be (given our lack of knowledge about basic processes), written only for critical periods such as the summer low flow season. In this case, the parameters should transfer from one critical period to the next. The District should, then, "calibrate" for one low-flow period and "verify" the next. If, for example, a DO model doesn't "verify" in this manner, some basic process has probably been overlooked. Only if the District can demonstrate that both calibration and verification have been successfully accomplished should any reasonable "manager" use the results of the next step, prediction, in the decision process. Note that even with a properly calibrated and verified model the predictions are only as good as the inputs used to produce then..This is a statement of the GIGO principle (Garbage in, Garbage out . ) Finally, the sixth step, maintenance, is something that the District should do with any model it builds because this can be the basis for future profitable studies. James P. Bennett Hydrologist cc: Assistant Chief Hydrologist R & TC Regional Hydrologist, NR Deputy Assistant Chief Hydrologist, Research Enclosure Mathematical Modeling of Unsteady Flow and Sediment Transport Prepared for Advanced Fluvial Hydraulics and Sediment Transport by James P. Bennett COMPONENTS OF A MATHEMATICAL MODELING PROGRAM Model construction: The formulation of the mathematical equations describing the pertinent physical processes, design of mathematical techniques for the practical solution for these equations, incorporation of these into a computer program and debugging and testing the program to insure that the governing equations are solved with acceptable .accuracy. Data Collection: Compilation of information descriptive of the .physical bounds of the modeled. area, along with boundary and initial conditions for running the model, as well as, collection of observations of prototype behavior for purposes of defining certain coefficients in the equations and insuring ourselves that our conceptions of the physical processes are reasonably accurate. A rough version of the model can and should be used in designing the data collection program. - Calibration: The adjustment of model coefficients to obtain "best fit" of model predictions to a set of observations of prototype behavior. The "best fit" criterion should be some mathematically definable function of observations and corresponding predictions, such as a mean square error function. In the process of calibration, this error function is minimized either automatically or by trial and error removal by manipulating the values of the model coefficients. Verification: Testing predictions from the model using coefficients derived in the calibration step against an independent set of observations to determine if the predictions in the future can reasonable be expected to be acceptably accurate. The criterion for judgment is usually some maximum acceptable value of the error function as defined in the calibration step. How large is "acceptable" must be determined on the basis of judgement, and is dependent on, among others, the costs likely to be incurred as a result of an inaccurate prediction, how well our mathematical furmulation represent the physical processes, and on the resources available to the mathematical modeling program. Prediction: The use of the model to make projections concerning the future behavior of the prototype. This is the payoff of the mathematical modeling program. Remember that output predictions are only as good as the input information supplied. Maintenance and Recalibration: A continuing process of improving the model on the basis of new information concerning the mathematical description of the physical processes, changes in the system being modeled, and new observations of occurrences which can be used as calibration data. This process is necessary to keep the model current and available for use in obtaining future "payoffs". Memorandum To: Robert A. Miller, WPD, Miami, Florida .DATE April 12, 1976 Through: R. J. Pickering, Chief, Quality-of-Water Branch From: Robert A. Baltzer, WRD-NR Research SUBJECT: INFORMATION--Verification of parametric models. Your memorandum to the Chief, Water Quality Branch, concerning water-quality model calibration and verification has been passed on to me--among others--for comment. Your questions are both timely and appropriate, not only with respect to the particular model with which you happen to be working, but to the larger question of water resources modeling and the calibration and verification of models, generally. I begin my comments by providing background information on models and model development. My responses to your questions are presented in what I hope is an orderly, logical manner. They follow, more-or-less, the same sequence with which the questions were raised in your memo. It is my hope that these responses will prove helpful and will afford some clarification of the model calibration/verification question for you, for others working with models, and for those having administrative responsibility for technical programs. Let me begin by pointing out that modeling is a conceptually based endeavor. The model builder starts out with a mental image or mental concept of a process synthesized from particular or discrete information. His concept may be unsophisticated and/or intentionally simplistic. On the other hand, he may choose to employ an advanced, very sophisticated concept of the process. The degree of conceptual sophistication used by a modeler usually depends upon l) his knowledge or scientific understanding of the process, 2) the objective of the modeling effort, 3) the resources--manhours, money, data--available to him, and 4) perhaps certain other peripheral factors, as well. How these factors are blended is a matter of much concern to present day administrative managers of model research and development. Modeling is generally categorized as being of either of two types: physical modeling or mathematical (numerical) modeling. The latter is the more comprehensive of the two, in the sense that any process that can be described in relational terms or pure logic can be incorporated into the mathematical modeling effort. In other words, a process to be simulated or mathematically modeled may embody a broad array of integrated sub-processes drawn together from several distinct disciplines--biochemistry, hydrodynamics, and economics, for an example. Moreover, mathematical modeling affords (or should afford) a high degree of model transferability not inherently possible with physical modeling. Page 2 - . April 12, 1976 Robert A.. Miller, WRD, Miami, Florida The age-old recorded history of water resources investigations is filled with examples of the use of physical models. Yet, although the study of mathematics and classical hydromechanics have been intertwined for centuries, mathematical modeling, as we think of it today, has made its debut as a useful, powerful, and viable tool only within the past decade. This coming-of-age of mathematical modeling is due almost entirely to the advent of large-capacity, high-speed, digital computer systems. Generally speaking, mathematical (numerical) models may be classified as being of either the deterministic (structure imitating) type or of the statistical (usually stochastic) type. From this thumbnail background sketch of modeling, model development, and model classification, let us now turn to the questions you have raised. The use of a mathematical simulation model, whether it be a water quality model or some other type of model, depends almost entirely on the fundamental properties embodied in its original design concept. This is particularly true of structure imitating models. As you clearly pointed out in your memorandum, the person who conceives, formulates, and actually builds a model knows best (or should) what prototype situations can be simulated through its use. The limitations inherent in the modeler's concept govern the prototype circumstances under which it may be appropriately used. To a model user who is not also the model builder, these limitations should likewise be of utmost importance. The user can only hope that the builder has properly conceived the model and has provided him with adequate and accurate documentation of the model. It is a little bit like acquiring a dog: without the papers you don't know whether you are getting a pedigree animal or a "Heinz". All too often we see model users--yes, even in WRD-- attempting to use off-the-shelf models in a cavalier manner without knowing their limitations or the specific conditions for which they are applicable. A well conceived model intended to be appropriate for use with a particular set of prototype circumstances may, nevertheless, still prove to be inadequate. How so? The use of inappropriate numerical processes or poor computer programming technique can sharply curtail the worth and useful scope of what otherwise should have been a useful mathematical model. Numerical instability resulting from computational problems is a prime example of this difficulty. Failure of the numerical scheme to properly conserve the significant non-linear properties of the process can produce spurious results or solution instability, to cite yet another example. Page 3 April 12, 1976 Robert A. Miller, WRD, Miami, Florida As you clearly indicate, you are making use of an EPA stock model, namely QUAL-II. The questions that you must ask yourself and satisfactorily answer are these: a) What limiting constraints and assumptions are inherent . in the design concept of this model? b) Given these constraints and assumptions, is the model . appropriate to my particular problem situation? c) Is there sufficient experience with this model to have reasonable confidence in its numerical processes? Be sure you really know the answers to these questions. Presumably you have already answered these questions to your satisfaction. Before attempting to answer your subsequent questions we need to define and briefly discuss model schematization, model calibration, and model verification--three important aspects of operational modeling. For the most part mathematical models remain conceptually general until they are schematized to represent the process(es) occurring in a specifically defined geographic region or at a particular site. To illustrate, a one-dimensional flow model remains conceptually general, until physical quantities (number of subreaches, lengths of subreaches, channel cross-sectional values, etc.) are supplied together with the appropriate process parameters (Chezy's "C" or Manning's "n", the velocity coefficients "a", etc.). Only after these quantities are defined do we have a specific model. The process of schematization requires a blend of l) scientific understanding of the process(es) being simulated, 2) numerical and operational understanding or familiarity with the model itself, 3) an intuitive ability to discretize the particular prototype situation to best suit the model, and 4) prior field and modeling experience. To a degree model schematization is an art. Calibration of a model is the task of establishing the.correct values of the various parameters used in a particular implementation. Some of these values may be readily determined from actual field measurements. For others, an initial estimate may have to suffice until the calibration process indicates appropriate refinements. Unfortunately, the technique often used for calibration refinement is an iterative series of trial-and- error solutions of the model itself. Needless to say, this is a very expensive process, although--given enough time and money--a calibration of sorts usually results. However, other methods are now available for quickly arriving at a more accurate calibration than can be derived by trial and error. Moreover, these new Page 4 April 12, 1976 Robert A. Miller, WRD, Miami, Florida methods have the virtue of providing a quantitative measure of the probable accuracy of the calibration. I will discuss these briefly in a subsequent paragraph. A word of caution with regard to model calibration! Variables which can be measured and quantitatively determined in the prototype should not be "adjusted" or otherwise modified in order to make the values of the various parameters appear more reasonable. If, as a result of model calibration, one (or more) of the parameters appear unreasonable or seems to be unexpectedly a function of another variable, then one must consider the possibilities 1) that the model, by virtue of its limiting constraints and assumptions, may not be well suited to the prototype situation, 2) that the mathematical/numerical techniques used in the model may be faulty, 3) that there may be a programing, computational, or setup error in the model, and/or 4) that the schematization may have been done improperly. However, one must keep in mind that the normally small errors introduced into the modeling process through discretization of the numerical processes will concentrate in the parametric quantities. Model verification is the task wherein simulated results obtained with a supposedly calibrated model are compared with accurately determined prototype values obtained from field measurements. Verification usually involves the comparison of simulated results and prototype measured values for several different time periods. Hopefully, these depict the more-or-less full range of expected prototype conditions. The usual interpretation of this process is that the difference between the simulated results and the prototype observations is a measure of the degree of verification of the model. All too often, however, we read about or observe model users who first calibrate a model using a measured prototype data det, and then claim to have verified the model using the same set of data. This, of course, is ridiculous. If one is limited in the amount of data that is available, then one must consider using some means such as a split-sample technique. In other words, use one-half of the prototype data sample for model calibration and the other half for model verification. The range of conditions over which a model is applicable will depend upon the model itself, upon the range of data available for its calibration, upon the stability of the parametric coefficients, and upon the reliability of the answers desired. It usually follows that if a model is based upon well-defined, fundamental processes or concepts that are valid throughout a wide range of conditions, then one can safely assume that the model may be used throughout this range of conditions. If one is dealing with a lumped parameter or so-called black-box model based upon very simplistic concepts, then the range over which this model may be applicable is probably very limited, indeed. Such factors as model range-of-conditions must be considered in the choice of a model. Page 5 April 12, 1976 Robert A. Miller, WRD, Miami, Florida Let me now attempt to address your specific questions, namely, those identified as numbers 1, 2, and 3: Question 1: My interpretation of your question is this: . How does one.obtain a numerical measure of the accuracy of the calibration of a model? In other words, how do we quantatively appraise the goodness-of-fit of the calibration? To this my answer is that we now have developed statistical techniques for specifically accomplishing this purpose. In dynamic modeling the cross- spectral density function is proving to be an exceptionally powerful tool for measuring the variance--including phase. angle, presence (or absence) of harmonics, as well as deviations-- of two time series. The use of this technique can also provide probability condifence bands for the calibration. Question 2: Let me repeat what I have stated above. In my opinion the proper step-by-step procedure is as follows: 1) Attempt to calibrate a model for a specific time period, let us designate this time period as period "A"; 2) Do not recalibrate the model for time period "B", but rather use the model as previously calibrated (period "A") for simulating the conditions experienced during time period "B"; 3) Attempt to verify the simulated results by comparing them with the field measured results for time period "B"; and 4) Determine statistically the comparison of the results, and appraise statistically the "goodness-of-fit". Question 3: It is entirely possible in the modeling game (perhaps more so in water-quality modeling than in others) that parameters will have a long-term dependency on other factors such as temperature, season, etc. This implies two facts: First, the process is not sufficiently understood to account for long-term changes and thus the model is not capable of automatically responding to seasonal changes. Therefore, a different set of calibration parameters will be needed for different seasons of the year. This, using your words, "lack of verification" does not necessarily totally disqualify the model for predictive purposes, but it does tell the model user that he cannot proceed with abandon to use it without consideration of the seasonal effects. The second fact is that the model must be used with knowledgeable caution. You pointed out, quite correctly, that the matter of model calibration and verification involves project management and design. I only wish that more of our people in WRD were aware of this fact. In setting up a project it is vitally important to understand the implications of model verification and calibration, and to include these factors in planning project data collection and duration purposes, as well as, for both manpower and dollar reasons.. . Page 6 April 12, 1976 Robert A Miller, WRD, Miami, Florida . The general subject of model schematization, model calibration, and model verification is in an adolescent state of development in WRD and in the water resources fraternity in general. Research is producing better techniques and quantative means for appraising model calibration and modeling results. These techniques should soon become available at the field level, thereby helping to make simulation modeling become fully of age. My response to your original questions is much more long-winded than I had intended. If you have read this far and if my response has clarified any of your questions, then it has been a worthwhile effort. R. A. Baltzer Hydrologist cc: Chief, Quality-of-Water Branch Assistant Chief Hydrologist, R & TC Regional Hydrologist, SR, Atlanta Sub-District Chief, Miami, Florida Chief, Surface-Water Branch Chief, Ground-Water Branch