Estimation of Nutrient Loads and Water-Quality Analyses
27
Additionally, other factors contributing to differences
between sample types could be related to the degree of
mixing at the sample location based upon physical char-
acteristics of the channel, such as distance upstream or
downstream from the control structures and configura-
tion of the channel.
Model Development
An ordinary least-squares regression technique
was used to develop predictive equations for the pur-
pose of estimating total nitrogen and total phosphorus
loads discharged from the east coast canals to Biscayne
Bay. The predictive equations can be used to estimate
the value of a dependent variable from observations on
a related or independent variable. In this study, load was
used as the dependent or response variable and dis-
charge as the independent or explanatory variable.
Because discharge is used in the computation of load,
linearity and the best fit equation can be established or
developed more easily by relating load to discharge
rather than concentration to discharge.
Using more than one independent variable, such
as stage, rainfall, gate opening, and discharge, multiple
linear regression was attempted to improve the predic-
tive equations. However, because discharge is com-
puted from stage and gate openings and is based upon
rainfall, collinearity between independent variables
precluded this approach. A smoothing procedure, called
locally weighted scatterplot smooth (LOWESS), was
used by plotting load as a function of discharge to deter-
mine the degree of linearity between the two variables.
When it was deemed necessary to improve the linear
relation between load and discharge, transformations of
the independent variable were made based upon the
relation of the curve to the Mosteller and Tukey bulging
rule (Helsel and Hirsch, 1992, p. 229).
Improvement in the models was based on
increases in the adjusted coefficient of determination
(R2), which explains the amount of variation in the load
determined by discharge and reduction in the predicted
error sums of squares (PRESS) statistic. The models
selected were those having the highest adjusted R2 and
the lowest PRESS statistic as well as the lowest root
mean square error or standard error of the regression.
The null hypothesis of no linear relation between load
and discharge was rejected at the 0.05 significance level
(a level) when the attained significance level (p-value)
was less than 0.05.
An important part of model development is resid-
uals analysis. Two assumptions of an ordinary least-
squares regression are: (1) variance of the residuals is
constant (homoscedastic) over the range of values, and
(2) residuals are independent. Plots of predicted values
against residuals were examined, and where noncon-
stant variance (heteroscedasticity) over the range of
values was observed, log transformations of the
response variables were made. Because of errors in
comparing log space with real space, no comparisons
were made with this analogy between transformed and
nontransformed models based on the adjusted R2 or
PRESS statistics.
A key element in model development is regres-
sion diagnostics. Basing model adequacy solely on the
adjusted R2 may prove to be inadequate because there
may be no indication as to whether the data have been
well fitted. Examination of data points for leverage,
influence, or outliers was required to verify model ade-
quacy. Outliers in the x direction were determined to be
significant if they exceeded 3 p/n where p is the number
of coefficients and n equals the number of samples
(Helsel and Hirsch, 1992, p. 247). Studentized residuals
were used to examine outliers in the y direction, and
Cooks D was used to determine influence from outli-
ers. Observations were considered to have high influ-
ence if Cooks D exceeded the value for the F
distribution for p +1, n - p at a = 0.1 (Helsel and Hirsch,
1992, p. 249). Numerous data values demonstrated high
leverage, but only a few data values showed both high
leverage and high influence.
The ESTIMATOR Program
Because continuous discharge data are currently
being computed and long-term water-quality data are
available at site S-26 along Miami Canal, a software
program, called ESTIMATOR, was used in the devel-
opment of nitrogen and phosphorus load models and for
the estimation of loads at the site. The ESTIMATOR
program develops models and computes loads based on
mean daily discharge and water-quality data files. This
program was used at site S-26 (formerly part of the
NASQAN network as previously discussed) because of
the continuous discharge data available and the require-
ment that about 50 water samples should have been col-
lected over at least a 2-year period. The ESTIMATOR
program consists of a seven-parameter log/linear model
employing the following regressors: a constant, a qua-
dratic fit to the natural logarithm of discharge, a qua-
dratic fit to time, and a sinusoidal first-order Fourier