(ThreshR User Documentation)
2.1 Database
2.1.1 Input Data
2.1.2 Mapping Specifications: Selecting a
Coordinate System
2.1.3 Digital Elevation Models, Watershed
Delineation, and Flow Length
2.1.4 DEM Pre-Processing And Determination
Of Flow Directions for ThreshR
2.1.5 Release notes for peak flow regression
data layers
2.1 Database
2.1.1 Input Data
2.1.1.1 Data Formats
A description of the GIS data types used in ThreshR is provided
here. More information about these data types is provided in the
ESRI online documentation.
Both raster and vector data formats are used in ThreshR. A raster data
layer is an array of regular square cells. A vector data layer consists
of either points, lines, or polygons. Raster data is useful for representing
spatially continuous data sets like rainfall and elevation. Vector data
is useful for representing discrete objects like watersheds and streams.
Many factors affect the choice of whether a data set is stored in raster
or vector format. Some factors considered are storage efficiency, resolution,
accuracy, and ease of analysis and display. It is easier to transform vector
data among coordinate systems than it is to transform raster data. Projecting
raster data has important implications for this project and is discussed
further below. A nice feature of the ArcView Spatial Analyst software is
that functions are provided to convert between raster and vector data formats
and to perform overlays and analyses using these two different data types.
ThreshR takes advantage of these functions.
Raster Layers Used in ThreshR: All raster layers are stored in
the ESRI Grid format. The specifics of the ESRI Grid storage format are
proprietary, but this information is not required to access and manipulate
the data using ArcView. The ESRI Grid format allows efficient access to
raster data in large data sets by dividing the raster layer into blocks.
Integer Grid layers may have a value attribute table (VAT) associated with
them. The VAT contains a record for each unique value contained in
the grid. The "Count" field in a VAT stores the number of cells containing
each unique value. Additional fields can be added to a VAT that describe
the zones defined by unique integer values. On a UNIX or PC filesystem,
several different files are used to define a grid. The file storage
scheme is transparent to the ArcView user but it is helpful to be aware
of a few facts. For an elevation Grid named "dem," there will be
a directory called "dem" that contains several files. The raster
data and header information is stored in these files. If a Grid has
a VAT, this file will be stored in a separate directory called "Info."
The Info directory exists at the same level as the "dem" directory and
stores attribute tables for all grids at that directory level.
Vector: Two vector data file formats can be used in ArcView,
Shapefiles and Arc/Info Coverages. The distinction between these two data
formats is not important to the casual user, and the two data types are
accessed and manipulated using the same ArcView/Avenue functions. The main
difference is that Arc/Info Coverages contain more topological information
than Shapefiles. Both Shapefiles and Coverages contain geometric data describing
either points, lines, or polygons and related tabular information. Shapefiles
are a more portable data format and the format specification is available
to the public. All vector data sets required for ThreshR are stored as
Shapefiles. On the filesystem, Shapefiles consist of 3-5 files with the
same base name but different file extensions. All Shapefiles contain a
.shp
file that stores feature geometry, a .shx file that stores an index
to the feature geometry, and .dbf file (dBase format) that stores
feature attribute information.
Fundamental to the definition of GIS is that data layers have attribute
tables associated with them. All vector layers have an associated
attribute table with one record corresponding to each point, line, or polygon
feature on the map. One of the most powerful features of ArcView
is the visual linkage between map layers and their associated attributes.
Spatial queries are automatically linked to tabular queries and vice-versa.
The ThreshR Avenue programs use these types of query and other advanced
spatial analysis functions.
2.1.1.2 Input Files
Tables 2.1 and 2.2 describe GIS data layers that are part the ThreshR database.
The data layers in Table 2.1 are common to all RFCs. Table 2.2 summarizes
additional data layers that are required to compute flood peak flows for
selected return periods using USGS regression equations (Jennings, Thomas,
and Riggs, 1994). Some of the data descriptions provided in Table
2 encompass more than one specific type of data layer. For example,
soils characteristics refers to three types of layers: soil permeability,
SCS Type A Soil, and SCS Type D Soil. The USGS peak flow
regression equations require diverse types of data that vary from state
to state. Only the data layers required for the States in an RFC are included
in the database for that RFC.
Table 2.1 Data Sets Common to All RFCs
Data Description |
File Type |
Standard File Name |
Digital Elevation Model (m) |
Grid |
DEM |
Flow Direction |
Grid |
Fd |
Flow Accumulation |
Grid |
Fa |
Downstream Flow Length (m) |
Grid |
Fld |
Grid Cell Slope in ft/ft |
Grid |
Slopeff |
Buffered mask of the RFC Boundary (1's inside,
NODATA outside) (used to define computational zone) |
Grid |
Mask |
State Boundaries |
Shapefile-Polygon |
Statekey.shp |
USGS Region Boundaries |
Shapefile-Polygon |
Regions.shp |
Modified RF1 file |
Shapefile-Line |
Rf1.shp |
HRAP Center Points |
Shapefile-Point |
hrappts.shp |
Parameter Information Table |
Dbase |
Parcode.dbf |
Regression Equation Coefficients Table |
Dbase |
Regequat.dbf |
Table 2.2 Data Layers Only Required for Certain States*
Data Description |
File Type |
# of States requiring this (including Puerto
Rico) |
Surface Water Storage (Lakes, ponds, swamps) |
Shapefile-Polygon |
16 |
Mean Annual Precipitation (inches) |
Grid |
19 |
Minimum mean January Temperature (degrees F) |
Grid |
4 |
Rainfall amount for a given duration (inches) |
Grid |
14 |
Forest Cover (percent) |
Grid |
8 |
Soils Characteristics |
Shapefile-Polygon |
3 |
Mean Annual Snowfall (inches) |
Grid |
2 |
Area of Stratified Drift (percent) |
Shapefile-Polygon |
1 |
Runoff Coefficient |
Grid |
1 |
Drainage Frequency (number of first order streams
per square mile) |
Shapefile-Line |
1 |
Mean Annual Runoff (inches) |
Grid |
1 |
Normal Daily May-March Temp (degrees F) |
Grid |
1 |
Impervious Cover (percent) |
Shapefile-Polygon |
1 |
Annual PET (inches) |
Grid |
1 |
2.1.2 Coordinate System
Choosing a coordinate system is an important first step in any "GIS" project.
A GIS coordinate system typically consists of a map projection and a datum.
Although many GIS data sets are collected and distributed in geographic
coordinates (latitude, longitude), the traditional mapping approach is
to transform coordinates into a flat plane coordinate system so that length
and area calculations can be made more easily. To calculate flow
directions, distances, and areas correctly using ArcView, data should be
transformed into a suitable map projection. Geographic coordinates
(latitude, longitude) are not in a map projection although data stored
in the lat-lon system are sometimes said to be in the geographic projection.
ArcView allows the user to display geographic data sets, but built-in ArcView
functions for computing flow directions, areas, and intersections are prone
to significant spatial errors when applied to data in geographic coordinates
because these functions treat longitude-latitude coordinates as x-y Cartesian
coordinates. ArcView Version 3.1 includes a tool for measuring the
actual distance between two points in geographic space using ellipsoidal
formulas, but this is the exception, rather than the rule for ArcView analysis
functions.
The Integrated Hydrologic Automated Basin Boundary System (IHABBS) software
developed at the National Operational Hydrologic Remote Sensing Center
(NOHRSC) deals with coordinate systems differently than ArcView.
In IHABBS, data is stored in geographic coordinates and a map projection
is not used. Equations for an ellipsoidal earth are used in
IHABBS to make length calculations when accuracy is required. This
difference between ArcView and IHABBS is important because it has implications
for the use of IHABBS data within ArcView. For most types of data,
there is no problem because a map projection can be applied. The
flow direction grids prepared for IHABBS in geographic space present a
different problem because the information gets degraded when projected.
This unique problem is discussed more in Section
2.4. The remainder of this section is devoted to discussing the map
projection selected for the ThreshR project.
For the ThreshR project, an Albers Equal-Area Conic map projection and
the North American Datum of 1983 (NAD83) will be used. The Albers Equal-Area
Conic projection with the standard parallels given in Table 2.3 is one
of the most commonly used projections used by the USGS to make maps of
the conterminous United States (Snyder, 1987). From this point forward,
the Albers Equal-Area projection with the standard parallels of Table 2.3
will be referred to as the "conUS Albers" projection where conUS stands
for conterminous United States. Many data distributors choose the conUS
Albers projection to distribute GIS data sets with nationwide coverage.
The USGS Water-Resources Investigation Report 94-4002 that gives peak flow
regression equations for all 50 states (Jennings, Thomas and Riggs, 1994)
presents many of the required input maps using the conUS Albers projection.
The coordinate system parameters shown in Table 2.3 are also identical
to those chosen by the US Army Corps of Engineers Hydrologic Engineering
Center (HEC) for their proposed Standard Hydrologic Grid (HEC, 1996).
For Alaska and Hawaii, different sets of standard parallels are recommended
to reduce distortion and could be chosen using the "1/6" rule cited by
Snyder (1987) or by following the lead of previous cartographers by selecting
standard parallels of 55° N and 65° N for Alaska and 8° N and
18° N for Hawaii (Snyder, 1987).
Table 2.3. Coordinate System Parameters Used in ThreshR*
Characteristic |
Description/Value |
Projection type |
Albers Equal-Area Conic |
1st standard parallel: |
29.5 N |
2nd standard parallel: |
45.5 N |
Longitude of the central meridian: |
96 W |
Latitude of the projection origin: |
23 N |
False Easting: |
0 |
False Northing: |
0 |
Datum |
NAD83 |
Units |
Meters |
* For gridded data sets, a cell size needs to be determined. Cell size
specification is discussed further in Section 2.4.
One advantage of using an Equal-Area projection is that by design, the
area of a polygon (i.e. a watershed) in projected space closely approximates
the true area of that polygon on an ellipsoidal model earth. Area is an
ideal characteristic of a map to preserve for hydrologic applications because
watershed drainage area is often a key parameter. In order to preserve
area there must be some distortion in shape, angle, and scale, but the
amount of these distortions can be minimized with a prudent choice of projection
parameters. Maps covering large areas inherently involve larger distortions
than maps covering small areas.
In addition to drainage areas, length measurements are also important
for hydrologic calculations. By considering the scale distortion (ratio
of map distance to earth distance) inherent to the conUS Albers projection,
an approximate estimate of what the maximum distortion in a stream length
measurement or a watershed perimeter measurement can be made. For purposes
of this discussion, scale is defined as the ratio of a mapped distance
to the corresponding distance on a model (ellipsoidal) earth. Using the
conUS Albers projection, a maximum scale distortion of 1.25% occurs at
the northernmost and southernmost edges of the map (49° N and 25°
N), decreasing as one moves towards the standard parallels of 29.5°
N and 45.5° N where the scale distortion is 0 (Snyder, 1987). The scale
distortion peaks again midway between the standard parallels at 37.5 N,
but is less than 1% at this point.
These maximum distortion estimates of 1.25% and 1% occur along meridians
of longitude and parallels of latitude. For an equal-area projection, the
scale factor along meridians of longitude (h) is the inverse of the scale
factor along parallels of latitude (k) so when k = 1.0125 (a 1.25% error),
h = 0.9877, thus the product h * k = 1. This means that even near the edges
of the map where maximum distortion is 1.25% (k = 1.0125), the net length
distortion of stream lines that are not aligned perfectly with a meridian
of longitude will be less than 1.25% because the scale factor associated
with parallels of latitude has the opposite effect from the scale factor
associated with meridians of longitude. To demonstrate that the actual
errors in length calculations are less than these maximum errors, a comparison
was made between watershed perimeters computed using (1) the IHABBS ellipsoidal
equations for making length calculations, and (2) perimeters computed using
ArcView for the same polygons in the conUS projected space. These calculations
were made for 48 watersheds in the Arkansas-Red River Basin. The maximum
difference in computed watershed perimeters was 0.5% and the average difference
was only 0.11%. As expected, these figures are lower than the scale distortion
percentages, h, (0.7% to 1%) along meridians of longitude for latitudes
in the Arkansas-Red River Basin (34° to 38° N).
From an implementation standpoint, it is necessary and appropriate to
work in projected space because of the capabilities of ArcView, and because
this is a standard GIS approach. Another aesthetic advantage with working
in projected space is that the shapes of polygons (watersheds or political
boundaries) on the screen and on printed maps are much more realistic looking
when compared with 2-D longitude-latitude plots. Displaying 2-D longitude-latitude
plots is a tremendous distortion of reality because of the fact that meridians
of longitude converge on one another as they move closer to the poles.
A nice feature of working in projected space is that the distances and
sizes of features that a user sees on the screen are closer to reality.
To illustrate this point, Figure 2.1 shows a map of the 50 U.S. states
plotted using geographic coordinates (x-y = long-lat) in an x-y plane and
the same map plotted in the conUS Albers projection. The shapes of features
in the conUS projection are much more similar to those you would see on
a globe.
To summarize, the ThreshR application must use projected data. The conUS
Albers projection defined in Table 2.3 is appropriate for hydrologic parameter
estimation. There is no area distortion associated with this projection
and the maximum possible length distortion is about 1.25% (in most cases
the length distortion will be considerably lower). This distortion error
is small relative to other uncertainties in ThreshR calculations. Products
from the GIS analysis (i.e. basin boundaries, maps of threshold runoff,
stream lines, etc.) can be easily projected into other coordinate systems
for display and use with other NWS software applications.
Figure 2.1. Comparison of a 2-D Lon-lat Plot with an Albers Equal-Area
Plot
2.1.3 Digital Elevation Models, Watershed Delineation,
and Flow Length
A digital terrain model is an ordered array of elevation values and is
commonly stored in one of three data structures: contours, grids, and triangulated
irregular networks (TINs). Moore et al. (1991) review the advantages
and disadvantages of each of these three data structures. Grid terrain
models are the most widely available data structure, terrain analysis methods
using grid models are simple, and the grid structure is compatible with
remote sensing techniques. Only gridded elevation data is used in this
project. The term adopted by the USGS to describe digital terrain data
in a grid format is digital elevation model (DEM).
DEMs are available at many different resolutions. In the United States,
the U.S. Geological Survey (USGS) has taken a lead role in developing and
distributing DEMs that can be used for analysis over a wide range of scales.
Currently, 3 arc-second DEMs are available from the USGS for all of the
United States and its territories. The spacing between elevation values
in a 3 arc-second DEM is about 90 m at the Equator. The USGS is also working
on a seamless 1-arc second (~30 m) DEM for the United States called the
National Elevation Data Set. When working in large areas, use of
even 3 arc-second data can consume large amounts of disk space and computational
time. For this reason, the initial ThreshR database includes terrain grids
with 15 arc-second spacing. The 15 arc-second DEMs used in ThreshR were
created at NOHRSC by resampling 3 arc-second DEMs from the USGS 3 arc-second
data. The 15 arc-second data require 25 times less disk space than the
3 arc-second data. To provide a sense of file sizes, a 15 arc-second DEM
for the largest RFC (MBRFC) would consume about 30 MB while the 3 arc-second
DEM would consume roughly 750 MB. The threshold runoff software
will work Threshold runoff calculations a good deal of uncertainty even
with the best data available so the hypothesis is that any inaccuracies
associated with using the 15 arc-second data are small relative to other
hydrologic uncertainties and the computational and storage savings is worth
the tradeoff.
A DEM can be used to define the drainage path that a water droplet falling
on the land surface takes to a watershed outlet. If water flow is on the
surface or near the surface it is often assumed that the direction of flow
is controlled by topography. A common computer method for determining a
drainage network from DEMs is summarized here.
2.1.3.1 Pit Removal
Depressions or pits in a DEM are cells or groups of cells that have a lower
elevation than all surrounding cells. Pits in a DEM may be real, representing
natural landscape features, but are often artificial, caused by data errors
introduced during elevation sampling or data processing. Both O'Callaghan
and Mark (1984) and Band (1986) note that in areas where fluvial erosion
processes sculpt the landscape most pits in the DEM are caused by data
errors while recently glaciated areas or karst landscapes are more likely
to have natural depressions. Various methods have been developed to remove
pits from DEMs so that a continuous flow path is defined from every cell
to the edge of the data set or the watershed outlet. The pit filling algorithm
described by Jenson and Domingue (1988) is coded into ESRI software and
is used for pre-processing ThreshR data. The Jenson and Domingue (1988)
algorithm raises the elevation of all cells in a pit to the elevation of
the lowest pour-point on the edge of the pit.
2.1.3.2 Flow Direction and Flow Accumulation
Models
Several models for defining a grid of flow directions based on a DEM are
discussed in the literature and good reviews of these methods are provided
by Tarboton (1997) and Costa-Cabral and Burges (1994) who both introduce
their own methods. The simplest and most widely used method (often referred
to as the D8 method) to define flow directions in DEMs is described by
O'Callaghan and Mark (1984) and Jensen and Domingue (1988). A D8 flow direction
function is available in ESRI software and is used for ThreshR pre-processing.
In the D8 model, it is assumed that a water particle in each DEM cell flows
towards one and only one of its neighboring cells, that cell being the
one in the direction of steepest descent. To assign a flow direction value
to a cell, the "distance weighted drop" to each of eight neighboring cells
is computed by taking the difference in elevation values and dividing by
for a diagonal cell and one for a non-diagonal cell. The flow direction
for a cell is assumed to be in the direction with the highest distance
weighted drop. In situations where multiple adjacent cells have equal drops
or where all adjacent cells have no drop (flat areas), special considerations
are made (Jenson and Domingue, 1988). In ESRI software, the eight possible
flow directions are assigned unique numbers based on the following convention
( East = 1; Southeast = 2; South = 4; Southwest = 8; West = 16; Northwest
= 32 ; North = 64; Northeast = 128 ). Figure 2.2a shows an example elevation
grid, Figure 2.2b shows the flow direction assignment convention, Figure
2.2c shows the numerical values assigned to cells in the flow direction
grid, and Figure 2.2d shows the flow directions symbolically with arrows.
Figure 2.2. Assignment of flow directions using the D8 model.
(a) elevations, (b) flow direction codes, (c) flow direction grid values,
(d) symbolic representation of flow directions.
Using a flow direction grid and weight grid as input, a flow accumulation
function is defined which returns, for each cell, the sum of the weights
of all cells that flow to that cell. If the weight grid values are all
1, the cell values in the resulting flow accumulation grid are equal to
the total number of cells contributing to that cell. A flow accumulation
grid and a skeleton network implied by the flow direction grid are shown
in Figure 2.3. In this example, all weights are equal to one.
Figure 2.3. (a) Flow accumulation grid and (b) schematic of drainage
network.
In areas of low relief where DEMs are often flat, use of the D8 model
alone to determine flow directions may result in significant errors. This
problem is addressed further in Section 2.4.
2.1.3.3 Drainage Network and Watershed Definition
A simple and practical method for defining drainage networks is to identify
channel cells as cells which drain a minimum threshold area. Depending
on the choice of threshold value, this method may yield a dense or sparse
channel network. In ThreshR, the drainage area threshold is selected
to define the size of the watersheds to be studied. No attempt is made
to determine the threshold at which there is physical evidence of a transition
from hillslope to channel transport mechanisms.
Delineation of watersheds from a DEM requires a flow direction grid
and a set of pour-points. Using a GIS, pour-points may be (1) selected
interactively with a mouse, (2) selected automatically at the downstream
end of each link in the drainage network created using an area threshold
(Jenson and Domingue (1988); Maidment and Mizgalewicz (1993)), or (3) identified
using a coordinate file. The second method is used to identify points where
ThreshR will be computed. The results of a generic procedure for automatically
delineating a watershed for each link in a drainage network is illustrated
in Figure 2.4. Based on the drainage network of Figure 2.3, Figure 2.4a
shows cells tagged as stream cells based on the arbitrary criterion that
a stream cell has flow accumulation greater than two. Figure 2.4b is a
grid in which each stream link is assigned a unique integer value and Figure
2.5c shows the same grid divided into six subwatersheds. Cells in the same
subwatershed are assigned the same grid-code.
Figure 2.4. (a) Stream cells,
(b) stream link grid, (c) subwatersheds.
2.1.3.4 Flow Length
Flow length is a useful function available in ArcView and Arc/Info that
calculates the length from each cell in a Grid to the nearest outlet point
or "NODATA" cell. A flow length grid computed for the DEM shown in
Figure 2.2a is shown in Figure 2.5. .
Figure 2.5 Flowlength Grid for the DEM From Figure 2.2a
2.1.4 DEM Pre-Processing And
Determination Of Flow Directions for ThreshR
Special DEM pre-processing procedures have been developed to prepare flow
direction grids for ThreshR. With a coarse 15 arc-second DEM as input,
simple application of the D8 flow direction model often yields undesirable
results in flat areas. In flat areas, the locations of digitized streams
are used to help define flow directions.
2.1.4.1 Why not use IHABBs flow direction values?
NOHRSC developed a grid of flow directions with a procedure that uses both
the D8 model and information from a digitized stream network (an RF1 stream
grid). However, this flow direction grid cannot be used here because the
grid was developed using geographic coordinates (latitude, longitude).
The reason that this is a problem is discussed here.
Measured in ellipsoidal coordinates, the spacing between flow direction
values in the IHABBS grid is 15 arc-seconds. The spacing of flow direction
values along parallels of latitude in meters depends on the latitude. The
spacing of flow direction values along meridians of longitude in meters
is relatively constant with only slight variations due to the fact that
the earth is shaped more like an ellipsoid than a sphere. To provide some
example numbers, assume that the earth is a sphere of radius 6371.2 km.
This makes the approximate spacing of 15 arc-seconds along the Equator
and along meridians of longitude throughout the globe equal to the earth
radius (r) times the arc length in radians or:
y = 6,371,200*15/3600*3.1416/180 = 463 m
To get the spacing between flow direction values along parallels of latitude
(x), y is multiplied by cos(latitude). A summary of this "lateral" spacing
between elevation points for the range of latitudes covering the conterminous
United States is given in the Table 2.4 below:
Table 2.4. Meter Equivalent of 15 Arc-second Spacing Along a Parallel
of Latitude
Latitude |
Approximate Meter Equivalent to 15 arc-second
spacing along parallels |
25 |
420 m |
30 |
401 m |
35 |
380 m |
40 |
355 m |
45 |
328 |
50 |
298 |
The implications of this uneven spacing is that gridded data must be
resampled when projected into a Cartesian system like conUS, because by
definition, a Grid in GIS consists of regular, square cells. When projecting
a grid, the first step is to select a cell size for the output space. Based
on the information in Table 2.4 and the convenience of using round numbers,
a cell size of 400-m has been selected for the ThreshR project.
The resampling procedure proceeds as follows. The required areal coverage
for the output grid is determined based on the extent of the input data
set, and a set of regular square cells of the selected size is laid out
in the output plane. The center-points of the grid cells being projected
are transformed into the output plane, and values are assigned to the regular
square cells in this plane based on values from the projected grid center-point(s).
There are three options for assigning values to each of the square output
cells based on nearby projected, center-point values. These options are
nearest neighbor, bilinear, and cubic resampling. For categorical(integer)
values (e.g. flow direction) the value of the nearest projected center
point is assigned to each square output cell. In flow direction data, codes
are discrete -- unlike temperature and rainfall grids and there is no correlation
between an individual cell's flow direction value and the values in surrounding
cells. The nearest neighbor resampling option illustrated in Figure 2.5.
Figure 2.5. Illustration of Nearest Neighbor Resampling (Adapted
from the Arc/Info Online Help)
To demonstrate the problem with resampling flow direction values, the
IHABBS flow direction grid for ABRFC was projected into the conUS Albers
projection using a cell size of 400 m. Projecting and resampling the flow
direction cells creates "mirror" cells - cells that flow to one another.
An example of a situation in which a mirror cell was created is shown in
Figure 2.6. In this figure, the arrows represent the flow directions assigned
to cells in the output plane. The small boxes with x's in them represent
the location of projected center points from the original grid with 15
arc-second spacing. Five of these center points are labeled with their
original flow direction values (East, East, South, West, and South). It
can be seen that the reason for the occurrence of mirror cells 2 and 3
is that cell 2 is assigned the value East and cell 3 is assigned the value
West even though these two directions were separated by a South cell in
the original (lat-lon) grid.
There are too many occurrences of mirror cells to consider identifying
and correcting these problems. There are 9,097 instances of mirror cells
created
in the ABRFC alone. This number might seem high, but it is not high compared
with the total number of flow direction cells in the ABRFC grid (4.1 million).
Figure 2.6. Mirror cell example.
2.1.4.2. ThreshR Procedures to Create a Flow
Direction Grid in Projected Space
As stated earlier, one problem with the D8 flow direction method is that
it is a local method that cannot resolve the flow directions for cells
surrounded by other cells of equal elevation -- "flat" areas. Attempts
to represent areas of low relief using a digital elevation model (DEM)
often results in flat surfaces, even if the true relief exhibits a mild
slope. These flat DEM areas may arise due to inadequate vertical resolution.
In addition, flat areas are created when artificial pits in the DEM are
filled. Steps intended to improve flow direction assignment in flat areas
are described here. The steps are designed to be as close to similar procedures
used at NOHRSC in developing their flow direction grid.
The NOHRSC flow direction analysis starts with filling artificial pits
and applying the D8 flow direction algorithm. Steps that were taken at
NOHRSC to modify ("improve") the raw D8 flow direction grid are summarized
here. The basic philosophy is to let the digitized streams exert control
on flow directions in flat areas where there may not be enough resolution
in the DEM to define flow directions.
NOHRSC Flow Direction Modifications (quotations from Internet documentation)
N1: "Route lake[s] and reservoirs to their outlets define[d] either
by rivers and streams or by the lowest water edge elevation flowing away
from the lake or reservoir,"
N2: "Route rivers and streams from their headwaters to their tail waters.
Route cells immediately adjacent to rivers and streams into the rivers
and streams,"
N3: "Route flat areas to intersecting streams or the nearest routed
cell,"
Steps N1 and N2 can be emulated by "burning" the digitized stream network
into a projected and filled DEM and then computing flow directions. The
edited version of RF1 created at NOHRSC is very useful for this purpose
because the nohrsc_rf1 drainage network contains only single line representations
of streams and single line flow paths have been drawn to replace lakes
and reservoirs. The stream burn-in procedure involves the following steps:
1. Rasterize the nohrsc_rf1 network, making sure that the newly created
stream network cells are the same size as the DEM cells and correctly registered
2. Create a modified DEM by lowering elevation values corresponding
to the rasterized nohrsc_rf1 cells relative to all surrounding (non-stream)
elevation cells. An easy way to do this is to assign all stream cells the
elevation 0.
3. Apply the ArcView-Arc/Info flow direction function to the modified
DEM
The effects of the ArcView-Arc/Info stream burn-in procedures that are
consistent with NOHRSC modifications N1 and N2 are as follows.
1. All stream cells will be routed from their headwaters to their tailwaters,
2. Because a streamline has been used to represent lakes and reservoirs,
water will be routed through these areas in an appropriate manner following
the digitized line.
3. Cells immediately adjacent to rivers will be routed to rivers because
this will be the direction of steepest descent.
Although the stream burn-in procedure has essentially the same effects
as modification steps N1 and N2 performed at NOHRSC, step N3, which deals
with flat areas, is more difficult to emulate. In a "flat" area, an appropriate
flow direction value cannot be assigned to a cell by looking only at each
of its eight immediate neighbors. The location of digitized streams can
be used to guide flow direction assignment in flat areas. This is a summary
of the scheme used by NOHRSC:
Some large flat areas will contain streams while other flat areas may
not. Large flat areas that contain a stream (as defined by the RF1 files)
are essentially split into smaller flat areas by the stream. Any stream
that splits a flat area becomes a bordering cell for that flat area. Assignment
of flow directions for the remainder of flat area cells is done recursively,
starting at border cells with defined flow directions (either stream cells
or cells of lower elevation) and moving progressively inward. The flat
area eventually shrinks to nothing.
This algorithm used in ESRI software (Jenson and Domingue (1988)) is
very similar in concept to that used by NOHRSC, but there is a difference
in the results achieved. Looking at the computed flow directions, it appears
that the assignment of flat area flow directions in Arc/Info is biased
towards cells on the edge of flat areas with the lowest elevation (rather
than simply the nearest border cell). For example, compare the original
flow directions from NOHRSC and the flow directions computed in Arc/Info
using the projected and resampled NOHRSC DEM in Figures 2.7a and 2.7b.
Figure 2.7a. NOHRSC Flow Directions in a Flat Area with No Streams
Figure 2.7b Arc/Info Flow Directions in a Flat Area with No Streams
Since assigning flow directions in flat areas of the DEM involves a
lot of uncertainty, either the NOHRSC approach or the Arc/Info approach
is probably acceptable. However, because the Arc/Info algorithm is particularly
sensitive to the elevation values of cells bordering the flat areas, burning
in the streams can have a significant effect on the flow in flat areas
and give some odd results. As explained above, burning in the streams involves
raising the land surface cells relative to the stream cells in some way.
It turns out that the elevation amount by which the stream cells differ
from the land surface cells influences the way the Arc/Info algorithm assigns
flow directions in flat areas (if a stream crosses the flat area in question).
For example, in a test in an area where a stream crosses a flat portion
of a DEM, the resulting flow directions computed with Arc/Info when lowering
the stream cells relative to the land surface by 1 meter, 10 meters, and
100 meters were compared. The results of this test are shown in Figures
2.8a, 2.8b, 2.8c, and 2.8d where 2.8a is the unmodified DEM. The most dramatic
result is shown in Figure 2.8d where the stream is drawing all cells in
the flat area for which the flow directions cannot be determined using
the D8 algorithm. Note that cells near the bottom of Figure 2.8d are drawn
down by a stream that is not shown. (Although I do not know the reason
for these differences, I'm wondering if it has to do only with the order
of computations.)
Figure 2.8a. No burn-in.
Figure 2.8b. 1 m burn-in
Figure 2.8c - 10 m burn-in
Figure 2.8d - 100 m burn-in
From Figures 2.8 a, b, c, and d it appears that choosing a small stream-landsurface
differential with which to "burn-in" the streams (Figure 2.8a) creates
a more sensible assignment of flow directions in flat area cells. The dilemma
is that if a small stream-landsurface differential is introduced, then
the delineated streams might not precisely match the digitized streams
in non-flat areas. A precise match in non-flat areas may or may not be
important. This leads to the point that there are actually two motivations
for burning in the streams -- to provide consistency between digitized
and delineated streams and to provide added guidance for flow direction
assignment in flat areas. Comparing Figures 2.8a and 2.8b shows the advantages
of burning in streams in flat areas.
In order to produce flow direction assignments that are similar to those
generated at NOHRSC and to eliminate the possibility of flow direction
bias due to burning in the streams (Figure 2.8d), programs have been developed
to introduce a small elevation gradient in flat areas towards cells which
have a defined flow direction. The scheme uses a series of Arc/Info commands
and a C program. These programs are only used for pre-processing and will
not be required by the end user. This scheme eliminates the ambiguity in
assigning flow direction values in flat areas using the D8 algorithm. An
illustration of this scheme is given in Figures 2.9-2.11.
Figure 2.9 shows cells for which a flow direction cannot be assigned
using the D8 model in white and green (white and medium gray in black and
white copies). Black cells in Figure 2.9 are non-flat areas and the continuous
string of cells down the middle of Figure 2.9 is a burned stream. Green
cells (down-border cells) are distinguished because they have neighbors
with D8 flow directions and have no neighbors of higher elevation. Other
white cells should flow towards these green cells. Elevation values in
flat areas are adjusted slightly to introduce a gradient towards the green
cells. This is illustrated in Figure 2.10 where the elevation gradient
is from white (slightly higher cells) to black (lowest cells). When flow
directions are recomputed using the adjusted elevation grid, the results
are fairly consistent with the flow directions derived at NOHRSC (Figure
2.11a and b).
Figure 2.9
Figure 2.10
Figure 2.11a. Flow directions computed with a modified elevation
grid
Figure 2.11b. NOHRSC Flow directions
2.1.4.3 A Comparison of Watershed Areas Delineated
using IHABBS Flow Directions and ArcView-Arc/Info Flow Directions
Drainage areas delineated using (1) the Arc/Info flowdirection function
applied to the projected NOHRSC DEM without modifications, and (2) the
Arc/Info flowdirection function applied to a projected and modified
DEM (modifications include burning in the streams and introducing an artificial
elevation gradient in flat areas) are compared to drainage areas delineated
using IHABBS. The study was made for 47 basins in the ABRFC ranging in
size from 160 km2 - 2000 km2. For case 1, the average
percent difference between drainage areas computed with Arc/Info and with
IHABBS is 1.9%, with a standard deviation of 4.0%, and a maximum difference
of 23.7%. For case 2, the average percent difference in drainage areas
computed with Arc/Info and IHABBS is 0.6%, with a standard deviation of
0.8% and a maximum difference of 4.1%.
In case 1, the most significant differences in basin boundary definition
are caused by differences in how the two software packages split up flow
in flat areas near basin divides. In case 2, the extra pre-processing being
done for ThreshR eliminates some of these discrepancies.
2.1.4.4 Problems with Grid Cell Resolution
The stream burning procedure that is used in ThreshR pre-processing involves
setting all cells that underlie the vector RF1 files to zero. The flow
direction values for each of these zero valued cells are assigned using
the Arc/Info flow direction function. In this function, the stream cell
flow directions are not determined using the direction of steepest descent
because there is no direction of steepest descent. The stream cell flow
directions are actually determined using an iterative procedure that works
its way backwards from the outlet. A problem with setting stream cell values
to zero arises when two stream arcs are so close together that upon rasterization
a cell in one stream segment has a neighbor which is a cell in another
stream segment and these cells are not near a segment junction point.
What can happen then is that a cell which neighbors with another stream
segment may be assigned a flow direction into its neighboring stream segment
rather than to its own stream segment because there are no elevation values
to guide the choice. This problem occurs partly because the source data
scale for the 15 arc-second DEM (~1:1,000,000) is inconsistent with the
source data scale for the RF1 files (~1:500,000). In other words,
the cell size is too large relative to the density of the stream lines.
Ultimately, the best solution for this problem might be go the to use of
a 3 arc-second DEM, but for now work around programs have been developed.
At a later time, it may be worthwhile to investigate the tradeoffs between
large increases in storage and computation time associated with using a
higher resolution DEM and any improvements in results that might be obtained.
There are several possible ways to deal with the DEM cell resolution
problem described in the preceding paragraph. One way to minimize
this problem is to decrease the cell size by resampling the DEM.
This approach is not adopted because it creates a larger data set, implies
a greater resolution in elevation data than what is actually available,
and the cell size required to eliminate this problem is not obvious.
The number of locations at which the adjacent burned stream reach problem
is created might be reduced by using only a 1-m stream-drop when burning
in the streams rather than setting all stream cells to zero elevation;
however, it is not clear thaty this solution would guarantee success in
all cases. In addition, it is difficult to define a rational basis
for choosing a constant stream-drop value. Small stream-drops will
not force the DEM streams to follow the digitized streams. Forcing
stream junctions to occur in the locations defined by digitized streams
is helpful, something that can be more easily achieved with a large stream
drop or setting the stream cell values to zero. By forcing flow into
digitized streams, the ThreshR flow directions will be more cosistent with
the NOHRSC-derived flow directions.
The approach used to correct the adjacent burned-stream reach problem
in the 15 arc-second ThreshR database is to edit the flow direction grids
manually where necessary. A set of pre-processing tools has been
developed to make this task feasible. These tools automatically identify
potential problem areas like that shown in Figure 2.12, and provide an
interactive environment for editing the flowdirections near these problem
areas as deemed necessary. The number of potential problem areas
in ABRFC and SERFC was about 70 each. Only a fraction of these points
actually required editing and the time required for this task is about
1-2 hours. Instructions for implementing the flow direction pre-processing
procedures have been prepared but are not presented here because the end
user is not required to implement these procedures.
Figure 2.12 Example of a situation where the rasterized version
of two distinct creeks contain neighboring cells.
Pre-processing. To develop a grid of flow directions in projected
space, the following procedures
-
Convert the NOHRSC RF1 files, the 15 arc-second DEM, and the slope grid
into Arc/Info format.
-
Project the data into the conUS projection deefined in the Technical Memo.
-
Buffer an RFC basin outline obtained from the NOHRSC web site and use this
to clip out only the data that is really needed for analysis.
-
Convert the NOHRSC RF1 files to a grid and "burn-in" the streams.
-
Identify flat zones in the DEM.
-
Modify the burned DEM by establishing an elevation gradient in the flat
zones towards cells of lower elevation.
-
Re-compute flow direction from the burned DEM.
-
A difficulty that arises when burning-in the streams is that the scale
of the RF1 source data is inconsistent with the 15 arc-second (400 m) digital
elevation model cells being used for this analysis. Digitized streamlines
may be within 800 m of one another even far upstream from stream junction
points. This means that when the stream network is rasterized, there may
be locations where cells from different stream segments neighbor one another.
Subsequently, when streams are "burned-in" to the DEM, crossover of flow
to the wrong stream segment can potentially occur at locations far removed
from the actual stream junctions. After recent discussions with Dennis
J. it seems that the best solution for this problem is to identify potential
problem sites are identified with an automatic procedure developed in ArcView,
but to manually edit the flow direction grids in problem areas. The need
to deal with this problem would be eliminated if 3 arc-second digital elevation
data were used.
Semi-automated procedures have been developed for the above pre-processing
tasks. Running the pre-processing procedures is describing in Section III
of this document.
2.1.5 Release notes for
peak flow data
Iowa: The stream order Grid ("stral_ia") in the ThreshR
database is incorrect. As a temporary solution, baseline runs for
Q2 are made using the regression equations from Carpenter and Georgakakos,
1993 (p. 20) is used. A linefile of first order streams in Iowa is
needed.
Colorado: The equations for the Eastern Colorado Plains drainage
areas less than 20 mi2 have not yet been included. For all size basins
in the Eastern Colorado Plains, there are not regression equations given
in USGS WRIR 94-4002 for Q2.
Texas: In the ThreshR database, the boundaries of regions 3 and
4 on the Texas regions map were extended to include the playa area in the
panhandle where regression equations do not apply. Thus, baseline
calculations in the playa region of the Texas panhandle are unreliable.
Nebraska: Using the database available, no distinction can be
made between total drainage area and contributing drainage area.
|