National Weather Hydrology Laboratory

www.nws.noaa.gov

Site Map

News

Organization

Threshold Runoff

Home

(ThreshR User Documentation)

2.1 Database

2.1.1 Input Data
2.1.2 Mapping Specifications: Selecting a Coordinate System
2.1.3 Digital Elevation Models, Watershed Delineation, and Flow Length
2.1.4 DEM Pre-Processing And Determination Of Flow Directions for ThreshR
2.1.5 Release notes for peak flow regression data layers

2.1 Database

2.1.1 Input Data

2.1.1.1 Data Formats

A description of the GIS data types used in ThreshR is provided here. More information about these data types is provided in the ESRI online documentation.

Both raster and vector data formats are used in ThreshR. A raster data layer is an array of regular square cells. A vector data layer consists of either points, lines, or polygons. Raster data is useful for representing spatially continuous data sets like rainfall and elevation. Vector data is useful for representing discrete objects like watersheds and streams. Many factors affect the choice of whether a data set is stored in raster or vector format. Some factors considered are storage efficiency, resolution, accuracy, and ease of analysis and display. It is easier to transform vector data among coordinate systems than it is to transform raster data. Projecting raster data has important implications for this project and is discussed further below. A nice feature of the ArcView Spatial Analyst software is that functions are provided to convert between raster and vector data formats and to perform overlays and analyses using these two different data types. ThreshR takes advantage of these functions.

Raster Layers Used in ThreshR: All raster layers are stored in the ESRI Grid format. The specifics of the ESRI Grid storage format are proprietary, but this information is not required to access and manipulate the data using ArcView. The ESRI Grid format allows efficient access to raster data in large data sets by dividing the raster layer into blocks. Integer Grid layers may have a value attribute table (VAT) associated with them. The VAT contains a record for each unique value contained in the grid. The "Count" field in a VAT stores the number of cells containing each unique value. Additional fields can be added to a VAT that describe the zones defined by unique integer values. On a UNIX or PC filesystem, several different files are used to define a grid. The file storage scheme is transparent to the ArcView user but it is helpful to be aware of a few facts. For an elevation Grid named "dem," there will be a directory called "dem" that contains several files. The raster data and header information is stored in these files. If a Grid has a VAT, this file will be stored in a separate directory called "Info." The Info directory exists at the same level as the "dem" directory and stores attribute tables for all grids at that directory level.

Vector: Two vector data file formats can be used in ArcView, Shapefiles and Arc/Info Coverages. The distinction between these two data formats is not important to the casual user, and the two data types are accessed and manipulated using the same ArcView/Avenue functions. The main difference is that Arc/Info Coverages contain more topological information than Shapefiles. Both Shapefiles and Coverages contain geometric data describing either points, lines, or polygons and related tabular information. Shapefiles are a more portable data format and the format specification is available to the public. All vector data sets required for ThreshR are stored as Shapefiles. On the filesystem, Shapefiles consist of 3-5 files with the same base name but different file extensions. All Shapefiles contain a .shp file that stores feature geometry, a .shx file that stores an index to the feature geometry, and .dbf file (dBase format) that stores feature attribute information.

Fundamental to the definition of GIS is that data layers have attribute tables associated with them. All vector layers have an associated attribute table with one record corresponding to each point, line, or polygon feature on the map. One of the most powerful features of ArcView is the visual linkage between map layers and their associated attributes. Spatial queries are automatically linked to tabular queries and vice-versa. The ThreshR Avenue programs use these types of query and other advanced spatial analysis functions.

2.1.1.2 Input Files

Tables 2.1 and 2.2 describe GIS data layers that are part the ThreshR database. The data layers in Table 2.1 are common to all RFCs. Table 2.2 summarizes additional data layers that are required to compute flood peak flows for selected return periods using USGS regression equations (Jennings, Thomas, and Riggs, 1994). Some of the data descriptions provided in Table 2 encompass more than one specific type of data layer. For example, soils characteristics refers to three types of layers: soil permeability, SCS Type A Soil, and SCS Type D Soil. The USGS peak flow regression equations require diverse types of data that vary from state to state. Only the data layers required for the States in an RFC are included in the database for that RFC.

Table 2.1 Data Sets Common to All RFCs

Data Description File Type Standard File Name

Digital Elevation Model (m) Grid DEM

Flow Direction Grid Fd

Flow Accumulation Grid Fa

Downstream Flow Length (m) Grid Fld

Grid Cell Slope in ft/ft Grid Slopeff

Buffered mask of the RFC Boundary (1's inside, NODATA outside) (used to define computational zone) Grid Mask

State Boundaries Shapefile-Polygon Statekey.shp

USGS Region Boundaries Shapefile-Polygon Regions.shp

Modified RF1 file Shapefile-Line Rf1.shp

HRAP Center Points Shapefile-Point hrappts.shp

Parameter Information Table Dbase Parcode.dbf

Regression Equation Coefficients Table Dbase Regequat.dbf

Table 2.2 Data Layers Only Required for Certain States*

Data Description File Type # of States requiring this (including Puerto Rico)

Surface Water Storage (Lakes, ponds, swamps) Shapefile-Polygon 16

Mean Annual Precipitation (inches) Grid 19

Minimum mean January Temperature (degrees F) Grid 4

Rainfall amount for a given duration (inches) Grid 14

Forest Cover (percent) Grid 8

Soils Characteristics Shapefile-Polygon 3

Mean Annual Snowfall (inches) Grid 2

Area of Stratified Drift (percent) Shapefile-Polygon 1

Runoff Coefficient Grid 1

Drainage Frequency (number of first order streams per square mile) Shapefile-Line 1

Mean Annual Runoff (inches) Grid 1

Normal Daily May-March Temp (degrees F) Grid 1

Impervious Cover (percent) Shapefile-Polygon 1

Annual PET (inches) Grid 1

2.1.2 Coordinate System

Choosing a coordinate system is an important first step in any "GIS" project. A GIS coordinate system typically consists of a map projection and a datum. Although many GIS data sets are collected and distributed in geographic coordinates (latitude, longitude), the traditional mapping approach is to transform coordinates into a flat plane coordinate system so that length and area calculations can be made more easily. To calculate flow directions, distances, and areas correctly using ArcView, data should be transformed into a suitable map projection. Geographic coordinates (latitude, longitude) are not in a map projection although data stored in the lat-lon system are sometimes said to be in the geographic projection. ArcView allows the user to display geographic data sets, but built-in ArcView functions for computing flow directions, areas, and intersections are prone to significant spatial errors when applied to data in geographic coordinates because these functions treat longitude-latitude coordinates as x-y Cartesian coordinates. ArcView Version 3.1 includes a tool for measuring the actual distance between two points in geographic space using ellipsoidal formulas, but this is the exception, rather than the rule for ArcView analysis functions.

The Integrated Hydrologic Automated Basin Boundary System (IHABBS) software developed at the National Operational Hydrologic Remote Sensing Center (NOHRSC) deals with coordinate systems differently than ArcView. In IHABBS, data is stored in geographic coordinates and a map projection is not used. Equations for an ellipsoidal earth are used in IHABBS to make length calculations when accuracy is required. This difference between ArcView and IHABBS is important because it has implications for the use of IHABBS data within ArcView. For most types of data, there is no problem because a map projection can be applied. The flow direction grids prepared for IHABBS in geographic space present a different problem because the information gets degraded when projected. This unique problem is discussed more in Section 2.4. The remainder of this section is devoted to discussing the map projection selected for the ThreshR project.

For the ThreshR project, an Albers Equal-Area Conic map projection and the North American Datum of 1983 (NAD83) will be used. The Albers Equal-Area Conic projection with the standard parallels given in Table 2.3 is one of the most commonly used projections used by the USGS to make maps of the conterminous United States (Snyder, 1987). From this point forward, the Albers Equal-Area projection with the standard parallels of Table 2.3 will be referred to as the "conUS Albers" projection where conUS stands for conterminous United States. Many data distributors choose the conUS Albers projection to distribute GIS data sets with nationwide coverage. The USGS Water-Resources Investigation Report 94-4002 that gives peak flow regression equations for all 50 states (Jennings, Thomas and Riggs, 1994) presents many of the required input maps using the conUS Albers projection. The coordinate system parameters shown in Table 2.3 are also identical to those chosen by the US Army Corps of Engineers Hydrologic Engineering Center (HEC) for their proposed Standard Hydrologic Grid (HEC, 1996).

For Alaska and Hawaii, different sets of standard parallels are recommended to reduce distortion and could be chosen using the "1/6" rule cited by Snyder (1987) or by following the lead of previous cartographers by selecting standard parallels of 55° N and 65° N for Alaska and 8° N and 18° N for Hawaii (Snyder, 1987).

Table 2.3. Coordinate System Parameters Used in ThreshR*

Characteristic	Description/Value
Projection type	Albers Equal-Area Conic
1st standard parallel:	29.5 N
2nd standard parallel:	45.5 N
Longitude of the central meridian:	96 W
Latitude of the projection origin:	23 N
False Easting:	0
False Northing:	0
Datum	NAD83
Units	Meters

* For gridded data sets, a cell size needs to be determined. Cell size specification is discussed further in Section 2.4.

One advantage of using an Equal-Area projection is that by design, the area of a polygon (i.e. a watershed) in projected space closely approximates the true area of that polygon on an ellipsoidal model earth. Area is an ideal characteristic of a map to preserve for hydrologic applications because watershed drainage area is often a key parameter. In order to preserve area there must be some distortion in shape, angle, and scale, but the amount of these distortions can be minimized with a prudent choice of projection parameters. Maps covering large areas inherently involve larger distortions than maps covering small areas.

In addition to drainage areas, length measurements are also important for hydrologic calculations. By considering the scale distortion (ratio of map distance to earth distance) inherent to the conUS Albers projection, an approximate estimate of what the maximum distortion in a stream length measurement or a watershed perimeter measurement can be made. For purposes of this discussion, scale is defined as the ratio of a mapped distance to the corresponding distance on a model (ellipsoidal) earth. Using the conUS Albers projection, a maximum scale distortion of 1.25% occurs at the northernmost and southernmost edges of the map (49° N and 25° N), decreasing as one moves towards the standard parallels of 29.5° N and 45.5° N where the scale distortion is 0 (Snyder, 1987). The scale distortion peaks again midway between the standard parallels at 37.5 N, but is less than 1% at this point.

These maximum distortion estimates of 1.25% and 1% occur along meridians of longitude and parallels of latitude. For an equal-area projection, the scale factor along meridians of longitude (h) is the inverse of the scale factor along parallels of latitude (k) so when k = 1.0125 (a 1.25% error), h = 0.9877, thus the product h * k = 1. This means that even near the edges of the map where maximum distortion is 1.25% (k = 1.0125), the net length distortion of stream lines that are not aligned perfectly with a meridian of longitude will be less than 1.25% because the scale factor associated with parallels of latitude has the opposite effect from the scale factor associated with meridians of longitude. To demonstrate that the actual errors in length calculations are less than these maximum errors, a comparison was made between watershed perimeters computed using (1) the IHABBS ellipsoidal equations for making length calculations, and (2) perimeters computed using ArcView for the same polygons in the conUS projected space. These calculations were made for 48 watersheds in the Arkansas-Red River Basin. The maximum difference in computed watershed perimeters was 0.5% and the average difference was only 0.11%. As expected, these figures are lower than the scale distortion percentages, h, (0.7% to 1%) along meridians of longitude for latitudes in the Arkansas-Red River Basin (34° to 38° N).

From an implementation standpoint, it is necessary and appropriate to work in projected space because of the capabilities of ArcView, and because this is a standard GIS approach. Another aesthetic advantage with working in projected space is that the shapes of polygons (watersheds or political boundaries) on the screen and on printed maps are much more realistic looking when compared with 2-D longitude-latitude plots. Displaying 2-D longitude-latitude plots is a tremendous distortion of reality because of the fact that meridians of longitude converge on one another as they move closer to the poles. A nice feature of working in projected space is that the distances and sizes of features that a user sees on the screen are closer to reality. To illustrate this point, Figure 2.1 shows a map of the 50 U.S. states plotted using geographic coordinates (x-y = long-lat) in an x-y plane and the same map plotted in the conUS Albers projection. The shapes of features in the conUS projection are much more similar to those you would see on a globe.

To summarize, the ThreshR application must use projected data. The conUS Albers projection defined in Table 2.3 is appropriate for hydrologic parameter estimation. There is no area distortion associated with this projection and the maximum possible length distortion is about 1.25% (in most cases the length distortion will be considerably lower). This distortion error is small relative to other uncertainties in ThreshR calculations. Products from the GIS analysis (i.e. basin boundaries, maps of threshold runoff, stream lines, etc.) can be easily projected into other coordinate systems for display and use with other NWS software applications.

Figure 2.1. Comparison of a 2-D Lon-lat Plot with an Albers Equal-Area Plot

tr graphic

2.1.3 Digital Elevation Models, Watershed Delineation, and Flow Length

A digital terrain model is an ordered array of elevation values and is commonly stored in one of three data structures: contours, grids, and triangulated irregular networks (TINs). Moore et al. (1991) review the advantages and disadvantages of each of these three data structures. Grid terrain models are the most widely available data structure, terrain analysis methods using grid models are simple, and the grid structure is compatible with remote sensing techniques. Only gridded elevation data is used in this project. The term adopted by the USGS to describe digital terrain data in a grid format is digital elevation model (DEM).

DEMs are available at many different resolutions. In the United States, the U.S. Geological Survey (USGS) has taken a lead role in developing and distributing DEMs that can be used for analysis over a wide range of scales. Currently, 3 arc-second DEMs are available from the USGS for all of the United States and its territories. The spacing between elevation values in a 3 arc-second DEM is about 90 m at the Equator. The USGS is also working on a seamless 1-arc second (~30 m) DEM for the United States called the National Elevation Data Set. When working in large areas, use of even 3 arc-second data can consume large amounts of disk space and computational time. For this reason, the initial ThreshR database includes terrain grids with 15 arc-second spacing. The 15 arc-second DEMs used in ThreshR were created at NOHRSC by resampling 3 arc-second DEMs from the USGS 3 arc-second data. The 15 arc-second data require 25 times less disk space than the 3 arc-second data. To provide a sense of file sizes, a 15 arc-second DEM for the largest RFC (MBRFC) would consume about 30 MB while the 3 arc-second DEM would consume roughly 750 MB. The threshold runoff software will work Threshold runoff calculations a good deal of uncertainty even with the best data available so the hypothesis is that any inaccuracies associated with using the 15 arc-second data are small relative to other hydrologic uncertainties and the computational and storage savings is worth the tradeoff.

A DEM can be used to define the drainage path that a water droplet falling on the land surface takes to a watershed outlet. If water flow is on the surface or near the surface it is often assumed that the direction of flow is controlled by topography. A common computer method for determining a drainage network from DEMs is summarized here.

2.1.3.1 Pit Removal

Depressions or pits in a DEM are cells or groups of cells that have a lower elevation than all surrounding cells. Pits in a DEM may be real, representing natural landscape features, but are often artificial, caused by data errors introduced during elevation sampling or data processing. Both O'Callaghan and Mark (1984) and Band (1986) note that in areas where fluvial erosion processes sculpt the landscape most pits in the DEM are caused by data errors while recently glaciated areas or karst landscapes are more likely to have natural depressions. Various methods have been developed to remove pits from DEMs so that a continuous flow path is defined from every cell to the edge of the data set or the watershed outlet. The pit filling algorithm described by Jenson and Domingue (1988) is coded into ESRI software and is used for pre-processing ThreshR data. The Jenson and Domingue (1988) algorithm raises the elevation of all cells in a pit to the elevation of the lowest pour-point on the edge of the pit.

2.1.3.2 Flow Direction and Flow Accumulation Models

Several models for defining a grid of flow directions based on a DEM are discussed in the literature and good reviews of these methods are provided by Tarboton (1997) and Costa-Cabral and Burges (1994) who both introduce their own methods. The simplest and most widely used method (often referred to as the D8 method) to define flow directions in DEMs is described by O'Callaghan and Mark (1984) and Jensen and Domingue (1988). A D8 flow direction function is available in ESRI software and is used for ThreshR pre-processing. In the D8 model, it is assumed that a water particle in each DEM cell flows towards one and only one of its neighboring cells, that cell being the one in the direction of steepest descent. To assign a flow direction value to a cell, the "distance weighted drop" to each of eight neighboring cells is computed by taking the difference in elevation values and dividing by

for a diagonal cell and one for a non-diagonal cell. The flow direction for a cell is assumed to be in the direction with the highest distance weighted drop. In situations where multiple adjacent cells have equal drops or where all adjacent cells have no drop (flat areas), special considerations are made (Jenson and Domingue, 1988). In ESRI software, the eight possible flow directions are assigned unique numbers based on the following convention ( East = 1; Southeast = 2; South = 4; Southwest = 8; West = 16; Northwest = 32 ; North = 64; Northeast = 128 ). Figure 2.2a shows an example elevation grid, Figure 2.2b shows the flow direction assignment convention, Figure 2.2c shows the numerical values assigned to cells in the flow direction grid, and Figure 2.2d shows the flow directions symbolically with arrows.

tr graphic

Figure 2.2. Assignment of flow directions using the D8 model. (a) elevations, (b) flow direction codes, (c) flow direction grid values, (d) symbolic representation of flow directions.

Using a flow direction grid and weight grid as input, a flow accumulation function is defined which returns, for each cell, the sum of the weights of all cells that flow to that cell. If the weight grid values are all 1, the cell values in the resulting flow accumulation grid are equal to the total number of cells contributing to that cell. A flow accumulation grid and a skeleton network implied by the flow direction grid are shown in Figure 2.3. In this example, all weights are equal to one.

tr graphic

Figure 2.3. (a) Flow accumulation grid and (b) schematic of drainage network.

In areas of low relief where DEMs are often flat, use of the D8 model alone to determine flow directions may result in significant errors. This problem is addressed further in Section 2.4.

2.1.3.3 Drainage Network and Watershed Definition

A simple and practical method for defining drainage networks is to identify channel cells as cells which drain a minimum threshold area. Depending on the choice of threshold value, this method may yield a dense or sparse channel network. In ThreshR, the drainage area threshold is selected to define the size of the watersheds to be studied. No attempt is made to determine the threshold at which there is physical evidence of a transition from hillslope to channel transport mechanisms.

Delineation of watersheds from a DEM requires a flow direction grid and a set of pour-points. Using a GIS, pour-points may be (1) selected interactively with a mouse, (2) selected automatically at the downstream end of each link in the drainage network created using an area threshold (Jenson and Domingue (1988); Maidment and Mizgalewicz (1993)), or (3) identified using a coordinate file. The second method is used to identify points where ThreshR will be computed. The results of a generic procedure for automatically delineating a watershed for each link in a drainage network is illustrated in Figure 2.4. Based on the drainage network of Figure 2.3, Figure 2.4a shows cells tagged as stream cells based on the arbitrary criterion that a stream cell has flow accumulation greater than two. Figure 2.4b is a grid in which each stream link is assigned a unique integer value and Figure 2.5c shows the same grid divided into six subwatersheds. Cells in the same subwatershed are assigned the same grid-code.

tr graphic

Figure 2.4. (a) Stream cells, (b) stream link grid, (c) subwatersheds.

2.1.3.4 Flow Length

Flow length is a useful function available in ArcView and Arc/Info that calculates the length from each cell in a Grid to the nearest outlet point or "NODATA" cell. A flow length grid computed for the DEM shown in Figure 2.2a is shown in Figure 2.5. .

Figure 2.5 Flowlength Grid for the DEM From Figure 2.2a

tr graphic

2.1.4 DEM Pre-Processing And Determination Of Flow Directions for ThreshR

Special DEM pre-processing procedures have been developed to prepare flow direction grids for ThreshR. With a coarse 15 arc-second DEM as input, simple application of the D8 flow direction model often yields undesirable results in flat areas. In flat areas, the locations of digitized streams are used to help define flow directions.

2.1.4.1 Why not use IHABBs flow direction values?

NOHRSC developed a grid of flow directions with a procedure that uses both the D8 model and information from a digitized stream network (an RF1 stream grid). However, this flow direction grid cannot be used here because the grid was developed using geographic coordinates (latitude, longitude). The reason that this is a problem is discussed here.

Measured in ellipsoidal coordinates, the spacing between flow direction values in the IHABBS grid is 15 arc-seconds. The spacing of flow direction values along parallels of latitude in meters depends on the latitude. The spacing of flow direction values along meridians of longitude in meters is relatively constant with only slight variations due to the fact that the earth is shaped more like an ellipsoid than a sphere. To provide some example numbers, assume that the earth is a sphere of radius 6371.2 km. This makes the approximate spacing of 15 arc-seconds along the Equator and along meridians of longitude throughout the globe equal to the earth radius (r) times the arc length in radians or:

y = 6,371,200*15/3600*3.1416/180 = 463 m To get the spacing between flow direction values along parallels of latitude (x), y is multiplied by cos(latitude). A summary of this "lateral" spacing between elevation points for the range of latitudes covering the conterminous United States is given in the Table 2.4 below:

Table 2.4. Meter Equivalent of 15 Arc-second Spacing Along a Parallel of Latitude

Latitude	Approximate Meter Equivalent to 15 arc-second spacing along parallels
25	420 m
30	401 m
35	380 m
40	355 m
45	328
50	298

The implications of this uneven spacing is that gridded data must be resampled when projected into a Cartesian system like conUS, because by definition, a Grid in GIS consists of regular, square cells. When projecting a grid, the first step is to select a cell size for the output space. Based on the information in Table 2.4 and the convenience of using round numbers, a cell size of 400-m has been selected for the ThreshR project.

The resampling procedure proceeds as follows. The required areal coverage for the output grid is determined based on the extent of the input data set, and a set of regular square cells of the selected size is laid out in the output plane. The center-points of the grid cells being projected are transformed into the output plane, and values are assigned to the regular square cells in this plane based on values from the projected grid center-point(s). There are three options for assigning values to each of the square output cells based on nearby projected, center-point values. These options are nearest neighbor, bilinear, and cubic resampling. For categorical(integer) values (e.g. flow direction) the value of the nearest projected center point is assigned to each square output cell. In flow direction data, codes are discrete -- unlike temperature and rainfall grids and there is no correlation between an individual cell's flow direction value and the values in surrounding cells. The nearest neighbor resampling option illustrated in Figure 2.5.

Figure 2.5. Illustration of Nearest Neighbor Resampling (Adapted from the Arc/Info Online Help)

tr graphic

To demonstrate the problem with resampling flow direction values, the IHABBS flow direction grid for ABRFC was projected into the conUS Albers projection using a cell size of 400 m. Projecting and resampling the flow direction cells creates "mirror" cells - cells that flow to one another. An example of a situation in which a mirror cell was created is shown in Figure 2.6. In this figure, the arrows represent the flow directions assigned to cells in the output plane. The small boxes with x's in them represent the location of projected center points from the original grid with 15 arc-second spacing. Five of these center points are labeled with their original flow direction values (East, East, South, West, and South). It can be seen that the reason for the occurrence of mirror cells 2 and 3 is that cell 2 is assigned the value East and cell 3 is assigned the value West even though these two directions were separated by a South cell in the original (lat-lon) grid.

There are too many occurrences of mirror cells to consider identifying and correcting these problems. There are 9,097 instances of mirror cells created in the ABRFC alone. This number might seem high, but it is not high compared with the total number of flow direction cells in the ABRFC grid (4.1 million).

Figure 2.6. Mirror cell example.

tr graphic

2.1.4.2. ThreshR Procedures to Create a Flow Direction Grid in Projected Space

As stated earlier, one problem with the D8 flow direction method is that it is a local method that cannot resolve the flow directions for cells surrounded by other cells of equal elevation -- "flat" areas. Attempts to represent areas of low relief using a digital elevation model (DEM) often results in flat surfaces, even if the true relief exhibits a mild slope. These flat DEM areas may arise due to inadequate vertical resolution. In addition, flat areas are created when artificial pits in the DEM are filled. Steps intended to improve flow direction assignment in flat areas are described here. The steps are designed to be as close to similar procedures used at NOHRSC in developing their flow direction grid.

The NOHRSC flow direction analysis starts with filling artificial pits and applying the D8 flow direction algorithm. Steps that were taken at NOHRSC to modify ("improve") the raw D8 flow direction grid are summarized here. The basic philosophy is to let the digitized streams exert control on flow directions in flat areas where there may not be enough resolution in the DEM to define flow directions.

NOHRSC Flow Direction Modifications (quotations from Internet documentation)

N1: "Route lake[s] and reservoirs to their outlets define[d] either by rivers and streams or by the lowest water edge elevation flowing away from the lake or reservoir,"

N2: "Route rivers and streams from their headwaters to their tail waters. Route cells immediately adjacent to rivers and streams into the rivers and streams,"

N3: "Route flat areas to intersecting streams or the nearest routed cell,"

Steps N1 and N2 can be emulated by "burning" the digitized stream network into a projected and filled DEM and then computing flow directions. The edited version of RF1 created at NOHRSC is very useful for this purpose because the nohrsc_rf1 drainage network contains only single line representations of streams and single line flow paths have been drawn to replace lakes and reservoirs. The stream burn-in procedure involves the following steps: 1. Rasterize the nohrsc_rf1 network, making sure that the newly created stream network cells are the same size as the DEM cells and correctly registered

2. Create a modified DEM by lowering elevation values corresponding to the rasterized nohrsc_rf1 cells relative to all surrounding (non-stream) elevation cells. An easy way to do this is to assign all stream cells the elevation 0.

3. Apply the ArcView-Arc/Info flow direction function to the modified DEM

The effects of the ArcView-Arc/Info stream burn-in procedures that are consistent with NOHRSC modifications N1 and N2 are as follows.

1. All stream cells will be routed from their headwaters to their tailwaters,

2. Because a streamline has been used to represent lakes and reservoirs, water will be routed through these areas in an appropriate manner following the digitized line.

3. Cells immediately adjacent to rivers will be routed to rivers because this will be the direction of steepest descent.

Although the stream burn-in procedure has essentially the same effects as modification steps N1 and N2 performed at NOHRSC, step N3, which deals with flat areas, is more difficult to emulate. In a "flat" area, an appropriate flow direction value cannot be assigned to a cell by looking only at each of its eight immediate neighbors. The location of digitized streams can be used to guide flow direction assignment in flat areas. This is a summary of the scheme used by NOHRSC:

Some large flat areas will contain streams while other flat areas may not. Large flat areas that contain a stream (as defined by the RF1 files) are essentially split into smaller flat areas by the stream. Any stream that splits a flat area becomes a bordering cell for that flat area. Assignment of flow directions for the remainder of flat area cells is done recursively, starting at border cells with defined flow directions (either stream cells or cells of lower elevation) and moving progressively inward. The flat area eventually shrinks to nothing.

This algorithm used in ESRI software (Jenson and Domingue (1988)) is very similar in concept to that used by NOHRSC, but there is a difference in the results achieved. Looking at the computed flow directions, it appears that the assignment of flat area flow directions in Arc/Info is biased towards cells on the edge of flat areas with the lowest elevation (rather than simply the nearest border cell). For example, compare the original flow directions from NOHRSC and the flow directions computed in Arc/Info using the projected and resampled NOHRSC DEM in Figures 2.7a and 2.7b.

Figure 2.7a. NOHRSC Flow Directions in a Flat Area with No Streams

tr graphic

Figure 2.7b Arc/Info Flow Directions in a Flat Area with No Streams

tr graphic

Since assigning flow directions in flat areas of the DEM involves a lot of uncertainty, either the NOHRSC approach or the Arc/Info approach is probably acceptable. However, because the Arc/Info algorithm is particularly sensitive to the elevation values of cells bordering the flat areas, burning in the streams can have a significant effect on the flow in flat areas and give some odd results. As explained above, burning in the streams involves raising the land surface cells relative to the stream cells in some way. It turns out that the elevation amount by which the stream cells differ from the land surface cells influences the way the Arc/Info algorithm assigns flow directions in flat areas (if a stream crosses the flat area in question). For example, in a test in an area where a stream crosses a flat portion of a DEM, the resulting flow directions computed with Arc/Info when lowering the stream cells relative to the land surface by 1 meter, 10 meters, and 100 meters were compared. The results of this test are shown in Figures 2.8a, 2.8b, 2.8c, and 2.8d where 2.8a is the unmodified DEM. The most dramatic result is shown in Figure 2.8d where the stream is drawing all cells in the flat area for which the flow directions cannot be determined using the D8 algorithm. Note that cells near the bottom of Figure 2.8d are drawn down by a stream that is not shown. (Although I do not know the reason for these differences, I'm wondering if it has to do only with the order of computations.)

Figure 2.8a. No burn-in.

tr graphic

Figure 2.8b. 1 m burn-in

tr graphic

Figure 2.8c - 10 m burn-in

tr graphic

Figure 2.8d - 100 m burn-in

tr graphic

From Figures 2.8 a, b, c, and d it appears that choosing a small stream-landsurface differential with which to "burn-in" the streams (Figure 2.8a) creates a more sensible assignment of flow directions in flat area cells. The dilemma is that if a small stream-landsurface differential is introduced, then the delineated streams might not precisely match the digitized streams in non-flat areas. A precise match in non-flat areas may or may not be important. This leads to the point that there are actually two motivations for burning in the streams -- to provide consistency between digitized and delineated streams and to provide added guidance for flow direction assignment in flat areas. Comparing Figures 2.8a and 2.8b shows the advantages of burning in streams in flat areas.

In order to produce flow direction assignments that are similar to those generated at NOHRSC and to eliminate the possibility of flow direction bias due to burning in the streams (Figure 2.8d), programs have been developed to introduce a small elevation gradient in flat areas towards cells which have a defined flow direction. The scheme uses a series of Arc/Info commands and a C program. These programs are only used for pre-processing and will not be required by the end user. This scheme eliminates the ambiguity in assigning flow direction values in flat areas using the D8 algorithm. An illustration of this scheme is given in Figures 2.9-2.11.

Figure 2.9 shows cells for which a flow direction cannot be assigned using the D8 model in white and green (white and medium gray in black and white copies). Black cells in Figure 2.9 are non-flat areas and the continuous string of cells down the middle of Figure 2.9 is a burned stream. Green cells (down-border cells) are distinguished because they have neighbors with D8 flow directions and have no neighbors of higher elevation. Other white cells should flow towards these green cells. Elevation values in flat areas are adjusted slightly to introduce a gradient towards the green cells. This is illustrated in Figure 2.10 where the elevation gradient is from white (slightly higher cells) to black (lowest cells). When flow directions are recomputed using the adjusted elevation grid, the results are fairly consistent with the flow directions derived at NOHRSC (Figure 2.11a and b).

Figure 2.9

tr graphic

Figure 2.10

tr graphic

Figure 2.11a. Flow directions computed with a modified elevation grid

tr graphic

Figure 2.11b. NOHRSC Flow directions

tr graphic

2.1.4.3 A Comparison of Watershed Areas Delineated using IHABBS Flow Directions and ArcView-Arc/Info Flow Directions

Drainage areas delineated using (1) the Arc/Info flowdirection function applied to the projected NOHRSC DEM without modifications, and (2) the Arc/Info flowdirection function applied to a projected and modified DEM (modifications include burning in the streams and introducing an artificial elevation gradient in flat areas) are compared to drainage areas delineated using IHABBS. The study was made for 47 basins in the ABRFC ranging in size from 160 km² - 2000 km². For case 1, the average percent difference between drainage areas computed with Arc/Info and with IHABBS is 1.9%, with a standard deviation of 4.0%, and a maximum difference of 23.7%. For case 2, the average percent difference in drainage areas computed with Arc/Info and IHABBS is 0.6%, with a standard deviation of 0.8% and a maximum difference of 4.1%.

In case 1, the most significant differences in basin boundary definition are caused by differences in how the two software packages split up flow in flat areas near basin divides. In case 2, the extra pre-processing being done for ThreshR eliminates some of these discrepancies.

2.1.4.4 Problems with Grid Cell Resolution

The stream burning procedure that is used in ThreshR pre-processing involves setting all cells that underlie the vector RF1 files to zero. The flow direction values for each of these zero valued cells are assigned using the Arc/Info flow direction function. In this function, the stream cell flow directions are not determined using the direction of steepest descent because there is no direction of steepest descent. The stream cell flow directions are actually determined using an iterative procedure that works its way backwards from the outlet. A problem with setting stream cell values to zero arises when two stream arcs are so close together that upon rasterization a cell in one stream segment has a neighbor which is a cell in another stream segment and these cells are not near a segment junction point. What can happen then is that a cell which neighbors with another stream segment may be assigned a flow direction into its neighboring stream segment rather than to its own stream segment because there are no elevation values to guide the choice. This problem occurs partly because the source data scale for the 15 arc-second DEM (~1:1,000,000) is inconsistent with the source data scale for the RF1 files (~1:500,000). In other words, the cell size is too large relative to the density of the stream lines. Ultimately, the best solution for this problem might be go the to use of a 3 arc-second DEM, but for now work around programs have been developed. At a later time, it may be worthwhile to investigate the tradeoffs between large increases in storage and computation time associated with using a higher resolution DEM and any improvements in results that might be obtained.

There are several possible ways to deal with the DEM cell resolution problem described in the preceding paragraph. One way to minimize this problem is to decrease the cell size by resampling the DEM. This approach is not adopted because it creates a larger data set, implies a greater resolution in elevation data than what is actually available, and the cell size required to eliminate this problem is not obvious. The number of locations at which the adjacent burned stream reach problem is created might be reduced by using only a 1-m stream-drop when burning in the streams rather than setting all stream cells to zero elevation; however, it is not clear thaty this solution would guarantee success in all cases. In addition, it is difficult to define a rational basis for choosing a constant stream-drop value. Small stream-drops will not force the DEM streams to follow the digitized streams. Forcing stream junctions to occur in the locations defined by digitized streams is helpful, something that can be more easily achieved with a large stream drop or setting the stream cell values to zero. By forcing flow into digitized streams, the ThreshR flow directions will be more cosistent with the NOHRSC-derived flow directions.

The approach used to correct the adjacent burned-stream reach problem in the 15 arc-second ThreshR database is to edit the flow direction grids manually where necessary. A set of pre-processing tools has been developed to make this task feasible. These tools automatically identify potential problem areas like that shown in Figure 2.12, and provide an interactive environment for editing the flowdirections near these problem areas as deemed necessary. The number of potential problem areas in ABRFC and SERFC was about 70 each. Only a fraction of these points actually required editing and the time required for this task is about 1-2 hours. Instructions for implementing the flow direction pre-processing procedures have been prepared but are not presented here because the end user is not required to implement these procedures.

tr graphic

Figure 2.12 Example of a situation where the rasterized version of two distinct creeks contain neighboring cells.

Pre-processing. To develop a grid of flow directions in projected space, the following procedures

Convert the NOHRSC RF1 files, the 15 arc-second DEM, and the slope grid into Arc/Info format.

Project the data into the conUS projection deefined in the Technical Memo.

Buffer an RFC basin outline obtained from the NOHRSC web site and use this to clip out only the data that is really needed for analysis.

Convert the NOHRSC RF1 files to a grid and "burn-in" the streams.

Identify flat zones in the DEM.

Modify the burned DEM by establishing an elevation gradient in the flat zones towards cells of lower elevation.

Fill the modified DEM.

Re-compute flow direction from the burned DEM.

A difficulty that arises when burning-in the streams is that the scale of the RF1 source data is inconsistent with the 15 arc-second (400 m) digital elevation model cells being used for this analysis. Digitized streamlines may be within 800 m of one another even far upstream from stream junction points. This means that when the stream network is rasterized, there may be locations where cells from different stream segments neighbor one another. Subsequently, when streams are "burned-in" to the DEM, crossover of flow to the wrong stream segment can potentially occur at locations far removed from the actual stream junctions. After recent discussions with Dennis J. it seems that the best solution for this problem is to identify potential problem sites are identified with an automatic procedure developed in ArcView, but to manually edit the flow direction grids in problem areas. The need to deal with this problem would be eliminated if 3 arc-second digital elevation data were used.

Semi-automated procedures have been developed for the above pre-processing tasks. Running the pre-processing procedures is describing in Section III of this document.

2.1.5 Release notes for peak flow data

Iowa: The stream order Grid ("stral_ia") in the ThreshR database is incorrect. As a temporary solution, baseline runs for Q2 are made using the regression equations from Carpenter and Georgakakos, 1993 (p. 20) is used. A linefile of first order streams in Iowa is needed.
Colorado: The equations for the Eastern Colorado Plains drainage areas less than 20 mi2 have not yet been included. For all size basins in the Eastern Colorado Plains, there are not regression equations given in USGS WRIR 94-4002 for Q2.
Texas: In the ThreshR database, the boundaries of regions 3 and 4 on the Texas regions map were extended to include the playa area in the panhandle where regression equations do not apply. Thus, baseline calculations in the playa region of the Texas panhandle are unreliable.
Nebraska: Using the database available, no distinction can be made between total drainage area and contributing drainage area.

Main Link Categories:
Home | OHD


US Department of Commerce National Oceanic and Atmospheric Administration National Weather Service Office of Hydrologic Development 1325 East West Highway Silver Spring, MD 20910 Page Author: OHD webmaster Page last modified: October 14, 2008	Disclaimer Credits Glossary	Privacy Policy About Us Career Opportunities