Coastal Services Center

National Oceanic and Atmospheric Administration

[Skip Navigation]

Technical FAQs



How is land cover information derived from aerial and satellite imagery?

All surfaces reflect, absorb, or transmit 100 percent of incident light. Different materials reflect and absorb different amounts and wavelengths of light along the electromagnetic spectrum (EMS). This is the basis for identifying surface components from remote sensing.

High-resolution analog aerial photography can be used to delineate geographic themes of information, such as wetlands. The National Wetlands Inventory (NWI) Program, for example, uses aerial photography to delineate wetlands in the landscape by manual interpretation and delineation.

Digital sensors, such as most satellite-based sensors, can collect multiple wavelengths of light in regions of the EMS not possible with an analog photographic medium. It is possible to manipulate and statistically analyze these wavelengths of light to determine unique characteristics of the landscape and ground surface. These characteristics can be turned into information such as land cover.

Digital raster images are analogous to spreadsheets. They are cells filled with observations (in this case digital values representing the intensity of light reflected from surface materials). Therefore it is possible to statistically analyze digital imagery to determine land cover. The reflective characteristics of each cell are compared and clumped together to form spectral signatures. In a perfect world, each spectral signature would represent a unique landscape component. This is rarely the case; therefore, many processes and applications have been designed to extract information from digital imagery.

Each satellite scene is individually classified, focusing first on separating major categories (e.g., water, forest, marsh, herbaceous upland, and developed) using standard supervised and unsupervised classification techniques. Fieldwork at this stage of classification is critical to orient the analyst to the environment and to provide numerous individual areas as training sites for the land cover classification. The mean and covariance statistics for these training areas are passed to an isodata classification algorithm which assigns every unknown pixel to the class in which it has the highest probability of being a member. Then iterative unsupervised classifications are performed on each major category individually by masking out all other major categories. By masking out all data but a single major category, the spectral variance is greatly reduced, thus decreasing classification errors. After several classification iterations of the masked data, final classification labels are assigned to the spectral clusters. Changes among major categories are permitted to occur even at this stage of processing. Subsequent fieldwork and the use of collateral data such as U.S. Geological Survey (USGS) maps, Topologically Integrated Geographic Encoding and Referencing system (TIGER) road data, and National Wetland Inventory data lead to further refinements in the image classification. In small areas where land cover class confusion cannot be separated spectrally, human pattern recognition must be used to recode the data.

For more information on Remote Sensing theory, background, and applications see the following Web sites:

Return to Top


How is landscape change analysis performed with satellite imagery?

Sun-synchronous satellites return to the same point on the face of the Earth at a regular time each day. Multiple images of the same location can be compared to each other to statistically extract changes.

When images are analyzed over seasons, it is possible to examine the phenology of the landscape. When anniversary dates of imagery are examined, human and natural landscape changes can be extracted.

Change detection is usually accomplished by subtracting the images or performing correlation analysis to determine differences. Digital images are roughly analogous to spreadsheets. The individual cells contain numbers which can be subtracted, divided, or otherwise manipulated. By subtracting two images, the areas of change can be highlighted. It should be noted that there are several ways of highlighting areas of change, or creating change masks, ranging from very simple to very complex processes involving correlation analysis.

Once the analyst has a baseline land cover analysis for the most recent time period, only the pixels in the previous time period that have changed spectrally are necessary to reclassify. The assumption is that anything that did not change significantly spectrally, did not change in the landscape. These pixels (representing no change) are simply replaced with the previous time period image classification. This means that the development of an accurate change mask is critical for an accurate change detection analysis.

Digital image change analysis is limited to analyzing time slices (discrete observations in time). Therefore, we can only know how a pixel was classified from one image to the next, and can only make inferences on the processes at work in the middle. However, many of the processes are readily apparent in the field, such as the transformation of forest and agriculture to development, and logging. Because we are analyzing discrete time slices, landscape change is reported as from-to information in a change matrix which is set up with "From" and "To" designations; each pixel has a "from" value (forest) and a "to" value (development). Just as in double entry accounting, each debit is a credit and each credit is a debit somewhere. It is important to understand this relationship when attempting to make sense of change information.

It is possible to identify the amount of change between two images by image differencing the same band, so long as the two images have previously been rectified to a common base map and normalized. Image differencing involves subtracting the imagery of one date from that of another. The subtraction results in positive and negative values in areas of radiance change and zero values in areas of no change in a new change image. The images are subtracted, resulting in a signed 16-bit analysis with pixel values ranging from -255 to 255. The results are transformed into positive unsigned 16-bit values by adding a constant, C (usually 255). The operation is expressed mathematically as

Dijk = BVijk(1) - BVijk(2) + C

where:

Dijk = change pixel value

BVijk(1) = brightness value at time 1

BVijk(2) = brightness value at time 2

C = a constant (e.g., 255).

i = row number

j = column number k = a single band (e.g., Near Infrared, NDVI, Principle Components Analysis).

The change image produced using image differencing usually yields a brightness value distribution approximately Gaussian in nature, where pixels of no brightness value change are distributed around the mean and pixels of change are found in the tails of the distribution. A threshold value is carefully chosen to identify spectral "change" and "no-change" pixels in the change image. A "change/no-change" mask is derived by performing image differencing on TM band 4 (Near Infrared), Normalized Difference Vegetation Index (NDVI), a Principle Components Analysis (PCA), or a combination of the three on the two dates of imagery recoded into a binary mask file. The change/no-change mask is then overlayed onto the earlier date of imagery and only those pixels that are detected as having spectrally changed are viewed as candidate pixels for categorical change. These candidate pixels are classified in the same manner as the baseline land cover analysis for consistency.

The two land cover classifications from different time periods are compared on a pixel by pixel basis using a change detection matrix. This traditional post-classification comparison yields "from"-land-cover-class and "to"-land-cover-class change information. Many pixels with sufficient change to be included in the mask of candidate pixels in the spectral change process did not qualify as categorical land cover change. This method may reduce change detection errors (omission and commission) and provides detailed "from-to" change class information. The technique reduces effort by allowing analysts to focus on the small amount of area that has changed between dates.

A Change Matrix is a N x N table of all possible landscape changes where N is the number of thematic land cover classes in a study. For example, once a 15-class land cover baseline classification and a 15-class change classification are produced, they can be compared in a 225 cell table (15 x 15) showing all the possible changes of each individual land cover category to any other category. This table is interpretable by presenting tabular results measuring change between classes, such as Evergreen Forest being cleared and becoming Bare Land, quantitatively. These tables show the exact amount of change, what the pixel was, and what it became. This matrix is critical to see the entire change analysis in one view. It is a very powerful tool for examining landscape change.

Return to Top


What does the term geographic scale mean in relation to ground resolution?

Large geographic scale means small area and high detail.

Small geographic scale means large area and small detail.

Scale in landscape ecology is referred to as "grain size." This is a term to describe discrete habitat fragments in the landscape.

Scale to a cartographer describes the relationship between the distance on the map and the distance on the ground.

Scale in a vector context refers to the amount of detail you retain, or error you are willing to tolerate in your data. It also refers to the size of objects at a given spatial coverage. For instance, National Mapping Accuracy Standards for 1:24,000-scale mapping mandate that 90 percent of the features in the spatial coverage be within 14 meters of their exact location on the face of the Earth. At 1:24,000 scale, a .5 millimeter line (fine pencil width) covers 12.5 meters on the ground. Therefore, the smallest object you can resolve on a 1:24,000 scale map must be at or about the size of 14 meters.

Resolution in a raster context refers to the smallest unit of area covered by a single pixel. Therefore, a 30-meter pixel would cover an area of 30 meters x 30 meters (900 square meters) on the ground. The smallest observable feature in a raster takes 4 contiguous pixels to be reliably identified. This is known as the Nyquist Frequency. To determine the appropriate resolution for your applications, you must determine the smallest feature you want to resolve, and the pixel size must be half the smallest dimension of the feature in question. For instance if you want to find a car (10 x 6 feet), then your pixel size must be 0.5 x 6 feet = 3 feet to reliably identify cars in a raster context.

In a raster context, linear features, such as roads, can be extracted at or about the base resolution of the image. So, the scale of linear features extracted from raster imagery is approximately equal to the resolution. However, to extract polygonal features (areas) you need to base your error tolerances on a minimum of 4 pixels to reliably identify features. Therefore, the scale of area features is approximately twice the resolution of the imagery. For example, to delineate a road feature from a 30-meter image, you can extract a line representing the road with 30-meter accuracy, or approximately 1:50,000 scale. If you want to delineate wetland polygons in the landscape, your smallest reliable polygon must be 4 pixels (or 60-meter accuracy), which is approximately 1:100,000 scale.

Scale and resolution are related, but not directly so. Therefore, it is most common to separate the two terms and refer to scale in a vector context and resolution in a raster context.

For more information on scale as it relates to geographic information systems (GIS) and remote sensing, see Quattrochi and Goodchild, 1997, Scale in Remote Sensing and GIS, Lewis Publishers, Washington D.C.

Return to Top


What does accuracy mean in GIS and remote sensing?

Spatial Accuracy is the measure of how well features in a geographically registered layer correspond to their positions on the Earth's surface. This is usually measured by a Root Mean Squared Error (RMSE). For C-CAP applications, an RMSE of one pixel is expected, but often smaller errors are reached. It is possible to reach spatial accuracies of less than one-pixel width RMSE under certain conditions (flat terrain and properly modeled sensor distortion).

Thematic Accuracy is a measure of how well the attributes (assigned by the analyst) match up to their real-world features. For instance, is a coniferous forest correctly labeled as an Evergreen Forest?

Attribute accuracy is a measure of the probability that the land cover type for any given polygon is properly identified according to the land cover scheme. For example, if a substantial polygon of "High Intensity Developed" land is identified as "Deciduous Woody Wetland" that is a clear instance of categorical error. If 15 percent of all sample polygons for this class are misclassified to "Deciduous Woody Wetland" and other categories, the categorical accuracy for the "High Intensity Developed" class is 85 percent. The remote sensing literature is replete with procedures for measuring attribute accuracy, such as Congalton's 1991 paper in volume 37 of Remote Sensing of the Environment, A review of assessing the accuracy of classifications of remotely sensed data. Generally, these procedures serve well for current time periods and for relatively small study areas. Past time periods, however, cannot be field verified. Conventional procedures also are difficult to apply to large areas. Accuracy assessments of change databases of great size are currently infeasible due to the combination of past time period, large area, and the excessive number of "from" and "to" classes.

Tests for logical consistency should indicate that all row and column positions in the selected latitude/longitude window contain data. Conversion, integration, and registration with vector files should indicate that all positions are consistent with Earth coordinates. Attribute files must be logically consistent. For example, when examining the change matrix for logical consistency, very few pixels should change from urban to any other category, or from water to any category other than bare ground or marsh. The range of appropriate tests is left to the judgment and experience of regional analysts. All attribute classes should be mutually exclusive.

Fitness for Use is very different from a statistical accuracy assessment. The concept behind the process is to take the data to the field and see if it performs suitably to interpret and navigate the landscape. If problems are evident in the field, more work is required. Often, problems can be missed entirely or counted as acceptable inaccuracies in the data during a statistical analysis. However, if the data perform in the field, they are more likely to perform to standards in the accuracy assessment application.

Statistical Accuracy is the reported results of thematic accuracy for a given project. It is important to know how much error is inherent in the data. Therefore, we must sample the data against "ground truth." From these samples, we can derive statistical estimates of error.

Contingency matrices are developed to graphically display the results of the fieldwork. A contingency matrix is an N x N matrix of "observed" and "classified" cells corresponding to N land cover classes. The matrix depicts the land cover classification category versus the field-observed land cover type. The diagonal cells indicate correct observations, meaning that the observations were classified correctly according to the field observations. Any observation off the diagonal indicates a misclassified accuracy control point.

Overall Accuracy is the number of incorrect observations divided by the number of correct. This a very crude measure of accuracy.

User's Accuracy is a measure of how well the classification performed in the field by category (rows). The user's accuracy details errors of commission. An error of commission results when a pixel is committed to an incorrect class.

Producer's Accuracy is a measure of how accurately the analyst classified the image data by category (columns). The producer's accuracy details the errors of omission. An error of omission results when a pixel is incorrectly classified into another category. The pixel is omitted from its correct class.

Kappa Coefficient is a discrete multivariate technique to interpret the results of a contingency matrix. The Kappa statistic incorporates the off diagonal observations of the rows and columns as well as the diagonal to give a more robust assessment of accuracy than overall accuracy measures. The Kappa statistic is computed as the summation of the diagonal multiplied by the summation of each row multiplied by the summation of each column divided by the summation of each row multiplied by the summation of each column.

Kappa Coefficient =

N * S(i = 1 to r) xii - S(i = 1 to r) (xi+ * x+i)
-------------------------------------------------------------------------
N2 - S(i = 1 to r)
(xi+ * x+i)

where:

r = rows

ii+ = marginal total for rows

xii = total in row i and column i (diagonal)

x+i = marginal total for columns

N = total number of observations

Adaptive Sampling is a relatively new concept in land cover accuracy assessment. The basic concept is to model the likelihood of a pixel being correctly identified. This type of analysis calculates the sample size based on the variability of the entire thematic class distribution. This area of statistical analysis has potential in the coming years to approach a statistically significant measure of land cover accuracy.

Return to Top


How is fieldwork used to improve thematic classification?

There is no such thing as armchair geography. Reality is the best test for land cover accuracy.

Fieldwork gives the analyst a critical view and feel for the landscape in a holistic sense that is not possible from aerial photography and other traditional methods of accuracy assessment. The C-CAP approach to fieldwork provides two means of assessing accuracy, statistical, and fitness for use.

In traditional methods of field verification, a paper map was used, and a random point was plotted, navigated to, and observed. Under this procedure, the observation of 30 sites per day per team was considered a strong success. Another option is a "windshield survey" approach. This is a biased system in which observers simply note what they can observe from transportation routes. This method provides for more observations, but without statistically valid random sampling. Both systems have their individual advantages and problems.

The C-CAP method combines stratified random samples buffered by Topologically Integrated Geographic Encoding and Referencing system (TIGER) road ancillary data and windshield survey approaches. This hybrid method is used to determine the accuracy trends in the data. Using available Global Positioning System (GPS)/Database Graphical User Interface (GUI)/laptop computers, C-CAP field teams can reach 300 or more sites per day. The technology includes

  • Laptop computers
  • Real-time GPS receiver interface software with database applications
  • Computer-based real-time fieldwork database entry and manipulation
  • Georeferenced digital satellite imagery and classified land cover analysis imagery
  • GIS ancillary data, such as roads, other land cover analyses, and digital elevation models

In preparation for field accuracy data collection, TIGER roads are acquired, registered to the digital imagery, and mosaiced. A 300-meter buffer of the land cover imagery is generated based upon the TIGER roads. Random samples are collected from the buffered land cover analysis image in 3 x 3 pixel windows stratified by land cover classification category. The random samples are then used to create a database with a graphical user interface. Field teams navigate by GPS interface with georeferenced TIGER roads, land cover images, and Thematic Mapper (TM) images to field sites. Observations are recorded on the laptop for later manipulation. The items that are noted in the field include

  • Canopy cover
  • Vegetation types by species, where applicable
  • Land cover characterization
  • Soils
  • Special conditions and remarks

Once these data are collected, the results are used to refine the initial classification into a usable and accurate land cover inventory. The database is preserved with the field attributes and is suitable for many spreadsheet and GIS software packages. Once all refinements are complete, a final accuracy assessment is performed to determine the statistical accuracy.

Contingency matrices are developed from the matrix, and four statistical measures of accuracy can be derived:

  • Producer's Accuracy
  • User's Accuracy
  • Overall Accuracy
  • Kappa Coefficient

Return to Top


Where can I learn more about datums and projections?

Geodesy deals with the measurement and representation of the Earth with respect to its gravity field and dynamic geophysical characteristics (polar motion, Earth tides, and crustal motion) in three-dimensional space over time. Geodesy is primarily concerned with the position of the Earth's gravity field and geometric aspects as they change over time.

A Geoid is an idealized equilibrium surface representing the real shape of the Earth, without accounting for the topographic features. The geoid is too complicated to serve as the computational surface for calculating three-dimensional positions.

The Ellipsoid is a simplified representation of the geoid for calculating three-dimensional position.

Coordinate system is a reference system for defining points in space or on planes and surfaces in relation to designated axes, planes, or surfaces.

The Datum is a mathematical representation of the imperfect sphere representing the Earth's surface with the center of mass of the Earth as the origin.

A Projection is a lattice to represent surface coordinates based upon a given datum.

For more information see:

Return to Top