EXISTING EXPOSURE DATA SOURCES

Existing exposure data sources for use in highway safety analysis are described in this section. The following exposure data sources are included:

A description of each data source has been prepared for a data catalog. The objective of the catalog is to provide the highway safety researcher with sufficient information to assess the feasibility (considering time, level of effort, and cost constraints) of using the exposure data source in designing a highway safety evaluation study. The descriptions contain the following information, as applicable:

Highway Performance Monitoring System (HPMS)

Contents

The Highway Performance Monitoring System (HPMS) is a nationwide inventory system that includes all of the Nation's public road mileage. The primary purpose of the HPMS is to serve the data and information needs of the FHWA and Congress. The HPMS assesses the system length, use, condition, performance, and operating characteristics of the highway infrastructure.

The HPMS was initiated in 1978 to consolidate and streamline the States' data collection efforts and reporting requirements. In keeping with FHWA's mandate to provide information, the HPMS is reassessed and modified to collect data relevant to emerging issues. In such a way, collection of pavement information was added to the HPMS in 1987. It was modified again in 1993 to respond to the need to monitor travel for the clean air issues. The HPMS also changes with advances in technology. In 1993, States were required to submit a linear referencing system for their road systems. Thus, the structure of HPMS is undergoing changes over time as data items are added and dropped in response to current information needs.

The HPMS organization, guidance, and analyses are the responsibility of the FHWA. Data reporting for the HPMS is accomplished by the State highway agencies in cooperation with local governmental units and metropolitan planning agencies.

The HPMS report submitted annually by each State consists of:

Areawide Data. The areawide data consist of statewide summaries. These data consist of the totals for mileage, travel, accidents, local system data, land area, population, and travel activity by vehicle type. This information is reported for rural, total small urban, and individual urbanized areas.

Universe Data. Universe data refers to a limited set of data items reported for the entire public roads system as individual sections or grouped length records. The public roads system includes those roads owned by the State highway agency, local governments, and Federal agencies. These data contain a complete inventory of mileage classified by system, jurisdiction, and selected operational characteristics.

Standard Sample Data. The standard sample data include specific inventory, condition, and operational data obtained for the sample panels of highway sections. These data can be expanded to represent the universe of highway mileage.

The data cover:

"Donut" Sample Data. "Donut" data requirements were added to the HPMS in 1993 in response to a need of the Environmental Protection Agency (EPA). The "donut" sample is a supplementary sample of highway panels from the nonurbanized portion (donut area) of National Ambient Air Quality Standards (NAAQS) nonattainment areas. This additional sampling is required to serve EPA's Section 187 Travel Tracking and Forecasting Procedures for the NAAQS non-attainment areas.

The data items are a subset of the data items provided for the standard sample and include identifiers, AADT, and expansion factors.

Linear Referencing System. A linear referencing system (LRS) was added to the HPMS for the 1993 report. These data will enhance the HPMS with Geographic Information System (GIS) capabilities. The data consist of node data file, inventory route and link data files, and inventory route and node maps for the principal arterial system/national highway system (PAS/NHS), and the rural minor arterial system.

Samples

Standard Sample. The HPMS universe consists of all public highways or roads within a State with the exception of roads functionally classified as local. The reporting strata for the HPMS include type of area (rural, small urban, and individual or collective urbanized areas) and functional class (in rural areas, these are Interstate, other principal arterial, minor arterial, major collector, and minor collector; in urban areas, these are Interstate, other freeway or expressway, other principal arterial, minor arterial, and collector). A third level of stratification based on volume was added as a statistical device to reduce sample size and to ensure inclusion of the higher volume sections of the sample in 1987.

The HPMS sampling element is defined on the basis of road segment, which includes both directions of travel and all travel lanes within the section. The HPMS standard sample design is a stratified simple random sample.

Donut Area Sample. The donut area sampling universe consists of all highway sections functionally classified as rural minor arterial and major collector, and small urban minor arterial and collector that are located within the defined nonattainment boundary and outside of all urbanized area boundaries. This typically forms an annular spatial area and is, therefore, called a "donut."

The donut universe is stratified into two functional systems (the minor arterial and collector) and a limited number of volume-group strata. The sample is a stratified simple random sample.

Data Quality

Generally, the quality of data is good. There is some variation in quality of the HPMS reports across the States. Since these data are required by the Federal Government and used for developing national policy and determining the funding of highways, the States comply.

The frequency of missing data is very low. However, whenever there is a change in the HPMS, such as the addition of the donut area information in 1993, there are some problems with the new data from some of the States. Typically, such problems are resolved by the second year of the requirement.

Coverage

FHWA has all the HPMS data from 1978 to the present. Individual States generally will have only their most recent few years.

The national universe data for 1 year contain about 3.25 million records. It is stored on tapes. Records go back to 1980.

The total national standard sample contains approximately 115,000 records per year. Again, these data are stored on tape. Records go back to 1978.

The areawide data for each State are submitted on a series of templates. At first, there were five templates that were submitted on paper. Later, spreadsheet templates were allowed. In 1993, the number of templates was increased to seven and spreadsheet templates (Lotus 1-2-3) were mandated.

Annually, FHWA transfers these records to a mainframe file and stores them on tape. One format was used until 1992. A new format (basically an ASCII file) was instituted in 1993.

The first submissions of the donut sample and line referencing systems were required in 1994. There are no archives of them at this time.

Measurement

The key variable in the sampling design of the HPMS is AADT. AADT is not directly measured (except for a very small number of continuous permanent counting stations in each State), but is either derived from short counts, factored from previous counts, or estimated in some other manner.

States are asked to maintain at least one automatic traffic recorder (ATR) on each route of the PAS/NHS and a minimum of three on both the rural and urban portions of the non-PAS/NHS highways. These are used to develop day of week and seasonal factors used for expansion of short counts to AADT.

Typically, volumes at the ATRs are measured with pavement loops. Pavement loops are prone to failure, especially in northern climates and from construction vehicles. However, failures at ATR stations are supposed to be repaired as soon as possible. Recently, other more reliable technologies have been introduced.

The HPMS methodology requires that traffic counts of at least 24 h be conducted on one-third of the road sections in the standard sample each year. These counts typically are taken with pneumatic tube-type portable counters. These are reliable and, if a problem is suspected, the count can be easily repeated. The vehicle volume is derived from these counts by adjusting for the number of multi-axle vehicles in the traffic flow.

The AADT for these sections is then calculated from the short period volumes, with the application of adjustment factors developed from volumes at the ATRs.

The AADT at the sites where traffic counts were not made in the current year is factored from previous counts at the site or by other methods (estimation, engineering judgment, tracing volume maps, etc.). The method of AADT estimation for each site is one of the data items for the sample.

Statistical Reliability

The HPMS standard sample design is a stratified simple random sample. The HPMS sample size estimation process was tied to the AADT. Of the approximately 80 data items collected, AADT is perhaps the most variable data item in HPMS. Therefore, the reliability of most other characteristics would be expected to exceed that of AADT.

The sample size for each stratum of the samples is prescribed in the HPMS Manual. The sample sizes per functional system vary by State according to the total number of road sections (universe), the number of predetermined volume groups, the validity of the State's AADT data, and the design precision levels.

For rural, small urban, and collective urbanized areas, sample sizes are based on 90-5 precision levels for volume groups of the Principal Arterial System (PAS), 90-10 for minor arterial system, and 80-10 for the collectors (excluding minor collectors).

For individual urbanized areas with populations > 200,000 that are in NAAQS non-attainment areas, the design precision is 90-10 for the arterial system and 80-10 for collectors.

For individually sampled urban areas with populations < 200,000, the precision levels are 80-10 or 70-15 depending on several other factors.

The only objective of the donut portion of the HPMS is to estimate the daily vehicle-miles traveled (DVMT) within the donut areas with a precision of ± 10 percent with 90 percent confidence. DVMT is determined from AADT. Thus, the sample size for a particular donut area is based on the variability of AADT in that donut area.

Data Format and Access

The templates for the areawide data and the data format for the universe, standard sample, and donut sample data are shown in the appendix. Note that the fields are marked with an A, S, or D indicating that this field is required for all records, standard sample records, or donut records, respectively.

To obtain these data files or some portion of these data, contact the Highway Systems Performance Division of the FHWA.

All data are available on IBM readable mainframe computer tapes. The types of tapes that the data are stored on correspond to tape technology at the time the data were collected.

The universe data file is extremely large, approximately 3.25 million records per year. It does not appear particularly useful for highway safety research. However, should a researcher have a need for this information, he/she would have to contact the Highway Systems Performance Division and work out the details of copying the desired tapes. The researcher would have to provide the tapes.

The standard sample data consist of about 115,000 records per year. All the available data sets (from 1978) can be obtained on mainframe cartridge tape.

The areawide data are available on mainframe computer tape in Extended Binary Coded Decimal Information Code (EBCDIC) format. These files can be obtained from the Highway Systems Performance Division on PC diskettes in ASCII format.

The HPMS is updated annually and a new HPMS is generated at that time. It is important to note that some of the data fields and even some of the overall structure of HPMS may change from year to year.

The HPMS data from the States for the previous year is due at FHWA on June 15. It becomes available outside the FHWA sometime at the end of the year. Thus, a researcher can get data from the 1993 HPMS in December 1994 or January 1995.

The FHWA contact for HPMS is:

David R. McElhaney, Director
Office of Highway Information Management
Federal Highway Administration
400 7th Street S.W.
Washington, DC 20590
(202) 366-0180

Reference

Highway Performance Monitoring System Field Manual. Federal Highway Administration. OMB No. 2125-0028. 1993.

Highway Safety Information System (HSIS), FHWA

General: The Highway Safety Information System is produced by the Highway Safety Research Center (HSRC) at the University of North Carolina.

Purpose: The FHWA has selected States for HSIS that provide linked accident, highway inventory, and traffic count data, and has converted the files to SAS format to provide an enhanced analysis capability. This introductory section provides only a general overview of the data. Descriptions specific to each of the States follow.

Source: VMT is estimated from segment lengths and AADT. The AADT volumes are updated from 1 to 5 years. Some values are estimated or interpolated; some sites are permanent and some are year-round. Most are temporary sites, taking 48-h counts. Some have vehicle classification, or "commercial," vehicle counts. "Commercial" is usually any vehicle with two axles and six tires or more.

Coverage: States covered in this write-up include: California, Illinois, Maine, Michigan, Minnesota, North Carolina, Utah, and Washington. Additional States are being added to HSIS. In most States, a major portion (but not all) of the highway system is covered. Usually, these are the State-maintained roads.

Sample: The highway segments covered are usually a purposefully selected subset. Cross-section files in some States contain a sample of segments, usually limited in number.

Strengths: Sample size is large, and there is a diversity of data in different States. The files are in SAS format for convenience, and the documentation is better than usually available from the States. The data are suited for aggregate comparisons.

Limitations: The AADT data are sometimes coarse, and may not be suited for identifying individual, high-risk locations. Entering volumes for both roads of an intersection often are not available. National estimates are not possible. The diversity of data in different States can also be a disadvantage.

Accuracy: AADT volumes are not all observed and are not independent, so the variance cannot be estimated.

Also included at the end of this section is a brief discussion of the statistical implications of the nature of the traffic volume data in most State files. Issues discussed include the use of a purposeful sample rather than a random selection of sites for counts, and the use of estimated or interpolated counts rather than actual counts. A general conclusion is that the traffic volume data will not support a statistically defensible analysis (except when the HPMS procedures have been followed). However, a purposeful sample can be representative, although the variance is likely to be underestimated. Similarly, estimated or interpolated counts may also be reasonable in value, but again, the variance will be underestimated. When highway sections have been stratified prior to selecting sites, the most rigorous use of the data is to calculate estimates at the strata level. Use of the volume data to simply stratify the data into volume groups is also relatively sound.

Thus, the traffic volume data must be used with caution. The actual extent of any of these problems cannot be estimated without additional data. Estimated or interpolated counts mean that the observations are no longer independent, and most statistical techniques are no longer appropriate. In particular, the variance is underestimated and bias may be introduced. The analyst should be aware of the source of the traffic counts in each State and should use good judgment in the selection of an analytic approach. Though statistically sound analyses of accident rates may not be possible with the currently available exposure information, it may be possible to use this information in a productive way, e.g., for stratifying sites, and to perform within the strata only analyses relying on counts.

HSIS Contacts: Jeffrey Paniati at (703) 285-2057 or Yusuf Mohamedshah at (703) 285-2090

California, HSIS

Coverage: The current accident files cover the years 1991 to 1995, and there is roadway information for 1993 and 1994. Accident reporting is not uniform in California, with some municipalities using their own report form and reporting threshold, instead of the California Highway Patrol (CHP) form. Accidents occurring on State routes (including those in urban areas that do not use the CHP form) are location coded. There are about 150,000 accidents annually on State routes (all with location codes) out of an estimated statewide total of 500,000 accidents per year. Reporting is also not complete for uninjured occupants. Information on uninjured occupants is only collected if there is al least one injured occupant. Thus, the occupant injury data are biased to overrepresent injured occupants. However, uninjured drivers have been identified in the driver file by Highway Safety Research Center (HSRC) by linking the injury information from the occupant file with the vehicle file. Overall, HSRC estimates that information on uninjured occupants is missing for about 50 percent of non-towaway accidents.

The roadway information is contained in three files: the Roadlog file, the Intersection file, and the Interchange Ramp file. The Roadlog file contains information on approximately 24,461 km of roadway, including about 3943 km of Interstate, 17,702 km of other primary highways, and about 2736 km of secondary/county/township roads. The 24,461 km are divided into about 50,000 records in the Roadlog file, for an average section length of 0.5 km.

The Roadlog file contains information describing the functional class of the road, cross section information such as width and number of lanes, as well as information on design speed, median barriers, and other special features. The intersection file has information of 20,000 intersections, and the Interchange Ramp file has information on 14,000 ramps. Accidents can be linked with all three roadway files and the intersection file can be linked with the associated segments in the Roadlog file, but the Interchange Ramp data cannot be linked with its associated interchange.

Exposure Information: The Roadlog file includes an AADT and a DVMT for each segment (record). Section length is also included. No information on truck travel is available. In the Intersection file, there is an AADT for the mainline road and for the crossing road, as well as descriptive information for both the mainline and cross road. AADT is also included in the Interchange Ramp file.

Traffic Data: As indicated in the preceding three sections, all three inventory files contain AADT information. In addition, the Roadlog file contains information on DVMT, which is computed as the product of the section length and section AADT estimate.

In California, the 12 district offices have the responsibility of collecting traffic data and developing the AADT estimates for each road section within their district. The Division of Traffic Operations of the Caltrans central office oversees the operation and attempts to maintain consistency in the methods and data across all districts as much as possible. If requested, Traffic Operations personnel will assist a district in calculating the AADT estimates. The division also maintains all count data on an on-line computer file for the districts' use.

There are approximately 2,100 permanent count stations on mainline highways operated by Caltrans in California. Of these, approximately 400 are permanent, continuous counting control stations that operate each day in a given year. Every major State-administered route is counted each year. The 400 permanent continuous count stations form a network that covers all major routes. The remaining control stations are permanent, quarterly counting control stations, i.e., in-pavement loops to which a counter/recorder device is attached for 7 to 14 days during each quarter. Caltrans also collects count data at approximately 700 of these quarterly counting control stations once every 3 years. In a given year, there are approximately 1,000 permanent quarterly counting stations where count data are not collected. California has determined that the AADT estimates, which are derived from the simple average of the four (unadjusted) quarterly counts, does indeed account for seasonal fluctuations without further adjustment based on nearby permanent counters. Consequently, there are no additional adjustments or corrections applied to the AADT's estimated from the quarterly counts.

In addition to the permanent control stations, approximately 1,000 coverage counts are collected annually. The intent is to collect coverage counts on a 3-year cycle (for a total of approximately 3,000 coverage counts), although conditions may force longer intervals in certain districts at times. A coverage count is basically a 24-h to 1-week count.

Coverage counts are expanded to AADT estimates using factors derived from the combined continuous counts and quarterly count data. For road sections that are not counted in a given year, it is the responsibility of the districts to develop these AADT estimates. In some cases, the districts reply on overall traffic growth trends within the district. However, in most cases, the AADT assigned to the section is developed by studying the traffic growth in counts falling on each side of the section.

It is also noted that 24-h to 1-week coverage counts are collected on approximately 3,200 on- and off-ramps per year. These ramp counts are manipulated through ramp balancing to reflect continuity of flow on mainline freeways.

Finally, vehicle classification data are collected at approximately 70 permanent stations across the State. Additional classification counts are collected on an as-requested basis, typically at locations where traffic count data are being collected. Since this is district-based, there is no reliable estimate on how many additional classification counts are collected across all 12 districts per year. Finally, there are approximately 45 weigh-in-motion stations statewide that provide speed, volume, and the "13-bin" vehicle classification information. (Taken from HSIS Guidebook for the California State Data Files.)

Linking Accident and Exposure Information: Accidents can be linked with all three roadway files. Accidents are located manually using the scene diagram on the accident report and maps. Accuracy of the location is believed to be within 0.16 km, and missing data is only a few percentage points.

Illinois, HSIS

Coverage: During 1985 to 1994, this included 26,232 km of roadway of which 2,736 km were Interstate highways; 15,449 km of other primary roadways; and 8,047 km of secondary, county, and township roads.

Exposure Information: All exposure information is contained in the Roadlog file, which contains records for 197,000 sections; each section, on average, is slightly less than 0.16 km.

Exposure, in terms of VMT, can be calculated from AADT and the section length. In addition to the total, AADT for "heavy commercial vehicles" (defined as having two or more axles and six or more tires) is given.

Intersection information is in the Roadlog file and also in an Intersection Location file. They contain the same information, but the Intersection Location file contains one record for each intersection. If there is more than one intersection in a section, the information from the Roadlog file is repeated for each intersection record. Intersections are characterized as "across," "left," and "right." The crossing road is apparently not identifiable. Thus, it appears that for intersection exposure only, the AADT on the through road is available.

Traffic Data: As indicated earlier, the Roadlog file contains information on AADT, percentage of trucks for 1990 and earlier, and commercial vehicle AADT for 1991 and later. These data are developed in Illinois' traffic volume counting program and are based on a combination of permanent counters that count traffic 24 h each day for 365 days each year and a series of short-term "coverage" counts conducted each year. Illinois has 49 automatic traffic recorders (ATRs), of which 21 are capable of collecting counts by vehicle class in accordance with FHWA's Scheme F. The ATR locations on the 5 different classes of roadway include 7 locations on rural Interstate roadway, 6 locations on urban Interstate locations, 12 locations on other rural roadways, 19 locations on other urban routes, and 5 locations on "recreational" routes.

In addition to the ATR data, short-term traffic counts on Interstate and primary highway systems are done on a 2-year cycle. During even-numbered years, portable counter devices are deployed in combination with pre-established in-pavement loop detectors. Typically, the counter devices are deployed during 1 week of the year at any given site. Short counts (e.g., 24- or 48-h counts) are collected on Monday through Thursday only. It should be noted that a sample of Interstate sections are counted 1 week out of every 4 months. During odd-numbered years, the Illinois DOT conducts a comprehensive interchange ramp counting program on State highways. These ramp counts are used to supplement ADT data for sections where the State did not have monitors (i.e., counter devices). In total, it is estimated that approximately 96 percent of the primary system is covered during each 2-year cycle.

For other non-primary roads (i.e., the "off" marked route system), Illinois collects 48-h coverage counts in approximately 20 percent of the counties once every 5 years. However, the northeast counties are done every 4 years. With the exception of Cook County, which is also on a 4-year cycle, urban areas within counties are counted on a 5-year statewide cycle.

Additional vehicle classification counts are conducted on HPMS sections. These are made at 300 locations over a 3-year cycle (i.e., approximately 100 each year) to form a representative distribution for the State.

Finally, the districts often have a need for additional traffic data. Consequently, when requested, the State collects 12-hour turning movement counts at intersections and other "special" traffic data to satisfy these needs.

To convert the short-term coverage counts to AADT, Illinois applies adjustments to reflect corrections for number of axles and for seasonal differences in the daily traffic. Axle corrections are developed from both permanent classification counters and from manual (HPMS) counts. For seasonal corrections, each coverage count location is assigned to one of the five categories of roadway where permanent counters are located, as defined above. The seasonal factors are based on averages from all ATRs in that group.

When a road section is not counted during a given year, growth factors are developed and applied to the most recent prior year's count. Average growth factors are created each year for each functional class of roadway using ATR data and data from adjusted short counts for the current year. The growth factor applied to a particular uncounted section is based on its functional class. For sections where no prior AADT exist, AADT/mile averages by functional class are developed and then used in order to "fill in" the AADTs.

Finally, it should be noted that the percentages of truck-related "Heavy Commercial Volumes" include "two-axle trucks with six or more tires plus multi-axle vehicles." Thus, while pick-ups and vans are excluded, this combination would include single trucks, tractor-semi combinations, and buses. Thus, it cannot be considered a count of just the multiple unit (tractor-trailer) trucks that are found on the roadway system. (Taken from the HSIS Guidebook for the Illinois State Data Files.)

Linking Accident and Exposure Information: Data on different files can be linked by a linkage key, which combines county, route prefix, and route number with the station number.

For intersection accidents, the intersecting route number and route prefix are given. However, it does not appear possible to identify which vehicle approached the intersection from the main road and which one approached from the crossing road. The direction of travel for each vehicle is given, but the direction of the road is not given in the Roadlog file.

Maine, HSIS

Coverage: The Link Record file covers all highways in Maine, including local roads and urban streets. The 35,405 km are divided into 67,000 links. Files are currently available for the years 1985 to 1994.

Exposure Information: The Link Record file contains AADT for each link; the year of AADT; and whether it is an actual count, an interpolation, or an estimate. Together with the length of the link, VMT can be estimated.

Information on intersections is available from the Node Records file, which also includes nodes other than intersections. The configuration of each intersection is given, and up to six legs are identified by the corresponding link numbers. As an exposure measure, only the total number of vehicles entering the intersection is given. However, it is possible to obtain the AADT for each leg from the Road Link file.

Traffic Data: With respect to the traffic information on both the Link and the Node files, the traffic counts that are in the system are extracted from a traffic file again prepared within the Bureau of Planning. The counts are extracted from a series of 54 permanent count stations across the State, 6 of which do detailed vehicle classification counts. There are a total of 9 stations on Interstate routes (which collect counts in both directions), approximately 13 stations on U.S. routes, 24 stations on State routes, and 8 stations on other routes.

In addition to the continuous count stations, each summer, 48-h counts are done at between 1,600 to 2,200 locations on all US and State highways. Beginning in 1994, the number of coverage counts increased to between 3,600 and 4,200. Approximately 10 percent of these counts include vehicle classification counts. Classification estimates exist for other locations that are not high-priority locations.

Each year, these counts are done in either the northern, central, or southern areas of the State. The counters move to a different area the following summer, covering the entire State every 5 years. The southern and central areas are counted in alternate years for the first 4 years of a cycle. Then, the northern area, where counts change less per year, is counted during the fifth year of the cycle.

Seasonal adjustment factors for the coverage counts are based on continuous count stations that fall into the same "highway type" category as the coverage count. Based on extensive analysis in the late 1980's, the three categories used are Urban (including suburban locations), Arterial (including all Interstate locations plus other locations in rural areas), and Recreational locations (whether urban or rural). The actual adjustment factor for a given coverage count location is based on the weekly average ADT for all continuous count stations falling into that category.

For years in which no count data were collected within a given area of the State, historical daily traffic flows are factored up on a county-by-county basis. The growth factor used is based primarily on traffic changes at the continuous count stations falling into the same highway-type category described above. Other information used in developing a specific growth factor includes counts from nearby urbanized areas and special counts that may have been conducted at the location for other reasons. The final growth factor used is based on interpolation between points of known growth (such as 2 or more years at the similar continuous count stations), and is done by personnel with a working knowledge of the system's traffic patterns.

In summary, while some of the counts may be off due to roadside development and/or roadway construction within a specific area of the State that occurred within the 2-year period, in general, the count data are felt to be quite adequate for analysis purposes. (Taken from the HSIS Guidebook for the Maine State Data Files.)

Linking Accident and Exposure Information: Accident and exposure data can be linked by the low and high node numbers that identify each segment and by the distance from the low node given in the accident record.

Intersection accidents are identified as such, distinguishing three-, four-, and five-leg intersections. However, the leg from which a vehicle entered an intersection cannot be determined.

Michigan, HSIS

Coverage: Of 189,897 km of roadway in Michigan, the Roadway Segment file covers only 15,449 km of trunkline divided into 43,000 segments. Data for the years 1985 to 1994 are currently available.

Exposure Information: The Roadway Segment file shows AADT categorized into 10 classes. Commercial AADT is also given. No definition of "commercial" is shown. AADT for the segment is given.

A Cross Section file covers 8,047 km of two-lane rural roads with segments selected by a stratified random sample. Very detailed roadside feature information is given. However, there is no information on sample stratum. ADT values are given based on counts in the early 1980s. Counts of accidents by severity are given.

There is an Intersection file that has recently been released for analysis. However, information on AADT or vehicles entering the intersection is not provided.

Traffic Data: As noted above, information on AADT and Commercial Vehicle AADT is found on the Roadlog file. These data are developed in Michigan's traffic counting program, which, like other States, includes both full-time permanent counter locations that operate 365 days each year and short-term coverage counts at a much larger number of locations. Michigan DOT currently operates and maintains 121 permanent traffic recording (PTR) stations. These PTRs include 34 on Interstates, 31 on U.S. routes, 23 on Michigan State highways, and 12 on other routes.

In addition, there are a varying number of short-term "coverage counts" conducted each year. Michigan DOT indicated that approximately 3,300 such 48-h "short" counts were requested in 1995. These coverage counts included the following:

Michigan attempts to count every State-maintained road section in a 3-year period. Unless required under the HPMS, Michigan also attempts to collect classification counts over a 6-year cycle. It should be noted that in addition to the State's traffic counting program, other agencies (notably those in urban areas) are also collecting traffic data for HPMS purposes. Furthermore, the Metropolitan Planning Organizations (MPOs) in Michigan have developed and supported urban transportation planning models in accordance with ISTEA requirements. These MPOs subsequently have their own counting programs to support their model development and application.

To factor up the short counts to reflect AADT, seasonal factors are developed. Unlike some States where these seasonal factors are based on PTR counts within the same functional class as the short-count location, Michigan has defined six or seven "cluster-analysis groups." Each of these groups contains a number of PTRs, and the adjustment factors are based on averaging the PTR counts within that group. Each roadway section (and thus each short count) is assigned to one of these cluster-analysis groups.

When a specific roadway section is not counted in a given year, its count from the previous year must be adjusted to represent traffic growth. Here, Michigan attempts to "look up and down the road" and identify the closest, comparable section for which an ADT was estimated (counted) for the given year. They determine the percentage change (e.g., increase or decrease) in the ADT associated with that "comparable" section, and apply that percentage change to the historical count for the specific section in question. (Taken from the HSIS Guidebook for the Michigan State Data Files.)

Linking Accident and Exposure Information: Though the Roadway Segment file covers less than 10 percent of the total highway mileage, about one-third of all accidents can be matched with locations on the Roadway Segment file. Linking can be done via information on the control section, and the milepost.

Accidents that occur within 30.5 m of an intersection with a trunkline road are coded for that road with the milepost of the intersection.

Minnesota, HSIS

Coverage: Coverage includes the years 1985 to 1994; however, some files are available only for certain years, and there were changes between the years. Files detail 19,311 km of primary roadways, an additional 37,014 km of State-maintained systems, and 157,711 km of county and local roads.

Exposure Information: Two files provide exposure information: (1) the Roadlog file and (2) the Intersection/Interchange file.

The Roadlog file contains information on about 200,000 road sections on which highway characteristics remain constant. Exposure in terms of VMT can be obtained from the values of AADT given for the segment, and the given length of the segment. Also given is "commercial" ADT. Commercial vehicles are defined as having at least two axles and at least six tires. Exposure estimates can be stratified according to the highway characteristics contained in the file (also according to AADT or AADT per lane).

The Roadlog file identifies the type of intersection at the beginning of a segment. However, it does not identify the intersecting road. Thus, intersection exposure cannot be obtained from this file.

The Intersection/Interchange file contains data on 3,500 intersections, 256 interchanges, and 2,800 grade crossings, currently for the years 1987, 1989, and 1991. Intersections were originally selected for the purpose of identifying high accident locations, but are retained in the file.

Intersection type and a code describing it in some detail are given. The route on which each approaching segment is located is identified, and there are up to two legs for each segment. The direction (N, NE, E, etc.) of each leg is also shown. This allows reconstruction of the configuration of the intersection. For each leg of each segment, the AADT for several years is given. For the second leg of a crossing minor roadway, in 10 percent to 30 percent of the cases, AADT is missing. In these cases, it is recommended that the value for the first leg be used. Thus, the available exposure for intersections consists of AADT on the intersection approaches.

Commercial AADT is not given for intersections. However, it appears possible, though cumbersome, to obtain this information from the Roadlog file.

Traffic Data: The Traffic file contains information related to AADT data for all roadway sections across the State. This information is manually derived from sample and continuous counts taken at temporary and permanent count stations throughout the State. It contains total AADTs and AADTs for heavy commercial vehicles (which are defined as vehicles with two axles and six or more tires).

Like other States, Minnesota develops traffic volume estimates based on automatic traffic recorder stations (ATRs) and short-term (48-h) "coverage" counts. There are approximately 120 ATRs that count traffic 24 hours per day, 365 days per year, across the various roadway types. These are located on all classes of both rural and urban highway, with approximately 55 percent of the locations being on urban roadways and 45 percent on rural roadways.

In addition, there are approximately 34,000 coverage (temporary) count locations across the State where 48-h counts are made. Approximately 12,000 of these locations are covered each year. For the trunk highway system (including Interstate roads), these counts are made on a 2-year cycle, as are counts on roads within the Twin Cities metropolitan area. For the lower order County State-Aid Highways and the Municipal State-Aid System outside the Twin Cities metropolitan area, the counts are made on a 4-year cycle.

The seasonal adjustment factor for a given coverage count is based on counts made at ATRs which are similar to the coverage count location. Here, ATRs are grouped into the following classifications:

Outside (i.e., non-metropolitan area)

Metropolitan Area

Seasonal adjustment factors, based on the data for the previous 3 years, are developed for each classification and are applied to all coverage counts collected at locations within that classification.

For the "non-count" years, a growth factor is applied to the previous year's data based on changes in counts at the ATR stations located on the same functional class of roadway. When new data are available at the end of the next count cycle, these data for the interim non-count years are readjusted to represent the average of prior and subsequent count years (e.g., a 1987 "non-count" year estimate based on the growth factor would be readjusted to represent the average of 1986 and 1988 counts at that location as soon as the 1988 count year was completed).

In developing AADT estimates for each section of roadway, there are sometimes road sections with no historical count data (e.g., lower order local facilities, including township roadways and local streets). In these cases, an original "baseline" estimate is based on ATR counts on lowest order roadways with the lowest counted volumes. Growth factors for these uncounted sections are also based on this same ATR group.

MinnDOT also collects vehicle classification counts at about 300 sites per year. These are 16-h (e.g., 6 a.m. to 10 p.m.) manual classification counts usually over 2 different days. In addition, portable vehicle classifiers are deployed to collect 48-h data. Currently, there is no program to seasonally adjust the classification counts. There are an additional 25 weigh-in-motion stations statewide that collect classification data. However, these data are used less than the manual classification counts.

The new count data are placed in the Traffic file within the first 6 months of the subsequent calendar year. While the Traffic file can also be thought of as a "Section" file (with a specified AADT at the beginning count station being assumed constant over the entire section), it differs from the Roadlog file to which it will often be merged in that the beginning and end points (termini) are often located at different points on the roadway. The linking variables are again the route system/route number/reference point (milepost).

There are approximately 208,000 records on the file, but these do not represent a one-to-one match with the 200,000 "true" records on the Roadlog file. Often, there are Roadlog sections with multiple Traffic file records (i.e., multiple count stations), and often there are Roadlog sections with no Traffic file records (i.e., corresponding count stations) located within the section.

Each raw file record contains up to 30 years of AADT information (with the related year "attached"). Thus, to determine the average AADT for a given year for a series of sections on a given route: (1) the traffic section reference points must be matched with the appropriate Roadlog sections by comparing the reference point with the beginning and ending milepoint on Roadlog sections (with the ending milepoint being "assigned" as being equal to the beginning milepoint on the succeeding section), (2) the appropriate yearly AADT for each contained Traffic file record must be extracted, and (3) the counts must be averaged for sections where multiple Traffic file records exist. If no Traffic file record exists for a given Roadlog section, then the section AADT is assumed to be equal to the AADT at the previous (upstream) traffic section on the same route. (This is the assumption made by Minnesota and by HSRC programs. However, other procedures could be followed in calculating AADT if they are felt to be more appropriate for a given research question.) Any AADT assignment program developed must not carry over counts from one route to another; this is a mistake that can easily be made since the Roadlog file is a continuous file in route order. Obviously, averaging traffic over more than 1 year will require additional programming.

Currently, there are two HSIS SAS-formatted Traffic files -- one developed for 1987 and earlier data, and one containing data for only 1988 and 1989. Again, please note that traffic data were merged with the Roadlog file for years 1987 through 1994. The Traffic file still remains a separate file on the HSIS system for the years 1987 through 1989. It is no longer available as a separate file on the HSIS system after 1989.

The first Traffic file (1987) is similar to the raw file in that it contains up to 10 years of data, with 1987 counts being the most recent data. The second file (1988-1989) contains only counts for 1988 and 1989. Each record on the file contains information on traffic counts for one year for a given location. To combine across years for a given counter location, records with the same location information can be merged.

To make the AADT information even more easily usable in subsequent analyses, HSRC developed a linking program that links the basic AADT information from the SAS Traffic file with the Roadlog file to produce a separate single "Average AADT" variable for each Roadlog section on each of the two Roadlog files (i.e., 1985-1987, 1988-1989). Where necessary, averaging across traffic sections in a given Roadlog section for a given year and "carrying down" AADT information from the prior record have been done in this linkage program. Since the 1987 Roadlog file is used with accident data from 1985-1987, and the 1989 file is used for 1988-1989 accidents, the AADT variable on each Roadlog file represents an average AADT over the respective time periods. That is, the 1987 file contains average AADTs for 1985-1987, and the 1989 Roadlog file contains average AADTs for 1988-1989. Different AADTs (say for individual accident years) could be developed by modifying the existing computer program.

Since it is not possible to perform an independent "check" of the accuracy of the AADT information, it is assumed that the procedure in place in Minnesota to monitor count stations and update the file provides adequate information. As indicated above, these are felt to be excellent data for the trunkline system where they are updated on a 2-year cycle. There are also fairly good data for the county State-aid systems, which are generally updated on a 4-year cycle. (Taken from the HSIS Guidebook for the Minnesota State Data Files.)

Relating Accident and Exposure Data: Accidents are located by information on the route system, route number, and a "reference point." This information allows an accident to be attached to the appropriate section of the Roadlog file.

Accidents in an intersection can also be attached to the Intersection file by using route system and number, and the reference point.

Apparently, the approach from which a vehicle entered an intersection cannot be identified, except possibly by matching the direction of travel with the direction of the approach from the Intersection file.

North Carolina, HSIS

Coverage: The current HSIS files for North Carolina cover the years 1990-1995. Accidents are linked to the Roadway Inventory file with a computerized referencing system that currently covers about 38 percent of the estimated 148,056 total road kilometers in North Carolina. The reference systems covers all 22,530 km of primary routes, and an additional 33,473 km of secondary roads (rural secondary roads and city streets). There are no "county" roads in North Carolina, since all are under State control. This system links about 60 percent of the accidents (118,000 out of 192,000) to a road segment in the Roadway Inventory file.

Exposure Information: The Roadway Inventory file describes homogeneous road segments defined by a beginning and ending milepost. An AADT is provided with the year in which the count was taken and the section length in miles. The percent trucks in peak traffic is available for about 40 percent of the sections and an off-peak percent trucks is available for about 10 percent of the sections. The roadway variables include roadway width, number of lanes, lane width, shoulder type and width, median type and width, surface type, whether the section is in the HPMS sample, a traffic growth factor, and other variables.

Currently, intersection and interchange information cannot be linked with accident as the descriptive information is not available in a suitable format. The available information on roadway segments does not include information on horizontal curvature, vertical grade, or passing sight distance.

Traffic Data: As indicated above, the basic AADT and percent truck information is included on the Roadway Inventory file. The traffic count information used in the development of these variables is developed from a series of permanent control count locations and spot counts across the system. Currently, there are approximately 100 ATRs across the State. These are permanent full-time counters that are used both for counts at their location and to establish seasonal and growth factors used with spot counts from surrounding locations.

In addition to these permanent stations, there are approximately 60,000 points in the State where 24- to 48-h counts are made. The entire primary and Interstate system is covered each year. Fifty percent of the secondary roadway system is covered each year with the remaining 50 percent being done in the alternate year. The spot counts are linked with a group of nearby ATRs in order to establish distributional factors. The data are reviewed internally by the inter-office traffic staff, edited, quality control is checked, and then factors are developed. The traffic counts are closed out for the count year in October of each year and then sent to the roadway inventory staff for inclusion in the Inventory file.

Ramp counts are made each year on all interchange ramps on the Interstate system. These ramp counts are used to generate turning volumes and to balance counts on the mainline for the Interstate and crossing roadways. This represents approximately a 2-week count on each ramp. Past ramp counts are found on paper file, but have been computerized since early 1993.

Truck counts are made on a 3-year cycle at 300 vehicle classification sites across the State. The 300 count locations are not necessarily at all of the ATR sites. There are approximately 90 truck weigh stations in the State related to the SHRP program. In addition, it was noted that truck counts are made every 3 years on all HPMS sections in the State.

Finally, for intersections that are in the State's Traffic Improvement Program, turning counts are done on an as-needed basis. These turning counts include both a.m. and p.m. peak traffic, with each count being conducted for approximately 7 h. It is estimated that approximately 500 of these are done each year. These are found in a paper file, which may be computerized in the next 1 to 2 years.

Examination of the traffic-related variables in the HSIS Inventory file indicates that ADT is present for 99.9 percent of the sections. However, what is missing is data on percent trucks. Here, the variable concerning "Percent Trucks at Peak" is uncoded for approximately 60 percent of the mileage. The variable related to "Off-Peak Percent Trucks" is uncoded for almost 90 percent of the mileage. Conversations with department of highways staff indicated that this is the result of the fact that these variables are only coded if there is fairly high confidence in the percentages. This would occur if a classification count had been done on the section (as in an HPMS sample section) or on an adjacent or nearby section. Thus, while the data present should be fairly accurate, data are missing for a large number of miles.

Linking Accident and Exposure Information: The linking system for the accident data is unusual in that it is based on a "paper" reference system. The linkage information is the county, route, and milepost. However, there are no physical mileposts on the roads. The investigating officer records the distance and direction to a reference point that may be an intersection, bridge, or city boundary. Mileposts are determined in a computerized referencing system, based on the location of the reference given. The accident is linked by using the milepost generated by the computerized reference system to locate the section in the Roadway Inventory file, which includes this milepost within the beginning and ending milepost defining the section. Nearly all accidents on the primary road system are linked with this system, plus a large number of accidents on the secondary roads. About 90 percent of the mileage in the reference system is in rural areas. About 80 percent of the rural accident locations are believed to be accurate within 0.16 km, and 80 to 90 percent of the urban accident locations are thought to be accurate within 30.5 m.

Intersection characteristics are not currently available for linkage with the accident data.

Utah, HSIS

Coverage: Accident data for 1985-1994 are included, but highway data for 1990 are not available.

Of the 80,465 highway kilometers in Utah, 69,200 km are on the Roads file. However, only 20,599 km of these have inventory information and can be used for analytical purposes.

Exposure Information: The Roads file contains AADT for each section. Also given are the percentage of trucks in off-peak periods and the percentage of commercial vehicles in peak periods. No definition of "trucks" and "commercial vehicles" are given. Together with the segment length, VMT can be estimated.

No separate information for intersection exposure is available. The only information given for intersections is the number of intersections by segment, also separated by type of control. The intersecting roads are not identifiable.

For the State-controlled system, a Horizontal Curve file and a Vertical Grade file are also available. They allow disaggregation of exposure by grade and curvature.

For a random sample of sections of two-lane roads, a Cross Section file is available. It contains extensive information on cross-section and roadside features, including trees, posts, hydrants, recovery area, etc. This would allow the inclusion of highly specialized exposure measures, such as the number of trees passed, etc. Counts of accidents by severity are also given.

Traffic Data: As noted earlier, traffic data related to AADT and truck percentages are found on the Roadlog file. These data are based on Utah's traffic count program. In this program, there are 85 permanent ATRs on Interstate and Utah State roads that are in operation 365 days/year. Of these, 53 ATRs capture volume and vehicle classification counts and 32 ATRs count volume only. These ATRs conform with FHWA's HPMS guidelines. In addition, there are approximately 10 ATRs on roads inside National Parks in Utah that are operated by the National Park Service.

In addition to these permanent counts, Utah collects 48-h coverage counts at approximately 1,000 locations per year. Counts on the State-system roadway are done on a 3- to 5-year cycle. Approximately 100 traffic counting machines are used to collect traffic data for 11,426 km of State-system roads in Utah. In terms of coverage, Utah tends to have a better sample coverage of high-volume roads compared to lower functional categories. From a purely statistical perspective, a larger sample might be more appropriate for the lower functional classes of roads. However, Utah believes that limited resources for counting should be devoted to the roads that carry the bulk of the traffic. In addition to these coverage counts, approximately 100 short-term vehicle classification counts are conducted each year.

Short-term counts are expanded to AADT estimates using ATR data for roads with similar characteristics, functional class, and volume group. For a year in which no count is made, the previous year's count for a section is modified by a "growth factor" that is based on data from an "assigned" (similar) ATR station, area count data, and/or estimated statewide averages. In this manner, volume assignments are made to each section of State-system roadway each year. Finally, Utah staff also develop estimates of truck percentages and equivalent single-axle loadings (ESALs) for "on-system" roadways. Traffic information is entered into the Traffic file as it is being collected, but is transferred to the computerized system and, thus, to the Roadlog file at the end of the year.

With respect to the accuracy of the traffic information, Utah staff indicated that the data are currently being corrected so that errors would probably not be greater than +10 percent for almost all of the sections. (Taken from the HSIS Guidebook for the Utah State Data files.)

Linking Accident and Exposure Information: Accident and highway files contain the route number and milepost, which allow linking of the data. Intersection accidents can be identified by a code based on the officer's intersection sketch. However, they cannot be linked to a specific intersection in a segment, except if there is only one in a segment.

Washington State, HSIS

Coverage: The current HSIS files for Washington State cover the years 1993-1995. Data for 1991 and 1992 will be added later when it is available. There are approximately 120,000 accidents per year in Washington State. Approximately 42,000 of these occur on State routes, and are location coded manually, based on the scene diagram and location information on the accident report. About 20 percent of these are "citizen" reports. Omission of these citizen reports reduces the located accidents on State routes to about 34,000.

A total of 13,840 km are described in the Roadlog file. This mileage includes 11,748 km of mainline roads, and another 2092 km of ramp front and other non-mainline roads. For example, information on each ramp for 876 interchanges is included. Interstate, U.S., and State routes are included. About 85 percent of the mileage is rural and there are about 1408 freeway kilometers. Each record describes a homogeneous section of road, as created by HSRC from point-by-point files supplied by the State. There are a total of 41,000 sections at an average section length of 0.3 km. Although the points at which intersecting roads cross are identified, there is not sufficient information (milepost) to link in the section data for the crossing road. Thus, the Washington State data do not appear well suited to an analysis of intersection accidents.

Exposure Information: The Roadlog file includes the beginning and ending mileposts and section length, the latter two calculated by HSRC. AADT is also given. By linking with the Traffic file, additional weekday and weekend counts are available, as well as single- and double-trailer truck volume. The available roadway characteristics include surface width, lane width and type, shoulder width and type, median information, functional class, posted speed, and other information.

The Traffic file created by HSRC describes road sections with approximately constant volume. The beginning milepost is identified, and the endpoint is found as the beginning milepost for the next record. However, one must check that the route has not changed. Additional section files describe 33,000 vertical grade sections and 14,600 horizontal curve sections. These can also be linked with the Roadlog file based on beginning and ending mileposts.

Traffic Data: As noted above, traffic count data captured on the Trips file, and thus in the HSIS system, contain a number of variables. These include AADT, average weekday volume, average weekend volume, single-trailer truck percentage, double-trailer truck percentage, and various peak-hour descriptive percentages. While AADT information has been merged into the HSIS Roadlog file to facilitate rate-based analyses, the other variables can be linked with the Roadlog file through linkage variables contained in both files.

In the base traffic file from which this information is derived, a new record is begun when there is a change in the AADT. The traffic census staff go through each of the inventory groups and identify what they feel are "discontinuities" along the routes in terms of volumes. These discontinuities would represent locations where the staff expect there to be significant changes in the AADT, such as an intersection with a significant turning volume or the location of a major traffic generator such as a shopping mall exit. In short, the Traffic file is a set of "homogeneous traffic sections." Thus, even though the file is organized as "point data" with only a "beginning" milepost, the data should not change until the next milepost. (In using and merging the file, some caution should be taken to ensure that the next milepost on the file is within the same route.)

The basis for the traffic information is a series of permanent and non-permanent count stations across the State. There are 117 permanent ATRs in the State as of December 1993; all 117 produced volume counts. Of these permanent count stations, 70 produced vehicle classification counts, 32 produced truck weight plus classification counts, 22 produced vehicle length counts, and 47 produced speed counts.

In addition to the permanent count stations, the traffic census staff conducts approximately 3,500 weekday counts each year. Each of these is a 72-h, Tuesday through Thursday count. Approximately 400 of these include additional vehicle classification counts each year. The counts are not always taken at the exact same sights, but do cover all HPMS locations as well as certain project counts that are conducted each year. In Washington State, there are 3,200 HPMS sections. The traffic staff feel that there are approximately 5,000 unique "homogeneous traffic" sections in the State each year. Counts are made at each of these locations every other year or every third year. In addition to these counts, there are ramp counts done at 120 to 150 interchanges each year.

With respect to accuracy and completeness, the DOT staff feel that they have very good data on approximately 90 to 95 percent of the roadway in the trips system. They feel that the least accurate information on the file is the vehicle classification counts. This is due to the limited number of count stations that are, by necessity, available for these type counts. However, traffic census staff are working toward increasing the accuracy of these truck counts. Their current feeling is that the variable related to daily truck percentage in the peak hour now contains good data. The overall truck count system was redone in 1987. One of the current points of interest is to try to expand the seasonal factors for trucks to make these even more accurate.

As noted under specific variable descriptions in the later format section, certain other variables (such as "Peak Hour Percentage" and "Peak Hour Split") have significant numbers of uncoded ("zero") locations. These represent locations where counts were not made or where old, erroneous counts have been deleted from the file. Washington State staff recommend carrying forward values from the preceding valid count location in these cases.

Linking Accident and Exposure Information: County, route, and milepost in the accident files can be used to create an 11-character variable that can be linked based on the route identifier and the beginning and ending mileposts in the Roadlog file. In the Traffic file, the beginning milepost is given, and the endpoint is assumed to be the beginning of the next record after checking that the route is the same.

Intersection volume and characteristics are only available for the mainline roads. Information for the crossing road sections cannot be linked.

Exposure Information in Highway Files

Highway files typically contain AADT for each segment in the file. Sometimes additional information is given, e.g., AADT for commercial vehicles or peak ADT. Together with the section length, AADT allows calculation of VMT on that section. If a segment ends at an intersection, AADT provides the number of vehicles entering and leaving the intersection from each approach. For an intersection within a segment, the same values must be assumed for the two approaches on this road.

In a formal sense, this provides enough information to calculate and analyze accident rates. However, if accident rates or accident counts in relation to AADT are used in statistical analyses, then the statistical characteristics of the AADT information in the files need to be known.

There are basically three types of accident studies:

(1) Making and comparing aggregate estimates.

(2) Studying relationships between accidents and highways and other factors using segments or intersections as observations.

(3) Identification of hazardous locations--"black spots."

The statistical characteristics of the AADT information affect these analyses in different ways.

The AADT values for the many sections of a highway file are derived from relatively few actual counts. At continuous counting stations, counts are made 24 hours a day, 365 days a year. At temporary counting stations, counts are made for usually 24 or 48 h, at intervals of 1 or several years.

There are two statistical questions: (1) what are the sampling characteristics of the actual counts, and (2) how are the AADT values for the sections without counts obtained from those for the sections with counts?

The answers to these questions determine the statistical analyses that can be validly performed with accident rates as dependent, or AADT as independent, variables.

To allow generalization beyond the sites with actual counts, sites should be randomly sampled from a well defined "frame," e.g., all sections on Interstate highways. This is often not done. Historically, "judgment" samples have often been made. Sites were selected that experts thought to be "typical" or representing the entire range of highway characteristics. While a judgment sample can give unbiased estimates, one cannot be certain of this. In particular, one cannot validly predict the errors of estimates based on judgment samples.

At the temporary counting stations, there is also sampling over time. If the counting is not done during certain parts of the year only, but year-round, sampling over time may be adequately close to random sampling.

Statistical analyses of a sample obtain estimates for the total sampling frame: totals or averages. In this application, it would be the number of all vehicles entering intersections on the highway network constituting the frame or the AADT representing an average over all sites on this highway network.

If the sample is stratified, then the estimates apply to each stratum separately, and estimates for all strata combined can also be obtained.

Such estimates can be used for studies of broad questions, e.g., comparing accident risks among highway systems, among highways with different numbers of lanes, classes, and intersections, etc. The level of detail such studies can consider is limited, because each stratum provides a single observation. However, if a detailed sampling plan is developed that stratifies according to many factors and their interactions, then even if the minimum of two sampling sites per stratum is used, detailed analyses may be possible.

One limitation of this type of analysis is that it does not allow identification of high-risk sites or "black spots." Highway data files contain information that, in principle, allows identification of such black spots, e.,g., AADT for short highway sections. With this information, an analyst can calculate accident risks for sections and intersections, and identify high-risk locations. However, without fully understanding how the AADT values for the individual sections are obtained from the relatively few sites with actual counts, the analyst cannot assess the statistical characteristic of the AADT values, and analyses based on them may be invalid. One approach is to assign to each section the value of the preceding section, until a section with an actual count is encountered, then carry over this count, etc. An alternative is to linearly interpolate AADT on the sections between connecting stations. While such approaches may give realistic order-of-magnitude estimates, and may even be quite realistic under certain conditions, this is not guaranteed. Thus, estimates of accident rates based on them can be biased and unrealistic. A more subtle, but not less important, aspect is that the estimates are not independent. Usually, the estimates on adjacent sections are positively correlated. A consequence is that analyses, which are using individual sections with their accident counts and AADT values as observations, tend to underestimate the uncertainties and errors of the results. They may also lead to the identification of "black spots," which appear to have unusually high accident risks only because the variability of the calculated rates is underestimated. Therefore, the statistical value of AADT figures by segment, without indication from which stations and by which method they are derived, is very limited.