Estimating Confidence Intervals for Transport Mode Share
STEPHEN D. CLARK *,1 JOHN
MCKIMM 2
ABSTRACT
One of the common statistics used to monitor transport activity
is the total travel by a particular method or mode and, for each
mode, this share is routinely expressed as a percentage of total
personal travel. This article describes a simple model to estimate a
confidence interval around this percentage using Monte Carlo
simulation. The model takes into account the impact of both
measurement errors in counting traffic and daily variations in
traffic levels. These confidence intervals can then be used to test
reliably for significant changes in mode share. The model can also
be used in sensitivity analysis to investigate how sensitive the
width of this interval is to changes in the size of the measurement
errors and daily fluctuations. A bootstrap technique is then used to
validate the Monte Carlo estimated confidence interval.
KEYWORDS: Mode share, confidence intervals, Monte
Carlo, bootstrap.
INTRODUCTION
The last 5 to 10 years in United Kingdom transport has seen the
establishment of an increasing number of targets against which the
performance of the transport system is to be measured. Many of these
targets are expressed in precise numerical terms, and sophisticated
monitoring regimes are in place to determine the current value of
the measure of interest. In some cases, this monitoring can provide
complete information about the measure (the population), but more
commonly only information on a sample of the measure is possible.
Information from the sample is then used to infer the behavior of
the population. Statistics tell us that all samples are subject to
variation and in judging the value of an indicator (and in
particular whether a target has been achieved) some account of this
variability is necessary. Therefore, it is important to ensure that
the precision of the monitoring regime that estimates the required
indicator is compatible with the specified target level for the
indicator.
The following section presents the background to the statistic to
be modeled in this paper: the percentage of people who travel by a
particular mode. The next section describes the Monte Carlo
technique used to estimate the confidence interval around this
statistic. The following section presents the survey methodology
used by the city of Leeds in the United Kingdom to collect the base
data. By using the information on how the base data were collected,
ranges can be set for likely measurement errors and daily variation,
which are detailed in the next two sections. A number of the
implicit assumptions that result from this exercise are then
highlighted. We next report on the application of the Monte Carlo
technique and the issues surrounding the sensitivity analysis and
sample size determination. The penultimate section uses the
technique of bootstrap estimation to "validate" the Monte Carlo
estimates of mode share deviation. The final section provides some
suggestions on how the technique can be adapted for other purposes.
MODE SHARE STATISTICS
Local government authorities regularly undertake surveys to
measure the volume of traffic and travel in their areas to aid in
planning services and targeting investment. The measure of travel
usually adopted is that by people rather than by vehicle. This
allows for a more meaningful measure of travel to be estimated,
because, for example, a fully loaded bus carries far more people
than a single car. These surveys can range over a designated area
(e.g., a town or city), be concerned purely with journeys across a
designated cordon, or may result from an individual or household
travel diary.
Because the volume of total travel in different areas varies, it
is common to present, for each mode, these volumes as a share of the
total travel volume in the area and to express this share as a
percentage. This then enables a comparison to be made of mode shares
between areas. Also, if such surveys are conducted at regular time
intervals, then trends in each mode of travel can be identified.
Concerns arise when these surveys are based on a small sample
size, maybe as few as one full day of observation (Royal Statistical
Society 2005; USDOT 2003). These small sample sizes should not,
however, be much of a surprise since, typically, a six-hour survey
in a large metropolitan area may cost upwards of £10,000 (about
$18,000). Obtaining a more reliable estimate of the mode share and
the precision of this estimate would require more survey days; just
to halve the standard error of the mean estimate requires three
extra days, bringing the cost of the survey to £40,000 (about
$72,000). But without an indication of this sampling variability, it
is difficult to conclude that any observed changes are real and
statistically significant.
Some survey techniques, such as stated preference surveys,
attempt to estimate mode share, and, since they use well-understood
statistical models, they are able to provide confidence bounds
around any mode share estimates (Ortuzar and Willumsen 1994). Such
surveys are, however, typically concerned with making a choice that
involves at least one hypothetical alternative. Furthermore, they
have other errors that may lead to greater imprecision than already
present and are costly to administer and analyze.
The study described in this paper focuses on an alternative form
of data, namely revealed preference data, where the modes actually
used by individuals are recorded. Also, this information is provided
in an aggregate form of travel data (i.e., the number of people
traveling by the different modes) rather than the disaggregate form
of household or individual travel diaries.
MONTE CARLO SIMULATION
Simulation is an attempt to replicate a real world phenomenon
using a model and a set of simplifying assumptions. One form of
simulation that involves the assessment of the behavior of random
variables (e.g., observed traffic flow or vehicle occupancy) is the
Monte Carlo approach. The method assumes that the traffic flow (or
other variable) follows a statistical probability distribution. As
part of the simulation process, repeated instances of random
observations are taken from this assumed distribution, and the
impact of these random draws on some output measure is recorded.
Using this simple sampling approach, many replications can be made,
and a reliable estimate of the output measure and its spread can be
obtained.
This paper uses the Monte Carlo approach to simulate the observed
differences that can occur as a result of measurement error and
daily variation associated with the conduct of a cordon traffic
survey. By obtaining a large simulated set of these errors and
variations and using them to "correct" the observed count, it will
be possible to calculate a set of confidence intervals around the
output measure, in this case mode share.
While results from the statistical literature allow the
distribution for a mode share to be established (see appendix), this
closed-form distribution approach contains a number of
disadvantages:
- reliable estimates of the parameters for the distributions are
difficult to obtain, because few sample observations are
available;
- incorporating sophisticated multivariate relationships into
the model is necessary, because, for example, the estimate of the
share of travel by rail will impact on the share by all other
modes; and
- the model and the methodology need to be easily explainable to
nonstatisticians; mathematical models involving Greek symbols are
not useful to such an audience.
Monte Carlo approaches have been used previously in the
transportation field. These include structural reliability
(Pothisiri and Hjelmstad 2003; Zhao and Ang 2003), traffic modeling
(Cassidy et al. 1994; Tarko 2000), network reliability (Chen et al.
1999, 2002; Lam and Xu 1999), and activity modeling (Kreihich 1979;
Veldhuisen et al. 2000; Castiglione et al. 2003).
Perhaps the most similar study to the work described here is that
reported in Williamson et al. (2002), where the Monte Carlo approach
was used to investigate whether short period traffic counts (of 5-,
10-, and 20-minute duration) can accurately represent hourly traffic
counts. The first stage was to assume a Weibull distribution for the
count data and to estimate the scale and shape parameters of the
distribution. In the next stage, 1,000 instances of 60 observations
(1 observation for each minute) from the appropriate Weibull
distribution were generated and used to construct a cumulative
distribution plot. From this plot, 90% confidence intervals were
estimated, and if the actual observed hourly count fell within this
interval then the estimation was deemed a success. An application of
this methodology showed that contiguous 20-minute counts were
required in order to accurately estimate an hourly traffic count.
SURVEY METHODOLOGY
This section describes the survey methodology used to collect the
data for the example application of the Monte Carlo simulation. A
thorough understanding of the survey methodology is important,
because this will later help in defining the ranges for measurement
errors and daily variations. All the data here (except rail data)
were obtained from on-street observation by a team of enumerators,
where all movements in one direction, across a datum line, were
recorded. A discussion of the methodology for each mode of travel
follows.
Cars. Each enumerator was asked to count the number of
cars, categorized by the number of occupants (1, 2, 3, and 4 or
more). Depending on the volume of traffic on the road, they may also
have been required to count goods vehicles and cyclists.
Goods vehicles and cyclists. If the person who was
counting cars could not handle this category of traffic, another
enumerator was used to count these vehicles. Cyclists using
dedicated paths or the pedestrian pavement were included in the
count.
Buses. An enumerator recorded the type of bus observed and
made a roadside assessment, without boarding the bus, of how full it
was. Four types of buses were counted: mini, single deck, double
deck, and articulated. The occupancy was recorded as empty,
one-quarter full, half full, three-quarters full, full, and full
with standing passengers.
Rail. The local Passenger Transport Executive (PTE)
provided an estimate of the average volume of passengers arriving at
the central train station. This estimate was based on onboard head
count surveys conducted by train operator staff on three days during
a year and was supplemented by additional PTE commissioned counts.
They were then reconciled with other databases to provide an
adjusted estimate.
Walk. The number of people walking across the datum line
was recorded.
The surveys of the 34 radial roads into Leeds City Centre (figure
1) were conducted over 17 separate weekdays in May 2002 from
7:30 a.m. to 9:30 a.m. and 2:00 p.m. to 6:00 p.m. The number of
radial roads surveyed on each day varied from one to up to five, but
each road was counted only once.
The next two sections present ranges for the possible accuracy of
the counts and the degree of daily variability during the morning
peak. To a large degree, a Delphic approach (Dajani and Gilbert
1975), involving transport planners, survey managers, and
statisticians, was used to arrive at a consensus opinion on the size
of these error ranges. Other ranges may be used without invalidating
the general Monte Carlo approach presented here.
MEASUREMENT ERROR
The measurement error assesses the accuracy of the enumerator
counts. This is equivalent to comparing two (or more) counts of the
same thing at the same time by different people to see how close
they are in agreement. Clearly, this will depend on the skill and
expertise of the staff involved.
Cars. The Traffic Appraisal Manual (DfT 2003)
suggests that a skilled enumerator can achieve a 95% confidence
interval accuracy of ±10%. Our own validation checks conducted by a
second enumerator suggest that an interval of between ±5% and ±15%
is usual. An error range of ±10% was selected for estimating the
volume of single-occupant cars and a slightly larger range of ±12%
for cars with more than one occupant, because this is a slightly
more complex task.
Buses. It is likely that buses will be counted with more
accuracy than cars, since they are a more visible presence on the
road. Conversely, the measure of occupancy is likely to be
inaccurate, because estimations of the occupancy must be made from
the roadside. Table
1 gives the volume measurement and occupancy estimation errors
for each type of bus. Minibuses have an error range similar to cars.
All other bus types have a reduced error range, because they should
be more noticeable. The error in estimating the occupancy of
minibuses is low, because it is relatively easy for a quick and near
precise estimate to be made. It is slightly more difficult to
estimate the occupancy of single-deck buses. The most difficult task
is estimating the vehicle occupancy of double-deck and articulated
buses: with double-deck buses, it is very difficult to judge how
full the top deck is; and with articulated buses, there is a large
volume of information to assess visually. For these reasons, the
occupancy error was set high at ±15%.
Rail. The PTE who provided the estimates for rail
patronage judged the numbers to be accurate within a range of
±5%.
Walk and pedalcycle. Both these volumes are thought to be
recorded at similar levels of accuracy to each other, near the ±10%
mark.
Powered two-wheelers (PTWs). A PTW vehicle can be an
inconspicuous part of the traffic. They do not necessarily keep to
designated lanes and can easily speed along the carriageway or weave
between lanes. This rational led to a high measurement error range
of ±15%.
So far in this section, only the errors specific for each mode of
travel have been quantified. In addition, it is not unreasonable to
assume that there is a global error that affects all the modes
counted on the same day. This may be due to generally unfavorable
(foggy or wet) or favorable (dry and warm) roadside conditions. This
global error is in addition to the mode-specific errors for all
road-based volumes (i.e., not rail) and in this way modifies the
mode-specific errors.
For the global volume errors, the range was set at ±5%. This
means that, for example, a sample value for the error in estimating
the volume of single-occupant car traffic was in the range of ±15%
(a mode-specific element of ±10% and a global element of ±5%). Buses
also have a global occupancy error to reflect the fact that in
certain conditions (e.g., misty windows) occupancy in all buses will
be difficult to estimate and also that rounding (to the nearest
quarter) is involved. For buses, the global volume error was the
same as for the other road-based modes, and the global occupancy
error was set high at ±15%.
DAILY VARIATION
In addition to measurement error, taking into account the natural
daily fluctuations that occur in traffic volumes is necessary. These
variations can result from many causes; for example, a person may
change his or her mode of travel or time of departure on successive
days. Even if we were to count traffic with perfect accuracy, these
daily variations will still be present in our data, and, in this
section, estimates of the extent of these fluctuations are
provided.
Cars. The daily variation in the volume of people
traveling by car is specified for each category of car occupancy.
These are set at ±5% for single- and double-occupant cars, ±8% for
three-occupant cars, and ±12% for four or more occupants in a car.
Some published evidence supports these ranges of variation. Phillips
(1979) used a range of coefficients of variation of between 2.5% and
15% in determining the sample size for daily traffic flow
estimation. Fox et al. (1998) suggested that a range for the
coefficient of variation of 8% to 15% is appropriate, and, in the
peak period, this value can be at the low end of this range (near
10%).
Buses. Buses run on a regular schedule each day, and,
therefore, we would expect only small day-to-day variations in the
number of buses counted. To quantify this, information reported in
the 2003 West Yorkshire Local Transport Plan (WYPTA 2003) (which
includes Leeds) shows that only 1.4% of all buses were canceled and
of those that ran, 90% were less than 6 minutes late. In addition to
this variation in the volume of scheduled buses, there was also
variation in the average occupancy of buses. Both the volume and the
occupancy variation are limited to ±5%.
Rail. Like buses, the volume of rail travel should be
consistent from day to day. Statistics from the Strategic Rail
Authority (2002) for the commuter rail operator in West Yorkshire
show that the level of service reliability is comparable to that for
buses. The percentage of train cancellations is 1.5%; however, the
punctuality is slightly worse for trains, with just 83.8% of trains
arriving within 5 minutes of their scheduled time (but 91.7% within
10 minutes). The range of variation was, therefore, set at ±5%,
similar to the level for buses.
Walk. The volume of walk traffic is anticipated to vary
slightly more than motorized methods of travel, because the traveler
may easily substitute another mode (e.g., as a car passenger some
days of the week or via bus on rainy days). The range was,
therefore, set at ±10%.
PTWs. This mode is thought to be a highly variable form of
travel. Statistics from the Department for Transport (1994) show
that nearly 40% of motorcycle trips take place in the summer months
and only 16% in the winter months. Many of these summer journeys
will be for leisure purposes, and, because the primary concern here
is with morning peak commuting trips, this suggests a range less
than that indicated by the statistics. The range was set at
±12%.
Pedalcycle. Like walking and PTWs, this mode is thought to
be highly variable on a day-to-day basis for many of the same
reasons (DfT 1994, 1996). Cycling can, however, be even more
unpleasant during adverse weather conditions than other modes
(primarily for safety and comfort reasons) and so the variation
range was set high at ±15%.
In addition to the mode-specific ranges of variation described
here, an additional global element of variation was applied (in a
similar manner to the global measurement error). This range of
variation was set at ±5%. As a result, and referring to the values
suggested for cars in this section, a compounded variation range of
±10% for single- and double-occupancy cars (a mode-specific element
of ±5% and global element of ±5%) is possible.
MONTE CARLO SUMMARY
Before progressing to an illustrative example to show how this
information is able to produce confidence intervals for mode share
statistics, a few points are worth making.
- Two distinct sources of uncertainty. The measurement
error represents the accuracy of the count, while the daily
variability represents the fluctuation in these counts. Even if it
were possible to count with 100% accuracy, there would still be
daily variability, and, even if every traveler made the same
journey by the same mode at the same time each day, there would
still be differences in what enumerators counted.
- Error structures. Depending on the survey methodology
adopted, the structure of the errors will change. If, instead of
classifying cars by the occupants, one person counts both cars and
people separately, it is likely that the measurement errors will
be negatively correlated (i.e., they are able to count vehicles
accurately but people inaccurately or vice versa).
- Expertise required. To set the ranges for the errors
and variability requires some expertise and assumptions. One
approach is to start with a fairly well understood measure (e.g.,
the accuracy in enumerating cars) and set other rates relative to
this.
- Count duration. The range of daily variation will
depend on the schedule of when counts are conducted. The ranges
for a survey of 25 locations, all conducted on 1 day, should be
larger than an alternative survey where 5 locations are counted on
5 days and their values summed.
- Correlation between days. In the model specified here,
no correlation exists in the errors or the variation between
consecutive days. If it appears that, for example, high errors in
counting at locations on one day would lead to a tendency to high
errors on other days, then this could be accommodated within the
model framework presented here.
- Limitations on model use. The model is purely concerned
with travel behavior in an aggregate form and no information on
the traveler's individual characteristics (e.g., gender, age,
income) is required or used. The model cannot, therefore,
anticipate the detailed results of policy interventions or produce
forecasts of future behavior.
EXAMPLE APPLICATION
To apply the Monte Carlo technique to the problem of estimating
confidence intervals, we used the Excel spreadsheet package. Excel
provides all the facilities required to conduct the simulation
(primarily the generation of random numbers, although some care is
required; see Knusel 1998). It has the tools to interpret the output
(i.e., produce graphs and tables) and is commonly available.
One aspect that still needs to be defined is the underlying
distribution from which the sample errors and levels of variation
are drawn. The simplest distribution available is the uniform
distribution where each sample value within a range is equally
likely. This does not appeal intuitively, because smaller error or
variability values would be more likely than larger values. This
requirement suggests that the normal distribution should be
used. The normal distribution does not, however, have a limiting
range; sampled values can extend between plus and minus infinity.
Clearly, these more extreme values would not be expected to arise in
practice, so we adopted the convention that 95% of the sampled error
or variability rates should be within the set ranges for errors or
variability as described above. The normal distribution is also
symmetric. If it is thought that the measurement errors are one
sided (i.e., either mostly under- or overestimates), then it is
possible to sample primarily positive or negative values.
The sampling regime as described in this paper is built within a
workbook.1
A series of 17 worksheets hold the morning peak data collected on
each of the 17 survey days. Each of these worksheets contains the
following traffic information for all sites that were counted on
that survey day:
- the existing base case as surveyed during May 2002,
- the sampled values for the measurement error; these errors are
applied to the observed counts so that the measurement errors they
contain are "corrected,"
- the sampled daily variations; these are applied to the "error
corrected" values calculated in step 2 to represent values that
could reasonably be counted on a different survey day,
- the measurement error calculations for buses; these
calculations are more complex, because they are disaggregated by
the four vehicle types and six occupancy levels,
- the final results are the updated counts after the application
of both the measurement errors and daily variations.
A summary spreadsheet accumulates the updated counts for all 17
sites around the cordon to produce an overall estimate of the mode
share.
The process of generating repeated measurement errors and daily
variations was achieved with the aid of a simple Visual Basic macro
and the resultant mode shares recorded and graphed. Figure
2 shows the distribution of the mode share for cars after 5,000
such samples were conducted, which took less than 5 minutes to
calculate on a 2GHz desktop PC.
The distribution has a mean of 60.3% and a standard deviation of
0.71%. The distribution appears normal with an estimated skewness of
0.01 and an (adjusted) kurtosis of 0.06, both of which are close to
the values expected for a normal distribution. It is, therefore,
possible to estimate a 95% confidence interval for the car mode
share between 58.9% and 61.7%. Similar confidence intervals can be
calculated for the other modes. It should be noted that the
resultant normal shape of this mode share distribution does not
depend on the normality of the underlying sampling distribution; if
a uniform sampling distribution is used, the same shape results,
albeit, with a different spread.
SENSITIVITY ANALYSIS
The measurement errors used here could be improved on if further
resources were devoted to data collection. As an illustration of
this possibility, the question is posed as to what degree of
improvement would result from a halving in the mode-specific error
with which single-occupant cars are counted and classified, from 10%
down to 5%. When the Monte Carlo simulation model is re-run with
this new error range, the interval reduces only slightly to between
59.0% and 61.6%.
A wider view of how sensitive the measure of spread in the mode
share of car is can be obtained by graphing the standard deviation
for a series of values for one or more of the assumed ranges. Figure
3 shows how the standard deviation of the single-occupant car
mode share changes as the single-occupant car-specific measurement
error changes from 0% to 25% and the daily variation in single car
occupants changes from 0% to 25%. All other ranges stay at their
default values.
As expected, the standard deviation increases as the ranges of
variation increase. Even at a 0% value for both ranges, variation
remains in the mode share for cars. This is due to the fact that the
other modes are still varying at their old levels, and, since we are
dealing with a share, their variability will also impact on the
variability of single-occupant travel by car.
SURVEY IMPLICATIONS
Information on the degree of variability of the car mode share
statistic allows us to compute the minimum sample sizes required to
reliably detect a specified level of change. Using the following
equation for sample size estimation (Ortuzar and Willumsen
1994):
where
n′ is the required sample size,
is the critical value of a α% standard normal
distribution,
s is the estimated standard deviation of the measured
quantity, and
δ is the minimum required change to detect,
and an example of an absolute one percentage point change as the
target, the estimated sample size is:
which suggests that a sample size of two survey days is required
to be 95% sure that an observed change of at least 1% in the average
mode share for cars is significant. Table
2 shows the required sample sizes for a range of these changes
for each of the three main modes of travel, using each of the Monte
Carlo-derived estimates of the mode's standard deviation.
BOOTSTRAP ESTIMATION
The technique of bootstrap estimation falls within the resampling
family of techniques (Efron 1982; Efron and Tibshirani 1993). It is
particularly useful when no simple expression is available to
compute the summary statistics for a measure or only a limited
sample size is available. The process essentially involves taking
repeated subsamples from a larger sample (with or without
replacement) and calculating the statistic of importance based on
this subsample. The distribution of these subsample statistics is
then used to infer information about the population as a whole.
The bootstrap technique has had some application within the
transport field. Rilett et al. (1999) used the technique to estimate
the variance of freeway travel time forecasts derived from an
artificial neural network. This allowed predictions to be made of
future confidence intervals for journey times along a freeway and
then used as input to Advanced Traveler Information Systems. A study
by Brundell-Freij (2000) focuses on assessing the accuracy in the
estimates produced by complex transport models. This study used both
Monte Carlo simulation and bootstrap techniques to show how
different kinds of variation in the input data affect the quality of
the final model estimates. The study suggests that these variations
can be a large but unknown feature of transport models. Hjorth
(2002) used the bootstrap technique to estimate the covariance
structure of traffic counts conducted at pairs of sites. This
information was then used to construct route flow proportions and
probabilities.
Here we are interested in using the bootstrap technique to obtain
estimates of the mode share confidence intervals from a limited
number of surveys (DiCiccio and Efron 1996; Wood 2004). If we have a
count of the traffic entering the city center on a limited number of
days at each site, it would be possible to choose, at random, one
day from each site and add them together to arrive at an estimate of
the total volume of traffic entering the city center and hence
calculate mode shares. So, for example, one bootstrap draw could
combine the counts from day five at site A, day two at site B, day
one at site C, and so on, while the next draw would combine counts
from day four at site A, day three at site B, day one (again) at
site C, and so on. A large number of these draws could be taken and
the distribution and summary statistics established for either the
total volume or the mode shares.
Based on the Monte Carlo simulation work described earlier,
additional surveys were conducted in May 2004, so that each radial
road into Leeds City Center was surveyed on four days rather than
the more usual one day. This sample size allows changes as small as
0.7% in the mode share for cars to be detected reliably. To increase
the representative nature of the data, the survey was designed so
that each of the 34 roads would be surveyed once on 4 different
weekdays (excluding Fridays).
Aggregating the survey data together to produce a mode share for
traffic crossing the entire cordon involved selecting 1 survey day
from the 4 possible days at each of 34 survey locations. This
produced a large number of possible combination of days and sites,
434, to be precise. To make this exercise more
manageable, adjacent sites were grouped together to form seven
corridors (see figure 1). This decreased the number of possible
combinations to 47 = 16,384. For the bootstrap exercise,
just a fraction of these combinations were used: 4,000 selected at
random from over 16,000 possibilities. The bootstrap mean of the
4,000 car mode shares selected was 57.3%, much lower than the Monte
Carlo mean value calculated in 2002.
Table
3 gives the estimated standard deviations for the mode shares of
car, bus, rail, and walk from the Monte Carlo and bootstrap
techniques. The bootstrap-estimated standard deviation for car-based
trips is 0.64%, compared with the Monte Carlo estimate of 0.71%. The
bootstrap deviation could be expected to be different for a number
of reasons:
- The range of daily variation selected for the Monte Carlo
simulation was designed to account for the variety seen throughout
the year, while the bootstrap estimate was based on just the
variability observed within one calendar month. If the surveys
used in the bootstrap estimation were conducted throughout 2004
rather than just in May, it is likely that a wider spread of
observed variation would be present and the estimated standard
deviation would increase above the 0.64% value found here.
- The survey enumerators knew there would be repeated surveys at
each site. This ability to cross check counts may have encouraged
them to be more accurate in their counting. A more accurate and
consistent set of counts would produce a smaller deviation.
- The mean share was reduced significantly: the 0.71% Monte
Carlo estimate is for a car share of about 60.3%, while the
bootstrap estimate is a lower share of about 57.3%.
FURTHER IDEAS
In this paper, a Monte Carlo simulation regime was established to
estimate the variability in mode share for a traffic cordon survey.
While the illustrative example used a specific experimental
methodology to collect the data and determine the structure of the
model, the simulation approach proposed is flexible enough to allow
the use of data that are collected through different survey designs.
Of particular note here is that no "conservation of flow" principle
has been applied to the changes (i.e., changes in one mode of travel
are not mirrored with compensatory changes in another) but if
thought necessary, this principle could easily be incorporated in
the model.
There is nearly always a value in conducting more surveys to
measure the important ranges that define both measurement errors and
daily variation ("Whenever you can, count." Sir Francis
Galton). These surveys do, however, come at a cost. The model
proposed in this study can help to identify which survey methodology
has the greatest impact on the accuracy of mode share and,
therefore, provide the best value for the money.
ACKNOWLEDGMENTS
The authors would like to thank their colleagues, Ken Mason and
Mohammed Mahmood, for their valuable contributions to the work
reported here. We would also like to thank the three anonymous
referees and Dr. Susan Grant-Muller for comments on earlier drafts
of this paper. The opinions and ideas expressed in this paper are
those of the authors alone and should not be taken to be those of
Leeds City Council or its agencies.
REFERENCES
Brundell-Freij, K. 2000. Sampling, Specification and
Estimation as Sources of Inaccuracy in Complex Transport ModelsSome
Examples Analyzed by Monte Carlo Simulation and Bootstrap.
Proceedings of Seminar F, European Transport Conference, Cambridge,
England, pp. 225237.
Cassidy, M.J., Y.T. Son, and D.V. Rosowsky. 1994.
Estimating Motorist Delay at Two-Lane Highway Work Zones.
Transportation Research A 28(5):433444.
Castiglione, J, J. Freedman, and M. Bradley. 2003. A
Systematic Investigation of Variability Due to Random Simulation
Error in Activity Based Micro-Simulation Forecasting Model, paper
presented at the 2003 Annual Meetings of the Transportation Research
Board, Washington, DC.
Chen, A, H. Yang, H.K. Lo, and W.H. Tang. 1999.
Capacity Related Reliability for Transportation Networks. Journal
of Advanced Transportation 33(2):183200.
____. 2002. Capacity Reliability of a Road Network: An
Assessment Methodology and Numerical Results. Transportation
Research B (36):225252.
Dajani, J.S. and G. Gilbert. 1975. Delphic Predictions
and Cross Impact Simulation. ASCE Journal of the Urban Planning
and Development Division 10(1):4959. May.
Department for Transport (DfT). 1994. National
Travel Survey: 1991/93. London, England. Chapter 6, Motorcycles
and Their Users, pp. 3743.
____. 1996. Transport Statistics Report: Cycling in
Great Britain. London, England. Chapter 2, Cycle Traffic.
____. 2003. Design Manual for Roads and Bridges.
Volume 12, Section 1, Part 1: Traffic Appraisal Manual, Annex
10.3.6. Available at
www.official-documents.co.uk/document/deps/ha/dmrb/index.htm.
DiCiccio, T.J. and B. Efron. 1996. Bootstrap
Confidence Intervals. Statistical Science 11(3):189228.
Efron, B. 1982. The Jackknife, the Bootstrap, and
Other Resampling Plans. Philadelphia, PA: Society for Industrial
and Applied Mathematics.
Efron, B. and R.J. Tibshirani. 1993. An
Introduction to the Bootstrap. New York, NY: Chapman &
Hall.
Fox, K, S. Clark, R. Boddy, F. Montgomery, and M.
Bell. 1998. Some Benefits of a SCOOT UTC System: An Independent
Assessment by Micro-Simulation. Traffic Engineering and
Control 38(8):484489.
Hjorth, U. 2002. Traffic Sub-Flow Estimation and
Bootstrap Analysis from Filtered Counts. Transportation Research
B 36(4):345359.
Knusel, L. 1998. On the Accuracy of Statistical
Distributions in Microsoft Excel 97. Computational Statistics and
Data Analysis 26:375377. See also
http://www.stat.uni-muenchen.de/~knuesel.
Kreihich, V. 1979. Modelling of Car Availability Modal
Split and Trip Distribution by Monte Carlo Simulation: A Short Way
to Integrated Models. Transportation 8(2):153166.
Lam, W.H.K. and G. Xu. 1999. A Traffic Flow Simulator
for Network Reliability. Journal of Advanced Transportation
22(2):159182.
Ortuzar, J.D. and L.G. Willumsen. 1994. Modelling
Transport, 2nd ed. New York, NY: Wiley.
Philips, G. 1979. Accuracy of Annual Traffic Flow
Estimates from Short Period Count. TRRL Supplementary Report
514. Crowthorne, Berkshire, UK: Transport Research Laboratory.
Pothisiri, T. and K.D. Hjelmstad. 2003. Structural
Damage Detection and Assessment from Modal Response. Journal of
Engineering Mechanics 129(2):135145.
Rilett, L.R., D. Park, and B. Gajewski. 1999.
Estimating Confidence Interval for Freeway Corridor Travel Time
Forecasts. Proceedings of the 6th World Congress on Intelligent
Transport Systems, Toronto, Canada.
Royal Statistical Society. 2005. Performance
Indicators: Good, Bad and UglyThe Report of the Working Party on
Performance Monitoring in the Public Services, chaired by Professor
S.M. Bird, submitted October 23rd, 2003. Journal of the Royal
Statistical Society A 168(1):127. Available from
www.rss.org.uk.
Strategic Rail Authority. 2002. On Track.
London, England. Available at www.sra.gov.uk.
Tarko, AP. 2000. Random Queues in Signalized Road
Networks. Transportation Science 38(4):415425.
U.S. Department of Transportation (USDOT), Bureau of
Transportation Statistics. 2003. Guide to Good Statistical Practice
in the Transportation Field. Available at http://testcentral.bts.gov:8080/publications/guide_to_good_statistical_practice_in_the_transportation_field/.
Veldhuisen, J., H. Timmermans, and L. Kapoen. 2000.
Microsimulation Model of Activity-Travel Patterns and Traffic Flow:
Specification, Validation Tests and Monte Carlo Error.
Transportation Research Record 1706:126135.
West Yorkshire Passenger Transport Authority (WYPTA).
2003. West Yorkshire Local Transport Plan: Annual Progress
Report, 20022003. West Yorkshire, UK.
Williamson, D.G., M. Yao, and J. McFadden. 2002. Monte
Carlo Simulation in Sampling Techniques of Traffic Data Collection.
Transportation Research Record 1804.
Wood, M. 2004. Statistical Inference Using Bootstrap
Confidence Intervals. Significance 1(4):180182.
Zhao, Y.G. and A.H.S. Ang. 2003. System Reliability
Assessment by Method of Moments. Journal of Structural
Engineering 129(10):13411349.
APPENDIX
Distributional Alternative
A question arises as to whether any results from the statistical
literature will allow inferences to be made concerning the
distribution of a share:
where
S 1 is the share of mode one,
X 1 is the volume of mode one,
X 2 is the volume of all other modes, and
the distribution of both X 1 and X
2 are known.
The γ distribution is one form of distribution that is
quite flexible in the range of distributional shapes that it can
represent. Another feature of the γ distribution is that if
X 1 and X 2 are γ
distributed random variables with X 1 ~ γ
(α1, β2) and X
2 ~ γ (α2,
β1), then S 1 has a β
distribution, S 1 ~ β
(α1,α2). Critical to the use of
this result is that both the γ distributions have similar
values for the scale parameter, β1.
In the context of the data used in this study, each mode of
travel will have a different scale; the volume of travel by car is
greater than that by bus and rail. This suggests that the β
distribution approach to modeling the distribution of mode share may
not be realistic.
END NOTE
1. Available from http://testcentral.bts.gov:8080/cgi-bin/ExitPage/good_bye.cgi?url=http://www.stephenclark.clara.net/.
ADDRESSES FOR CORRESPONDENCE
Corresponding author: S. Clark,
Transport Policy Monitoring, Development Department, Leeds City
Council, Leonardo Building, 2 Rossington Street, Leeds LS2 8HD
England. E-mail: Stephen.clark@leeds.gov.uk
J. McKimm, Transport Policy
Monitoring, Development Department, Leeds City Council, Leonardo
Building, 2 Rossington Street, Leeds LS2 8HD England. E-mail: john.mckimm@leeds.gov.uk
|