A Self-Instructing Course in Disaggregate Mode Choice Modeling

A Self-Instructing Course in Disaggregate Mode Choice Modeling - FTA



Click  HERE for graphic.





                    A Self-instructing Course
                    in Disaggregate
                    Mode Choice Modeling

                    Final Report
                    December 1986


                    Prepared by

                    Joel L. Horowitz
                    University of Iowa
                    Iowa City, Iowa 52242
                       and
                    Frank S. Koppelman
                    Northwestern University
                    Evanston, Illinois 60201
                       and
                    Steven R. Lerman
                    Massachusetts Institute of Technology
                    Cambridge, Massachusetts 02139

                    Prepared for

                    University Research and-Training Program
                    Urban Mass Transportation Administration
                       (now Federal Transit Administration)
                    Washington, D.C. 20590

                    Distributed in Cooperation with

                    Technology Sharing Program
                    U.S. Department of Transportation
                    Washington, D.C. 20590


                    DOT-T-93-18





                                MODULE 1
                              INTRODUCTION

1.1   The Motivation for This Course

   Many practical transportation policy issues are concerned with mode
choice.  For example, the gain or loss in transit revenues caused by a
fare increase depends on how travelers' mode choices are affected by the
increase.  If few current transit riders switch to other modes because
of the fare increase, transit revenues will increase proportionally to
the increase in fare.  But if many riders switch to other modes,
revenues will increase less than proportionally to the fare increase and
may decrease.  Similarly, the effects of changes in transit routes and
schedules on ridership, revenues, and traffic congestion all depend on
how the changes affect individual travelers' mode choices.  The
effectiveness of programs to encourage ridesharing -- for example,
preferential parking or preferential access to freeways for carpools --
also depends on how the programs affect mode choice.  In most
situations, planners must choose among a variety of fare schedules and
service designs.  An understanding of the separate and combined effects
of these decisions on travel mode choice is essential to selection of
the best plan to meet specific transportation objectives.
   The importance of mode choice in transportation policy analysis and
decision making has lead to a variety of methods for predicting the
effects of policy measures on travelers' mode choices.  Two well-known
and frequently used prediction methods are the method of elasticities
and aggregate mode split modeling.  Both of these methods have serious
defects that greatly restrict their practical usefulness.  For example,
the method of elasticities cannot predict accurately the effects of
making several





changes in transit service simultaneously (e.g., of increasing both the
fare and the schedule frequency or of adding a new route to the system). 
Aggregate mode split models can be exceedingly costly and cumbersome to
develop.  Moreover, they are subject to serious biases and prediction
errors owing to their reliance on aggregate travel data rather than
records of individual trips.  The range of policy questions that can be
treated with aggregate models is quite limited.  For example, it usually
is not possible to carry out multimodal analyses with these models
(e.g., analyses-in which it is necessary to predict the use of several
different modes such as bus transit, rail transit, carpool, and single-
occupant automobile).
   This course is concerned with a third class of mode choice models,
called disaggregate models, that have substantial practical advantages
over both elasticity methods and aggregate mode split models. 
Disaggregate models achieve a higher degree of policy sensitivity than
either elasticity or aggregate mode split models.  Disaggregate models
can represent a wider range of policy variables than can either
elasticity or aggregate models, and they can treat multimodal problems
without difficulty.  Moreover, disaggregate models avoid the biases
inherent in aggregate models, and they are much more efficient than
aggregate models in terms of data and computational requirements. 
Disaggregate models can be developed using data from only 1000-3000
households -- less than one tenth the number required by aggregate
models -- and they can be implemented on microcomputers.  In fact, as
the examples given later in this course will show, many useful
applications of disaggregate models can be made by hand with the aid of
a desk calculator.
   Disaggregate mode choice models have been available for use in
transportation planning and policy analysis for nearly 15 years.  Many

                                    2





transportation agencies now use these models for practical policy
analysis.  This makes it important for transportation professionals to
understand the principles underlying the development and use of
disaggregate models, since failure to understand these principles can
lead to the development of seriously erroneous models and to serious
prediction errors.
   Unfortunately, materials that explain how to use disaggregate models
are not readily available.  Most descriptions of disaggregate modeling
techniques are written for members of the research community or for
graduate students.  People in both groups have extensive backgrounds in
mathematics and statistics, and graduate students may be able to spend
several months learning to use the techniques.  Consequently, the
available descriptions emphasize the mathematical and statistical
details of the techniques and, thereby, convey the impression that the
techniques are useful mainly to researchers and can be used only by
people with considerable mathematical training.  This is a false
impression.  The main concepts and methods of disaggregate mode choice
modeling can be understood and applied by, anybody who has mastered
high-school algebra.  The purpose of this course is to explain what
disaggregate mode choice models are, how they work, and how they can be
applied to practical problems, and to do this with a minimum of
mathematics and jargon.

1.2   Description of the Course

   This is a complete, self-instructing course in disaggregate mode
choice modeling.  It includes a text, worked examples, problems for
readers to solve, and solutions to the problems.  The course is designed
for readers who are familiar with urban transportation planning issues
and methods and have knowledge of mathematics at the level of high
school algebra.  No prior

                                    3





familiarity with statistics or computer programming is needed.  The only
equipment required, apart from pencil and paper, is a desk calculator. 
A supplement to the course provides problems to be worked on a
microcomputer.  This supplement may be skipped by readers without access
to an IBMcompatible microcomputer.  The course is divided into self-
contained modules of 1-2 hours duration.  It is expected that most
individuals will be able to complete the entire course in 15-20 hours of
work.
   The purpose of the course is to familiarize readers with the basic
concepts and methods of disaggregate mode choice modeling and to do so
with a minimum of mathematics and technical jargon.  It is designed to
help readers understand how disaggregate models work and why they are
useful so that readers can become informed users of these models and
their outputs.  The course will not make experts out of its readers.  No
short, selfinstructing, non-mathematical course could do this.  However,
this course will enable readers to understand what the experts are doing
(or should be doing) and how the results can be used.  It also will
enable readers to do some of the things that, they previously may have
thought require the services of an expert.  Readers who wish to achieve
a more detailed understanding of disaggregate models or a higher level
of expertise than this course provides should consider taking a college
course in travel demand modeling or reading one or more of the
references listed at the end of this module.
   The course consists of 7 modules, including this introduction. 
Modules 2-4 describe the conceptual foundations of disaggregate mode
choice modeling.  These modules explain the assumptions about travel
behavior that underlie disaggregate mode choice models and show how
these assumptions are represented in models suitable for use in
practical analysis.  Numerical

                                    4





examples are given that illustrate the usefulness of the behavioral
assumptions and the plausibility of mode choice models based on these
assumptions.
   Modules 5-7 are concerned with the practical development and
implementation of mode choice models.  Module 5 discusses the
explanatory variables that typically are used in disaggregate mode
choice models, the choices that analysts face in selecting variables,
and the practical consequences of alternative choices.  Module 6
explains how disaggregate mode choice models are estimated or
calibrated.  This module also describes the data requirements of these
models and discusses how the models can be tested empirically. 
Particular emphasis is placed on practical procedures for determining
whether the correct explanatory variables have been used in a model and
on comparing different versions of the same model to determine which
provides the best explanation of the available data.  Module 7 explains
how aggregate travel demand can be predicted using disaggregate mode
choice models.
   The modules build on one another.  Each uses material from its
predecessors, and none can be understood without first understanding its
predecessors.  Therefore, readers are strongly advised to work through
the modules in sequence without skipping any.  Each module contains
numerical examples that illustrate the material being presented, and
each includes problems for the reader to solve.  Readers are urged to
work through the numerical examples and to understand them fully. 
Readers are also urged to solve the problems.  It is possible to gain a
complete understanding of the ideas presented here only by working with
them.  The problems provide an opportunity to do such work.  Solutions
to the problems are given following Module 7.

                                    5





   The time required to work through a module will vary greatly among
both modules and readers.  It is likely that most readers will be able
to work through the text and examples of most modules in 1-2 hours,
although some modules and readers may require more or less time. 
Working the problems at the end of a module may require an additional
hour.

1.3 Summary

   Disaggregate mode choice models have important practical advantages
over other available methods for predicting the consequences of
transportation policy measures that affect mode choice.  This course is
designed to provide readers with a working knowledge of practical
disaggregate mode choice modeling.  It does not require extensive
mathematical training or other special technical preparation.  It can be
worked by readers who are familiar with urban transportation planning
practice and are comfortable with high school algebra.  The course is
suitable for individuals who must carry out mode choice analyses, use
and interpret the outputs of such analyses, or supervise those who do
such work.

                                    6





                               REFERENCES

   Readers wishing additional information on disaggregate mode choice
modeling, beyond that provided in this course, are encouraged to consult
the following references.

M. Ben-Akiva and S.R. Lerman, Discrete Choice Analysis: Theory and
Application to Travel Demand, The M.I.T. Press, Cambridge, MA, 1985.

Cambridge Systematics, Inc., Analytic Procedures for Estimating-Changes
in Travel Demand and Fuel Consumption, report DOE/PE/8628-1, Vol. 2,
U.S. Department of Energy, October 1979.

Cambridge Systematics, Inc., Case City Applications of Analysis
Methodologies, report DOE/PE/8628-1, Vol. 3, U.S. Department of Energy,
October, 1979.

D.A. Hensher and L.W. Johnson, Applied-Discrete-Choice Modelling, Croom-
Helm, London, 1981.

T.A. Domencich and D. McFadden, Urban Travel Demand: A Behavioral
Analysis, North Holland/American Elsevier, New York, 1975.

D.L. McFadden, "The Theory and Practice of Disaggregate Demand
Forecasting for Various Modes of Urban Transportation," in Emerging
TransportationPlanning Methods, report DOT-RSPA-DPB-50-78-2, U.S.
Department of Transportation, August 1978.

                                    7





                                MODULE 2

                      INTRODUCTION TO CHOICE THEORY

2.1   Introduction

   This module introduces the behavioral theory that forms the basis of
disaggregate mode choice models.  The theory is presented in its
simplest form in this module.  The theory is expanded and made more
realistic in Modules 3 and 4.

2.2   The Role of Choice in Generating Travel Demand

   The basic idea underlying modern approaches to travel demand modeling
is that travel is the result of choices made by individuals or
collective decision-making units such as households.  For example, an
individual preparing to travel to work must choose whether to drive
alone, take a bus, travel in a carpool, etc.  The individual also must
choose when to leave home and, depending on the chosen mode, may have to
choose which route to use.  The objective of travel demand modeling is
to model and predict the outcomes of these choices by individuals (or,
if appropriate, by collective decision-making units such as households). 
Measures of aggregate travel, such as bus ridership, are obtained by
adding up the choices of individuals.

   To model the outcomes of individuals' choices, it is necessary to:

   1. Identify the decisions that must be made and the options, or
      alternative outcomes, that are available to the individual.  In
      this course, the decision that will be considered is choice of
      mode, and the options are travel modes such as drive alone,
      carpool, and bus.  However, the methods that will be discussed
      also

                                    8





      are applicable to other travel choices, including choices of trip
      frequencies, destinations, and routes.
   2. Identify variables likely to affect the choices of interest.  It
      is particularly important to identify policy variables -- i.e.,
      variables whose values may be changed through deliberate policy
      decisions -- since much practical travel demand modeling is
      concerned with predicting the consequences of changing the values
      of these variables.  Travel time and travel cost are examples of
      policy variables relevant to mode choice.

   3. Develop a mathematical formula that describes the dependence of
      choices on the relevant variables.

   This module is concerned primarily with item 3. The module describes
a theory of human preferences and choices that is useful for guiding the
development of mathematical formulas relating choices to appropriate
sets of variables.  The application of the theory is illustrated with
examples involving the prediction of mode choice.
   To minimize the complexity of the presentation, it will be assumed in
this module that all of the variables relevant to individuals' choices
of modes are known to the analyst.  This makes it possible to develop
models that predict individuals' choices of modes with certainty and
without error.  Of course, it is not possible in practice to achieve
such a high degree of modeling perfection, and it will be necessary to
modify the models discussed in this module to make them suitable for use
in real-world applications.  The modified models, which are explained in
Modules 3 and 4, are based on the behavioral concepts described in this
module.  The modifications needed to achieve practical models extend,
rather than replace, the concepts presented in this module.

                                    9





2.3   Preferences

   An individual's choice represents an expression of his preferences
among the available options at the time and under the conditions in
which the choice is made.  For example, if an individual decides to
travel to work by bus rather than by driving alone or by carpooling,
this means that he prefers bus to the other two modes under the
conditions that exist when the choice is made.
   It is important to understand that the preferences relevant to choice
are the ones that pertain to the chooser's existing circumstances, not
to an ideal set of circumstances.  For example, a commuter boarding a
bus may think to himself that he would really rather take a taxi if he
could afford it and that he is taking the bus only because he does not
have much money.  Such thoughts do not imply that the commuter prefers
taxi to bus under the existing circumstances.  He would prefer taxi to
bus under ideal circumstances (e.g., having a lot of money), but under
the existing circumstances (e.g., having to give up lunch if he spends
money on a taxi), he prefers bus.
   Since choice is an expression of preferences, modeling and predicting
choices is equivalent to modeling and predicting preferences -- i.e., if
one has a model that enables one to predict an individual's preferences
among the available options, one also is able to predict the same
individual's choices.  Preferences among a set of options depend on the
attributes of the options and of the individual involved.  For example,
attributes of travel modes that are relevant to preferences among modes
include travel time, travel cost, comfort, and reliability.  Attributes
of individuals that affect preferences among modes include income and
the number of automobiles

                                   10





owned.  The next section describes a way of relating attributes to
preferences and choices.

2.4   Utility Theory

   Virtually all operational models for predicting individuals' choices
are based on a behavioral principle called "utility maximization." This
principle and its relation to choice can be stated in words very simply. 
According to the utility maximization principle, there is a mathematical
function U, called a utility function, whose numerical value depends on
attributes of the available options and the individual.  The utility
function has the property that its value for one option exceeds its
value for another if and only if the individual prefers the first option
to the second.  Thus, the ranking of the available options according to
the individual's preferences and the ranking according to the values of
the utility function are the same.  The individual chooses the most
preferred option, which is the one with the highest utility-function
value.
   The utility maximization principle can be stated mathematically as
follows.  Let C denote the set of options available to an individual
(e.g., drive alone, carpool, and bus in the case of mode choice).  C is
called the choice set.  For each option i in C, let Xi denote the
attributes of i for the individual in question.  For example, if i
corresponds to drive alone, Xi denotes the travel time, travel cost,
and other relevant attributes of the drive alone mode for the individual
in question.  Let S denote the attributes of the individual that are
relevant to preferences among the options in C (e.g.-, income,
automobiles owned, etc.). Then, according to the utility maximization
principle, there is a function U (the utility function) of the
attributes of options and individuals that describes

                                   11





individuals' preferences.  U has the property that for any two options i
and j in C

      U(X ,S) > U(X  ,S)                                           (2.1)
         i         j

implies that the individual prefers alternative i to alternative j and
will choose i if given a choice between i and j. Given a choice among
many options, alternative i in C is chosen if

      U(X ,S) > U(X  ,S)                                           (2.2)
         i         j

for all alternatives j (other than i) in C.

   The utility function U is defined to have the following properties:

   1. The function U is the same for all options.  Differences among
      options are accounted for by differences in the numerical values
      of the attributes X, not by changing the function U. (Of course,
      the numerical value of U depends on the option, but the functional
      form of U is the same for all options.)
   2. The utility of an alternative depends only on attributes of that
      alternative and of the individual.  It does not depend on
      attributes of other alternatives.  Thus, for example, the utility
      of driving alone does not depend on bus travel time and cost.  Of
      course, the choice the individual makes depends on the attributes
      of all alternatives since the chosen mode is the one with the
      highest utility.

The following example illustrates the use of the utility maximization
principle in mode choice analysis.

   Example 2.1: A Utility Model of Mode Choice

   Suppose that an individual can travel to work by driving alone,
carpooling, or riding the bus.  Assume that the relevant attributes of
these

                                   12





modes are travel time and travel cost.  Assume that the relevant
attribute of the individual is annual income.  Let T denote door-to-door
travel time in hours, C denote travel cost in dollars, and Y denote
annual income in thousands of dollars per year.  Let the utility
function be

      U(T,C,Y) =  -T - 5C/Y                                        (2.3)

Suppose the values of travel time and cost for the available modes are:

                          Time (T)       Cost (C)
             Mode          (hours)          ($)

          Drive Alone       0.50           2.00

            Carpool         0.75           1.00

              Bus           1.00           0.75

Then if income is $40,000 per year, for example, the value of U for
drive alone is -0.50 - 5(2.00)/40 = -0.75. The following table shows the
value of U corresponding to each mode for an individual whose income is
$40,000 per year (Y = 40) and an individual whose income is $10,000 per
year (Y = 10):

                                    U

             Mode          Y = 40         Y = 10

          Drive Alone       -0.75          -1.50
            Carpool         -0.88          -1.25
              Bus           -1.09          -1.38

The high-income individual chooses to drive alone (because drive alone
has the highest utility for this individual), and the low-income
individual chooses to carpool.  Note that all utilities are negative
because U consists

                                   13





of (generalized) costs of travel but excludes the value of reaching the
destination.  In this case, the highest value of U is the one that is
least negative.
   Now suppose that the quality of transit service is improved so that
travel time for bus is 0.75 hr.  Then the utilities become:

                                    U
             Mode          Y = 40         Y = 10

          Drive Alone       -0.75          -1.50
            Carpool         -0.88          -1.25
              Bus           -0.84          -1.13

The high-income individual still drives alone, but the low-income
individual switches to bus.

   Although this example is very simple, it illustrates some important
characteristics of choice models based on the utility maximization
principle.  First, it shows how a utility function can be used to
describe the dependence of preferences and choices on attributes of
individuals and options.  Notice, in particular, that the same utility
function describes the preferences of more than one individual.  It is
not necessary to have a separate utility function for each individual if
differences among individuals can be accounted for by attribute
variables such as income.  Second, the example illustrates the use of
utility theory to predict changes in preferences and choices that occur
when an attribute of one of the options changes.  Moreover, the utility
model is able to capture differences in the responses of different
individuals to the same attribute change.

                                   14





Finally, the example illustrates some advantages of utility models over
many traditional mode choice models.  For instance, the model in the
example treats choice among three modes and can easily be extended to
treat more than three modes.  Many traditional models are able to treat
only two modes.  In addition, since the utility model operates at the
level of individuals, it guarantees that the percentages of individuals
choosing a mode always are within the range 0-100% and always add up to
100%.  Many traditional mode choice models do not have this obviously
desirable property.

2.5   Properties of Utility Functions and Utility Models

   It is tempting to interpret the numerical values of the utilities of
a set of options as indicators of an individual's strengths of
preference for the options.  For example, if a certain individual's
utility of driving alone is 5 and his utility of traveling by bus is 1,
then it might be said that the individual's preference for driving alone
is 5 times greater than his preference for traveling by bus.  As it
turns out, such an interpretation is both unnecessary and incorrect.
   The interpretation is unnecessary because choice does not depend on
strengths of preference; it depends only on preference ordering.  A
utility model always predicts that the option with the highest utility
will be chosen, regardless of whether that option's utility is much
larger or only slightly larger than the utilities of the other available
options.  For example, driving alone will be chosen if the utilities of
driving alone, carpooling, and traveling by bus are 100.0, 2.0, and 1.0,
respectively; and driving alone will also be chosen if the utilites are
2.1, 2.0, and 1.9.
   That the preference strength interpretation of utility is incorrect
follows from the observation that the utility function is defined only
as a

                                   15





function whose numerical values for the available options have the same
ordering (e.g., highest to lowest) as the individual's preferences among
the options.  The definition of a utility function does not include any
assumptions or statements about strengths of preference.  Any function
that reproduces preference orderings can serve as a utility function and
will give the same predictions of choice, regardless of the signs or
numerical values of the utilities.  Thus, the utility function contains
no information about strengths of preference.
   In fact, there are infinitely many utility functions that can be used
to describe the same preferences and that give the same predictions of
choices.  This nonuniqueness of utility functions is illustrated by the
following example.

   Example 2.2: Nonuniqueness of Utility Functions

   In example 2.1, preferences among the alternatives drive alone,
carpool, and bus were described with the utility function

   U(T,C,Y) =  -T - 5C/Y,                                          (2.4)

where T, C, and Y, respectively, are travel time in hours, travel cost
in dollars, and income in thousands of dollars per year.  However,
exactly the same preference rankings and choice predictions would be
obtained with any of the following alternative utility functions:

   V(T,C,Y) =  -TY - 5C,                                           (2.5)

   W(T,C,Y) =  10 20T - 100C/Y,                                    (2.6)

   X(T,C,Y) =  -T�   10CT/Y -  25C�/Y�                             (2.7)

To see this for the case of utility function V, suppose that;

      U(T , C ,Y) >  U(T , C , Y),
         1   1          2   2

where T1 and C1 denote the travel time and cost of option 1, and T2
and C2 denote the travel time and cost of option 2. Then

                                   16





      -T  - 5C /Y >  -T  - 5C /y.                                  (2.8)
        1     1        2     2 

Multiplying both sides of equation (2.8) by Y yields

      -T Y - 5C   >  -T Y - 5C   ,                                 (2.9)
        1      1       2      2

which is equivalent to

      V(T ,C ,Y)  >  V(T  ,C  ,Y),                                (2.10)
         1  1           2   2

Thus, the utility functions U and V are interchangeable: they give the
same preference orderings and the same predictions of choice. 
Similarly, the utility functions W and X defined in equations (2.6) and
(2.7) are interchangeable with U. You will be asked to show this in
Exercise 2.3.

2.6   Predictions of Aggregate Travel Behavior

   The utility maximization principle and utility-based choice models
are methods for describing and predicting choices made by individuals. 
However, practical travel demand analysis rarely is concerned with the
choices of individual travelers.  Rather, it is concerned with the
behavior of large groups or aggregates of travelers.  A utility model
can be used to obtain predictions of aggregate travel behavior: one
simply adds up the model's predictions of the choices of the individuals
in the group of interest.  This process is illustrated in the following
example.

   Example 2-.3: Aggregate Travel Behavior

   As in Example 2.1, consider choice among the modes drive alone,
carpools and bus with the utility function

         U(T,C,Y) =  -T - 5C/Y,                                   (2.11)


where T, C, and Y, respectively, denote travel time in hours, travel
cost in dollars, and income in thousands of dollars per year.  Suppose
that

                                   17





individuals who live in a certain suburb of a city and work downtown
face the following travel times and costs for trips from home to work:

                          Time (T)       Cost (C)
             Mode          (hours)          ($)

          Drive Alone       0.50           2.00

            Carpool         0.75           1.00

              Bus           1.00           0.75

In addition, suppose that the incomes of these individuals are
distributed as follows:

                                       Percentage of
                      Income            Individuals

                        17                   5
                        19                  15
                        27                  25
                        33                  25
                        37                  20
                        40                  10

Then the utility values and mode choices according to income group are:

                                   18





  Income     Drive Alone     Carpool        Bus          Choice

    17          -1.09         -1.04        -1.22         Carpool
    19          -1.03         -1.01        -1.20         Carpool
    27          -0.87         -0.94        -1.14       Drive Alone
    33          -0.80         -0.90        -1.11       Drive Alone
    37          -0.77         -0.89        -1.10       Drive Alone
    40          -0.75         -0.88        -1.09       Drive Alone


Since 20% of the individuals belong to income groups in which carpool is
chosen and 80% belong to income groups in which drive alone is chosen,
the aggregate mode shares are 20% for carpool and 80% for drive alone. 
No individuals in the population under consideration choose bus.
   Notice that aggregate behavior cannot be predicted correctly by
averaging the utility values over individuals and predicting aggregate
behavior using the average utilities.  The average utility of driving
alone in this example is -0.86 [i.e., 0.05(-1.09) + 0.15(-1.03) + ...  +
0.10(-0.75)], and the average utilities of carpooling and traveling by
bus are -0.93 and -1.13, respectively.  Thus, use of the average utility
values would result in the erroneous prediction that all of the
travelers in the population under consideration drive alone.

2.7 Summary

   Modern approaches to travel demand modeling are based on a behavioral
principle called utility maximization.  This module has explained the
utility maximization principle and illustrated its use in predicting
individuals' mode choices and aggregate mode shares.

                                   19





                                EXERCISES


2.1   Refer to Example 2.1. Suppose the utility function in this example
      were

         U(W,T,C,Y)  =  W - T - 5C/Y,

      where W is the value of arriving at work.  Let W - 3.0, and let
      the values of T and C be as in Example 2.1. Compute the utilities
      of drive alone, carpools and bus for Y = 40 and Y = 10 using this
      new utility function.  Are there now any negative utilities?  Are
      any of the predicted choices different from those in Example 2.1?

2.2   Suppose, as in Example 2.1, that an individual can travel to work
      by driving alone, carpooling, or riding the bus.  Assume that the
      relevant attributes of these modes are travel time and travel
      cost.  Assume that the relevant attribute of the individual is
      annual income.  Using the notation of Example 2.1, let T denote
      door-to-door travel time in hours, C denote travel cost in
      dollars, and Y denote annual income in thousands of dollars per
      year.  Let the utility function be

         U(T,C,Y) =  -3T - 8C/Y.

      Let the values of travel time and cost for the available modes be

                          Time (T)       Cost (C)
             Mode          (hours)          ($)

          Drive Alone       0.35           2.25
            Carpool         0.60           0.95
              Bus           0.75           0.60

                                   20





      Compute the utility values for an individual with an income of
      $40,000 per year and for an individual with an income of $10,000
      per year.  What mode would each individual choose?

2.3   Show that the utility functions W and X in Example 2.2 are
      interchangeable with U. Evaluate the utility functions U, V, W,
      and X for some representative values of the attributes of the
      drive alone, carpools and bus modes.  Use the results to
      illustrate why utility values should not be interpreted as
      strengths of preference.

2.4   Consider choice among the modes drive alone, carpools and bus with
      the utility function:

         U(T,C,Y) =  -T - 5C/Y,

      where T, C, and Y, respectively, denote travel time in hours,
      travel cost in dollars, and income in thousands of dollars per
      year.  Suppose that individuals who live in a certain suburb of a
      city and work downtown face the following travel times and costs
      for trips from home to work:

                          Time (T)       Cost (C)
             Mode          (hours)          ($)

          Drive Alone       0.50           2.50

            Carpool         0.75           1.25

              Bus           1.00           0.50

      In addition, suppose that the incomes of these individuals are
      distributed as follows:

                                   21





                                  Percentage of
                 Income            Individuals

                   14                   5
                   18                  15
                   22                  25
                   26                  25
                   30                  20
                   34                  10

Determine individuals' mode choices according to income, and use these
to compute the aggregate shares of each mode.

Now suppose that bus travel time is reduced to 0.95 hr.  Compute the new
aggregate shares and the percentage changes in the shares resulting from
the improvement in bus service.  Also, compute the aggregate shares by
first averaging the utilities over individuals and then predicting mode
choices based on the average utilities.  Do you obtain the same
aggregate shares as when you determine the mode choices before
averaging?  If not, which method is correct?

                                   22





                                MODULE 3

               INTRODUCTION TO PROBABILISTIC CHOICE THEORY

3.1   Introduction

   Module 2 introduced the theory of behavior that forms the basis of
disaggregate mode choice models.  This module continues developing the
theory in ways that make it more realistic and useful for practical
applications.

3.2   Inadeguacy of Deterministic Utility Models

   The theory of travel choice described in Module 2 yields a simple
model of decision making that makes deterministic predictions of travel
choices.  In other words, according to this theory there is no
uncertainty in the predicted choices.  An individual is predicted to
choose the alternative with the highest utility, and according to the
model, there is no possibility that any other alternative will be
chosen.  Models based on utility maximization that yield deterministic
predictions of choice are called deterministic utility models.
   If deterministic utility models describe travel behavior correctly,
then similar individuals would be expected to make the same travel
choices when faced with the same sets of alternatives.  In practice,
however, it is not unusual for apparently similar individuals to make
different choices when faced with similar or even identical
alternatives.  In fact, the same individual may make different choices
when faced with the same alternatives on different occasions.  For
example, in studies of work trip mode choice it is frequently found that
individuals who have identical personal characteristics according to the
available data and who face similar sets of

                                   23





travel alternatives choose different modes of travel to work.  Some of
these individuals may vary their choices from day to day for no apparent
reason.
   Deterministic utility models cannot treat such "unexplained"
variations in travel behavior.  Thus, deterministic utility models
provide inadequate descriptions of travel behavior.  The purpose of this
module is to explore the reasons for this inadequacy and to lay the
groundwork for a family of models, also based on utility theory, that do
take account of unexplained variations in travel choices.  This family
of improved models is presented in detail in Module 4.
   It is easy to understand the basic sources of the inadequacy of
deterministic utility models.  First, analysts and the individuals
making the travel choices being modeled are unlikely to have the same
information about the available alternatives.  For example, the analyst
may not have data on the reliability of a particular bus line or the
likelihood that a particular individual will get a seat on the bus.  But
the individual in question is likely either to know these things (if he
has had experience with the bus line) or to have opinions about them
that are unknown to the analyst.  Second, the analyst is unlikely to
know all the characteristics of each individual that are relevant to
mode choice.  For example, an individual's choice of mode for the work
trip may depend on whether other family members want to use the car that
will be driven to work if automobile is the chosen mode.  However, it is
unlikely that an analyst will have detailed information on the
activities of family members.
   If analysts had data on all of the variables relevant to mode choice,
it would be reasonable to expect that mode choice could be described and
predicted satisfactorily by deterministic utility models.  Experience
has shown, however, that analysts do not have such data and have no
realistic

                                   24





possibility of obtaining them.  Therefore, mode choice models should
take a form that recognizes and accommodates analysts' lack of
information.  In this module and the next, it is shown how determistic
utility models can be modified to achieve this objective.  The resulting
models are called "random utility models" or "probabilistic choice
models" because they describe preferences and choice in terms of
probabilities.  Instead of predicting that an individual will choose a
particular mode with certainty, these models give probabilities that
each of the available modes will be chosen.  Thus, the analyst's lack of
complete information about the attributes of alternatives and
individuals is accommodated in the modeling process by predicting the
probabilities with which choices will be made instead of predicting that
a specific choice will be made with certainty.
   The remainder of this module describes some specific limitations of
information that affect mode choice modeling and explores their
consequences.  Examples are presented to show how these limitations lead
to unexplained" variations in choice and, therefore, the appearance of
probabilistic choice behavior (i.e., choices that are made according to
a deterministic utility model but that appear probabilistic to an
analyst who has only partial knowledge of the relevant variables). 
These examples suggest the usefulness of probabilistic models to
describe this behavior.

3.3   Limitations of Analysts' Information

   Two types of limitations of analysts' information make deterministic
utility models inadequate for practical mode choice analysis.  First,
travelers may not have exact knowledge of the attributes of the
available alternatives.  For example, a traveler choosing between bus
and car for a trip to a downtown shopping location is unlikely to know
the exact travel

                                   25





time by car or bus, the exact waiting time for the bus, whether he will
get a seat on the bus, or the exact likelihood of finding a free or low-
cost parking space near his destination.  Consequently, the traveler's
opinions or perceptions of these attributes are likely to differ from
the objectively measured values of the attributes.  An analyst often has
no way of obtaining exact information about an individual's opinions and
perceptions.  However, without this information, the analyst will not be
able to predict precisely the individual's choices.
   Second, the analyst may not know the true values of the travel
attributes important to the individual, and he may not know the
individual's utility function.  These limitations of knowledge further
restrict the ability of analysts to predict the travel behavior of
individuals accurately.  In the remainder of this section, five specific
limitations of analysts' knowledge are discussed.  Each of these
limitations leads to the occurrence of unexplained variation in travel
choices.  The five limitations are:

   1. Omission of relevant variables from models: A model of mode choice
      may omit one or more variables that are important to the traveler. 
      This may happen either because such omission achieves a useful
      simplification of the model or because the analyst does not have
      data on the omitted variables.  For example, the utility function
      used in Module 2 includes only travel time and cost as attributes
      of modes.  If travelers consider other factors, such as comfort,
      reliability, and privacy, in addition to travel time and cost,
      their mode choices will vary, even if travel time and cost do not,
      according to their perceptions of these other factors.

                                   26





   2. Measurement error: Analysts' information about service quality may
      be subject to measurement errors.  For example, data on travel
      time may be obtained from network models that yield estimates of
      travel times between zone centroids.  These estimates may be
      erroneous due to network coding errors, errors in the assumed
      volume-delay functions, or because the trips in question originate
      or terminate at locations other than zone centroids.  Thus, the
      analyst's estimates of travel time may be substantially different
      from the travel times actually experienced by travelers.

   3. Proxy variables: It is often necessary in practical modeling to
      use variables that are different from the ones that are
      theoretically appropriate.  For example, employment density may be
      used as a proxy for carpooling opportunities.  If individuals'
      mode choices depend on carpooling opportunities, these choices
      will not be predicted precisely by a model that uses employment
      density in place of more precise indicators of carpooling
      opportunities.  Another proxy variable commonly used in mode
      choice models is income as a proxy for automobile ownership.

   4. Differences between individuals may be ignored: Different
      individuals may evaluate alternatives differently.  For example,
      differences in costs among alternatives may be less important to
      wealthy people than to poor people.  Some models attempt to
      capture this difference by, for example, dividing the cost of
      travel by the individual's income or wage rate, as was done in
      Module 2. However, other differences are more difficult to
      represent in a model.  For example, physical characteristics of
      the individual that may affect the importance of seating
      availability, walk

                                   27





      distance, or wait time may not be known to the analyst and,
      therefore, not included in a model.

   5. Day to day variations in the choice context may be ignored: The
      data customarily used in mode choice modeling do not include
      information on day-to-day variations in the choice context that
      may affect mode choice.  Examples of such variations include
      short-term unavailability of a car due to repair needs or
      variations in the needs of another family member, the need to
      carry heavy packages on a particular day, or variations in the
      activities planned to be undertaken after work.

   Each of the foregoing limitations of analysts' knowledge can cause
variations in mode choices, either among individuals or by the same
individual on different occasions, that cannot be explained by the
observed (or measured) attributes of travelers or modes.  The next
section presents several examples that illustrate how such unexplained
variation in choices arise and show how it creates the appearance of
probabilistic travel behavior.

3.4   Examples of Unexplained Variation in Choice Behavior

   The examples in this section illustrate how the limitations of
analysts' knowledge described in Section 3.3 lead to unexplained
variation in choices and the appearance of probabilistic behavior.

   Example 3.1: Missing Variables
   Suppose, as in Example 2.1, that an individual can travel to work by
driving alone, carpooling, or riding the bus.  Suppose that the relevant
attributes of these modes Are travel time (T) and cost (C).  However,

                                   28





extending Example 2.1, let the relevant characteristics of the
individual include both annual income (Y) and the number of automobiles
owned by his household (A).  The effect of automobile ownership is to
increase or decrease the utility of the drive alone and carpool modes,
depending on the number of automobiles owned.  This effect is 
represented by the following equations for the utility function:

         U     =  -  T  -  5C   /Y  +  0.4(A - 1)                 (3.1a)
          DA          DA     DA

         U     =  -  T  -  5C   /Y  +  0.2(A - 1)                 (3.1b)
          CP          CP     CP

         U     =  -  T  -  5C   /Y,                               (3.1c)
          B           B      B

      As in Example 2.1, suppose the values of travel time and cost are

                          Time (T)       Cost (C)
             Mode          (Hours)          ($)

          Drive Alone       0.50           2.00

            Carpool         0.75           1.00

              Bus           1.00           0.75

If income is $15,000 per year (Y = 15), then the utilities of the three
alternatives for households with no automobiles (A = 0) are:

         U     =  -  0.5   -  5(2.0/15)  +  0.4(0 - 1)   =  -1.57
          DA

         U     =  -  0.75  -  5(1.0/15)  +  0.2(0 - 1)   =  -1.28
          CP

         U     =  -1.0  -  5(0.75/15)                    =  -1.25.
          B

Using the same equations, the utility values of the three alternatives
for three different levels of automobile ownership are:

                                   29





                            Zero            One            Two
             Mode           Cars            Car           Cars

          Drive Alone       -1.57          -1.17          -0.77

            Carpool         -1.28          -1.08          -0.88

              Bus           -1.25          -1.25          -1.25

          Mode Chosen        Bus          Carpool      Drive Alone

According to the principle of utility maximization, individuals in
households without cars use bus, those in households with one car
carpools and those in households with two cars drive alone.

   Now consider what happens if the automobile ownership variable is not
included in the data set and the analyst predicts choice with the
utility function of Example 2.1. All individuals would be predicted to
have utilities equal to -1.17 for drive alone, -1.08 for carpools and -
1.25 for bus, so all individuals would be predicted to travel by
carpools' However, if there were 20 zero-car households, 50 one-car
households, and 30 two-car households, it would be observed that, in
fact, 20% of the individuals choose bus, 50% choose carpools and 30%
drive alone.  Thus, the omission of the automobile ownership variable
from the utility function causes variations in travel choices that are
not explained by the model.  These variations in choices among
alternatives give the appearance of probabilistic choice behavior
because they can be described by a probability distribution in which the
probabilities that bus, carpools and drive alone are chosen are 0.20,
0.50, and 0.30, respectively.

                                   30





   Example 3.2: Measurement Error
   Now consider the zero-car households in Example 3.1, but assume that
different individuals have different travel times for the automobile
modes.  Specifically, assume that the true drive alone and carpool
travel times for individuals are distributed according to the following
relative frequencies:

   Percentage of Individuals           20%       50%       20%       10%
   Drive Alone Travel Time (hr.)      0.40      0.50      0.60      0.70
   Carpool Travel Time (hr.)          0.65      0.75      0.85      0.95

In this table, the travel times in the second column are the same as
those in Example 3.1. The travel times in the first column are lower
than those in Example 3.1, and the travel times in the third and fourth
columns are higher than in Example 3.1. Thus, the travel times in
Example 3.1 are measured with error.  Fifty percent of the individuals
have travel times that are given correctly by Example 3.1, but 20% have
travel times that are lower than those of Example 3.1 and 30% have
travel times that are higher.  Assume that the travel costs are the same
as in Example 3.1.
   The utilities of the three modes can be obtained from the utility
function of equations (3.1). For zero-car households with incomes of
$15,000 per year (Y = 15), the utilities are:

                                   31




                     Utilities Based on the Travel Time     Based on
Percentage of        Distributions in the Previous Table    Ex. 3.1
Individuals                    20%     50%     20%     10%    100%

Drive Alone                   -1.47   -1.57   -1.67   -1.77   -1.57
Carpool                       -1.18   -1.28   -1.38   -1.48   -1.28
Bus                           -1.25   -1.25   -1.25   -1.25   -1.25

Chosen Mode                  Carpool   Bus     Bus     Bus     Bus

In this,table, the first column gives the utilities for travelers whose
drive alone and carpool travel times are less than those in Example 3.1,
the second column gives the utilities for travelers with travel times
equal to those in Example 3.1, and the third and fourth columns give the
utilities for travelers with drive alone and carpool travel times
greater than those in Example 3.1. It can be seen that 20% of the
travelers (those with "lower" drive alone and carpool travel times)
choose carpools and the remaining 80% choose bus.  However, if the
distributions of travel time were ignored as in Example 3.1, 100% of
these travelers would be predicted-to choose bus.  The choices of 20% of
the travelers would be predicted erroneously because erroneous travel
time data had been used.
   If the same travel time distributions were applied to one-car
households with incomes of $15,000 per year, the utilities and mode
choices would be:

                                   32





                     Utilities Based on the Travel Time     Based on
Percentage of           Distributions                       Ex. 3.1
Individuals                    20%     50%     20%     10%    100%

Drive Alone                   -1.07   -1.17   -1.27   -1.37   -1.17
Carpool                       -0.98   -1.08   -1.18   -1.28   -1.08
Bus                           -1.25   -1.25   -1.25   -1.25   -1.25

Chosen Mode                  Carpool Carpool Carpool   Bus   Carpool


In this case, 90% of the travelers would choose carpool and 10% would
choose bus.  However, if the distributions of travel time were ignored,
all of these travelers would be predicted to choose carpools In this
case, the use of erroneous travel time data would cause erroneous
predictions of the choices of 10% of the travelers.
   In summary, ignoring the distributions of travel times of zero- and
one-car households results in predictions that do not reflect the true
variations in mode choices.  In other words, the actual choices vary in
ways not explained by the model used to make the predictions.  As in
Example 3.1, the variations can be described by a probability
distribution.
   In Exercise 3.2, you will be asked to determine the percentages of
twocar households choosing each mode when travel time is distributed as
in this example.

   Example 3.3: Differences in Preferences among Individuals

   Examples 3.1 and 3.2 assume that the same utility function applies to
every individual.  However, it is possible that, for reasons not known
to the analyst, different individuals have different preferences among
the same sets of alternatives.  When this happens, the preferences of
different individuals are described by different utility functions.  For
example,

                                   33





consider a population consisting of two groups in which time is valued
differently.  Suppose the preferences of one group are described by the
utility functions:

       (1)
      U  =  -0.75T      -  5C   /Y  +  0.4(A - 1)                 (3.2a)
       DA         DA         DA

       (1)
      U  =  -0.75T      -  5C   /Y  +  0.2(A - 1)                 (3.2b)
       CP         CP         CP

       (1)
      U  =  -0.75T      -  5C   /Y.                               (3.2c)
       B          B          B

Suppose the preferences of the second group are described by the utility
functions:

       (2)
      U  =  -1.5T    -  5C   /Y  +  0.4(A - 1)                    (3.3a)
       DA        DA       DA

       (2)
      U  =  -1.5T    -  5C   /Y  +  0.2(A - 1)                    (3.3b)
       CP        CP       CP

       (2)
      U  =  -1.5T    -  5C   /Y.                                  (3.3c)
       B         B        B

Then the members of group two consider time to be twice as valuable as
do the members of group 1. As a result, the utilities of the available
modes will be different for members of the two groups and the choices of
mode will be different.

   To illustrate this, suppose that the travel times and costs of
Example 3.1 are correct, and consider individuals whose incomes are
$15,000 per year.  The utility values for individuals in group 1 and
owning zero cars are:

    (1)
   U  =  -0.75(0.5) - 5(2.0/15) + 0.4(0 - 1) =  -1.44
    DA

    (1)
   U  =  -0.75(0.75) - 5(1.0/15) + 0.2(0 - 1)   =  -1.10
    CP

    (1)
   U  =  -0.75(l.0) - 5(0.75/15) =  -1.00.
    B

The utilities for members of both groups and owning zero, one, or two
cars are:

                                   34





                Zero     Cars      One      Car      Two     Cars
                Group    Group    Group    Group    Group    Group
  Mode            1        2        1        2        1        2

  Drive Alone   -1.44    -1.82    -1.04    -1.42    -0.64    -1.02

  Carpool       -1.10    -1.66    -0.90    -1.46    -0.70    -1.26

  Bus           -1.00    -1.75    -1.00    -1.75    -1.00    -1.75

  Chosen Mode    Bus     Car-     Car-     Drive    Drive    Drive
                         pool     pool     Alone    Alone    Alone

Notice that for zero- and one-car households, the mode chosen by members
of group 1 is different from the mode chosen by members of group 2.
Thus, the proportions of individuals owning zero, one, and two cars
choosing each mode depend on the proportions that belong to each
preference group.  However, if the analyst does not know that
individuals belong to different preference groups, it will appear that
"identical" individuals (i.e., individuals who have the same incomes and
levels of automobile ownership) make different choices when faced with
identical alternatives.  This variation in choices will be unexplained
by the information available to the analyst and will give the appearance
of probabilistic choice.

   Example 3.4:  Multiple Sources of Unexplained Variations in Choices

   The sources of unexplained variations in mode choices that have been
illustrated in the Examples 3.1-3.3 can occur simultaneously in
practice.  That is, a choice model may be used that omits the automobile
ownership variable even though automobile ownership affects mode choice,
does not account for differences in preferences among individuals, and
is based on data that include errors in measured travel time.  As an
example of this, suppose that the utility function used by the analyst
is:

      U(T,C,Y) =  -T - 5C/Y                                        (3.4)

                                   35





Suppose, also, that the measured values of T and C are as in Example
3.1, Then for an individual whose income is $15,000 per year, the
analyst would estimate the utilities of drive alone, carpools and bus to
be:

                    Mode                     Utility

                    Drive Alone              -1.17
                    Carpool                  -1.08
                    Bus                      -1.25

The analyst would predict that all of these individuals will choose
carpools However, the actual choices of a specific individual will
depend on which preference group he is in, the number of automobiles
owned by his household, and his true travel times.  Therefore, the modes
chosen will vary among individuals in ways that are not explained by the
analyst's specification of the utility function (Equation 3.4). Choices
will appear to be probabilistic to the analyst.

   The foregoing examples illustrate some of the ways in which
unexplained variation in travel behavior can arise.  As has been
discussed, this variation causes travel behavior to appear to be
probabilistic.  The next two sections discuss in more detail how
unexplained variation in travel behavior can be described in
probabilistic terms.  The use of a probabilistic representation of
travel behavior enables models to reflect both the effects of variables
that are included in the analyst's specification of the utility function
and the effects of errors in the analyst's specification.

                                   36





3.5   The Basic Formulation of Probabilistic Models

   In each of the examples of the preceding section, the correct utility
function differs from that used by the analyst due to the omission of a
variable that influences mode choice (Example 3.1), measurement error
(Example 3.2), variations in preferences among individuals (Example
3.3), or all of these (Example 3.4). In each case, the correct utility
function, U, can be written as the sum of the utility function specified
by the analyst, V, and an error term, e. That is:

         U  =  V + e.                                              (3.5)

The specified utility functions (V) and error terms (e) for Example 3.1
are:

            Components of the Utility Function -- Example 3.1

          Mode                Specified Utility   Error Term


          Drive Alone         -T   -  5C   /Y     0.4(A - 1)
                                DA      DA

          Carpool             -T   -  5C   /Y     0.2(A - 1)
                                CP      CP

          Bus                 -T   -  5C  /Y      0
                                B       B

The utilities of drive alone and carpool include an error term due to
the omission of automobile ownership from the specified utilities.
   The specified utility functions and error terms for Example 3.2 are
given in the following table.  In this table T* denotes measured travel
time, and T denotes true travel time.

                   Components of the Utility Function   Example 3.2
          Mode           Specified Utility             Error Term

          Drive Alone    -T*   -  5C   /Y              T   - T*
                           DA       DA                  DA    DA

          Carpool        -T*   -  5C   /Y              T   - T*
                           CP       CP                  CP    CP

          Bus            -T   -  5C  /Y                0
                           B       B

                                   37





In this case, the utilities of drive alone and carpool include an error
term because the drive alone and carpool travel times of some
individuals. are measured with error.  There is no error term in the
utility of bus because bus travel time is measured without error.
   The components of the utility function for Example 3.3 are:

            Components of the Utility Function -- Example 3.3

   Mode                Specified Utility              Error Term

   Drive Alone     -T   - 5C   /Y + 0.4(A - 1)    0.25T   (group 1)
                     DA     DA                         DA
                                                  -0.50T   (group 2)
                                                        DA

   Carpool         -T   -  5C  /Y + 0.2(A - 1)    0.25T   (group 1)
                     CP      CP                        CP
                                                  -0.50T   (group 2)
                                                        CP

   Bus             -T   -   5C   /Y               0.25T   (group 1)
                     B        B                        B
                                                  -0.50T   (group 2)
                                                        B

In this case, errors are present in all the utilities because the
specified utilities ignore the differences between the two population
groups.
   Of course, these examples are artificial.  The true utility functions
are known, and there is no need to use a representation such as Equation
(3.5) that replaces part of the known utility function with an error
term.  In practice, however, an analyst never knows the true utility
function.  An analyst can hope to know the utility function only up to
an error term.  In effect, the analyst always measures or estimates
utility with error, and an error term of unknown size is always present
in the analyst's specification of the utility function.  This error term
accounts for variables that the analyst knows influence travel behavior
but that are not included in his data set or that he chooses to omit
from his model (e.g., because he cannot

                                   38





forecast them well).  It also accounts for any variables that influence
travel behavior but are completely unknown to the analyst.
   It is customary to think of the error term as a random variable whose
values are described by a probability distribution.  The true utility
function, U, is then also a random variable consisting of the sum of a
deterministic component, V, and a random component, e. The deterministic
component of the utility function is what the analyst can measure or
estimate, and the random component is the difference between the true
utility function and the deterministic component.
   When the true utilities of the alternatives are random variables, it
is not possible to state with certainty which alternative has the
greatest utility or which alternative is chosen.  This is because
utility and choice depend on the random components of the utilities of
the available alternatives, and these components cannot be measured. 
The most an analyst can do is predict the probability that an
alternative has the maximum utility and, therefore, the probability that
the alternative is chosen.  Accordingly, the analyst must represent
travel behavior as being probabilistic.  The need for a probabilistic
representation is a consequence of the analyst's inability to make the
utility measurements required to predict behavior with certainty.
   of course, the probability that a particular alternative is chosen
depends on the values of the deterministic components of the utilities,
and the values of these components can be measured (or estimated).  In
general, the probability that a particular alternative is chosen either
increases or stays the same when the deterministic component of its
utility increases and either decreases or stays the same when the
deterministic component decreases.  This relation is illustrated by the
following example.

                                   39





   Example 3.5:   Dependence of Choice on the Deterministic Component of
                  Utility

   Let the true utilities be given by equations (3.2) and (3.3), and let
the deterministic (or specified) component of the utility function be
given by equation (3.4). Thus, the effects of differences in automobile
ownership and differences in preferences among population groups are not
represented in the deterministic component of the utility function and
are accounted for by the error term (i.e., the difference between true
utility and the deterministic component of utility).  Let the population
be distributed over automobile ownership classes and preference groups
according to the following percentages:

                                Automobiles Owned

                              0         1         2
     Preference Group 1      15%       25%       5%
     Preference Group 2       5        30        20

Let income be $15,000 per year (Y = 15), and let the travel times and
costs according to mode be as in Example 3.1. Then, the values of the
deterministic component of the utility function are -1.17 for drive
alone, 1.08 for carpool, and -1.25 for bus.  The values of true utility
and the chosen modes according to automobile ownership and preference
group are the same as in Example 3.3 (see the table on p. 34).  The
percentages of the population choosing each mode can be computed from
the following table:

                                   40





                                       Percent of
     Autos Owned   Preference Group    Population      Mode Chosen

          0                1               15              Bus

                           2                5            Carpool

          1                1               25            Carpool

                           2               30          Drive Alone

          2                1                5          Drive Alone

                           2               20          Drive Alone

The percentages of the entire population choosing each mode are 55% for
drive alone, 30% for carpools and 15% for bus.  However, the
deterministic component of the utility function for each mode has the
same values for all preference groups and automobile ownership levels. 
Thus, the deterministic component of utility cannot account for the
variations in choice within the population.  Since an analyst knows only
the deterministic components of utility, the population will appear to
be choosing among the modes with probabilities 0.55, 0.30, and 0.15 for
drive alone, carpool, and bus, respectively.  This appearance of
probabilistic choice is due to the fact that the analyst does not know
the variations in preferences and automobile ownership levels that cause
different individuals to choose difference modes.
   Now, suppose that in an effort to shift mode choice to high-occupancy
modes, a parking tax of $0.50 is imposed on each single-occupant
vehicle.  Then, the cost of driving alone increases to $2.50 while the
costs of carpooling and traveling by bus remain unchanged at $1.00 and
$0.75, respectively.  The deterministic component of drive alone utility
decreases to -1.34 from its former value of -1.17. The values of the
deterministic

                                   41





components of the utilities of carpool and bus remain unchanged.  The
new values of the true utilities are:

                Zero    Cars     One     Car    Two    Cars
                Group   Group   Group   Group  Group   Group
  Mode            1       2       1       2      1       2

  Drive Alone   -1.61   -1.99   -1.21   -1.59  -0.81   -1.19

  Carpool       -1.10   -1.66   -0.90   -1.46  -0.70   -1.26

  Bus           -1.00   -1.75   -1.00   -1.75  -1.00   -1.75

  Chosen Mode    Bus    Car-    Car-    Car-   Car-    Drive
                        pool    pool    pool   pool    Alone

The percentages of the population choosing each mode are now 20% for
drive alone, 65% for carpool, and 15% for bus.  Again, since the analyst
knows only the values of the deterministic component of utility, the
population will appear to be choosing among the modes with probabilities
of 0.20, 0.65, and 0.15 for drive alone, carpool, and bus, respectively.
   The next table shows the value of the deterministic component of
utility for drive alone and the percentages of the population choosing
each mode for three different values of the parking tax on single-
occupant vehicles:

                  Deterministic     Percentage of Population
                  Component of              Choosing
                   Utility for    Drive
    Tax            Drive Alone    Alone      Carpool      Bus
   $0.00              -1.17        55          30         15
    0.25              -1.25        15          65         20
    0.50              -1.34        15          65         20

                                   42





Notice that as the parking tax increases, the deterministic component of
the utility of drive alone decreases and the percentage of the
population choosing to drive alone either decreases or stays unchanged. 
Since the analyst knows only the deterministic components of utility,
mode choice will appear to be probabilistic.  The probability that a
given mode is chosen will be the percentage of the population choosing
that mode divided by 100.  Therefore, such an analyst will observe that
as the deterministic component of the utility of drive alone decreases,
the probability that drive alone is chosen either decreases or stays the
same.

3.6   The Probability of Observing a Specific Choice

   An important interpretation of probabilistic travel behavior can be
obtained by considering a common procedure for collecting mode choice
data.  Consider a group of apparently similar travelers who make
different choices when faced with the same alternatives.  If
observations of mode choice are obtained by sampling this group
randomly, the probabilities of selecting travelers who choose drive
alone, carpool, and bus are the same as the probabilities that
individual travelers make these choices.  In other words, the sampling
probabilities and the individual choice probabilities are the
same.
   To illustrate this, consider Example 3.1, in which the utility
function specified by the analyst includes travel time, travel cost, and
income but not automobile ownership.  Assume that all travelers have
incomes of $15,000 per year and that the travel times and costs of drive
alone, carpool, and bus for all travelers are as in Example 3.1. If 20%
of the travelers' households own 0 cars, 50% own 1 car, and 30% own 2
cars, the probability that a randomly sampled traveler chooses bus is
0.20, the probability that

                                   43





he chooses carpool is 0.50, and the probability that he chooses drive
alone is 0.30. These sampling probabilities are the same as the choice
probabilities obtained in Example 3.1 by considering a group of 20 zero-
car, 50 one-car, and 30 two-car travelers.  Thus, one can think of the
probability of a given choice as being the probability of sampling an
individual who makes that choice.  In fact, as will be discussed in
Module 6, this interpretation forms the basis of methods for calibrating
or estimating probabilistic models of choice.

3.7   Aggregate Prediction with Probabilistic Choice

   In Module 2, where the choices of individuals were predicted
deterministically, predictions of aggregate travel behavior were
obtained by summing the choices of the individuals comprising the
aggregate group.  Aggregate predictions when choices by individuals are
predicted probabilistically are obtained in an analogous manner: the
probabilities of choices by the individuals in the aggregate group are
summed.
   Consider Example 3.1, where choice is influenced by automobile
ownership but the analyst's specification of the utility function
includes only travel time, travel cost, and income.  Assume that the
distribution of automobile ownership in different income classes is as
follows:

                                   44





     Income            Zero-           % One-            % Two-
     ($000)       Car Households   Car Households    Car Households

      17.5              40               50                10

      22.5              20               60                20

      27.5              15               60                25

      32.5              10               60                30

      37.5               5               55                40

      42.5               0               50                50

The choice of each traveler can be obtained by substituting the values
of the travel time, travel cost, income, and automobile ownership
variables into Equation (3.1). The chosen modes for each income and
automobile ownership class are:

       Income            Automobile Ownership
       ($000-)   Zero Cars      One Car         Two Cars

        17.5        Bus         Carpool        Drive Alone

        22.5      Carpool     Drive Alone      Drive Alone

        27.5      Carpool     Drive Alone      Drive Alone

        32.5      Carpool     Drive Alone      Drive Alone

        37.5      Carpool     Drive Alone      Drive Alone

        42.5      Carpool     Drive Alone      Drive Alone

Since the percentages of travelers in each income class owning 0, 1, and
2 cars are known, the percentages of travelers choosing each mode
according to income class can be computed.  The results are:

                                   45





                                     Percent Choosing
            Income          Drive
            ($0001          Alone         Carpool          Bus

             17.5            10             50             40

             22.5            80             20              0

             27.5            85             15              0

             32.5            90             10              0

             37.5            95              5              0

             42.5            100             0              0

These percentages constitute the analyst's probabilistic predictions of
mode choice for each income class. (Recall that the analyst's utility
function does not include automobile ownership, so the analyst cannot
predict choice deterministically.) Using these predictions of individual
choice, predictions of aggregate choice are obtained in the following
way.  For each income class, the probability that an individual chooses
a given mode is multiplied by the number of individuals in the class,
thereby obtaining the number of individuals in the class who are
predicted to choose the given mode.  These predictions are then summed
over all income classes to obtain the total number of individuals who
are predicted to choose the given mode.
   As an example of this, suppose the numbers of individuals in the
income classes are known, so the analyst has the following information
available:

                                   46





                                          Percent Choosing
       Income        Number of        Drive
       ($000)       Individuals       Alone    Carpool     Bus

        17.5            20             10        50        40

        22.5            60             80        20         0

        27.5            100            85        15         0

        32.5            100            90        10         0

        37.5            80             95         5         0

        42.5            40             100        0         0

Then, the number of individuals predicted to choose drive alone is
0.10(20) + 0.80(60) + 0.85(100) + 0.90(100) + 0.95(80) + 1.0(40) - 341. 
Similarly, the numbers of individuals predicted to choose carpool and
bus are 51 and 8, respectively.

3.8 Summary

   The utility maximization principle provides a valuable framework for
the analysis of travel choice behavior.  However, deterministic utility
models are inadequate due to the inability of the analyst to know the
exact utility function of an individual and to measure accurately all of
the variables relevant to travel choice.  This module has explained how
limitations of the analyst's knowledge create unexplained variations in
travel choices and the appearance of probabilistic behavior.  In
addition, the module has provided a basis for modeling travel choice
probabilistically and has showed how aggregate travel behavior can be
predicted when choices by individuals are predicted probabilistically. 
The next module describes a specific probabilistic choice model that can
be used to predict individual choice probabilities.

                                   47





                                EXERCISES

3.1   In Example 3.1, show how the utility values for each mode are
      obtained for individuals in households with one or two cars.  What
      are the corresponding set of utilities for individuals in
      households with three cars?

3.2   Using Example 3.2, determine the mode choice percentages for two-
      car households?

3.3   Combine Examples 3.1 and 3.2. Assume that the proportions of zero-
      , one-, and two-car households are 0.20, 0.50, and 0.30,
      respectively.  Also assume that the distribution of travel times
      for each automobile ownership class is as shown in Example 3.2.
      Compute the mode choice for each automobile ownership-travel time
      group.  Also compute the overall probability that each mode is
      chosen.

3.4   Suppose that the distribution of individuals according to income
      class is:

                        Income            Number of
                        ($000)           Individuals

                         17.5                10

                         22.5                20

                         27.5                20

                         32.5                30

                         37.5                20

                         42.5                 3

                                   48





Repeat the aggregate prediction of Section 3.7 using the above income
distribution, thereby obtaining the total numbers of individuals
predicted to choose each mode.

                                   49





                                MODULE 4
                         THE LOGIT CHOICE MODEL

4.1   Introduction

   The preceding module showed that limitations of analysts' knowledge
of the variables that influence individuals' mode choices make it
necessary to predict these choices in terms of probabilities.  This
module introduces the most widely used mathematical model for making
probabilistic predictions of mode choices.
   Before the model is introduced, it is worthwhile to identify some of
the properties that a probabilistic choice model should have.  These
properties include:
   1. The probability of choosing a particular alternative should depend
      on the deterministic components of the utilities of all available
      alternatives.  The chosen alternative is the one with the highest
      total utility.  Therefore, it depends on the relative values of
      the total utilities of all alternatives.  It depends on the
      deterministic components of the utilities of all alternatives
      because the total utilities depend on the deterministic components
      of utility.
   2. The probability that an alternative is chosen should increase when
      the deterministic component of its utility increases.  It should
      decrease when the deterministic component of the utility of any
      other alternative increases.  This property follows from the fact
      that increasing the deterministic component of an alternative's
      utility increases the probability that that alternative has the





      highest utility.  Increasing the deterministic component of
      another alternative's utility decreases the probability that the
      first alternative has the highest utility.
   3. The model should accommodate choice sets containing any number of
      alternatives so that it can be applied regardless of the number of
      alternatives involved and can be used to predict the effects of
      changing the number of alternatives (e.g., as would happen when
      transit is initiated in an area that previously had none).
   4. The model should be easy to understand and to use in practice.

4.2   The Binomial Logit Model

   Before considering how choices among arbitrary numbers of
alternatives can be modeled, it is useful to consider a model of choice
among only two alternatives.  The most frequently used model of
probabilistic choice among two alternatives is the binomial logit model. 
In this model, the probability that alternative 1 is chosen when the
choice set consists of alternatives 1 and 2 is given by the following
formula:

                           exp(V )                                 (4.1)
                                1
         Pr(1) =  --------------------------------
                     exp(V ) + exp(V )
                          1         2

where Pr(1) is the probability that an individual chooses alternative 1,
exp( ) is the exponential function, and V1 and V2 are the
deterministic components of the utilities of alternatives 1 and 2,
respectively.  The exponential function transforms its argument (the
expression in parentheses) as shown in Figure 4.1 and as tabulated in
Table 4.1. It can be seen from the figure and the table that the
exponential function is monotonic (i.e., its value increases when the
value of its argument increases) and that its value is always positive.

                                   51





                  Figure 4.1: The Exponential Function


Click HERE for graphic.


                                   52





                  TABLE 4.1 -- THE EXPONENTIAL FUNCTION

                           Z                EXP(Z)

                         -3.0                0.050
                         -2.5                0.082
                         -2.0                0.135
                         -1.5                0.223
                         -1.0                0.368
                         -0.5                0.607
                          0.0                1.000
                          0.5                1.649
                          1.0                2.718
                          1.5                4.482
                          2.0                7.389
                          2.5               12.182
                          3.0               20.086

   As was discussed in Section 3.6 of Module 3, Pr(1) is the probability
that a randomly selected individual with deterministic utility
components V1 and V2 chooses alternative 1. Equation (4.1) implies
that in the binomial logit model, this probability increases
monotonically with the deterministic component of the utility of
alternative 1 and decreases monotonically with the deterministic
component of the utility of alternative 2.
   Since there are only two choices available to-the individual, the
probability that alternative 2 is chosen is one minus the probability
that alternative 1 is chosen.  Thus,

                                   53





                           exp (V  )
                                 1
   Pr(2) =  1 -   -----------------------
                  exp(V  ) + exp(V  )
                       1          2


                  exp(V  ) + exp(V  )               exp(V  )
                       1          2                      1
         =  ---------------------------   -  ------------------------
               exp(V   ) + exp(V   )            exp(V   ) + exp(V   )
                    1           2                    1           2


                  exp(V  )
                       2
   Pr(2) =  --------------------------                             (4.2)
            exp(V   ) + exp(V   )
                 1           2

This probability increases monotonically with the deterministic
component of the utility of alternative 2 and decreases monotonically
with the deterministic component of the utility of alternative 1.
   The binomial logit model has three of the four desirable properties
of probabilistic choice models that were listed in Section 4.1. The
binomial logit choice probabilities depend on the deterministic
components of the utilities of all alternatives (property 1); the
probability of choosing a particular alternative increases when the
deterministic component of the utility of that alternative increases,
and the probability decreases when the deterministic component of the
utility of the other alternative increases (property 2); and the model
is easy to understand and apply (property 4).  The binomial logit model
cannot treat choice among more than two alternatives, so it does not
have property 3.
   In the binomial logit model, the probabilities of choosing
alternatives 1 and 2 are equal when the deterministic components of the
two alternatives' utilities are equal.  Moreover, the choice
probabilities are most sensitive to changes in the deterministic
components of the utilities when these components are approximately
equal and the choice probabilities are close to 0.5. To see this, divide
the numerator and denominator of Equation (4.1) by exp(V1) to obtain

                     1
   Pr(1) =  -------------------------                              (4.3)
               1 + exp[ - (V  -  V  )]
                            1     2

                                   54





Equation (4.3) shows that Pr(1) depends only on the difference between
V1 and V2. Figure 4.2 shows a graph of Pr(1) as a function of this
difference, and Table 4.2 tabulates Pr(1) for selected values of V1 and
V2.

        TABLE 4.2 -- VALUES OF PR(1) IN THE BINOMIAL LOGIT MODEL

        Case        V         V      V   -  V   Pr(1)
                     1         2      1      2

          1        0.0       0.0       0.0      0.50
          2        0.5       0.0       0.5      0.62
          3        2.0       0.0       2.0      0.88
          4        2.5       0.0       2.5      0.92
          5        2.0      -0.5       2.5      0.92

It can be seen from the table and figure that:

   1. The probabilities of choosing alternatives 1 and 2 are both equal
      to 0.5 when the deterministic components of the alternatives'
      utilities are equal (i.e., when V1 - V2 = 0).  This is because
      exp(0) = 1 (see Table 4.1).
   2. The probability of choosing alternative 1 is more sensitive to
      changes in the deterministic component of the utility of either
      alternative when Pr(1) is close to 0.5 (i.e., when the
      deterministic components of utility are approximately equal) than
      when Pr(1) is close to 0 or 1. (The same statement applies to
      Pr(2).) For example, in Table 4.2, Pr(1) is equal to 0.5 in case 1
      but closer to 1.0 than to 0.5 in case 3.  V1 increases by 0.5
      from case 1 to case 2 and from case 3 to case 4. However, Pr(1)
      increases by 0.12 from case 1 to case 2 but by only 0.04 from case

                                   55





Click HERE for graphic.


                                   56





      3 to case 4. This property is illustrated in Figure 4.2, where the
      plot of Pr(1) is steeper when Pr(1) is close to 0.5 than when it
      is close to 0 or 1.

   Another property of the binomial logit model is that Pr(1) is
affected equally by increases in the value of V1 and decreases in the
value of V2.  This property is illustrated in Table 4.2. The change in
Pr(1) is the same in going from case 3 to case 4 (an increase of 0.5 in
the value of V1  ) as in going from case 3 to case 5 (a decrease of 0.5
in the value of V2  ). This result follows from the fact that change in
V1 - V2  is the same in going from case 3 to case 4 as it is in going
from case 3 to case 5.

4.3   The Multinomial Logit Model

   The binomial logit model can easily be extended to accommodate
choices among more than two alternatives.  To see how this is done,
suppose, first, that there are three alternatives in the choice set and
that the deterministic components of their utilities are V1, V2P and
V3. In the extended model, the probability that alternative 1 is chosen
is

                              exp(V  )
                                   1
   Pr(1) =  ----------------------------------------------         (4.4)
               exp(V   ) + exp(V   ) + exp(V   )
                    1           2           3


The additional alternative is incorporated by adding an additional
exponential term to the denominator of the equation for Pr(1).  The
probability of choosing alternative I remains proportional to the
exponential function of the deterministic component of its utility.  In
general, Pr(1) is smaller when there are three alternatives in the
choice set than when there are only two (assuming that the values of V1
and V2 are the same in both cases).  This is because the total
probability of choosing any of the alternatives, which always equals
one, must be shared among more

                                   57





alternatives when there are three alternatives in the choice set than
when there are only two.
   A model that accommodates any specified number of alternatives can be
obtained by adding the appropriate exponential terms to the denominator
of equation (4.4). Specifically, suppose there are J alternatives in the
choice set, where J is any number greater than or equal to 2. Let the
deterministic components of the utilities of the alternatives be V1,
V2, . . . VJ.  Then the probability that alternative 1 is chosen is


                     exp(V   )
                          1
   Pr(1) =  -------------------------------                        (4.5)
                J
                �     exp(V  )
               j=1         j


   The probability of choosing alternative 2 is

                     exp(V   )
                          2
   Pr(2) =  -------------------------------                        (4.6)
                J
                �     exp(V  )
               j=1         j


   In general, the probability of choosing alternative i (i 1,...,J) is

                     exp(V   )
                          i
   Pr(i) =  -------------------------------                        (4.7)
                J
                �     exp(V  )
               j=1         j


   Equation (4.7) is called the multinomial logit model. In the

multinomial logit model, as in the binomial logit model, the probability
of choosing an alternative increases monotonically with the
deterministic component of that alternative's utility and decreases
monotonically with the determinsitic component of the utility of any
other alternative.
   The multinomial logit model has all of the desirable properties of
the binomial logit model and, in addition, can be applied to any number
of alternatives.  Thus, it has all of the desirable properties of a
probabilistic choice model listed in Section 4.1.

                                   58





   An important additional property of both the binomial and multinomial
logit models is that the choice probabilities depend only on the
differences between the deterministic components of the alternatives'
utilities.  This property is illustrated in Equation (4.3) and Table
(4.2) for the binomial logit model, where the probability of choosing
either alternative depends only on V1 - V2, The corresponding
dependence for the multinomial logit model is illustrated by dividing
the numerator and denominator of Equation (4.4) by exp(V1 ) to obtain

                                    1
   Pr(1) =  -------------------------------------------------      (4.8)
               1 + exp[ - (V   - V   )] + exp[-(V   - V   )]
                            1     2              1     3

In this case, the probability that alternative 1 is chosen depends only
on the values of V1 - V2 and V1 - V3.  If there were J alternatives
in the choice set, where J exceeds 3, then the probability that
alternative 1 is chosen would depend only on V1 - V2, V1 - V3, . . .
V1, - VJ.  In the multinomial logit model, choice probabilities never
depend on ratios of utilities such as V1/V2,  V1/V3, etc.

4.4   Application of the Multinomial-Logit Model to Mode Choice Analysis

   The following example illustrates the application of the multinomial
logit model to mode choice analysis.

   Example 4.1: Application of the Multinomial Logit Model

   Consider travel to work, and let there be three modes in the choice
set: drive alone, carpools and bus.  Let the deterministic components of
the utilities of these modes be:

                                   59





                         Mode                V

                         Drive Alone        2.5
                         Carpool            2.0
                         Bus                1.0

Then, the values of the terms exp(Vj ) and of their sum are

                         Mode               exp(V)

                         Drive Alone         12.18
                         Carpool             7.39
                         Bus                 2.72

                         Sum                 22.29

Substitution of these values into Equation (4.7) yields

               Pr(Drive Alone)     =  12.18/22.29      =  0.55

               Pr(Carpool)         =  7.39/22.29       =  0.33

               Pr(bus)             =  2.72/22.29       =  0.12

As expected, the mode   with the highest deterministic component of
utility (drive alone) has the highest probability of being chosen. 
Notice, also, that the sum of the probabilities over all available modes
equals 1. This always happens in the multinomial logit model because one
of the alternatives must be chosen.
   To verify that you understand the computation of multinomial logit
choice probabilities, compute the probabilities of drive alone, carpool,
and bus when the deterministic components of the utilities are 2.5 for
drive alone, 1.5 for carpool, and 1.0 for bus.  You can obtain the
values of the

                                   60





exponential function from Table 4.1. The correct probabilities are 0.63
for drive alone, 0.23 for carpool, and 0.14 for bus.

4.5   Incorporation of Attributes of Alternatives and Individuals

   Example 4.1 assumed a fixed value of the deterministic component of
each mode's utility.  In practice, the deterministic component of a
mode's utility depends on attributes of that mode (but not of other
modes) and of the individual making the choice.  The following example
illustrates how choice probabilities can be made to depend on attributes
of alternatives and individuals.

   Example 4.2: Choice Probabilities That Depend on Attributes

   Suppose that the deterministic component of the utility of mode j (j 
=  drive alone, carpools or bus) is

   V  =   -T   - 5C  /Y,                        (4.9)
    j       j      j

where Tj and Cj, respectively, are the travel time (in hours) and cost
(in dollars) of mode j, and Y is the annual income (in thousands of
dollars) of the traveler.  Suppose the travel time and cost values are:

                    Mode              Time           Cost

                    Drive Alone       0.50           2.00
                    Carpool           0.75           1.00
                    Bus               1.00           0.75

Then the deterministic components of the modes' utilities and their
exponentials for individuals with incomes of $15,000 per year (Y = 15)
and $30,000 per year (Y = 30) are:

                                   61





                                Y = 15              Y = 30
          Mode                V      exp(V)       V      exp(V)

          Drive Alone       -1.17     0.31      -0.83     0.44

          Carpool           -1.08     0.34      -0.92     0.40

          Bus               -1.25     0.29      -1.13     0.32

          Sum                         0.94                1.16


The corresponding choice probabilities are:


                                Y = 15         Y = 30
               Mode            Pr(Mode)       Pr(mode)

               Drive Alone       0.33           0.38

               Carpool           0.36           0.34

               Bus               0.31           0.28

               Sum               1.00           1.00

Note that the probabilities of choosing the relatively inexpensive modes
(carpool and bus) are higher for the low-income individuals than for the
high-income ones.
   Now suppose that it is desired to predict the effects of increasing
the bus fare by $0.25. This fare change is represented in the
multinomial logit model by increasing the value of Cbus by $0.25. The
resulting choice probabilities are

                                   62





                                Y = 15         Y = 30
               Mode            Pr(Mode)       Pr(mode)

               Drive Alone       0.34           0.38

               Carpool           0.37           0.35

               Bus               0.29           0.27

               Sum               1.00           1.00

These probabilities reflect a shift away from the bus because of its
increased cost and resulting lower utility.  You should compute the
probabilities yourself to make sure you understand how they were
obtained.

4.6   Alternative-Specific Constants

   In the logit model used in Example 4.2, two modes have equal
probabilities of being chosen if they have equal travel times and travel
costs. (As a check of your understanding of the logit model, explain why
this is so.) In practice, however, other factors, such as comfort,
reliability, and safety, may cause one mode to have a greater
probability of being chosen than another, even if the two modes have
equal travel times and costs.  The best way to account for the effects
of such other factors is to include variables representing-them in the
deterministic component of the utility function.  However, this often is
not possible in practice, since many of these factors are difficult to
measure and predict.  An alternative method that always can be
implemented easily consists of adding appropriate constant terms to the
deterministic components of the utility functions of all the modes
except one.  These constants are called alternative-specific constants. 
The mode whose deterministic utility component does not include such a
constant is called the base mode.  The alternative-specific constant

                                   63





for a given mode is the average amount that factors not included in the
deterministic component of the utility function contribute to the
difference between the utilities of the given mode and the base mode. 
In other words, it is the average contribution of the error terms to the
differences between the two modes' utilities.  It does not matter which
mode is selected as the base mode; the values of the choice
probabilities will be the same for any base mode if the values of the
alternative-specific constants are assigned correctly. (Alternative-
specific constants are sometimes called bias constants since they seem
to represent biases by travelers toward or against other modes compared
to the base mode.  However, this term is misleading.  The constants do
not represent biases.  They represent the average effects of variables
not present in the model.)
   The following example illustrates the use of alternative-specific
constants.

   Example 4.3: Alternative-Specific Constants

   Suppose that the deterministic components of the utility functions of
drive alone, carpool, and bus are

   V  =  0.8 - T   - 5C   /Y                                     (4.10a)
    DA          DA     DA

   V  =  0.2 - T   - 5C   /Y                                     (4.10b)
    CP          CP     CP

   V  =      - T   - 5C   /Y.                                    (4.10c)
    B           B      B

   In this case, bus is the base mode, and the alternative-specific
constants for drive alone and carpool are 0.8 and 0.2, respectively. The
signs and magnitudes of these constants indicate that on the average,
factors other than travel time and cost that affect mode choice tend to
favor drive alone over both carpool and bus and carpool over bus.

                                   64





   To illustrate the effects of alternative-specific constants on logit
choice probabilities, suppose that the travel time and cost values are
the same as in Example 4.2 and that Y = 30.  Then the values of the
deterministic components of utility with and without the alternative-
specific constants are:

                    Without Constants  With Constants
   Mode             V      exp (V)      V      exp(V)

   Drive Alone    -0.83     0.44      -0.03     0.97
   Carpool        -0.92     0.40      -0.72     0.49
   Bus            -1.13     0.32      -1.13     0.32

   Sum                      1.16                1.78

The resulting logit choice probabilities with and without the
alternative-specific constants are:

                           Without         With
                          Constants      Constants
          Mode            Pr(Mode)       Pr(mode)

          Drive Alone       0.38           0.54

          Carpool           0.34           0.28

          Bus               0.28           0.18

          Sum               1.00           1.00

The choice probabilities with the alternative-specific constants are
very different from and more realistic than those obtained without the
constants.
   Any mode can be selected as the base mode when alternative-specific
constants are introduced into a model.  It does not matter which mode is
the

                                   65





base.  The choice probabilities will be the same, regardless of base, if
the differences between the values of the alternative-specific constants
for any two alternatives are the same for all choices of base.  For
example, suppose that drive alone, rather than bus, had been selected as
the base mode in Example 4.3. Suppose, in addition, that the
deterministic components of the utility functions had been

   V     =        -  T   -  5C   /Y                              (4.11a)
    DA                DA      DA

   V     =  -0.6  -  T   -  5C   /Y                              (4.11b)
    CP                CP      CP

   V     =  -0.8  -  T   -  5C   /Y.                             (4.11c)
    B                 B       B

Then, just as in equations (4.10), the difference between the
alternativespecific constants for drive alone and carpool is 0.6 (that
is, 0.0 - (-0.6)), the difference between the constants for drive alone
and bus is 0.8 (that is, 0.0 - (-0.8)), and the difference between the
constants for carpool and bus is 0.2 (that is, -0.6 - (-0.8)). You
should verify that logit models based on Equations (4.10) and (4.11)
yield the same choice probabilities by evaluating these probabilities
for the values of the travel time, cost, and income variables used in
Example 4.2.

4.7   Independence from Irrelevant Alternatives

   One of the most important properties of the multinomial logit model
is independence from irrelevant alternatives (IIA).  The IIA property
states that for any individual, the ratio of the probabilities of
choosing two alternatives is independent of the availability or
attributes of any other alternatives.  For example, in a multinomial
logit model of choice between drive-alone, carpool, and bus, the
probabilities of choosing drive alone and carpool are

                                   66





                        exp (V  )
                              DA
      Pr(DA)   =  --------------------------------------         (4.12a)
                  exp(V   ) + exp(V   ) + exp(V  )
                       DA          CP          B

   and

                        exp (V  )
                              CP
      Pr(CP)   =  -------------------------------------          (4.12b)
                  exp(V   ) + exp(V   ) + exp(V  )
                       DA          CP          B

The ratio of these probabilities is

   Pr(DA)      exp(V  )
                    DA
   -------  =  -------- =  exp(V   -  V   )                       (4.13)
   Pr(CP)      exp(V   )        DA     CP
                    CP

This ratio is independent of the attributes and availability of bus. 
The ratio is the same regardless of whether bus is an available
alternative.
   In the general multinomial logit model, the probability of choosing
alternative i when there are J alternatives in the choice set is given
by Equation (4.7). Equation (4.7) implies that for any two alternatives
i and k,

   Pr(i)       exp(V  )
                    i
   ------   =  ---------   =  exp(V   -  V  )                     (4.14)
   Pr(k)       exp(V  )            i      k
                    k

This equation shows that the ratio Pr(i)/Pr(k) depends only on Vi and
Vk, The ratio is the same regardless of which other alternatives, if
any, are in the choice set and regardless of the attributes of any other
alternatives.
   The IIA property limits the responses to transportation changes that
can be predicted by the multinomial logit model.  For example, if the
available modes are drive alone, carpools and bus, a multinomial logit
model predicts that the proportion of non-bus travelers choosing carpool
(the ratio Pr(CP)/[Pr(DA) + Pr(CP)]) is independent of the quality of
bus service.  Therefore, an improvement in bus service would be
predicted by a multinomial logit model to draw travelers from drive
alone and carpool in proportion to the original shares of these modes. 
The improvement in bus service would not be predicted to draw travelers
mainly from carpools, say, unless carpooling were the dominant non-bus
mode.  This is an important

                                   67





consequence of the IIA property that will be discussed further in
Subsection 4.7a.
   There are two other important practical consequences of the IIA
property, in addition to the limitations it places on the predictions
that can be made by multinomial logit models.  These are:
   1. It greatly simplifies the process of predicting the consequences
      of adding a mode to the choice set.
   2. It provides great flexibility in the forms of data that can be
      used to calibrate models.

The first of these consequences is discussed in Subsection 4.7b.
Discussion of the second consequence is an advanced topic that will not
be treated in this course.

    4.7a Limits on the Applicability of Multinomial Logit Due to IIA

The IIA property limits the effectiveness of the multinomial logit model
in predicting choices and changes in choices in certain circumstances. 
An extreme example of this problem is called the red bus/blue bus
paradox.

   Example 4.4: The Red Bus/Blue Bus Paradox

   Suppose the modes available for travel between home and work are
drive alone and a bus that is painted red (red bus or RB).  Assume that
the attributes of drive alone and red bus are such that VDA = VRB. 
Then the binomial logit formula (Equations (4.1) and (4.2)) implies that
Pr(DA) = Pr(RB) = 0.5. Now suppose a competing bus operator starts
operating a bus painted blue (blue bus or BB) on the same route as the
red bus.  The blue bus uses exactly the same kind of vehicle, runs on
exactly the same schedule, and serves exactly the same stops as the red
bus.  The only

                                   68





difference between the red and blue buses is their color.  If color does
not affect choice of mode, then initiation of blue bus service should
cause existing bus riders to divide evenly between the red and blue
buses.  The addition of blue bus to travelers' choice sets should have
no effect on travelers who choose to drive alone because it does not
affect the relative service quality of drive alone and bus. (The
assumption that the red and blue buses have identical schedules implies
that effective service frequency is unchanged by the initiation of blue
bus service.) Therefore, the choice probabilities following the
initiation of blue bus service should be Pr(DA) = 0.5, Pr(RB) = 0.25,
and Pr(BB) = 0.25.
   Now consider the prediction made by the logit model.  Since the red
and blue buses are identical in all attributes relevant to mode choice,
VRB = VBB.  In addition, VDA = VRB by assumption.  Therefore, the
deterministic components of the utilities of the three modes drive
alone, red bus, and blue bus are equal.  Let V denote this common value. 
Then for any of the three modes

                        exp(V)
   Pr(mode) =  ---------------------------   =  1/3.              (4.15)
               exp(V) + exp(V) + exp(V)


According to this equation, introduction of the blue bus causes the
share of drive alone to decrease from 1/2 to 1/3 of the travelers.  That
is, 1/3 of the original drive alone travelers are predicted to switch to
bus. (To obtain this result, suppose that there are 30 travelers in all. 
Then before the initiation blue bus service, the number predicted to
drive alone is 0.5(30) = 15.  If choices after the initiation of blue
bus service are given by Equation (4.15), then the number of travelers
choosing to drive alone after blue bus service starts is predicted to be
30(1/3) = 10.  Thus 1/3 of the drivers alone are predicted to switch to
bus.) This result is both

                                   69





inconsistent with the expectations developed in the previous paragraph
and unreasonable.

   The red bus/blue bus paradox provides an important illustration of
the possible consequences IIA, but it is extreme.  A more realistic
example of the effects of IIA is the following:

   Example 4.5: Effects of the IIA Property

   Consider an individual who has a choice between drive alone, carpool,
bus, and light rail.  Let the deterministic component of the logit
utility function be

   V     =  0.8   -  T   -  0.25C                                (4.16a)
    DA                DA         DA

   V     =  0.2   -  T   -  0.25C                                (4.16b)
    CP                CP         CP

   V     =  - 0.2 -  T   -  0.25C                                (4.16c)
    B                 B          B

   V     =        -  T   -  0.25C    ,                           (4.16d)
    LR                LR         LR

where LR denotes light rail, and T and C are the travel time in hours
and travel cost in dollars.  Let the values of T and C be

               Mode         Time      Cost

               Drive Alone  0.50      2.00
               Carpool      0.75      1.00
               Bus          1.20      0.50
               Light Rail   1.00      0.75

Then the values of V and the choice probabilities are:

                                   70





               Mode           V      exp(V)  Pr(Mode)

               Drive Alone  -0.20     0.819    0.458
               Carpool      -0.80     0.449    0.251
               Bus          -1.53     0.217    0.121
               Light Rail   -1.19     0.304    0.170

               Sum                    1.789    1.000

    Now suppose that the cost of traveling by light rail increases by
$0.50. If bus and light rail operate in the same corridors, we would
expect that most individuals diverted away from light rail would choose
to travel by bus.  However, according to the logit model, the new choice
probabilities are:

               Mode           V      exp(V)  Pr(Mode)

               Drive Alone  -0.20     0.819    0.467
               Carpool      -0.80     0.449    0.256
               Bus          -1.53     0.217    0.123
               Light Rail   -1.31     0.270    0.154

               Sum                    1.755    1.000

The logit model's prediction of change in the probability of choosing
each mode is shown in the following table:

                                   71





                              Pr(Mode)
                   Before Cost   After Cost
         Mode       Increase      Increase    Change

         Drive Alone  0.458         0.467     +0,009

         Carpool      0.251         0.256     +0,005

         Bus          0.121         0.123     +0,002

         Light Rail   0.170        -0.154     -0,016

         Sum          1.000         1.000      0,000

Notice that the probability of choosing each mode other than light rail
is predicted to increase in proportion to its original share.  This is a
consequence of the IIA property, which requires the ratios Pr(drive
alone)/Pr(carpool) and Pr(drive alone)/Pr(bus) to stay constant when the
cost of light rail travel increases (see equation (4.14)). In aggregate
terms, the riders who stop using light rail when its cost increases are
predicted to distribute themselves among the remaining modes in
proportion to the initial probabilities of choosing the remaining modes. 
Therefore, most of the riders who leave light rail are predicted to
drive alone since drive alone has the highest initial choice probability
(0.458). However, such a result, though possible (e.g., if bus and light
rail operate in different corridors so that bus is not a feasible
alternative for light rail travelers), is not necessarily realistic. 
For example, it is not consistent with our expectations if bus is an
alternative to light rail.  This inconsistency between the predictions
of the logit model and reasonable expectations limits the usefulness of
the multinomial logit model in situations such as this one.

                                   72





   In most existing models, IIA problems involving trade-offs between
competing transit modes are avoided through the simplifying assumption
that transit travelers choose the transit mode that provides the fastest
travel.  Thus, the choice between bus and light rail is most commonly
treated during the building of paths through the transit network, rather
than in predicting mode shares.  The mode choice model includes only a
single, generic transit mode that involves bus travel for certain trips
and light rail travel for others.
   Combining transit modes in this way, however, can lead to serious
prediction errors.  Moreover, the potential problems posed by IIA are
not restricted to choices among transit modes.  They can also arise in
choices among automobile-based modes, such as drive alone and carpool. 
Therefore, it is worthwhile to consider how IIA problems can be avoided
without combining modes.
   Frequently, it is possible to avoid unrealistic consequences of IIA
by including additional variables in the deterministic component of the
utility function.  As an illustration of this, suppose that in Example
4.5 the light rail travelers are mainly individuals who do not have cars
and, therefore, are highly unlikely to choose to drive alone.  Then
individuals who choose not to use light rail after the cost increases
will switch mainly to carpool and bus.  If carpooling is difficult for
individuals who do not have cars available, then individuals who stop
using light rail will switch mainly to bus.  These effects can be
accommodated within a multinomial logit model by including the variable
automobile ownership in the deterministic component of the utility
function.  The following example illustrates this.

                                   73





Example 4.6: Avoiding the Unrealistic Consequences of IIA Suppose, as in
Example 4.5, that travelers choose between the modes drive alone,
carpool, bus, and light rail.  However, let the deterministic components
of the utilities of these modes be

   V     =  -2.84 -  T     -  0.25C    +  4.5A                   (4.17a)
    DA                DA           DA

   V     =  -2.17 -  T     -  0.25C    +  3.5A                   (4.17b)
    CP                CP           CP

   V     =  -0.20 -  T     -  0.25C                              (4.17c)
    B                 B            B

   V     =        -  T     -  0.25C   ,                          (4.17d)
    LR                LR           LR

where A is the number of automobiles owned by the traveler's household. 
As in Example 4.5, let the values of travel time (T) and travel cost (C)
be:

                           Time           Cost
              Mode        (Hrs.)           ($)

              Drive Alone  0.50           2.00

              Carpool      0.75           1.00

              Bus          1.20           0.50

              Light Rail   1.00           0.75

Then the values of V and exp(V) for travelers whose households own 0, 1,
and 2 cars are:

               0 Cars                1 Car               2 Cars
 Mode         V      exp (V)      V      exp(V)       V      exp (V)

 Drive Alone-3.83     0,022     0.664     1.94      5.16      174.

 Carpool    -3.17     0,042     0.334     1.40.     3.83      46.2

 Bus        -1.53     0,217     -1.53     0,217     -1.53     0.217

 Light Rail -1.19     0,305     -1.19     0,304     -1.19     0.304

 Sum                  0.586               3.86                221.

                                   74





The multinomial logit choice probabilities according to automobile
ownership level are:

                                         Pr(Mode)
           Mode           0 Cars         1 Car          2 Cars

           Drive Alone    0.0368         0.503          0.789

           Carpool        0.0720         0.362          0.209

           Bus            0.370          0.0562         0.0001

           Light Rail     0.521          0.0790         0.0014

           Sum            1.000          1.000          1.000

Notice that bus and light rail are used mainly by 0-car owners and that
drive alone and carpool are used mainly by 1- and 2-car owners.  Suppose
that 25% of the travelers under consideration own 0 cars, 50% own 1 car,
and 25% own 2 cars.  Then the aggregate share of each mode in the
population as a whole can be obtained by substituting the choice
probabilities according to automobile ownership level into the formula

   Share(Mode) =  0.25Pr(Mode for A = 0) + 0.50Pr(Mode for A = 1)

               +  0.25Pr(Mode for A = 2).                         (4.18)

The results of this substitution are shown in the following table:

                                    Aggregate
                         Mode         Share

                         Drive Alone  0.458

                         Carpool      0.251

                         Bus          0.121

                         Light Rail   0.170

                                   75





Notice that these aggregate shares are exactly the same as the choice
probabilities in Example 4.5.
   Now assume that the cost of light rail transit increases by $0.50.
The following table shows the resulting values of V and exp(V) according
to automobile ownership for each mode:

                 0 Cars              1 Car               2 Cars
 Mode          V      exp(V)       V      exp(V)       V      exp(V)

 Drive Alone -3.83     0.022     0.664     1.94      5.16      174.

 Carpool     -3.17     0.042     0.334     1.40      3.83      46.2

 Bus         -1.53     0.217     -1.53     0.217     -1.53     0.217

 Light Rail  -1.31     0.269     -1.31     0.269     -1.31     0.269

 Sum                   0.550               3.83                221.


The new choice probabilities according to automobile ownership level
are:

                                   Pr(Mode)
         Mode             0 Cars     1 Car    2 Cars

         Drive Alone      0.0392     0.508     0.789

         Carpool          0.0767     0.365     0.209

         Bus               0.395    0.0567    0.0001

         Light Rail        0.489    0.0704    0.0012

         Sum               1.000     1.000     1.000

The changes in the choice probabilities according to automobile
ownership level can be obtained by subtracting Pr(Mode) before the
increase in light rail cost from Pr(Mode) after the increase.  The
results are:

                                   76





                          Change  in Pr(Mode)
         Mode             0 Cars     1 Car    2 Cars

         Drive Alone      +0.0024   +0.005     0.000

         Carpool          +0.0047   +0.003     0.000

         Bus              +0.025    +0.0005    0.000

         Light Rail       -0.032    -0.0086   -0.0002

Notice that, as expected, the travelers who have changed mode are mainly
those who own 0 cars and that the mode change by these travelers
consists mainly of switching from light rail to bus.
   The aggregate shares following the increase in the cost of light rail
travel can be obtained from Equation (4.17). These shares and the
changes in shares caused by the cost increase are:

                      Share
                   Before Cost    After Cost
       Mode         Increase       Increase         Change

       Drive Alone    0.458          0.461          +0.003

       Carpool        0.251          0.254          +0.003

       Bus            0.121          0.127          +0.006

       Light Rail     0.170          0.158          -0.012

       Sum            1.000          1.000           0.000

Notice that in contrast to the situation in Example 4.5, the bus share
now increases by twice as much as either the drive alone or the carpool
share.  This result is consistent with expectations when light rail and
bus serve the same corridors and the light rail travelers consist mainly
of individuals without cars. (Recall that expectations under these
conditions were developed in Example 4.5.) Thus, the change in the
specification of

                                   77





the deterministic component of the utility function has remedied the
unreasonable consequences of the IIA property that were found in Example
4.5.

   Example 4.6 has shown how the unreasonable consequences of the IIA
property can be alleviated by including an additional variable in the
deterministic component of the utility function.  Another way to
alleviate these consequences is to base predictions on a model, other
than the multinomial logit model, that does not have the IIA property. 
Discussion of such models is beyond the scope of this course.  Readers
interested in learning about them should consult the book by Ben-Akiva
and Lerman listed in the references at the end of Module 1.

   4.7b Introduction of New Modes

   An important problem in transportation analysis is the prediction of
ridership on new travel modes.  One of the advantages of the IIA
property is that it greatly simplifies the process of predicting the
effects of adding a new mode to the choice set.  The following example
illustrates how this is done.

   Example 4.7: Introduction of a New Mode

   Consider a traveler who can choose between drive alone and carpool. 
Let the probabilities with which these modes are chosen be given by a
binomial logit model in which the deterministic component of the utility
function is as in Equation (4.9). Let the the traveler's income be
$20,000 per year, and let the values of the travel time and cost
variables be:

                                   78





                                Time           Cost
              Mode             (Hours)          ($)

              Drive Alone        0.5           2.00

              Carpool            0.6           1.00


Then, the values of V and the choice probabilities are

           Mode          - V          exp(V)         Pr(mode

           Drive Alone  -1.00          0.37           0.46
           Carpool      -0.85          0.43           0.54

           Sum                         0.80           1.00

   Now suppose that bus service is initiated and that it has a travel
time of 0.8 hr. and a fare of $0.60. The probability that a traveler
will choose bus and the new probabilities that drive alone and carpool
are chosen can be obtained from the multinomial logit model (Equation
4.7) if the value of the deterministic component of bus utility is
known.  This value can be obtained by substituting the values of T and C
for bus into Equation (4.9) to obtain VB = 0.95. The resulting
computation of the mode choice probabilities for the three-mode choice
set is:

           Mode           V           exp(V)         Pr(mode)

           Drive Alone  -1.00          0.37            0.31
           Carpool      -O.85          0.43            0.36
           Bus          -0.95          0.39            0.33

           Sum                         1.19            1.00

                                   79





   In Example 4.7, the deterministic component of the utility function
does not contain alternative-specific constants.  In practice, these
constants usually are present, which makes it necessary to assign a
value to the alternative-specific constant for the new mode before
predictions of the effects of adding this mode can be made.  This value
usually must be assigned judgmentally in practice.  Although guidance as
to an appropriate range of values sometimes can be obtained by examining
mode choice models developed for cities where the new mode already is in
operation, there is almost always considerable uncertainty as to the
best value to use.  As a result, there is likely to be considerable
uncertainty as to the effects of introducing the new mode.

4.8 Summary

   This module has presented the binomial and multinomial logit models
and has explained how they can be used to describe probabilistic choice
behavior.  Several examples have illustrated the properties of these
models, including their sensitivity to changes in the deterministic
components of utility, their dependence on utility differences, the
importance of alternative-specific constants, the IIA property, and
their ability to facilitate prediction of the effects of introducing a
new mode.

                                EXERCISES

   4.1   Use Equation (4.3) to compute the probability that alternative
         1 is chosen for the following values of the deterministic
         components of utility:

                                   80





                      Case       V1       V2

                        6        1.0      -1.5
                        7        0.5      -2.0
                        8        3.0      -1.0
                        9        0.0      -0.5
                       10        0.0       2.5

      a. Referring to Table 4.2, compare cases 4, 5, and 7. What can you
         conclude about the importance of utility values as opposed to
         utility differences?
      b. Compare cases 4 and 10.  What can you conclude about the
         relation between these cases?

   4.2   Repeat the analysis shown in Example 4.1 for the case where the
         deterministic component of the utility of carpool is 1.5.
         Repeat again using 1.0 as the deterministic component of the
         utility of carpool.

   4.3   Suppose that in Example 4.7 two new modes were added, bus and
         bicycle.  Let the travel time and cost of bus be as in the
         example.  Let the travel time of bicycle be 1.0, and let its
         cost be 0.
      a. Compute the choice probabilities of all four modes, drive
         alone, carpool, bus, and bicycle, using Equation (4.9). Assume
         that income is $20,000 per year.
      b. Predict the choice probabilities for all four modes using
         Equations (4.12) after adding an appropriate equation for the
         deterministic component of the bicycle utility function.  Let
         the value of the alternative-specific constant for bicycle be -
         1.0.

                                   81





                                MODULE 5

                     VARIABLES OF MODE CHOICE MODELS

5.1   Introduction

   Probabilistic choice models generally, and logit models in
particular, make it possible to develop useful mode choice models that
do not include all of the variables that influence mode choice.  This is
a very important property of probabilistic choice models since, as was
discussed in Module 3, the variables that influence mode choice are not
all known to analysts and not all of the known variables can be measured
in practice.  It does not follow, however, that a model based on any
subset of the influential variables will be useful.  On the contrary,
there are certain types of variables that must be included to obtain a
useful model.  This module identi fies these classes of variables and
explains the forms that the variables can take.
   This module also identifies variables that have been found useful in
previously developed mode choice models.  However, it does not provide a
list of standard variables that always should be used in mode choice
models or a standard procedure for selecting variables.  No such list or
procedure exists.  The appropriate variables to use in a model depend on
the purposes for which the model is to be used and on the available
data.  They also depend on behavioral relations that normally are
revealed in the process of model development.  In fact, the process of
selecting variables for a practical mode choice model is as much an art
as a science.  It relies as much on judgment and experience as on
statistical techniques.  The purpose of this module is to provide
information that will contribute to informed





and sensible judgments in the selection of variables.  Statistical
techniques that can help to guide the selection of variables are
discussed in Section 6.5 of Module 6.
   Throughout this module, the term "utility function" will mean the
deterministic component of the utility function of a logit model.  The
modifier "deterministic component of the" will not be used.

5.2   Classes of Variables That Must Be Included in Models

   There are three kinds of variables that must be included in a model
to make it useful: (1) policy variables, (2) variables that affect mode
choice and that identify any demographic characteristics or population
groups of interest, and (3) other variables that influence mode choice
and are correlated with either the policy variables or the variables
used to identify demographic characteristics and population groups.
   In addition, the utility function of each mode except one should
include an alternative-specific constant.  The use of alternative-
specific constants was discussed in Section 4.6 of Module 4. As was
explained in Section 4.6, it does not matter which mode is the selected
as the base mode whose utility function does not include an alternative-
specific constant. (To minimize the complexity of the subsequent
discussion, the utility functions in the examples presented in this
module and in Module 6 do not all include alternative-specific
constants.  The omission of alternativespecific constants from the
examples is for reasons of expository clarity and should not be
interpreted as contradicting the principle that these constants should
be included in the utility functions of practical models.)

                                   83





   5.2a Policy Variables

   One of the most important uses of mode choice models is predicting
the effects of policy measures.  For example, a transportation planner
may want to predict the change in bus ridership that will occur if bus
fares change or bus travel becomes faster.  Such predictions can be made
only if the model includes explanatory variables, called policy
variables, that represent the policy measures being considered.  For
example, policy variables such as bus fare and bus travel time must be
included in a model to make it useful for predicting the effects of
policy measures that would change fares or travel times.  A model that
will be used to predict the effects of measures to improve the
reliability of bus service must include policy variables, such as the
percent of on-time arrivals, that can represent the effects of the
measures being considered.

   Example 5.1: Policy Variables

   Consider the following binomial logit model of work-trip mode choice. 
The available modes are automobile and bus, and the probabilities of
choosing these modes are:

                  exp(-T  -  5C   /Y)
                        a      a
   P     =  ----------------------------------                     (5.1)
    auto     exp(-T - 5C  /Y) + exp(-T  - 5C  /Y)
                   a    a             b     b


   P     =  1.0 - P     ,                                          (5.2)
    bus            auto

where

   Ta, Tb    =  Automobile and bus travel times in hours;

   Ca, Cb    =  Automobile and bus travel costs in dollars;

      Y     =  Income of the traveler's household in thousands of
               dollars per year.

                                   84





In this model, Ta, Tb, Ca, and Cb are policy variables since their
values can be influenced by transportation policy measures.  Y is not a
policy variable since individuals' incomes are not directly influenced
by transportation policy measures.
   The model of equations (5.1) and (5.2) can be used to predict the
effects on mode choice of policy measures that change automobile or bus
travel times and costs since travel times and costs are policy variables
of the model.  In contrast, comfort and safety are not policy variables
of the model.  The model cannot predict the effects of policy measures
designed to influence these variables.

   5.2b  Variables That Affect Mode Choice and Identify Demographic
         Characteristics or Population Groups

   Often, it is important to be able to predict the effects of policy
measures on different groups in the population or to predict the effects
of changes in demographic characteristics of the population.  For
example, it may be important to know whether increasing bus fares will
be particularly burdensome to low-income travelers or whether a certain
improvement in transit service will succeed in attracting members of
multi-car households to transit.  A model can answer questions such as
these only if it includes variables that permit the effects of policy
measures on different population groups of interest to be
differentiated.  In the example of bus fares just given, the variables
income and automobiles owned would serve this purpose.  The model
discussed in Example 5.1 includes the variable Y (income) and,
therefore, is capable of differentiating among the effects of changes in
travel time (T) and travel cost (C) on different income groups.  The
model also is capable of predicting the effects on mode choice of any
changes in Y

                                   85





that may occur in the future.  The model does not include a variable for
automobile ownership and, therefore, is not capable of differentiating
among households with different levels of automobile ownership or of
predicting the effects of changes in automobile ownership.

   Example 5.2:   Effects on Different-Income Groups of a Change in Bus
                  Fare

   Suppose that the model of Equation (5.2) is used to predict the
effects on bus ridership of increasing the fare from $0.50 to $1.00 on a
certain route.  Assume that Ca = $1.50, Ta = 0.50 hr., and Tb = 1.0
hr. for the affected travelers and that these travelers include
individuals whose incomes are $20,000 and $40,000 per year.  Then,
before the fare increase, the probability that the lower-income
travelers choose bus is

                                       exp[-1.0 - 5(0.5)/20]
   P  (low, before)  =  -----------------------------------------------
    bus                    exp[-0.5 - 5(1.5)/20] + exp[-1.0 - 5(0.5)/20]

                     =  0.44,

and the probability that the higher-income travelers choose bus is

                                    exp[-1.0 - 5(0.5)/40]
   P  (hi, before)   =  -----------------------------------------------
    bus                 exp[-0.5 - 5(l.5)/40] + exp[-1.0 - 5(0.5)/40]

                     =   0.41.

After the fare increase, the probabilities of bus choice by the low- and
high-income travelers are

                                    exp[-1.0  -  5(1.0)/20]
   P  (low, after)   =  -----------------------------------------------
    bus                 exp[-0.5 - 5(1.5)/20] + exp[-1.0 - 5(1.0)/20]

                     =  0.41,

   and

                                    exp[-1.0 - 5(1.0)/40]
   P   (hi, after)   =  -----------------------------------------------
    bus                 exp[-0.5 - 5(1.5)/40] + exp[-1.0 - 5(1.0)/40]

                     =  0.39.

                                   86





Therefore, the fare increase is predicted to cause a reduction of 7
percent (100 x 0.03/0.44) in the probability that a low-income traveler
chooses bus but only a 5 percent (100 x 0.02/0.41) reduction in the
probability that a high-income traveler chooses bus.  Accordingly, the
fare increase is predicted to have a greater impact on the low-income
travelers than on the high-income travelers.  This prediction could not
have been made if the model did not include the variable Y that enables
income groups to be distinguished.

   5.2c  Other Variables That Influence Mode Choice and Are Correlated
         with the Policy, Demographic, or Grouping Variables

   Frequently, it is possible to identify variables that are not of
interest themselves but that affect mode choice and are correlated with
one or more of the policy or grouping variables in a model.  Such
variables also must be included in the model or else the model will give
incorrect predictions of the effects of the policy and grouping
variables.  For example, suppose that travel time (a policy variable)
and income (a grouping variable) are included in a model of mode choice. 
Suppose, also, that the number of automobiles owned by a traveler's
household -- a variable that is known to have a strong effect on mode
choice -- is not of interest in a particular study but that multi-car
households tend to have higher incomes and to live farther from their
workplaces than do single-car and non-carowning households.  Since
automobile ownership has an independent effect on mode choice, apart
from its association with travel time and income, a model that does not
include automobile ownership as a variable will give incorrect
predictions of the effects of travel time and income on mode choice. 
Such a model's predictions of the effects of travel time will reflect
not only the

                                   87





true effects of travel time but, also, the effects of differences in
automobile ownership that are associated with differences in travel time
through the tendency of households with large travel times to own many
cars.  In other words, the travel time variable will operate, in part,
as a surrogate for automobile ownership and, therefore, will not
correctly describe the true effects of changes in travel time alone. 
Similarly, the model's predictions of the effects of income on mode
choice will reflect both the true effects of income and the effects of
differences in automobile ownership that are associated with differences
in income through the tendency of high-income households to own many
cars.
   The prediction errors caused by omitting a variable that is
correlated with a policy variable are further illustrated by the
following example.

   Example 5.3:   Effect of Omitting a Variable That Is Correlated with
                  a Policy Variabla

   Suppose a model of choice between the modes automobile and bus has
the binomial logit form:

                     exp(V  )
                          a
   P     =  --------------------------                             (5.3)
    auto       exp(V ) + exp(V  )
                    a         b


                     exp(V  )
                          b
   P     =  --------------------------                             (5.4)
    bus        exp(V ) + exp(V  )
                    a         b


Let Va and Vb  the automobile and bus utility functions, have the
forms


         V  =  -T + 0.5A                                           (5.5)
          a      a


         V  =  -T  ,                                               (5.6)
          b      b


where Ta and  Tb, respectively, are automobile and bus travel time in
hours, and A is the  number of automobiles owned by the traveler's
household.

                                   88





Substitution of Equations (5.5) and (5.6) into Equation (5.3), followed
by some algebra, yields

   P     =  1/(1 + exp[-(T  -  T )  -  0.5A]).                     (5.7)
    auto                  b     a

The solid lines in Figure 5.1 show graphs of Pauto as a function of Tb
- Ta for each of three different values of A. As expected, the graphs
show that, given the same value of Ta - Tb, an individual whose
household owns several automobiles has a higher probability of choosing
automobile than does an individual whose household owns only one
automobile.
   Now suppose that the available data consist of measurements of
Pauto, Tb - Ta , and A for three groups of individuals, as follows:

         Group      Tb - Ta     A      Pauto

         1            0.25        1       0.68
         2            0.60        2       0.83
         3            0.75        3       0.90

These data are plotted as large dots in Figure 5.1. Notice that
increases in automobile ownership are associated with increases in Tb - 
Ta.  In other words, automobile ownership is positively correlated with
Tb - Ta . A model based on these data that included the policy
variable Tb - Ta but not the automobile ownership variable A would
conclude that-the relation between Pauto and Tb - Ta is the one
obtained by connecting the three dots.  This relation is shown by the
dashed line in Figure 5.1. Notice that this line is much steeper than
the solid lines.  In other words, the model that omits the automobile
ownership variable predicts that policy changes (i.e., changes in Tb -
Ta ) have larger effects on mode choice than they really have.  The
model makes this prediction error because, owing to the omission

                                   89





Click HERE for graphic.


                                   90





   of the variable A, the predicted effects of changes in Tb = Ta
   reflect not only the true effects of changes in this variable, but
   also the effects of the changes in A that are associated in the data
   with changes in Ta = Tb.

   5.2d  Alternative-Specific Constants
   An alternative-specific constant is a constant that is added to the
utility function of a mode and whose numerical value may be different
for different modes.  Example 4.3 in Module 4 illustrates the use of
alternative-specific constants.  For example, in the utility function


   V  =  0.8   -  T  -  5C   /Y                                   (5.8a)
    DA             DA     DA


   V  =  0.2   -  T  -  5C  /Y                                    (5.8b)
    CP             CP     CP


   V  =        -  T  -  C   /Y,                                   (5.8c)
    B              B     B


the alternative-specific constants are 0.8 for drive alone and 0.2 for
carpool.
   Alternative-specific constants provide a convenient way to account
for the average effects of all variables affecting choice that are not
explanatory variables of the model.  The number of alternative-specific
constants in a model must not exceed the number of modes in the model
minus one.  The predictions of the model are the same, regardless of
which mode is selected to be the one that has no alternative-specific
constant in its utility function.  The following example illustrates the
prediction errors that can occur when alternative-specific constants are
not included in a model.

   Example 5.4: Alternative-Specific Constants

   Suppose that choice among the modes automobile and bus is described
by a logit model.  Let the logit utility function be

                                   91





   V  =  0.5   -  T                                               (5.9a)
    a              a


   V  =        -  T  ,                                            (5.9b)
    b              b


where Ta and Tb, respectively, denote automobile and bus travel time,
and 0.5 is the value of the alternative-specific constant in the
automobile utility function.  The probability that automobile is chosen
is

                  exp(O.5  -  T  )
                               a
   P     =  ---------------------------------------               (5.10)
    auto       exp(O.5 - T  )  +  exp(-T  )
                          a              b

Equivalently,

                           1
   P     =  -------------------------------------                 (5.11)
    auto       1 + exp[-(T  -  T  ) - 0.5]
                          b     a

The solid line in Figure 5.2 shows a graph of Pauto as a function of
Tb - Ta,
   Now suppose that the available data consist of observations of the
mode choices of individuals for whom Tb - Ta = 0.50 hr.  The
probability that such individuals choose automobile can be computed from
equation (5.11) and is

                     1
   P     =  ---------------------------   =  0.73.
    auto       1 + exp(-0.5 - 0.5)

   The point Pauto = 0.73, Tb - Ta = 0.50 is identified by the solid
dot in Figure 5.2. A logit model of choice between automobile and bus
that did not include an alternative-specific constant would have the
form

                        1
   P     =  ---------------------------------                     (5.12)
    auto          1 + exp[-c(T   -  T  )]
                              b      a

where c is a positive constant.  According to this model, Pauto = 0.5
when Tb - Ta = 0. The open dot in Figure 5.2 identifies the point
Pauto = 0.5, TB - TA = 0. The logit model without an alternative
specific constant that fits the observed choices is the one illustrated
by the dashed line in Figure 5.2. This line corresponds to equation
(5.12) with c = 2.0. Notice that the dashed line is much steeper than
the solid line.  In other words,

                                   92





Click HERE for graphic.


                                   93





the model without the alternative-specific constant predicts that
changes in travel time have larger effects on mode choice. that they
really have.

5.3   Functional Forms and Disaggregation of Variables

   An important aspect of selecting variables for a model is deciding
how these variables should depend on the observed attributes of modes
and individuals.  For example, the attribute "travel time" might be
represented in a model by the variable T (travel time measured in, say,
hours) or it might be represented by the variable ln(T), the natural
logarithm of T. Alternatively, T might be disaggregated into the
components in-vehicle travel time, walk time, wait time, transfer time,
etc., and each of these components (or possibly its logarithm or some
other transformation) used as a separate variable of the model.  The
form in which an attribute such as travel time enters a model frequently
has important behavioral implications, and it can have a large effect on
the model's forecasts.
   Although the need to decide the relation between an attribute and the
variable representing it can arise with virtually any attribute that
might enter a model, the attributes that seem to cause the greatest
difficulty in practical mode choice modeling are travel time, travel
cost, income, and automobile ownership.  This section describes the
variables most frequently used to represent these attributes in practice
and explains the implications of different choices of variables.

   5.3a Travel Time -- Disaggregation

   The most important decision that must be made with respect to travel
time is whether it should be disaggregated into components (e.g., in-
vehicle travel time, walk time, etc.) and, if so, what the components
should be.

                                   94





Disaggregation admits the possibility that equal changes in different
components of travel time may have different effects on mode choice. 
For example, disaggregating travel time into in-vehicle and out-of-
vehicle components admits the possibility that a 5 minute increase in
in-vehicle travel time and a 5 minute increase in out-of-vehicle travel
time have different effects on mode choice. (In fact, experience
indicates that travelers consider out-of-vehicle travel time to be more
burdensome than invehicle travel time, so a 5 minute increase in out-of-
vehicle travel time does have a greater effect on mode choice than does
a 5 minute increase in in-vehicle travel time.) Similarly,
disaggregating out-of-vehicle travel time into the components walk time
and wait time admits the possibility that equal changes in the values of
these components have different effects on mode choice.  In contrast,
representation of travel time by the single variable "total travel time"
is equivalent to assuming that equal changes in the various components
of travel time have equal effects on mode choice.  Similarly, combining
all of the components of out-of-vehicle travel time into the single
variable "total out-of-vehicle travel time" is equivalent to assuming
that equal changes in the various components of out-of-vehicle travel
time (e.g., walk time, wait time, etc.) have equal effects on mode
choice.

   Example 5.5: Components of Travel Time

   Consider the following two logit models of choice between automobile
and bus:
                           exp(-T  )
                                 a
   Model 1 -- P      =  ---------------------                     (5.13)
              auto       exp(-T  ) + exp(-T  )
                               a           b


                                   95





                              exp(-0.48TI   - 1.21TO  )
                                         a          a
Model 2 -- P   =  ----------------------------------------------------  (5.14)
           auto   exp( -0.48TI  -  1.21TO ) + exp(-0.48TI  -  1.21TO  )
                              a          a               b          b


Models 1 and 2 -- P     =  1.0  -   P                             (5.15)
                   bus               auto


In these models,

   Ta, Tb    =  Total travel time by auto and bus in hours;

   TIa, TIb  =  In-vehicle travel time by auto and bus in hours;

   TOa, TOb  =  Out-of-vehicle travel time by auto and bus in hours.

Thus, Model 2 disaggregates travel time into the components in-vehicle
travel time and out-of-vehicle travel time, whereas Model 1 does not
disaggregate travel time.  Equations (5.13) and (5.14) are equivalent to


   Model 1 -- P      =  1/(1 + exp[-(T  -  T  )])                 (5.16)
               auto                   b     a


   Model 2 -- P      =
               auto

      1/(1 + exp[-0.48(TI  -  TI  )  -  1.21(TO  - TO  )])        (5.17)
                         b      a              b     a


   Suppose that at present, TIb = 0.5 hr., TIa = 0.4 hr., TOb = 0.30
hr., and TOa = 0.05 hr. (the base case).  Then Tb = 0.80 hr., and Ta
0.45 hr.  In Model 1,

   P     =  1/(1 + exp[-(0.80 - 0.45)]) = 0.59,
    auto

and in Model 2,

   P     =  1/(1 + exp[-0.48(0.5 - 0.4) - 1.21(0.30 - 0.05)]) = 0.59.
    auto

The probability that automobile is chosen is the same in both models.
   Now consider the effects of increasing TIb (in-vehicle travel time)
by 0.1 hr. while TOb (out-of-vehicle travel time) remains unchanged and
of increasing TOb by 0.1 hr. while TIb remains unchanged.  The values
of TIa and TOa remain as before.  The following table shows the new
values of Pauto obtained from the two models as well as the value of
Pauto in the base case:

                                   96





                      Pauto According toChange from Base Case
       Case           Model 1   Model 2   Model 1  Model 2

       Base            0.59      0.59       0.0      0.0
   Increase TIb       0.61      0.60      0.02     0.01
   Increase TOb       0.61      0.62      0.02     0.03

It can be seen that according to Model 1, which does not disaggregate
travel time into its components, increasing TIb by 0.1 hr. and
increasing TOb by 0.1 hr. have the same effect on mode choice -- they
both increase Pauto by 0.02.  However, in Model 2, which does
disaggregate travel time, increasing TIb by 0.1 hr. increases Pauto by
only 0.01, whereas increasing TOb by 0.1 hr. increases Pauto by 0.03.
In other words, in Model 2, the increase in out-of-vehicle travel time
has three times the effect of the same increase in in-vehicle travel
time.  In Model 1, travelers are equally sensitive to changes in in-
vehicle and out-of-vehicle travel time.  In Model 2, travelers are more
sensitive to changes in out-of-vehicle travel time than to changes in
in-vehicle travel time.
   The differences between Models 1 and 2 have practical policy
consequences.  For example, suppose that Model 2 is correct.  Then use
of Model 1 overstates the importance of in-vehicle travel time savings
and understates the importance of out-of-vehicle travel-time savings. 
Use of Model 1 will cause a bus operator trying to increase service
quality and ridership to place too much emphasis on in-vehicle travel
time and too little on out-of-vehicle travel time.

   5.3b Travel Time -- Mode-Specific Representation

   An issue that is related to the disaggregation issue is whether
travel time (or one or more of its components) should be represented as
a generic

                                   97





or a mode-specific variable.  Travel time (or one of its components) is
generic if it is represented by the same variable in all modes.  It is
modespecific if it is represented by different variables in different
modes.  For example, automobile travel time and bus travel time might be
represented by separate variables in the utility function, with the
value of the automobile travel time variable being zero for the transit
mode and the value of the transit travel time variable being zero for
the automobile
mode.
   Use of a mode-specific variable to represent an attribute admits the
possibility that travelers evaluate that attribute differently for
different modes.  Use of a generic variable excludes this possibility. 
In-vehicle travel time is a travel time component that sometimes is
represented by a mode-specific variable in mode choice models.  The
behavioral rationale for this is that transit travelers often can spend
their in-vehicle time reading or sleeping, whereas automobile travelers
(particularly if they are drivers) may not be able to do these things. 
Therefore, travelers may perceive transit in-vehicle travel time as
being less burdensome than automobile invehicle travel time.

   Example 5.6: Generic and Mode-Specific Travel Time Variables

   Consider the following two models of choice between automobile and
bus:
                           exp( -T  )
                                  a
   Model 1 -- P   =  -------------------------------              (5.18)
              auto      exp(-T  ) + exp(- T  )
                              a            b


                  =  1/(1 + exp[-(T  - T  )])                     (5.19)
                                   b    a


                              exp(-4.2TA   - 2.8TB  )
                                        a         a
   Model 2 -- P   =  -------------------------------------------------(5.20)
              auto    exp(-4.2TA  - 2.8TB  ) + exp(-4.2TA  -  2.8TB  )
                                a        a               b         b


                  =  1/(1 + exp[-4.2(TA  - TA  ) - 2.8(TB  -  TB  )])(5.21)
                                       b     a           b      a


                                   98





   Models 1 and 2 -- Pbus =  1.0 - Pauto                        (5.22)

In these models,

   Ta, Tb    =  Value of the generic variable "total travel time" for
                  automobile (a) and bus (b).
   TAa, TAb  =  Value of the mode-specific variable "automobile travel
                  time" for automobile (a) and bus (b).  TAb = 0 always
                  since no time is spent traveling by automobile if bus
                  is chosen.  The value of TAa is the same as the value
                  of Ta.
   TBa, TBb  =  Value of the mode-specific variable "bus travel time"
                  for automobile (a) and bus (b).  TBa = 0 always since
                  no time is spent traveling by bus if automobile is
                  chosen.  The value of TBb is the same as the value of
                  Tb.

Since TAb  =  TBa  =  0,  TAa  =  Ta ,  and TBb  =  Tb, Model 2 is
equivalent to


   Model 2 -- P    = 1/[1 + exp(4.2T  - 2.8T  )].           (5.23)
               auto                 a       b


The difference between Models 1 and 2 is that in Model 1, travel time is
a generic variable, whereas travel time is a mode-specific variable in
Model 2.

   Suppose that at present, Tb = 0.80 hr. and Ta 0.45 hr. (the base
case).  Then in Model 1,


      P     =  1/(1 + exp[-(0.80 - 0.45)])   =  0.59,
       auto

and in Model 2,

      P     =  1/[1 + exp[4.2(0.45) - 2.8(0.80)])  =  0.59.
       auto

The probability that automobile is chosen is the same in both models.

   Now consider the effects of increasing Tb by 0.1 hr. while Ta
remains unchanged and of decreasing Ta by 0.1 hr. while Tb remains
unchanged.  The following table shows the new values of Pauto obtained
from the two models as well as the value of Pauto in the base case:

                                   99





                 Pauto According to    Change from Base Case
    Case        Model 1   Model 2   Model 1   Model 2

    Base         0.59      0.59       0.0       0.0

    Increase Tb 0.61      0.65      0.02      0.06

    Decrease Ta 0.61      0.68      0.02      0.09

It can be seen that according to Model 1, in which travel time is a
generic variable, increasing Tb by 0.1 hr. and decreasing Ta by 0.1
hr. have the same effect on mode choice -- they both increase Pauto by
0.02. However, in Model 2, which treats travel time as a mode-specific
variable, increasing Tb increases Pauto by 0.06, whereas decreasing
Ta increases Pauto by 0.09. In other words, the change in automobile
travel time has a 50 percent larger effect than an equal but opposite
change in bus travel time.  In Model 1, travelers are equally sensitive
to changes in automobile and bus travel time.  In Model 2, travelers are
more sensitive to changes in automobile travel time than to changes in
bus travel time.

   5.3c Travel Time -- Functional Form

   The final decision that must be made about travel time variables is
what the functional form of the relation between the variables and the
physically measured attributes should be.  Two functional forms are used
frequently in practice, the linear and the logarithmic.  In the linear
form, the travel time variable is simply measured travel time (T), and
in the logarithmic form, the variable is the natural logarithm of
measured travel time (ln(T)).  Tcan represent either total travel time
or any component of travel time.  Adoption of the linear form is
equivalent to assuming that travelers find a given increase in T equally
burdensome regardless of the current value of T. For example, if T
denotes total travel time, use of

                                   100





the linear form means that adding 5 minutes to a 1-hr. trip is perceived
by travelers as being just as burdensome as adding 5 minutes to a 10-
min. trip.  In the logarithmic form, travelers find a given percentage
increase in T to be equally burdensome regardless of the current value
of T. Thus, for example, adding 5 minutes to a 10-min. trip -- a 50
percent increase in travel time -- is perceived as being just as
burdensome as adding 30 minutes to a 1-hr. trip.
   The linear and logarithmic forms of travel time yield predictions of
mode choice that may be very different from one another, as is
illustrated by the following example.

   Example 5.7: Linear and Logarithmic Forms of Travel Time

   Consider the following two models of choice between two modes that
will be called mode 1 and mode 2:

                        exp( -T  )
                               1
   Model 1 -- P   =  ---------------------                        (5.24)
              1      exp( -T ) + exp( -T  )
                            1           2


                  =  1/(1 + exp[ -(T   - T   )])                  (5.25)
                                    1     2


                        exp( -ln  T  )
                                   1
   Model 2 -- P   =  ---------------------------                  (5.26)
               1     exp(-ln T  ) + exp(-ln T  )
                              1              2


                  =  1/(1 + exp[-ln(   T   / T   )])              (5.27)
                                        1     2


                  =  1/( 1 + T   / T  )                           (5.28)
                              1     2


   Models 1 and 2 --  P =  1.0 - P  ,                             (5.29)
                       2          1

where T1  and T2 denote total travel time in hours by modes 1 and 2.   
Figure 5.3 shows a graph of the relation between P1 and T2 for each
model when T1 = 0.5 hr. It can be seen from the figure that when mode 1
is the faster mode, the two models yield similar values of P1. However,
the values of P1

                                   101





Click HERE for graphic.


                                   102





obtained from the two models differ greatly when T2 is less than about
0.3 hr.

   5.3d Travel Cost and Income

   Like travel time, travel cost can be divided into components (e.g.,
automobile fuel and maintenance costs, parking costs, tolls and fares,
etc.) and can be represented as either a generic or mode-specific
attribute.  In most mode choice models, however, travel cost is treated
as generic and is not divided into components.
   An important consideration in the selection of travel cost variables
is their interaction with income.  Economic theory and everyday
experience both suggest that a traveler's sensitivity to changes in
travel costs may depend on his income, with high-income travelers being
less sensitive than lowincome ones.  To represent this income
dependence, the travel cost variable in mode choice models often takes
the form C/Y, where Y is the total or after-tax income of the traveler's
household. (After-tax income is the better variable to use, because only
after-tax income can be allocated among travel and non-travel uses at
the discretion of the household.  After-tax income can be computed from
total income -- the only type of income data normally available to
transportation planners -- if the proportion of total income paid in
taxes as a function of income level is known.  Average values of this
proportion are published in Vital Statistics of the United States.)
   Income also can be used as a surrogate for unobserved personal
attributes that affect choice.  For example, suppose that commuters
whose households have high incomes have jobs or tend to engage in other
activities that cause them to particularly value the schedule
flexibility provided by the automobile.  This tendency can be
represented in a model of choice

                                   103





between automobile and transit by adding to the utility function of the
automobile mode a variable equal to the income (or, possibly, the
logarithm of the income) of the traveler's household.  Such an income
variable then acts as a surrogate for unobserved attributes, such as a
preference for schedule flexibility, that tend to make the automobile
mode particularly attractive to high-income travelers.  Either total
income or after-tax income can be used for this purpose.
   When income is used in this way, it always is a mode-specific
variable, and there must always be at least one mode whose utility
function does not contain such a variable.  Thus, for example, in a
model of choice between automobile and transit, income can be a variable
of either the automobile or the transit utility function but not of both
utility functions.  In a model of choice between drive-alone, carpool,
and transit, income can enter the utility functions of any two of the
three modes.  The model yields the same predictions of mode choice,
regardless of which alternatives are assigned the mode-specific income
variables and which alternative has no such variable.
   The following example illustrates the use of income as a surrogate
for unobserved personal attributes that affect mode choice.

   Example 5.8:   Use of the Income Variable

   In a logit model of choice between automobile and bus, let the
utility function be:

      V  =  -T  -  5C  /Y  +  0.001Y                             (5.30a)
       a      a      a

      V  =  -T  -  5C  /Y  +  0.001Y                             (5.30b)
       b      b      b

where T, C, and Y, respectively, denote travel time in hours, travel
cost in dollars, and after-tax income in thousands of dollars per year. 
Notice that

                                   104





the additive income term 0.001Y is present in the utilities of both
modes.  The probability that automobile is chosen is

                   exp( -T   -  5C  /Y + 0.001Y)
                          a       a
   P     =  ---------------------------------------------------------(5.31)
    auto    exp( -T - 5C  /Y + 0.001Y) + exp( - T   -  5C   /Y + 0.001Y)
                   a    a                        b       b

Dividing the numerator and denominator of equation (5.30) by exp( -Ta -
5Ca /Y - 0.001Y) yields the equivalent model

                           1
   P     =  -------------------------------------------           (5.32)
    auto     1 + exp[- ( T  - T  ) - 5( C  - C  )/Y]
                          a    b         a    b

Notice that the additive income term is no longer present in the model:
it has cancelled out in the division and, therefore, has no effect on
the probability that auto is chosen or the probability that transit is
chosen.  Income affects the choice probabilities only through the term
5(Ca - Cb)/Y in which income interacts with travel cost.

   Now suppose that the utilities are

      V  =  -T  - 5C  /Y + 0.001Y                                (5.33a)
       a      a     a

      V  =  -T  - 5C  /Y.                                        (5.33b)
       b      b     b

   In this  case the additive income term is present in the utility
function of only one of the two modes.  The probability that automobile
is chosen is

                  exp( -T  -  5C  /Y + 0.001y)
                         a      a
   P     =  ----------------------------------------------------- (5.34)
    auto      exp( -T  - 5C  /Y + 0.001Y) + exp( - T  - 5C  /Y)
                     a     a                        b     b


Dividing the numerator and denominator of (5.34) by exp( -Ta - 5Ca/Y +
0.001Y) yields the equivalent model

                                 1
   P     =  --------------------------------------------------    (5.35)
    auto     1 + exp[-(T  - T  ) - 5(C  - C  )/Y - 0.001y]
                        b    a        b    a

   Notice that the additive income term is present in equation (5.35).
It has not cancelled in the division. Its form is such that Pauto   
increases when Y increases, as is to be expected from the form of
equations (5.33) for the utility function.  Thus, the additive income
term affects the choice

                                   105





probabilities when it is excluded from the utility function of one mode
but not when it is included in the utility functions of all modes.

   5.3e Automobile Ownership

   The automobile ownership variable in mode choice models usually takes
one of the following three forms:

   1. A     Total number of automobiles owned by the traveler's
            household
   2. A/LD  Number of automobiles per licensed driver in the traveler's
            household
   3. A/W   Number of automobiles per worker in the traveler's
            household.

The second two forms represent the possibility that as the number of
licensed drivers or workers in a household increases, the likelihood
that any particular individual can have the use of an automobile
decreases.
   Regardless of which form of the automobile ownership variable is
used, it is usually mode-specific and enters the utility function
additively.  As with additive income variables, there must be, at most,
one fewer additive automobile ownership variables than there are modes
in the model.
   The following example illustrates the use of automobile ownership in
a mode choice model.

   Example 5.9: Automobile Ownership Variables
   Suppose that choice among the modes drive alone, carpool, and bus is
described by a trinomial logit model. Let the utility function be

   V  =  -  T  -  5C   /Y  +  0.1A/W                             (5.36a)
    DA       DA     DA

   V  =  -  T  -  5C   /Y  +  0.1A/W                             (5.36b)
    CP       CP     CP

                                   106





   V  = - T  - 5C  /Y,                                           (5.36c)
    B      B     B

where T and C, respectively, denote travel time in hours and travel cost
in dollars, Y denotes the after-tax income of the traveler's household
in thousands of dollars per year, and A/W denotes the number of
automobiles per worker in the traveler's household.  Notice that A/W is
mode specific (that is, its coefficient is different for different
modes) and that it enters the utility functions of only two of the three
modes.  The choice probabilities are:

                     exp(V  )
                          DA
   P   = -----------------------------------------               (5.37a)
    DA      exp(V  ) +  exp(V  ) + exp(V  )
                 DA          CP         B


                     exp(V  )
                          CP
   P   = -----------------------------------------               (5.37b)
    CP      exp(V  ) +  exp(V  ) + exp(V  )
                 DA          CP         B


                     exp(V  )
                          B
   P   = -----------------------------------------               (5.37c)
    B       exp(V  ) +  exp(V  ) + exp(V  )
                 DA          CP         B


   5.4   Other Variables

   Many other variables, in addition to those already discussed, can be
included in disaggregate mode choice models.  Table 5.1 lists some of
the variables that have been used in such models in the past.
   As an illustration of the use of some of these variables (no model
includes all of them), Table 5.2 shows the utility function of a mode
choice model that was developed using data from the San Francisco area.
(This model includes more variables than do many mode choice models,
which is why it was picked for this illustration.) There are three modes
in the model: drive alone, carpool, and transit.  Access to transit can
be either on foot or by automobile.  The table shows the variables of
the model and the values

                                   107





of their coefficients in the utility function.  Thus, the utility
function for mode i is


   V  =  -4.697X     -  3.658X      -  21.43C/Y -  0.0122IVTT
    i           i,DA          i,CP


         - 0.0327NW + 0.0000137YD.                                (5.38)


   The roles of the variables of the model that have not previously been
discussed in this course are as follows.  HD1 and HD2 represent the
inconvenience of having to wait for transit.  These variables account
for the effect of headway on wait time and, through its effect on wait
time, on mode choice.  The choice of headway variables is based on the
assumption that an increase in transit headway when the headway exceeds
8 min. is less onerous than an equal increase when the headway is less
than 8 min. (e.g., because when the headway exceeds 8 min., additional
waiting time can be spent at home or the office, rather than at the
transit stop).  The relative magnitudes of the coefficients of HD1 and
HD2 support this assumption.  CBD1 and CBD2 represent the effects of
variables other than in-vehicle travel time and walk time that may make
automobile travel to the CBD particularly difficult.  Examples of such
variables are the need to drive in heavily congested traffic and the
difficulty of finding a parking space.  CA accounts for the possibility
that the random components of the utilities of transit with auto access
and transit with walk access not have the same average values.  PW
captures the possibility that the "primary worker" in a household may
have a greater claim to an automobile than do other household members. 
NW captures the possibility that members of a multi-worker household can
form a carpool among themselves, thereby reducing the difficulty of
carpool formation and increasing the probability that the carpool mode
is chosen.

                                   108





5.5   The Selection of Choice Sets

   A problem related to that of selecting explanatory variables for a
model is that of selecting the choice set.  In principle, a traveler's
choice set consists of every mode whose probability of being chosen
exceeds zero.  In practice, this can include a large number of
infrequently chosen modes for which data acquisition may be difficult
(e.g., walk, bicycle, boat).  Except in studies where these modes are of
particular interest, little is lost by excluding them from the choice
set (i.e., making the approximation that they are never chosen).  Thus,
in practice, the choice set contains every mode whose probability of
being chosen is large enough to be practically significant.
   Even when obviously infrequently used modes (e.g., boat) are
excluded, selecting choice sets can present difficult decisions.  For
example, should drive alone be included in the choice set of a traveler
whose household does not own an automobile?  The answer is no if there
is no significant likelihood that such a traveler has access to an
automobile.  However, it may be yes if substantial numbers of non-
automobile-owning travelers borrow or lease cars or drive cars provided
by their employers. (The difficulty of deciding whether drive alone
should be included in the choice set is greatly reduced if the data
include information on the number of cars available to a household,
including cars not owned.  Drive alone usually can be safely excluded
from the choice set of a traveler whose household has no car available.)
   There are no rigorous analytic methods for assigning choice sets to
travelers.  The assignment must be based mainly on the experience and
judgment of the analyst.  This judgment should be exercised carefully
since, as is illustrated by the following example, the choice set can
have a

                                   109





substantial effect on a model's choice probabilities.  The method used
to assign choice sets in applying a model must be the same as the method
used in developing it.
Example 5.10: The Effect of the Choice Set on Choice Probabilities
Suppose that for travelers who have access to all three modes, choice
among the modes drive alone, carpool, and bus is described by a
multinomial logit model in which the utility function is


   V     =  -T -  5C   /Y  +  0.1A/W                             (5.39a)
    DA        DA    DA

   V     =  -T -  5C   /Y  +  0.05A/W                            (5.39b)
    CP        CP    CP

   V     =  -T -  5C   /Y,                                       (5.39c)
    B         B     B


where Ti is the travel time (in hours) by mode i, Ci is the cost (in
dollars) of travel by mode i, Y is the income of the traveler's
household in thousands of dollars per year, and A/W is the number of
automobiles that the traveler's household owns per worker in the
household.  Then, the choice probabilities are


                     exp(V  )
                          DA
   P   = -----------------------------------------                (5.40)
    DA      exp(V  ) +  exp(V  ) + exp(V  )
                 DA          CP         B


                     exp(V  )
                          CP
   P   = -----------------------------------------                (5.41)
    CP      exp(V  ) +  exp(V  ) + exp(V  )
                 DA          CP         B


                     exp(V  )
                          B
   P   = -----------------------------------------                (5.42)
    B       exp(V  ) +  exp(V  ) + exp(V  )
                 DA          CP         B


   It is a consequence of the IIA property of the logit model that if
Equations (5.40) - (5.42) describe mode choice by a traveler with access
to all three modes, then mode choice by a traveler who lacks access to
drive alone (i.e., a traveler whose choice set consists of carpool and
bus but not drive alone) is described by the following binomial logit
model:

                                   110





                  exp(V  )
                       CP
   P   = -------------------------------                          (5.43)
    CP      exp(V  ) + exp(V  )
                 CP         B


                  exp(V  )
                       B
   P   = -----------------------------                            (5.44)
    B       exp(V  ) + exp(V  )
                 CP         B


   In this model PDA = 0.


   Now suppose that for a particular traveler, TDA = 0.50 hr., TGP =
0.60 hr, TB = 1.0 hr., CDA = $1.00, CGP = CB = $0.50, Y = 20, and A
= 0. The traveler's household owns no car.  If drive alone is included
in this traveler's choice set, the choice probabilities are

   P     =
    DA

                        exp[-0.5 - 5(1/20)]
-----------------------------------------------------------------------
exp[-0.5 - 5(1/20)] + exp[-0.6 - 5(0.50/20)] + exp[-1.0 - 5(0.50/20)]


         =  0.37



   P     =
    CP
                  exp( -0.6 - 5(0.50/20)]
-----------------------------------------------------------------------
exp[-0.5 - 5(1/20)] + exp[-0.6 - 5(0.50/20)] + exp[-1.0 - 5(0.50/20)]


         =  0.38


   P     =
    B
                     exp[-1.0 - 5(0.50/20)]
-----------------------------------------------------------------------
exp[-0.5 - 5(1/20)] + exp[-0.6 - 5(0.50/20)] + exp[-1.0 - 5(0.50/20)]


         =  0.25.


If drive alone is not included in the choice set, the choice
probabilities are

   P  =  0
    DA



                     exp( -0.6 - 5(0.50/20)]
   P     =  -------------------------------------------------
    CP      exp[ -0.6 - 5(0.50/20)] + exp[ -1.0 - 5(0.50/20)]

         =  0.60


                     exp[ -1.0 - 5(0.50/20)]
   P  =  -----------------------------------------------------
    B       exp[-0.6 - 5(0.50/20)] + exp[ -1.0 - 5(0.50/20)]


         =  0.40.

                                   111





In this case, the decision whether to include drive alone in the choice
set makes a difference of 0.37 in the probability that drive alone is
chosen, 0.22 in the probability that carpool is chosen, and 0.15 in the
probability that bus is chosen.

5.6 Summary

   This module has been concerned with the problem of selecting
variables and choice sets for multinomial logit mode choice models.  It
can be seen that the analyst has considerable flexibility in making
these selections.  To a large extent, the analyst must rely on judgment
and past experience in deciding which variables to include in a model. 
However, there also are systematic, empirical procedures for testing
models.  These procedures help to guide the selection of variables by
enabling the analyst to determine whether a particular selection is
seriously inconsistent with the available data.  Procedures for testing
models in this way are discussed in Section 6.5 of Module 6.

                                   112





                                TABLE 5.1
VARIABLES THAT HAVE BEEN USED IN MODE CHOICE MODELS

Total travel time
Logarithm of total travel time
In-vehicle travel time
Logarithm of in-vehicle travel time
Out-of-vehicle travel time
Out-of-vehicle travel time divided by travel distance
Walk time
Walk distance
Wait time
Transit headway
Wait time for transfers
Number of transfers
Total travel cost
Total travel cost divided by income of traveler's household
Auto mileage-related cost divided by income of traveler's household
Auto parking cost divided by income of traveler's household
Auto tolls divided by income of traveler's household
Bus fare divided by income of traveler's household
Income of traveler's household
Size of traveler's household
Number of automobiles
   (automobiles per worker, automobiles per licensed driver) owned by
   the traveler's household
Number of licensed drivers in the traveler's household
Employment density at traveler's workplace
Dummy variable
   indicating whether the traveler's workplace is in the CBD
Dummy variable
   indicating whether traveler is the "primary worker" in household or
   head of household
Number of workers in traveler's household
Sex of traveler
Age of traveler

                                   113





                                TABLE 5.2
       VARIABLES AND COEFFICIENTS OF WORK-TRIP A MODE CHOICE MODEL
                      FOR THE SAN FRANCISCO AREAa

Symbol of
Variable             Definition of Variable            Coefficient

 X                Dummy variable equal to 1 if            -4.697
  i,DA
                     i = DA and 0 otherwise

 X                Dummy variable equal to 1 if            -3.658
  i,CP
                     i = CP and 0 otherwise

 C/Y             Travel cost (round trip cents)          -21.43

                      divided by household

                     income (annual dollars)

 IVTT                In-vehicle travel time               -0.0122

                      (round trip minutes)

 WT              Walk time (round trip minutes)           -0.0335

 HD1             Transit headway up to 8 minutes          -0.0155

                      (round trip minutes)

 HD2             Transit headway greater than 8           -0.0107

                             minutes

 TW              Transfer wait time (round trip           -0.0302

                            minutes)

 CBD1             Dummy variable equal to 1 for           -1.067

                  drive alone if the traveler's

                  workplace is in the CBD, and

                 equal to 0 for other modes and

                           workplaces

                                   114





 Symbol of
 Variable            Definition of Variable            Coefficient

 CBD2             Dummy variable equal to 1 for           -0.347

                    carpool if the traveler's

                  workplace is in the CBD, and

                 equal to 0 for other modes and

                           workplaces

 AWDA             Automobiles per worker in the            1.958

                    traveler's household for

                  drive alone, and 0 otherwise

 AWCP             Automobiles per worker in the            1.763

                    traveler's household for

                    carpool, and 0 otherwise

 AWTA             Automobiles per worker in the            1.389

                    traveler's household for

                    transit with auto access,

                       and zero otherwise

 CA               Dummy variable equal to 1 for           -1.237

                    transit with auto access,

                         and 0 otherwise

 PW               Dummy variable equal to 1 for            0.677

                   drive alone if the traveler

                  is the primary worker in his

                   household, and 0 otherwise

                                   115





 Symbol of
 Variable            Definition of Variable            Coefficient

 NW            Dummy variable equal to the number          0.327

                  of workers in the traveler's

                  household for carpool, and 0

                         for other modes

 YD            Disposable income (dollars) of the          0.0000137

                 traveler's household for drive

                 alone and carpool, and zero for

                             transit
___________________________

a Source: Cambridge Systematics, Inc. (1978), Analytic Procedures for
   Estimating Changes in Travel Demand and Fuel Consumption, Report
   DOE/PE/8628-1, Vol. II, U.S. Department of Energy, Washington, D.C.

                                   116





                                EXERCISES

5.1   Suppose you want to predict the effects on transit ridership of
      improving the reliability of transit service.  Give three examples
      of policy variables that could be used to represent transit
      reliability in a mode choice model.

5.2   Suppose the utility function of a mode choice model is

         V  =  -T�,

   where T is travel time in hours.  According to this utility function,
   is adding 5 minutes to a 15-minute trip more or less burdensome to
   travelers than adding 5 minutes to a 1-hour trip?  What would your
   answer be if the utility function were

               1/2
         V = -T   ?

           1/2
   (Note: T    is the square root of T.)

5.3   Let T, C, and Y, respectively, denote travel time in hours, travel
      cost in dollars, and after-tax income in thousands of dollars per
      year.  For each of the following utility functions, determine
      whether high-income travelers are more sensitive, equally
      sensitive, or less sensitive to changes in travel cost than are
      low-income travelers:

      a. V  =  -T - 0.01CY

      b. V  =  -0.05TY - 0.025C

      c.  V =  -T + 5ln(Y - C)

      d. V  =  -T - 0.01C + 0.2Y

      e.  V  -T - 0.01C - 4C/Y

5.4   Suppose, as in Example 5.9, that the utility function of a logit
      model of choice between drive alone, carpool, and bus is

                                   117





   V     =  -T -  5C   /Y  +  0.1A/W
    DA        DA    DA

   V     =  -T -  5C   /Y  +  0.05A/W
    CP        CP    CP

   V     =  -T -  5C   /Y.
    B         B     B

Show that  if the coefficients b1  and b2  have appropriate values,
you obtain the same   choice probabilities from the logit model whose
utility function is

   V     =  -T -  5C   /Y
    DA        DA    DA


   V     =  -T - 5C   /Y + b  A/W
    CP        CP   CP       1


   V     =  -T - 5C   /Y + b  A/W.
    B         B    B        2


Also, show that by choosing b3 and b4 appropriately, you obtain the
same choice probabilities from the logit model whose utility function is


   V     =  -T -  5C   /Y + b  A/W
    DA        DA    DA       3


   V     =  -T -  5C   /Y
    CP        CP    CP


   V     =  -T -  5C  /Y   +  b  A/W.
    B         B     B          4


What are the appropriate values of the b's?

                                   118





                                MODULE 6
            ESTIMATING THE UTILITY FUNCTIONS OF CHOICE MODELS

6.1   Introduction

   In practice, the deterministic component of a logit model's utility
function (called simply the utility function in this module) is never
known a priori.  In fact, an analyst who is in the initial stages of
developing a logit model typically has very little information about the
utility function.  At best, he is likely to have a list of variables
that he thinks should be present in the utility function.  But he may
not be certain that all of the variables are needed, and he is highly
unlikely to know how the variables should be transformed (if at all) or
the numerical values of any parameters that enter the utility function. 
For example, the analyst may believe strongly that a logit model of
choice between automobile and bus should include the variables in-
vehicle travel time (IVTT), out-of-vehicle travel time (OVTT), and
travel cost (C).  But he is less likely to have strong a prior beliefs
about such matters as:

   a. Whether OVTT should be subdivided into walking time and waiting
      time.
   b. Whether log (IVTT) gives a better representation of the effects of
      in-vehicle travel time than does IVTT.
   c. The values of the numerical coefficients that enter the utility
      function.  For example, suppose it has been determined that the
      appropriate form of the utility function is

         V  =  a  IVTT + a  OVTT + a  C.
                1         2         3

   What numerical values should be assigned to the coefficents all a1,
   a2, and a3 ?






These questions must be answered empirically by fitting one or more
models to appropriate data and then testing and comparing the models to
see which one best describes the data.  This module describes methods
for fitting and testing logit models.
   The need to fit a model to data arises in the process of developing
virtually any travel demand model, not just logit models.  This fitting
process is often called calibration.  However, the words "estimation"
and "testing" give a more precise description of the process that will
be discussed in this module since this process consists of estimating
the values of the numerical coefficients of models and testing different
models to determine which one best explains the available data.

6.2   Acquisition of Data

   The first step of any estimation and testing process is acquisition
of appropriate data.  The data needed to develop a logit mode choice
model are:
   a. Observations of actual mode choices by a random sample of
      individuals who made the types of trips to which the model will
      apply.  For example, if a model of work-trip mode choice is being
      developed, observations of mode choices by a sample of travelers
      to work are needed.  This sample is called the estimation sample.
   b. The corresponding values of all attributes of both the chosen and
      the non-chosen modes that may be used as variables of the model. 
      For example, suppose that total travel time is being considered
      for use as a variable of the model.  Then for each individual in
      the estimation sample, the data must include the observed value of
      total travel time for each mode available to that individual.  A
      logit model cannot be developed if the attribute data pertain only

                                   120





      to the chosen mode. (In practice, modes that are highly unlikely
      to be chosen can be considered unavailable and ignored in the
      dataacquisition step.)
   c. The values of any attributes of individuals that may be used as
      variables of the model.  For example, if automobile ownership is
      being considered for inclusion in the model, then the data must
      include the number of automobiles owned by the household of each
      individual in the estimation sample.

   Observations of the choices and attributes of individuals usually are
obtained from home-interview or telephone-interview surveys of randomly
sampled individuals or households.  Data on the attributes of modes
(e.g., travel times and costs) can be obtained from analyses of highway
and transit networks and transit schedules.  Since it is necessary to
have data on individual travelers, aggregate data sets, such as the U.S.
Census, cannot be used as sources of travel data unless special
procedures are implemented to remove the biases cause y aggregation of
the data.  These procedures are quite complex, and discussion of them is
beyond the scope of this course.
   The size of the sample needed to develop a logit mode choice model
usually is in the range 1000-3000 individuals, not counting non-
respondents and unusable responses.  Although the upper end of this
range is preferable to the lower, a mode choice model usually can be
developed satisfactorily from a sample of 1000 observations if cost or
other considerations prohibit acquisition of a larger data set.
   The data used for developing a logit model must consist of
observations of choices by individuals, and the attribute data must
pertain to these individuals.  Use of aggregate data, such as mode
shares according to

                                   121





traffic zone and average values of attributes according to traffic zone,
can result in the development of a highly erroneous model unless special
procedures are used to remove the effects of aggregation bias.
   Table 6.1 illustrates the data needed to develop a simple model of
choice between the modes drive alone, carpool, and bus.  The attributes
of modes and travelers that will be used to form the variables of this
model are total travel time and the number of automobiles owned by the
traveler's household.  The entry NA for the travel time of a mode
signifies that this mode either is not available to the traveler or is
so unlikely to be chosen that it can be treated as unavailable without
making a serious error.  Of course, in practice, many more attributes
would be included in the data set than are shown in Table 6.1, and the
data table would be correspondingly larger.  However, its form and
structure would be as shown in Table 6.1.

     TABLE 6.1 -- DATA FOR DEVELOPMENT OF A SIMPLE MODE CHOICE MODEL

         Autos     Chosen       Travel Time (Min.) By
Person   Owned     Mode       Drive Alone Carpool    Bus
   1       1      Drive Alone     20        30       45
   2       0      Bus             NA        25       35
   3       2      Drive Alone     15        22       60
   4       1      Carpool         30        35       55
   5       1      Carpool         10        12       25
   6       1      Bus             20        30       15
   7       0      Carpool         NA        20       15
   8       3      Drive Alone     30        40       75
   9       2      Carpool         10        12        8

                                   122





TABLE 6.1 (cont)


         Autos    Chosen        Travel Time (Min.) By
Person   Owned     Mode       Drive Alone Carpool    Bus

   10      1      Bus             50        60       40
   -       -      -                -         -        -
   -       -      -                -         -        -
   -       -      -                -         -        -
   -       -      -                -         -        -
   -       -      -                -         -        -
   -       -      -                -         -        -


6.3   Specifying the Model

   After the data have been acquired, the next step in developing a
logit model consists of specifying, tentatively, one or more forms of
the utility function.  This specification step usually includes
identifying the variables of the utility function, including any
transformations of the attribute data, and the functional form of the
relation between the variables and the deterministic component of
utility.  It usually does not include specifying the numerical values of
the constant coefficients that enter the utility function.  For example,
in developing a logit mode choice model from the data shown in Table
6.1, the following two alternative forms of the utility function might
be specified:


   Form 1:  V  =  a  T  +  a  A  +  a                             (6.1a)
             DA    1  DA    2        3


            V  =  a  T  +  a  A  +  a                             (6.1b)
             CP    1  CP    4        5


            V  =  a  T                                            (6.1c)
             B     1  B


   Form 2:  V  =  b  log(T   )   +  b  A  +  b                    (6.2a)
             DA    1     DA          2        3


            V  =  b  log(T   )   +  b  A  +  b                    (6.2b)
             CP    1     CP          4        5


            V  =  b  log(T   ).                                   (6.2c)
             B     1      B

                                   123





In these equations, T denotes travel time in minutes, A denotes
automobiles owned by the traveler's household, and a1 - a5 and b1 -
b5 are constant coefficients.  The specification of the forms (6.1) and
(6.2) at this stage does not imply that the analyst necessarily believes
either to be correct.  Rather, (6.1) and (6.2) are forms that the
analyst believes are worthy of estimation and testing.  During the
process of estimation and testing, information will be obtained that
helps to determine whether these forms should be modified (e.g., by
deleting one or more variable from either or both) and that provides a
way to determine which form best explains the observed choices of the
individuals in the estimation sample.

   6.4   Estimation -- The Maximum Likelihood Method

   The third step of model development consists of estimating the
numerical values of the models' coefficients (e.g., the values of a1 to
a5 and b1 to b5 in equations (6.1) and (6.2)) by fitting the models
to the available data.  The fitting technique that is usually used in
practice is called the maximum likelihood method.  This consists of
choosing the values of the coefficients so as to maximize the likelihood
(or probability) according to the model being developed of observing the
choices made by the individuals in the estimation sample.  It can be
shown that the maximum likelihood method yields estimates of the
coefficients and predictions of choice probabilities that have the
greatest possible accuracy.
   The values of the coefficients obtained by the maximum likelihood
method are called estimates because they are based on a data set that is
a random sample of all travelers who might make the choices being
modeled.  Re-estimation of the coefficients using a different random
sample from the same population of travelers would give different values
of the coefficients

                                   124





owing to the effects of random sampling error.  The "true" coefficient
values could be found only if estimation were carried out using all
individuals in the population of interest.  This is never possible in
practice.  Therefore, the values of the coefficents can be known in
practice only up to the effects of random sampling error.  This is why
the numerical values of the coefficients are called estimates.
   The following example illustrates the maximum likelihood method.

   Example 6.1: The Maximum Likelihood Method
   Suppose a logit model of choice between the modes automobile and bus
is being developed.  The only variable of the model is total travel
time, T. The deterministic component of the model's utility function is
specified as

   V  =  aT,                                 (6.3)

where a is a constant coefficient.  Suppose that the estimation sample
consists of observations of the mode choices of three individuals.  Of
course, a sample size of three is far too small to be useful in
practice, but it is convenient for illustrative purposes because it
enables all the necessary computations to be performed with a desk
calculator.  Maximum likelihood estimation with a sample of realistic
size requires the use of a digital computer.  Let the choices and travel
times for the individuals in the estimation sample be:

                                          Travel Time (Min.)
            Person      Chosen Mode     Automobile         Bus

               1           Auto             50             30
               2           Auto             10             20
               3            Bus             30             40

According to the model, the probabilities of the observed mode choices
are

                                   125





   Individual 1: P(Auto)   =   [exp(50a)]/[exp(50a) + exp(30a))
   Individual 2: P(Auto)   =   [exp(10a)]/[exp(10a) + exp(20a)]
   Individual 3: P(bus)    =   [exp(40a)]/[exp(30a) + exp(40a)]. 

Equivalently, the probabilities are

   Individual 1: P(Auto)   =   1/[1 + exp( -20a)]
   Individual 2: P(Auto)   =   1/[1 + exp(10a)]
   Individual 3: P(Bus)    =   1/[1 + exp( -10a)).

The probability of the entire estimation sample is

   L  =  P(person 1 chooses auto) x P(person 2 chooses auto)

         x P(person 3 chooses bus).

   Therefore,
               1                    1                    1
   L  =  ---------------   X  ----------------- X  ---------------
         (1 + exp( -20a)]     (1 + exp(10a)]       (1 + exp( -10a)]

                                                                   (6.4)

L is called the sample likelihood.  In practice it is customary to work
with the natural logarithm of L, which is called the log likelihood and
is denoted by log L. In this example

   log L =  -(log[1 + exp( -20a)] + log[1 + exp(10a)]

            + log[1 + exp( -10a)]).                                (6.5)

The likelihood of a logit model is always a number whose value is
between zero and one (because the likelihood is a probability). 
Therefore, the log likelihood is always a negative number.
   The maximum likelihood method chooses the value of a so as to
maximize L or, equivalently, log L. Although this usually requires the
use of a digital computer, it can be done graphically in this case. 
Figure 6.1 shows the graph of log L in Equation (6.5) as a function of
a.  It can be seen that the maximum occurs at a =  0.08. This value is
called the maximum likelihood estimate of a.

                                   126





                                   127





   The only difference between the use of the maximum likelihood method
in the foregoing example and the use of the method in actual practice is
that in practice, the model being estimated has many unknown
coefficients (e.g., the work trip mode choice model for San Francisco
discussed in Section 5.4 of Module 5 has 17 coefficents) and the
estimation data set contains many more than three observations.  As a
result, the values of the coefficients that maximize the likelihood and
log likelihood functions cannnot be found graphically; they must be
found using a digital computer.  Software for carrying out maximum
likelihood estimation of logit models is available for both mainframe
computers (e.g., the subroutine ULOGIT in the UTMS system) and
microcomputers (e.g., the MDA software package).

6.5   Interpreting the Estimation Results -- Testing the Model

   The outputs of logit estimation software include, in addition to the
estimated values of the model's coefficients, a variety of information
that is useful for interpreting the estimated coefficients, deciding
which variables should be included in the model, and comparing one model
with another to determine which best describes the data.

   6.5a  The Precision of the Estimates -- Standard Errors of the
         Estimates

   The outputs of most logit estimation software include, along with the
estimated values of the coefficients, a set of numbers called the
standard errors of the estimates.  The standard error of the estimate of
the value of a particular coefficient is an indicator of the amount by
which the estimated value of the coefficient is likely to differ from
the true value as a result of random sampling error.  Thus, the standard
error of the

                                   128





estimate is an indicator of the precision with which the coefficient has
been estimated.  If the model is correctly specified, then there is a
probability of 0.95 that the true coefficient value is within
1.96x(standard error of the estimate) of the estimated value.  In other
words, if b est is the estimated value of a coefficient, b true is its
unknown true value, and s is its standard error of estimate, the
following inequality holds with probability 0.95:

      b     =  1.96s � b    � b   + 1.96s.                         (6.6)
       est              true   est

Changing the number 1.96 to 1.645 or 2.575 causes the inequality to hold
with probability 0.90 or 0.99. The interval best = 1.96s to best +
1.96s is called a 95 percent confidence interval for btrue.  The
analogous intervals formed by replacing 1.96 by 1.645 and 2.575 are
called 90 percent and 99 percent confidence intervals.

   Example 6.2 -- Confidence Intervals

   Suppose the estimated value of a particular coefficient is 2.337 and
its standard error of estimate is 0.875. Then a 95 percent confidence
interval for the true value of the coefficient is 0.622 to 4.052. In
other words, the interval 0.622 to 4.052 contains the true value of the
coefficient with probability 0.95.

   6.5b Deciding Whether to Retain a Variable The t Statistic

   In addition to the standard errors of the coefficient estimates, most
logit software reports numbers called t statistics-of the coefficients. 
The t statistic of a coefficent is simply the estimated value of the
coefficient divided by the standard error of the estimate.  In other
words, t = best /s.

                                   129





   The t statistic of a coefficient is useful for deciding whether the
variable associated with that coefficient contributes significantly to
the ability of the model to describe or explain the data.  This makes
the t statistic useful for deciding whether a variable should be
retained in the model or dropped.  Variables with significant
explanatory power should be retained, whereas variables with little
explanatory power should be dropped.  In general, variables whose
coefficients have large positive or negative t statistics have greater
explanatory power than variables whose coefficients have t statistics
that are close to zero.  Thus, variables whose coefficients have large
positive or negative t statistics should be retained, whereas variables
whose coefficients have t statistics close to zero may be dropped from a
model.  There is no single "correct" dividing line between t values that
indicate a variable should be retained and t values that indicate a
variable can be dropped.  However, experience suggests that it is a good
policy to retain variables whose coefficients have t statistics that are
greater that 1.0 or less than -1.0. Failure to retain variables whose
coefficients have t statistics outside of the range -1.0 to 1.0 may
cause the estimates of the remaining coefficients to be seriously biased
and the resulting model to be highly erroneous.
   Variables whose coefficients have t statistics between -1.0 and 1.0
can be considered candidates for being dropped from the model.  Before
such variables are dropped, however, it is necessary to take account of
certain qualifications that will be discussed later in this subsection
and in subsection 6.5c. In addition, the analyst may have strong reasons
for believing a variable has an important influence on choice, even if
its t statistic is close to 0. In such a case, the analyst may decide to
retain. the variable, despite its low t statistic.  However, the
coefficient of the

                                   130





variable and, therefore, its influence on choice will be known very
imprecisely.  It will not be possible to conclude with confidence that
predictions obtained from a model that includes the variable are more
accurate than predictions obtained from a model that excludes it.

   Example 6.3 -- t Statistics

   Suppose that in a model of choice between automobile (A) and bus (B)
for travel to work, the utility function is specified as


   V  =  b  +  b  IVTT  +  b  OVTT  +  b  C  +  b  +  b  D        (6.7a)
    A     1     2     A     3    A      4  A     5     6


   V  =        b  IVTT  +  b  OVTT  +  b  C  ,                    (6.7b)
    B           2     B     3     B     4  B


where IVTT denotes in-vehicle travel time, OVTT denotes out-of-vehicle
travel time, C denotes travel cost, A denotes the number of automobiles
owned by the traveler's household, and D equals 1 if the traveler's
workplace is in the central business district and 0 otherwise.  Let the
estimation results be:

                          Estimated  Standard Error
 Coefficient  Variable     Value      of Estimate  t Statistic

     b1      Intercept     1.45          0.390         3.72
     b2        IVTT       -0.00897      -0.00632      -1.42
     b3        OVTT       -0.0308       -0.0106       -2.91
     b4          C        -0.115        -0.0262       -4.39
     b5          A         0.770         0.244         3.16
     b6          D        -0.561         0.783        -0.716

The t statistic of b6 is between -1.0 and 1.0, which suggests that the
variable D has little explanatory power and that this variable can be

                                   131





dropped from the model.  None of the other coefficients have t
statistics between -1.0 and 1.0, so none of the other variables can be
dropped.

   The fact that a coefficient has a low t statistic does not
automatically mean that the corresponding variable should be dropped
from the model.  Errors in specifying the model's utility function can
cause one or more coefficients to have low t statistics even if the
attributes their variables represent are important to mode choice.  For
example, if the correct way to represent a certain attribute is with the
variable X� but in the estimated model the attribute is represented
incorrectly by X, then the coefficient of X may have a low t statistic,
even if the attribute represented by X is important to mode choice.  In
this case, if the model were re-estimated with the variable X� in place
of X, the coefficient of X� might be found to have a very high t
statistic.  Thus, it is often useful to experiment with different
transformations of an attribute before concluding on the basis of t
statistics that the attribute need not be represented in the utility
function.
   Another situation in which a low t statistic may not indicate that a
variable should be dropped is when two or more coefficients have low t
statistics.  It is possible for the t statistics of several coefficients
to be low even though the variables associated with these coefficients
collectively have significant explanatory power.  In other words, it is
possible for individual variables to have low explanatory power while a
group of such variables has high explanatory power.  In this case, it
would be undesirable to drop any of the variables, despite the low t
statistics of their coefficients.  A method for identifying the
occurrence of this situation is presented in the next subsection.

                                   132





   There is one situation in which a t statistic outside of the range -
1.0 to 1.0 indicates that a model is erroneously specified.  If a
coefficient has a t statistic outside this range and its sign is
inconsistent with wellestablished theory, then the model involved is
almost certainly incorrect.  For example, the coefficient of travel cost
in a mode choice model should be negative.  Thus, for example, if
estimating a particular model yields a travel cost coefficient of +0.50
with a t statistic of 2.7, the model is almost certainly incorrect and
should be reformulated.

   6.5c  Deciding Whether to Retain a Group of Variables -- The
         Likelihood Ratio Test

   Most logit estimation software reports the value that the sample log
likelihood has when the values of the coefficients equal the maximum
likelihood estimates.  This maximum value of the log likelihood provides
the basis of a procedure for deciding whether a group of variables can
be dropped from a model.  The procedure is called a likelihood ratio
test.  Intuitively, it works as follows.  If the group of variables in
question has little explanatory power, then dropping them from the model
should have little effect on the maximum value of the log likeliood. 
Dropping one or more variables always will cause the maximum value of
the log likelihood to decrease, but it will not decrease by much if the
variables that have been dropped have little explanatory power.  In
other words, if the group of variables in question has little
explanatory power, the difference between the log likelihoods of models
estimated with and without these variables will be close to zero.
   The likelihood ratio test is carried out quantitatively as follows:

                                   133





      a. Estimate the model with all variables included.  Let log L1
         denote the resulting maximum value of the log likelihood.
      b. Drop the variables in question and re-estimate the model.  Let
         log L2  denote the resulting maximum value of the log
         likelihood.
      c. Compute the quantity LR = 2(log L1 - log L2  ). LR is called
         the likelihood ratio test statistic.  It usually must be
         computed by hand using the value of log L1 and log L2 
         reported by the logit estimation software.  LR is always a
         positive number.
      d. If LR exceeds an appropriately determined critical value, CV,
         then the variables being tested should be retained in the
         model, even if all of their coefficients have t statistics in
         the range -1.0 to 1.0.  If LR is less than C, then it may be
         desirable to drop the variables from the model.

   The critical value, CV, for the likelihood ratio test statistic
depends on the number of variables being tested.  Table 6.2 shows the
appropriate values of CV for testing 2 to 5 variables.  A likelihood
ratio test of one variable is equivalent to the t-test procedure
described in subsection 6.5b. Thus, there is no need to carry out a
likelihood ratio test of a single variable.
   The group of variables to which the likelihood ratio test is applied
should be selected before looking at the variables' t statistics.  It is
an incorrect use of the test to define the group as the set of variables
whose t statistics are between -1.0 and 1.0.

                                   134





TABLE 6.2 -- CRITICAL VALUES OF THE LIKELIHOOD RATIO TEST STATISTIC

          Number of Variables         Critical
             Being Tested               Value

                   2                    2.408

                   3                    3.665

                   4                    4.878

                   5                    6.064

   The use of the likelihood ratio test is illustrated by the following
example.

   Example 6.4: The Likelihood Ratio Test

   Suppose that estimation of the logit model of Example 6.3 (i.e., the
model whose utility function is specified in Equations 6.7) had yielded
the following results:

                           Estimated   Standard Error
 Coefficient   Variable      Value       of Estimate   t Statistic

     b1       Intercept      1.45          0.390          3.72

     b2         IVTT        -0.00897      -0.0163        -0.549

     b3         OVTT        -0.0308       -0.0106        -2.91

     b4           C         -0.115        -0.0262        -4.39

     b5           A          0.770         0.244          3.16

     b6           D         -0.561         0.783         -0.716

     log L         =       -374.4

                                   135





Suppose, also, that it is uncertain whether the variables IVTT and D
contribute significantly to the explanatory power of the model.  To
determine whether these variables can be dropped, re-estimate the model
using the utility function

   V  =  b  +  b  OVTTA +  b  C  +  b  A                          (6.8a)
    A     1     3      A    4  A     5


   V  =        b  OVTT  +  b  C  ,                                (6.8b)
    B           3     B     4  B

Suppose the results are as follows:

                              Estimated Standard Error
    Coefficient   Variable      Value     of Estimate  t Statistic

        b1       Intercept     2.67         0.438        6.10

        b3         OVTT       -0.0291      -0.0143       -2.04

        b4           C        -0.175       -0.0482       -3.63

        b5           A         0.567        0.163        3.48

       log L          =        -377.2

Then the likelihood ratio test statistic is LR = 2[( -374.4) - ( -
377.2)] 5.60. There are two variables being tested.  According to Table
6.2, the critical value of the likelihood ratio statistic for testing
two variables is 2.408. Since LR exceeds this value, the variables IVTT
and DXa together have significant explanatory power, even though
neither has a t statistic outside of the range -1.0 to 1.0. Although the
influence of each variable on choice can be estimated only very
imprecisely, neither variable should be dropped from the model.  Doing
so may seriously bias the remaining coefficients and lead to large
prediction errors.  In other words, it is not possible with this model
to make accurate predictions of the effects on mode choice of changes in
in-vehicle travel time or workplace location.  However,

                                   136





it is necessary to retain these variables in the model to prevent
predictions of the effects of changes in the other variables from being
biased.

   6.5d Other Uses of t Statistics and Likelihood Ratio Tests

   Other important uses of t statistics and likelihood ratio tests are
to determine whether attributes such as travel time should be decomposed
into components and to determine whether such attributes are generic or
mode specific.  The following examples illustrate how these
determinations can be made.

   Example 6.5:   Determining Whether Out-of-Vehicle Travel Time Should
                  Be Subdivided into the Components Walk Time and Wait
                  Time

   Consider, again, the logit mode choice model of Example 6.3 whose
utility function is given by Equations (6.7). If walk time and wait time
are evaluated differently by travelers, then the variable OVTT should be
replaced by its components WK (walk time) and WT (wait time).  The term
b3OVTT in the utility function should be replaced by b3WK + aWT, where
b3and a are constant coefficients.  But

      b3  WK + aWT  =  b3  (WK + WT) + (a - b3  )WT.            (6.9)

Since OVTT is the sum of WK and WT,

      b3  WK + aWT  =  b3  OVTT + (a - b3  )WT.                (6.10)

Let b7  =  (a - b3  ). Then

      b3 WK + a3WT - b7 OVTT + b WT.                           (6.11)


Therefore, the utility function for the model in which out-of-vehicle
travel time is decomposed into the components walk time and wait time
can be written in the form

                                   137





   VA  =                                                        (6.12a)

      b1  +  b2IVTTA  +  b3OVTTA  +  b4CA  +  b5A  +  b6D  + 
b7WTA


   VB =                                                         (6.12b)

            b2IVTTB  +  b3OVTTB  +  b4CB                   + 
b7WTB,


To determine whether the decomposition is worthwhile, estimate the logit
model using the utility function (6.12). If the t statistic of the
coefficient b7 is outside of the range -1.0 to 1.0, then the separate
variable WT should not be dropped from the model.  The decomposition of
outof-vehicle travel time into components adds significant explanatory
power to the model.  If the t statistic of b7 is between -1.0 and 1.0,
then WT does not have significant explanatory power and probably can be
dropped.  In other words, if the t statistic is between -1.0 and 1.0,
decomposition of out-of-vehicle travel time into its components is
unnecessary.

   Example 6.6: Generic or Mode-Specific Travel Time and Cost Variables 

   Refer, again, to the mode choice model of Example 6.3. Suppose it is
desired to determine whether in-vehicle travel time and cost should be
represented as mode-specific variables.  If in-vehicle travel time is
modespecific, then Equations (6.7) should be replaced by

   VA   =  b1   +  b2IVTTA + b3OVTTA + b4CA + b5A + b6D(6.13a)

   VB   =        b7IVTTB + b3OVTTB  + b4CB,               (6.13b)

where b  7  is a constant coefficient.  This representation causes the
in-vehicle travel time term of the utility function, b2IVTT for auto
and b7IVTT for bus, to be mode specific.  If, in addition, travel cost
is mode specific, the utility function becomes

   VA   =  b1   +  b2IVTTA + b3OVTTA  +  b4CA + b5A + b6D(6.14a)

   VB   =        b7IVTTB + b3OVTTB  +  b8CB,              (6.14b)

where b8 is a constant coefficient.  Equations (6.14) are equivalent to

   VA   =  b1 + b2IVTTA + b3OVTTA + b4CA + b5A + b6D   (6.15a)

                                   138





   VB   =                                                       (6.15b)

      b2IVTTB + b3OVTTB + b4CB  + (b7 - b2 )IVTTB + (b8 - b4
)CB,

Define new coefficients b9  and b10  by

   b9 = b7  -  b2                                             (6.16a)

   b10 =   b8  -  b4,                                         (6.16b)

Then the utility function with mode specific in-vehicle travel time and
travel cost variables can be written in the form:

   VA   =  b1  + b2IVTTA  +  b3OVTTA   +  b4CA  +  b5A + b6D(6.17a)

   VB   =  b2IVTTB  + b3OVTTB  +  b4CB

            +  b9IVTTB  +  b10CB,                            (6.17b)

   To determine whether the mode-specific representation of in-vehicle
travel time and travel cost is worthwhile, estimate the logit model
whose utility function is given by Equations (6.17). Then use a
likelihood ratio test to determine whether the terms b9IVTTB and
b10CB add significant explanatory power to the model.  If they do not,
then the mode-specific representation is unnecessary.  If they do, then
at least one of the attributes in-vehicle travel time and cost should be
represented as a mode specific variable in the model.  If only one of
the coefficents b9 and b10 has a t-statistic between -1.0 and 1.0,
then the corresponding attribute can be represented by a generic
variable.  If neither or both coefficients have t-statistics in this
range, then both attributes should be represented by mode-specific
variables. (If the likelihood ratio test indicates that the mode-
specific representations of travel time and cost add significant
explanatory power to the model, then at least one of the variables
travel time and travel cost must be treated as mode specific, regardless
of the values of the t statistics of b9 and b10, If, in addition, b9
and b10 both have t statistics between -1.0 and 1.0, it is not possible
to determine

                                   139





whether one of the variables -- either travel time or travel cost -- can
be treated as generic.)

   6.5e  Comparisons of Non-Nested Models -- The Modified Likelihood
         Ratio Test
   All of the tests of models that have been discussed so far have been
formulated as tests of whether a particular variable or group of
variables should be dropped from a model.  Not all tests can be
formulated this way.  For example, suppose that two logit models of mode
choice are under consideration.  It is desired to determine which model
best explains the available data.  Let the deterministic components of
the utility functions of these models be

   Model 1: V  =  a1T + a2C                                     (6.18)

   Model 2: V  =  b1log T + b2C,                                (6.19)

where T and C, respectively, denote travel time and travel cost, and the
a's and b's are constant coefficients.  The t and likelihood ratio test
procedures discussed in the preceding subsections cannot be used to
determine which model is best because neither model can be obtained by
adding variables to or dropping variables from the other.  Models that
cannot be obtained from one another by addition or deletion of variables
are said to be non-nested.
   Intuitively, one might expect that if one of two non-nested models
explains the available data better than the other model does, then the
better model should yield a larger maximum value of the sample log
likelihood.  Thus, one might expect that a test similar to a likelihood
ratio test can be developed for testing non-nested models against one
another.  This expectation is correct.

                                   140





   The modified likelihood ratio test procedure is as follows.  Let the
non-nested models be called models 1 and 2. Let log L1 and log L2,
respectively, denote the maximum values of the sample log likelihood for
models 1 and 2, and let K1and K2denote the numbers of estimated
coefficients in the two models. (For example, in the models represented
by equations (6.18) and (6.19), K1  = K2  =  2.) Assume that log L1 �
log L2,  which suggests that model 1 is preferred to model 2. (If log
L1 < log L2, then renumber the models to make log L1  � log L2, )
Define the modified likelihood ratio test statistic, MIR, by

   MLR  =  (log L1  -  K1 /2) - (log L2  -  K2 /2).           (6.20)

If MIR > 1.35, then model I explains the available data substantially
better than does model 2. Moreover, model 2 almost certainly is
misspecified and should be dropped from further consideration.
   The following example illustrates the comparison of two non-nested
models using the modified likelihood ratio test.

   Example 6.7 -- Comparison of Non-Nested Models

   Consider the logit mode choice models whose utility functions are as
follows:

   Model 1: V  =  a1log IVTT  +  a2log OVTT + a3C              (6.21)

   Model 2: V  =  b1T  +  b2C,                                  (6.22)

where T, IVTT, OVTT, and C are total travel time, in-vehcle travel time,
out-of-vehicle travel time, and travel cost, respectively, and the a's
and b's are constant coefficients.  Suppose that maximum likelihood
estimation of the two models yields the results log L1 = - 437.7 and
log L2  = -440.2. There are 3 estimated coefficients in model 1 and 2
in model 2. Therefore,

                                   141





K1  =   3 and K2  =  2. The value of the modified likelihood ratio
test statistic is

   MLR  =  ( -437.7 - 3/2) - ( -440.2 - 2/2)  =  2.00.

Since MLR exceeds 1.35, model 1 explains the data better than model 2
does, and model 2 is almost certainly incorrect.

   6.6   Some Estimation Problems and How to Avoid Them

   There are certain kinds of specification errors that make maximum
likelihood estimation of logit models impossible.  If one or more of
these specification errors occurs, estimation software will terminate
abnormally and, possibly, produce an error or warning message.  Any
estimates that are obtained under these conditions will be meaningless. 
The most frequently arising specification errors that make estimation
impossible will now be described.

      a.  Use of too many alternative-specific constants: In most
practical situations, the utility functions of logit mode choice models
include alternative-specific constants.  As was discussed in Section 4.6
of Module 4, the number of such constants that are included in a model
must not exceed the number of modes in the model minus one.  If the
number of alternativespecific constants equals the number of modes,
there will not be a unique set of coefficient values that maximizes the
sample log likelihood.  This usually will cause estimation software to
terminate abnormally or to produce a message indicating that an
estimation problem has occurred.
      b.  Incorrect specification of socioeconomic variables:
Socioeconomic variables, such as income and automobile ownership, have
the same values for all alternatives.  As was discussed in Section 5.3
of Module 5, these variables can enter a logit model only if they are
mode-specific or are

                                   142





multiplied or divided by an attribute variable whose values differ
across alternatives.  Socioeconomic variables have no effect on choice
probabilities if they enter a logit model's utility function as generic
variables that do not interact with variables whose values vary across
alternatives.  As a result, when generic socioeconomic variables that do
not interact with other variables are present in a logit model, there is
not a unique set of coefficient values that maximizes the sample log
likelihood, and abnormal termination of estimation software occurs.
   As was also discussed in Section 5.3, the number of mode-specific
socioeconomic variables must not exceed the total number of modes in the
model minus one.  Violation of this rule will cause coefficient
estimation to fail and logit estimation software to terminate abnormally
or produce an error message.

      C.  Perfect Collinearity of Variables: Perfect collinearity is a
condition in which one or more variables of the utility function are
exact linear combinations of other variables.  For example, suppose that
T, IVTT, and OVTT denote total travel time, in-vehicle travel time, and
out-ofvehicle travel time, respectively.  Let the utility function of a
logit mode choice model be specified as

   V  =  b1T  +  b2IVTT  +  b3OVTT  +  other terms.            (6.23)

Then perfect collinearity exists because T is an exact linear
combination of IVTT and OVTT.  Specifically, T  =  IVTT + OVTT.  The
problem that perfect collinearity poses for estimation can be seen be
rewriting (6.24) in the form

   V  =  b1 (IVTT + OVTT) + b2IVTT + b3OVTT + other terms      (6.24)

      =  (b1 + b2 )OVTT + (b1 + b3 )IVTT + other terms.       (6.25)

                                   143





Equation (6.26) shows that predictions of choice depend only on the
values of b1 + b2 and b1  +  b3.  But there are infinitely many
combinations of b1, b2 and b3 that yield the same values of b1 + b2
and b1 + b3.  As a result, it is not possible to find a unique set of
b values that maximizes the model's log likelihood, and attempts to
estimate the coefficients will result in abnormal termination of the
software.
   Perfect collinearity always causes there to be infinitely many
combinations of coefficient values that maximize the log likelihood and,
for this reason, logit estimation software terminates abnormally or
produces an error message when it occurs.

      d.  Models with one or more unbounded coefficients: It is possible
to create models in which some of the variables, when multiplied by an
infinite coefficient, perfectly explain the choices of a subset of the
travelers in the estimation data set without affecting the estimated
choice probabilities for the other travelers.  For example, suppose a
certain variable always equals one for observed transit users and always
equals zero for other travelers. (Such a variable might represent a
special preference for transit that only transit users are thought to
have.) Then, if the utility function coefficient of this variable is
infinity, the predicted probability of choosing transit will be I for
all observed transit users, but the variable will have no effect on the
choice probabilities of travelers not observed to choose transit.
   When this condition occurs, estimation software will not be able to
find a set of coefficient values that maximizes the log likelihood
(i.e., because a computer cannot represent an infinite constant).  The
estimation software will terminate abnormally, possibly after reporting
a series of numerical overflows and usually with an indication that the
coefficient

                                   144





values that maximize the log likelihood have not been found.  Restarting
the estimation procedure will consume computer time but will not produce
successful maximum likelihood estimates of the coefficients.

   6.7   Conclusions

   This module has explained the method normally used to estimate the
coefficients of logit mode choice models. it has also described
statistical procedures that can guide the selection of variables for and
testing of logit models.  The statistical procedures are invaluable aids
in model development, but it is important to understand that the use of
statistical methods alone cannot guarantee the development of a
satisfactory model.  As was discussed in Section 5.1 of Module 5, model
development is as much an art as a science, and judgment and experience
are important elements of the art.
   The need for judgment and experience, even when "objective"
statistical methods are available, arises mainly from the fact that
statistical tests cannot determine when a model is correct (or, at
least, sufficiently free of serious errors to be satisfactory for its
intended uses).  They can only determine that a model is wrong.  If a
model is found to be wrong, statistical methods rarely provide useful
insight into why it is wrong or how to correct it.  The analyst must use
judgment and experience to identify likely sources of error in the model
and to formulate modifications of the model that might remove the
errors.  The modified models then can be subjected to statistical tests
to determine whether they are seriously erroneous.  Thus, practical
model development always involves alternating between statistical
analysis and judgmental activities.

                                   145





                                EXERCISES


6.1   Suppose you want to develop a binomial logit model of choice
      between automobile and bus for the work trip.  The utility
      function of the model is

            V  =  aT,

      where T is total travel time in minutes, and a is a constant
      coefficient.  Suppose that the estimation sample consists of the
      following three observations:

                                             Travel Time (Min.)
            Person      Chosen Mode     Automobile         Bus

               1           Auto             20             25
               2           Auto             25             40
               3            Bus             30             40

      (Of course a real model would have a much more complicated utility
      function and would be estimated using a much larger data set. 
      However, this simple model and small data set are exactly right
      for this exercise.)

      a. What is the log likelihood of this sample according to the
         specified model? (Hint: Refer to Example 6.1.)
      b. Evaluate the log likelihood for a = -0.025, -0.045, and -0.065.
         Which value of a yields the largest value of the log
         likelihood?
      c. Can you find a value of a that yields a larger value of the log
         likelihood? (Hint: Evaluate the log likelihood using two new

                                   146





      values of a, one that is slightly larger and and one that is
      slightly smaller than the value that yielded the largest log
      likelihood in part b.)

6.2   In a logit model of choice between automobile and bus for travel
      to work, the utility function has the form

         V  =  a1IVTT + a2OVTT + a3LC + a4PC,

where IVTT denotes in-vehicle travel time, OVTT denotes walk time, LC
denotes linehaul cost (e.g., cost of automobile fuel and maintenance,
bus fare), PC denotes parking cost, and the a's are constant
coefficients.  The values of the coefficients were estimated by maximum
likelihood, and the following results were obtained:

                                    Estimated   Standard Error
     Coefficient     Variable         Value       of Estimate

         a1           IVTT           -1.72          0.79

         a2           OVTT           -2.36          0.52

         a3            LC            -0.79          0.55

         a4            PC            -0.84          0.41

Log likelihood: -179.37

      a. Do the estimation results suggest that there are variables of
         the model that do not significantly affect mode choice and,
         therefore, are candidates for being dropped?  If so, which one
         or ones?
      b. Since the estimated values of a3 and a4 are close, you would
         like to determine whether LC and PC can be combined into the
         single variable, total travel cost.  What model should be
         estimated in order to make this determination?  Write the
         utility function of

                                   147





      the model.  What estimation results will you use to decide whether
      to retain the decomposition travel cost into separate components?
   c. A second analyst has suggested adding two additional variables to
      the model: CBD, which is a variable equal to 1 for the automobile
      mode if the traveler works in the CBD; and TR, the number of
      transfers required if transit is used.  If the second analyst's
      suggestion is accepted, the utility function will be

         Va  =  a1IVTTa  +  a2OVTTa  +  a3LCa  +  a4PC  + 
a5CBD

         Vb  =  a1IVTTb + a2OVTTb  +  a3LCb  +  a6TR,

      where the subscripts a and b denote automobile and bus,
      respectively, and it is assumed that there is no parking cost if
      bus is chosen.  Estimation of this model yielded the results:

                                    Estimated   Standard Error
     Coefficient     Variable         Value       of Estimate

         a1           IVTT           -1.65          0.83

         a2           OVTT           -2.53          0.61

         a3            LC            -0.73          0.52

         a4            PC            -0.72          0.59

         a5            CBD           -0.23          0.47

         a6            TR            -0.09          0.39

Log likelihood: -178.22

      Based on these results, do you believe the variables suggested by
      the second analyst should be retained in the model?

                                   148





      d. A third analyst thinks that travelers may find out-of-vehicle
         travel time less onerous for long trips than for short ones. 
         This analyst has proposed specifying the utility function as

         V  =  a1IVTT + a2OVTT/DIST + a3LC + a4PC,

         where DIST denotes travel distance.  Estimation of this model
         yielded the results:

                                    Estimated   Standard Error
     Coefficient     Variable         Value       of Estimate

         a1           IVTT           -1.43          0.69

         a2         OVTT/DIST        -0.27          0.10

         a3            LC            -0.64          0.43

         a4            PC            -0.89          0.38

Log likelihood: -177.53

         Based on the estimation results, does the model with OVTT/DIST
         explain the data significantly better than does the model with
         OVTT?

                                   149





                                MODULE 7

                   PREDICTION WITH DISAGGREGATE MODELS

7.1   Introduction

   One of the main objectives of transportation analysis is to support
the evaluation of transportation plans and policies and, thereby, to aid
the transportation decision-making process.  The evaluation is based, in
part, on the predicted effects of alternative capital investment and
operating decisions on travel flows, levels of service, and external or
non-user impacts.  The decision process requires information about
aggregate travel volumes because these are important measures of system
performance and they affect travel service and external impacts.  Thus,
aggregate travel volumes are important outputs of travel demand
prediction models.
   The preceding modules have described the formulation and estimation
of models of individual travel behavior.  The decision to use models of
individual behavior, or disaggregate models, is based on theoretical and
empirical evidence that such models reduce data collection costs and are
necessary to properly capture the effects of changes in population
characteristics and transportation service attributes on travel
behavior.  To use these models for making predictions of aggregate
travel volumes, a method is needed to aggregate the model's predictions
of the behavior of individuals.  This module describes methods for
obtaining aggregate forecasts from disaggregate models.
   Figure 7.1 summarizes the principal steps involved in developing a
disaggregate model and using it to make predictions of aggregate travel. 
The main flow of activities is shown on the lower line.  The first block
on

                                   150





                               Figure 7.1

     Development and Application of Dissaggregate Mode Choice Models


Click HERE for graphic.


                                   151





the left represents model formulation, which includes identification of
the set of alternatives and selection of variables.  The next block
represents estimation of the model using a disaggregate data set.  The
results of the estimation process may lead to revisions of the model
formulation as indicated by the broken feedback arrow.  The next block
represents the process of prediction using the estimated disaggregate
model and predicted values of the model's variables.  The predicted
values of the variables describe anticipated future conditions,
including the effects of policy measures.  The final block represents
the output of the modeling and prediction process: predicted aggregate
travel volumes conditional on the predicted values of the model's
variables.
   The preceding modules have discussed the formulation and estimation
of disaggregate mode choice models.  This module is concerned with
methods for using an estimated disaggregate model to make aggregate
predictions.  Section 7.2 reviews the reasons for estimating
disaggregate rather than aggregate models.  Section 7.3 describes the
relations between aggregate and disaggregate travel behavior.  Section
7.4 describes and evaluates three methods that can be used to make
aggregate predictions with disaggregate models, and Section 7.5 presents
an example of the application of one of these procedures.

   7.2   Reasons for Estimating Disaggregate Models

   Since a special methods are needed to obtain aggregate predictions
from disaggregate models, it is reasonable to ask why disaggregate
models should be used for making such predictions.  It may seem that a
better procedure would be to estimate models directly from aggregate
data and use the resulting aggregate models to make aggregate
predictions.  There are two

                                   152





important reasons for not basing aggregate predictions on models
estimated from aggregate data.  First, estimation of models from
aggregate data does not use the data efficiently.  It wastes some of the
information contained in the data.  For example, aggregate data sets
often are obtained by computing average values of demographic
characteristics and travel behavior of individuals living in the same
geographical area (usually a traffic zone or district).  The use of such
average values discards information about differences among individuals
within the same district or zone.  The loss of such information, which
is expensive to collect, is a waste of resources.  It is particularly
important to avoid this waste in an era when data collection costs are
high and the resources available for transportation studies are limited. 
The use of disaggregate models, which use data more efficiently than do
aggregate models, makes it possible to collect less data than otherwise
would be needed and, thereby, to conserve planning resources.
   The second important reason for estimating disaggregate models rather
than aggregate ones is that estimation of models from aggregate data
often yields parameter estimates that do not correctly reflect the
relations that influence travel behavior.  Such incorrect estimation
will lead to an improper understanding of travel behavior and may result
in the design and selection of transportation alternatives that are not
effective in meeting public objectives.
   The incorrect estimation of parameters when aggregate data are used
results from complicated statistical relations in the data.  These
relations are difficult to explain in non-technical terms.  As an
alternative, two examples will used to illustrate the types of errors
that can occur.  The first example is based on estimation of a linear
regression model of household trip generation as a function of
automobile ownership.  The second

                                   153





example is based on estimation of a logit model of choice between
automobile and bus.  The linear regression example illustrates the
effects of data aggregation in a familiar context.  The mode choice
example illustrates these effects in the travel prediction context that
is the subject of this course.
   Each example assumes the existence of an underlying relation that
describes the behavior of the individual or household.  This relation is
used to generate simulated data about individuals or households.  The
data are then aggregated in various ways, and the aggregated data are
used to estimate models of trip generation and mode choice.  Finally,
the models estimated from the aggregated data are compared with the
original models to determine the extent to which the aggregate models
recover the true parameter values.

   Example 7.1:   A Linear Regression Model of Trip Generation

   Assume that the number of daily trips made by the members of a
household is related to the number of automobiles owned by the household
as follows:

      Trips =  2 + 2A + U,                                         (7.1)

where A is the number of automobiles owned, and U is a random term that
represents the effects on trip generation of variables other than
automobile ownership.  Assume that the probability distribution of U is:

                                   154





                           U         Probability

                          +2             0.1
                          +1             0.2
                           0             0.4
                          -1             0.2
                          -2             0.1

This distribution makes it possible to compute the probability that the
number of trips made by a household is 2 + 2A - 2, 2 + 2A - 1, and so
forth up to 2 + 2A + 2. For example, the probability that the number of
trips is 2 + 2A - 2 is 0.1.
   Suppose that the following data on trip generation by a sample of
households residing in 3 traffic districts have been obtained:

      District 1        District 2         District 3
      Autos   Trips     Autos   Trips     Autos  Trips

        1       2         1       4         1      5

        1       3         1       4         1      5

        1       3         1       4         1      6

        1       4         2       6         2      7

        2       4         2       6         2      7

        2       5         2       6         2      8

        2       5         2       6         3      8

        3       6         3       8         3      9

        3       7         3       8         3      9

        3       7         3       8         3     10


                                   155





   If linear regression is used to estimate the relation between trips
and automobiles owned from the individual data, the resulting estimated
model is

            Trips =  2 + 2A.

Thus, the dependence of trip generation on automobile ownership that is
given by the original model is recovered exactly.  This dependence is
illustrated by the solid line in Figure 7.2.

   Now suppose that the data are summarized by district average values
as follows:

              District 1          District 2     District 3
             Autos   Trips       Autos  Trips    Autos  Trips

Total         19      46          20     60       21     74

Average       1.9     4.6         2.0    6.0      2.1    7.4

If linear regression is used to estimate average trips per household
from these aggregated data, the resulting estimated model is

         Trips =  -22 + 14A.                                       (7.2)

This relation is illustrated by the dashed line in Figure 7.2. Although
the aggregate model correctly replicates the district averages, as
illustrated by the points in Figure 7.2, it incorrectly predicts that
increases in automobile ownership increase trips at the rate of 14 trips
per automobile on the average rather than two trips per automobile.  An
error of this magnitude would cause seriously erroneous predictions of
future travel under conditions of changed automobile ownership.
   The result just obtained is specific to the districts that were used
for aggregating the data.  The use of different districts produces
different estimation results.  For example, if the boundaries of the
districts are

                                   156





Click HERE for graphic.


                                   157





changed in a way that causes the first 5 households in district 2 to be
reassigned to district 1 and the last 5 to be assigned to district 3,
the total and average values of the data become:

                    District 1               District 3
                  Autos     Trips          Autos     Trips
   Total           26        70             34        110
   Average        1.73      4.67           2.27      7.33

Application of linear regression to these aggregate data yields the
estimated relation

         Trips =  -4 + 5A.                                         (7.3)

This relation is illustrated by the dashed line in Figure 7.3. Although
Equation (7.3) is closer than is Equation (7.2) to the original model,
it still seriously overestimates the effect of automobile ownership on
trip generation.
   In practice, it is not possible to know how close a particular set of
parameter values estimated from aggregate data is to the true parameter
values.  Therefore, it is not possible to select districts or other
groupings of the data that can be certain of producing parameter
estimates that are close to the true values.

   A similar pattern of bias in parameter estimates occurs when
aggregate mode share models are estimated instead of disaggregate mode
choice models.  Bias in aggregate mode share models is illustrated by
the following example.

                                   158





Click HERE for graphic.


                                   159





   Example 7.2: Aggregate Mode Share Models

   Let choice between automobile and bus for travel to work be described
by the binomial logit model whose utility function is

   Va    =  1.5  -  0.1Ta                                        (7.4)

   Vb  =         -  0.1Tb,                                       (7.5)

where the subscripts a and b signify automobile and bus, respectively,
and T is travel time in minutes.  Substitution of these utility
functions into the formula for binomial logit choice probabilities
(Equation (4.3) of Module 4) yields the following probability that
automobile is chosen:
                           1
      Pa  =   ------------------------------------                (7.6)
               1 + exp[-1.5 - 0.1(Tb - Ta)]

In a large sample, the proportion of individuals with a given value of
Tb- Ta who choose automobile is approximately equal to the probability
that a single individual with the same value of Tb - Ta chooses
automobile.  The following simulated data set is based on these
relations with individuals assigned to two traffic districts:

          District 1                         District 2
      Tb - Ta     Number Choosing     Tb  -  Ta    Number Choosing

       (min,)     Auto       Bus          (min.)     Auto       Bus
         20        49         1             15        47         3
         15        48         2             10        45         5
         10        48         3              5        42         8
          5        46         4              0        39        11
          0        43         7             -5        34        16
         -5        39        11            -10        31        19

                                   160





Estimation of the utility functions for automobile and bus using the
methods described in Module 6 and the foregoing individual data yields

         Va  =  1.496 - 0.101Ta                                  (7.7)

         Vb  =        - 0.101Tb.                                 (7.8)

The estimated utility function is almost identical to the original one
given in Equations (7.4) and (7.5).
   Now suppose that the travel times and mode choices are averaged
according to district to yield:

                                   District 1        District 2
Average Value of Tb = Ta (min.)     7.5               2.5
Auto Volume                         273               238
Bus Volume                           27                62
Auto Share                            0.910             0.793
Bus Share                             0.090             0.207

The estimated utility function corresponding to this aggregate data set
is

   Va  =  0.861 - 0.194Ta                                        (7.9)

   Vb  =        - 0.194Tb.                                      (7.10)

The model based on this utility function is illustrated in Figure 7.4.
As in the linear regression example, use of aggregate data to estimate a
model has yielded seriously biased parameter estimates.
   As in the linear regression case, grouping the data into different
geographical areas produces differently biased parameter estimates.  For
example, the districts that have been defined can be divided into two

                                   161





Click HERE for graphic.


                                   162





traffic zones each such that the average time differences and mode
shares are as shown in the following table and Figure 7.5:

                                 Zone      Zone      Zone      Zone
                                  1A        1B        2A        2B

Average Value of Tb - Ta (min)12.5       5.0       5.0       0.0

Auto Volume                     97       176       128       110

Bus Volume                       3        24        22        40

Auto Share                       0.970     0.880     0.853     0.733

Bus Share                        0.030     0.120     0.147     0.267


The estimated utility function based on the zonal data is

         Va  =  0.987  - 0.187Ta                                (7.11)

         Vb  =         - 0.187Tb.                               (7.12)

   This estimate is different from the one obtained using the district
data and from the original model of Equations (7.4) and (7.5). Still
different estimates would be obtained using different groupings of the
data.
   This example shows that, as with linear regression models, use of
aggregate data to estimate mode choice models is likely to produce
incorrect parameter estimates.  The errors in parameter estimation
depend on the grouping of observations.  The resulting models will give
incorrect predictions of the effects of changes in automobile or bus
travel times.  This failure to properly represent the effects of changes
in travel time occurs despite the fact that the estimated model
accurately replicates the aggregate mode shares in the data (as shown in
Figures 7.4 and 7.5). The estimation errors that are caused by use of
aggregate data typically increase in severity as explanatory variables
are added to the model.

                                   163





Click HERE for graphic.


                                   164





   Despite these problems in estimating models from aggregate data,
planning practice requires information on the aggregate share of
individuals choosing each mode.  Thus, it is necessary to develop
methods for correctly estimating models of travelers' responses to
service changes and to correctly predict aggregate mode shares and
volumes.  The preceding modules have described the formulation and
estimation of disaggregate models that correctly describe travelers'
responses to service changes.  The following sections of this module
describe how to use disaggregate models of individual choice to predict
aggregate shares and volumes.

   7.3   Relation between Individual Choices and Mode Shares

   If the choices of individuals are known with certainty, the number of
individuals choosing a given mode can be obtained by simple counting. 
The aggregate share of the mode is obtained by dividing the total number
of individuals choosing the mode by the total number of individuals in
the sample.  This is what was done to obtain the mode shares in Example
7.2. For example, in the two-district case, the automobile share for
district 1 was obtained by counting the individuals choosing car (273 in
district 1) and dividing by the total number of individuals in the
district 1 sample (300), thereby obtaining the share of 0.910.
   When only probabilities of choices are known, rather than actual
choices, the number of individuals predicted to choose a mode is the sum
of the probabilities of choosing that mode for all of the individuals in
the population of travelers.  Thus, for example, if the available modes
are automobile and bus, the numbers of individuals predicted to choose
each are

         Nauto   =  �  Pauto (n),                               (7.13)
                  n


                                   165





and

         Nbus =  �  Pbus (n),                                   (7.14)
                  n


where Pauto(n) and Pbus(n), respectively, are the probabilities that
individual n chooses automobile and bus.  The corresponding mode shares
are obtained by dividing the predicted numbers choosing each mode by the
total number of individuals in the population.  That is


   Sauto   =  (1/N) �  Pauto (n)                        (7.15)
                     n
and

   Sbus    =  (1/N) �  Pbus (n),                        (7.16)
                     n

where Sauto and Sbus, respectively, denote the automobile and bus
shares, and N is the total number of travelers by both modes combined. 
Thus, the predicted share of a mode is the average value of its choice
probability in the population of travelers.
   The following example illustrates this prediction process using a
binomial logit model of choice between automobile and bus.

   Example 7.3:   Aggregate Prediction with a Disaggregate Model

   Let the logit model's utility function be

   Va  =   0.5 - 0.1Ta + 0.5A                                   (7.17)


   Vb  =       - 0.1Tb,                                         (7.18)

where the subscripts a and b denote automobile and bus, respectively, T
denotes travel time in minutes, and A is the number of automobiles owned
by the traveler's household.  Substitution of this utility function into
Equation (4.3) yields the following formula for the probability that
automobile is chosen:

                                   166




                              1
   Pa   =  -------------------------------------------          (7. 19)
            1 - exp[ -0.5  -  0.1(Tb - Ta) - 0.5A]


   The following table gives the values of Pa corresponding to
various  values of Tb - Ta and A:

                 A  =  1                            A  = 2

      Tb - Ta  No. of                  Tb - Ta  No. of
       (min.)     Cases      Pa          (min.)     Cases      Pa

         10        20       0.881           30        20       0.989
          5        20       0.818           25        20       0.982
          0        20       0.731           20        20       0.971
         -5        20       0.622           15        20       0.953
         -10       20       0.500           10        20       0.924
         -15       20       0.378            5        20       0.881

The number of individuals predicted to choose automobile is obtained by
multiplying the number of cases for each time difference and automobile
ownership level by the corresponding probability that automobile is
chosen
   then, summing over all time differences and automobile ownership
levels.  The predicted automobile share is obtained by dividing the
number of individuals predicted to choose automobile by the number of
individuals under consideration.  The results of these computations are:

   Number predicted to choose auto:    192.6

   Predicted auto share:                   0.802.

   The method used in the foregoing example to obtain aggregate
predictions from a disaggregate model is straightforward.  However, to
use this method in practice, it would be necessary to compute mode
choice

                                   167





probabilities for each individual in the population of interest.  In the
example, this computation was easy because the numbers of travel time
differences and automobile ownership levels were small.  In practice,
there would be many more levels, and an impractically large data set
would be needed to compute choice probabilities for every individual. 
Therefore, it is necessary in practice to have procedures that do not
require enumerating the entire population to obtain aggregate
predictions from disaggregate models.  Several practical procedures are
described and evaluated in the next section.

7.4   Methods for Aggregate Prediction with Disaggregate Models

   This section describes the three most frequently used methods for
making aggregate predictions with disaggregate models.  The first is the
naive method, which is popular because of its simplicity and its
similarity to methods used in prediction with aggregate models.  The
second is the market segmentation method, which is also easy to apply
and can give substantially improved predictions using a small quantity
of additional data.  The third is the sample enumeration method, which
provides very good estimates of aggregate behavior when adequate data
are available.

   7.4a The Naive Method

   This method consists of substituting average values of all of the
explanatory variables into the utility equations of the logit mode
choice model.  The resulting average utility values then are substituted
into the logit formula to obtain estimates of mode shares.  This method,
however, does not yield the same aggregate share estimates that would be
obtained by computing the choice probabilities of individuals and
averaging these

                                   168





probabilities according to Equation (7.15). The difference between the
two estimates is a consequence of the fact that the average of a
nonlinear function (in this case the average of logit choice
probabilities) is not equal to the function evaluated at the average
values of its variables (in this case the logit choice probabilities of
an individual with the average values of the explanatory variables). 
The errors associated with the naive method are illustrated in the
following example.

   Example 7.4: Errors of the-Naive Aggregation method

   Consider the choice between automobile and bus for travel to work. 
Assume that the logit model of Equation (7.19) applies and that all of
the individuals of concern own one car (A  =  1 in Equation (7.19)).
Suppose that the difference between bus and automobile travel time,
Tb - Ta, is 10 min. for one half of the individuals and -5 minutes for
the other half.  Then the probabilities Pa that automobile is chosen
for the two groups of individuals are as follows:

                          Group 1        Group 2
         T  -  Tb (min.)  10             -5

         Vb  -  Va       -2.0           -0.5

         Pa                0.881          0.622

The true aggregate share of automobile in this case is the weighted
average of the choice probabilities of the individuals in the two
groups, where the weights are proportional to group size.  Since, in
this example, the sizes of the groups are equal, the weights are 0.5 for
each group and the aggregate share is 0.5(0.881) + 0.5(0.622) = 0.752,
as shown in Figure 7.6.

                                   169





Click HERE for graphic.


                                   170





However, when the naive method is used, the aggregate share is based on
the average difference between the utilities of bus and automobile.  In
this example, the average difference is 0.5( -2.0) + 0.5( -0.5) = -1.25.
The value of Pa corresponding to this utility difference is 1/[1 + exp(
-1.25)] 0.777, as is also shown in Figure 7.6. The naive method produces
an error of 0.026 in the predicted shares of automobile and bus.  The
percentage errors are 100(0.777 - 0.752)/0.752 - 3.3% for the automobile
share and 100(0.223 - 0.248)/0.248 = -10.1% for the bus share.
   Now consider the data used in Example 7.3. As was shown in that
example, the correct aggregate automobile share is 0.802. The naive
aggregation method sets the aggregate share equal to the logit
automobile choice probability at the average values of the variables. 
In Example 7.3, the average value of Tb - Ta is 7.5 min. and the
average value of A is 1.5. Accordingly, the aggregate automobile share
predicted by the naive method is 1/(1 + exp[-0.5 - 0.1(7.5) - 0.5(l.5)])
- 0.881. The correct share and the share predicted by the naive method
differ by 0.079. The corresponding percentage errors are 10% for the
automobile share and 40% for the bus share.  Thus, in this case the
naive method makes very large prediction errors.
   In practice, the sizes of the prediction errors made by the naive
method depend on the distributions of utility values in the population
for which the predictions are being made.  The foregoing example shows
that the errors can be large.  Large errors also have been-found in
actual practice.  The errors made by the naive method generally reduce
the predicted shares of low probability modes and increase the predicted
shares of high probability modes.  Such prediction errors may seriously
affect the evaluation of

                                   171





transportation policy options.  Therefore, it is important to seek other
methods of aggregate prediction that reduce these prediction errors.

   7.4b The Market Segmentation Method

   This method divides the population for which a forecast is required
into segments within which individuals are similar but not necessarily
identical in terms of the values of important explanatory variables. 
Separate mode share predictions are made for each segment using the
naive method, and the mode share for the entire population is obtained
by weighted averaging of the shares for the segments using weights
proportional to the segment sizes.  In most cases, this method
substantially improves the accuracy of the aggregate predictions
relative to applying the naive method to the entire population.  As an
illustration of this, suppose that in Example 7.3 the market segments
consist of the different levels of automobile ownership.  The market
segmentation method predicts the aggregate mode shares for each group by
substituting into the logit utility functions the true automobile
ownership levels and the average travel time differences for each of
these levels.  The results of this computation are summarized in the
following table:

                                          A = 1          A = 2

    Average value of Tb - Ta (min.)     -2.5           17.5

    Number of cases                      120            120

    Auto choice share                      0.679          0.963

The market segmentation method's prediction of the automobile share for
the entire population is [120(0.679) + 120(0.963)]/240  =  0.821. This
prediction

                                   172





is considerably closer to the true share of 0.802 than is the prediction
of the naive method.
   The accuracy of the market segmentation method can be increased by
increasing the number of market segments.  For example, in the case just
described, the prediction accuracy could be improved by dividing the
district into geographical subareas so as to reduce the variation of
travel times within segments.  The variables used to define market
segments also influence the accuracy of the method.  It is best to
select variables whose variation is likely to have large effects on mode
choice in the circumstances being modeled.

   7.4c The Sample Enumeration Method

   The ultimate extension of the market segmentation method is to
continue subdividing the population until each segment consists of a
single individual.  However, data on the explanatory variables of models
usually cannot be obtained for every individual in a population of
practical interest.  A practical alternative is to base predictions on a
random sample of individuals from the population.  This method, called
the sample enumeration method, predicts the mode choice probabilities
for each individual in the sample and averages these as in Equation
(7.15) to obtain an estimate of the mode share for the population.  The
sample can be drawn from the same data used for model estimation.  Again
using the data from Example 7.3, a sample of 20 individuals might be:

                                   173





            A = 1                              A = 2
Ta - Tb  No. of                   Ta - Tb No. of
 (min.)     Cases       Pa          (min.)    Cases      Pa

   10        1         0.881           30        1       0.989
    5        3         0.818           25        2       0.982
    0        1         0.731           20        2       0.971
   -5        2         0.622           15        1       0.953
  -10        2         0.500           10        3       0.924
  -15        1         0.378            5        1       0.881

To estimate the automobile mode share by the sample enumeration method,
multiply each probability by the corresponding number of cases, add the
products, and divide the sum by the total number of cases.  The
resulting share is [1(0.881) + 3(0.818) + ... + 3(0.924) + 1(O.881)]/20
= 0.809. This estimate is very close to the correct share of 0.802.
   In general, the error made by the sample enumeration method depends
on the size and representativeness of the sample.  However, for random
samples consisting of 20-30 individuals per traffic zone or district,
the method usually gives predictions that are very close to the true
shares.
   The following table compares the shares predicted by the naive,
market segmentation, and sample enumeration method for the data in
Example 7.3:

                                                Percent   Error in
                    Auto Share    Bus Share   Auto Share  Bus Share

 True value            0.802        0.198         0.0%       0.0%

 Naive method          0.881        0.119         9.9      -39.9

 Market segmentation   0.821        0.179         2.4      - 9.6

 Sample enumeration    0.809        0.191         0.9      - 3.5

                                   174





In this case, as in most applications, the error associated with the
naive method is large, particularly in percentage terms for the bus mode
share.  This error is smaller for the market segmentation method and is
negligible for the sample enumeration method.  As suggested by these
results and confirmed by other, more complete studies, the sample
enumeration method should be used whenever the required data are
available.  When the data are not available, the market segmentation
method should be used.  The naive method should not be used except in
cases where data limitations preclude the use of the other methods.  In
such cases, the potential errors associated with the naive method should
be recognized, and the uncertain accuracy of the resulting predictions
should be considered in decision making.
   In cases where the 20-30 observations per traffic zone or district
needed to implement the sample enumeration method are not available, a
modified procedure called pseudosample enumeration often can be used. 
This procedure consists of using estimates of the distributions of the
relevant variables to generate artificially a sample of individuals in
each zone or district.  The sample enumeration method then is applied to
this pseudosample.

7.5 Summary

   This module has described methods for making forecasts of aggregate
choice shares and traffic volumes using disaggregate models.  The module
has explained the need for using disaggregate models even when the
objective is to predict aggregate travel.  The module has also explained
the theoretical basis for making aggregate predictions with disaggregate
models.  Three

                                   175





practical methods for making aggregate predictions have been described
and their accuracy discussed.
   This module concludes the core of the course.  The course has been
designed to provide you with an introduction to and general
understanding of the estimation and use of disaggregate mode choice
models.  The microcomputer-based exercises in the supplement to the
course will help you to solidify your understanding of these models. 
The references listed at the end of Module 1 provide additional
resources if you wish to further enhance your understanding of these
models.

                                   176





                         SOLUTIONS TO EXERCISES
                                MODULE 2

2.1

                             utility
            Mode          Y = 40     Y = 10

         Drive Alone       2.25       1.50

           Carpool         2.12       1.75

             Bus           1.91       1.62

   The choices are the same as in Example 2.1

2.2

                             Utility
            Mode          Y = 40     Y = 10
         Drive Alone       -1.50      -2.85
           Carpool         -1.99      -2.56
             Bus           -2.37      -2.73
           Choice       Drive Alone  Carpool


2.3   W = 10 - 2OU, and X = -U�   The values of U, V, W, and X
      corresponding to the travel times and costs in Example 2.1 and two
      different values of Y are:

                                   U
            Mode          Y = 40        Y = 10

         Drive Alone       -0.75        -1.50

           Carpool         -0.88        -1.25

             Bus           -1.09        -1.38

                                    V
            Mode          Y = 40        Y = 10

         Drive Alone       -30.0        -15.0

           Carpool         -35.0        -12.5

             Bus          -43.75        -13.75

                                   177





                                   W
            Mode          Y = 40        Y = 10

         Drive Alone       -5.0         -20.0

           Carpool         -7.5         -15.0

             Bus          -11.88        -17.5

                                    V
            Mode          Y = 40        Y = 10,

         Drive Alone       -0.56        -2.25

           Carpool         -0.77        -1.56

             Bus           -1.20        -1.89

   All of the utility functions predict that the individual with Y = 40
   will choose drive alone and that the individual with Y = 10 will
   choose carpools

2.4   The utility values and mode choices according to income group are:

                              Utility
         Income    Drive Alone   Carpool       Bus       Choice

           14         -1.39       -1.20       -1.18        Bus

           18         -1.19       -1.10       -1.14      Carpool

           22         -1.07       -1.03       -1.14      Carpool

           26         -0.98       -0.99       -1.11    Drive Alone

           30         -0.91       -0.96       -1.10    Drive Alone

           34         -0.87       -0.93       -1.07    Drive Alone

   The aggregate mode shares are 55% for drive alone, 40% for carpools
   and 5% for bus.
      The average utility values are -1.03 for drive alone, -0.96 for
   carpools and -1.11 for bus.  According to the average utilities, all
   travelers are predicted to choose carpools which is incorrect.
      After the reduction in bus travel time, the utilities and mode
   choices according to income are:

                                   178





                              Utility
         Income    Drive Alone   Carpool       Bus       Choice

           14         -1.39       -1.20       -1.13        Bus

           18         -1.19       -1.10       -1.09        Bus

           22         -1.07       -1.03       -1.06      Carpool

           26         -0.98       -0.99       -1.05    Drive Alone

           30         -0.91       -0.96       -1.03    Drive Alone

           34         -0.87       -0.93       -1.02    Drive Alone

The new aggregate mode shares are 55% for drive alone, 25% for carpool,
and 20% for bus.  Thus, the reduction in bus travel time has caused the
mode share of carpool to decrease and that of bus to increase.
   The average utilities after the reduction in bus travel time are -
1.03 for drive alone, -0.96 for carpool, and -1.06 for bus.  Using
average utilities, one predicts, erroneously, that all travelers choose
carpool and, therefore, that the reduction in bus travel time has caused
no change in the mode shares.

                                MODULE 3

3.1

   One car:   UDA     =   -0.5 - 5(2/15) + 0.4(1 - 1) =   -1.17
              UCP     =   -0.75 - 5(1/15) + 0.2(1 - 1)        =-1.08
              UB  =   -1.0 - 5(0.75/15)               =   -1.25

   Two cars:  UDA     =   -0.5 - 5(2/15) + 0.4(2 - 1) =   -0.77
              UCP     =   -0.75 - 5(1/15) + 0.2(2 - 1)        =-0.88
              UB  =   -1.0 - 5(0.75/15)               =   -1.25

   Three cars: UDA    =   -0.5 - 5(2/15) + 0.4(3 - 1)         =-0.37
              UCP     =   -0.75  5(1/15) + 0.2(3 - 1) =   -0.68
              UB  =   -1.0 - 5(0.75/15)               =   -1.25

                                   179





3 . 2
                           Utilities Based on the Travel Time   Based on
                           Distributions in Example 3.2          Ex. 3.1

 Percentage of Individuals  20%      50%    20%       10%       100%
 Drive Alone               -0.67    -0.77  -0.87     -0.97      -0.77
 Carpool                   -0.78    -0.88  -0.98     -1.08      -0.88
 Bus                       -1.25    -1.25  -1.25     -1.25      -1.25

   Chosen Mode                         Drive alone in all cases


3.3   0 cars:

                                 Utilities Based on the Travel Time
                                 Distributions in Example 3.2
    Percentage of Individuals    20%       50%        20%      10%
    Drive Alone                 -1.47     -1.57      -1.67    -1.77
    Carpool                     -1.18     -1.28      -1.38    -1.48
    Bus                         -1.25     -1.25      -1.25    -1.25
    Chosen Mode                 CP       Bus        Bus      Bus
    % of All Travelers           4        10          4        2

    One car:
                                  Utilities Based on the Travel Time
                                  Distributions in Example 3.2
    Percentage of Individuals    20%       50%        20%      10%
    Drive Alone                 -1.07     -1.17      -1.27    -1.37
    Carpool                     -0.98     -1.08      -1.18    -1.28
    Bus                         -1.25     -1.25      -1.25    -1.25
    Chosen Mode                 CP        CP         CP      Bus
    % of All Travelers          10        25         10        5

    Two cars:
                                  Utilities Based on the Travel Time
                                  Distributions in Example 3.2
    Percentage of Individuals    20%       50%        20%      10%
    Drive Alone                 -0.67     -0.77      -0.87    -0.97
    Carpool                     -0.78     -0.88      -0.98    -1.08
    Bus                         -1.25     -1.25      -1.25    -1.25
    Chosen Mode                 DA        DA         DA       DA
    % of All Travelers           6        15          6        3

   Aggregate mode shares:  30% (drive alone), 49% (carpool), 21% (bus).

                                   180




                               Percent Choosing
         Income       Number of      Drive
3.4      ($000)      Individuals     Alone      Carool         Bus

          17.5           10           10          50           40

          22.5           20           80          20            0

          27.5           20           85          15            0

          32.5           30           90          10            0

          37.5           20           95           5            0

          42.5            3           100          0            0


                                Number choosing
         Income       Number of      Drive
         ($000)      Individuals     Alone      Carpool        Bus

          17.5           10            1           5            4

          22.5           20           16           4            0

          27.5           20           17           3            0

          32.5           30           27           3            0

          37.5           20           19           1            0

          42.5            3            3           0            0

          Total          103          83          16            4

                                Module 4

4.1     Case       V1        V2    V1 - V2   Pr(1)

          6        1.0      -1.5       2.5      0.92
          7        0.5      -2.0       2.5      0.92
          8        3.0      -1.0       4.0      0.98
          9        0.0      -0.5       0.5      0.62
         10        0.0       2.5      -2.5      0.08


                                   181





          4.2Carpool utility = 1.5:

        Mode        V      Exp(V)      Pr
         DA        2.5      12.18     0.63
         CP        1.5      4.48      0.23
         Bus       1.0      -2.72     0.14

                            19.38     1.00


          Carpool utility = 1.0:

        Mode        V      Exp(V)      Pr
         DA        2.5      12.18     0.63
         CP        1.0      2.72      0.155
         Bus       1.0      -2.72     0.155

                            19.38     1.000


4.3 a.  Mode        T         C         V      Exp(V)       Pr
         DA        0.5       2.0      -1.0      0.37        0,237
         CP        0.6       1.0      -0.85     0.43        0,276
         Bus       0.8       0.6      -0.95     0.39        0,250
        Bike       1.0       0.0      -1.0      0.37        0,237

                                                1.56        1.00


4.3 b.  Alternative-specific constant for bicycle is -1.0:

        Mode        T         C         V      Exp(V)       Pr
         DA        0.5       2.0      -1.0      0.37        0.278
         CP        0.6       1.0      -0.85     0.43        0.323
         Bus       0.8       0.6      -0.95     0..39       0.293
        Bike       1.0       0.0      -2.0      0.14        0.105

                                                1.33        1.00


                                   182





                                MODULE 5

5.1   (1)   Percentage of runs that arrive at a given point within 3
            minutes of schedule.

      (2)   Percentage of runs that arrive at a given point no more than
            3 minutes late.

      (3)   Root-mean-square deviation from scheduled arrival time.


5.2   When  V = -T�, adding 5 minutes is more burdensome for the 1-hour
      trip.  When  V = -T� adding 5 minutes is more burdensome for the
      15-minute trip.

5.3   a. High-income travelers are more sensitive.

      b. Equally sensitive

      c. Low income travelers are more sensitive

      d. Equally sensitive

      e. Low income travelers are more sensitive.

5.4   For the first group of utility functions,use b1 = 0.05, b2 =
      0.10.
      For the second group, use b3 = 0.05, b4 = 0.05.

                                MODULE 6

6.1   a. log L =

         -log[1 + exp(5a)] - log[1 + exp(15a)]  -  log[1 + exp( -10a)]


      b.          a        log L
               -0.025      -1.98
               -0.045      -1.94
               -0.065      -1.93

      c. To 2 significant figures, there is no value of a that gives a
         log L larger than -1.93.

                                   183





6.2   a.    Variable       t statistic
            IVTT              -2.18
            OVTT              -4.54
            LC                -1.44
            PC                -2.05

         The t statistics indicate that none of the variables should be
         dropped.

      b. Estimate the model whose utility function is

            V = a1IVTT + a2OVTT + a3(LC + PC) + a4PC,

         and use the t statistic of a4  to determine whether the last
         term can be dropped.

      c. The likelihood ratio test statistic is

            LR = 2[( -178.22) - ( -179.37)] - 2.30.

         This is below the critical value for two variables, so the two
         additional variables can be dropped.

      d. The value of the modified likelihood ratio test statistic is

             MLR = ( -177.53 - 4/2) - ( -179.37 4/2) = 1.84.

         Since this exceeds the critical value for MIR, the model with
         OVTT/DIST is significantly better.


                                   184

         *U.S. Government Printing Office: 1993 -- 343-120/85890





NOTICE

This document is disseminated under the sponsorship of the U.S.
Department of Transportation in the interest of information exchange. 
The United States Government assumes no liability for its contents or
use thereof.

The United States Government does not endorse manufacturers or products. 
Trade names appear in the document only because they are essential to
the content of the report.

This report is being distributed through the U.S. Department of
Transportation's Technology Sharing Program.

DOT-T-93-18




NOTICE

This document is disseminated under the sponsership of the U.S.
Department of Transportation in the interest of information exchange. 
The fUnited States Government assumes no liability for its contents or
use thereof.

The United States Government does not endorse manufacturers or products. 
Trade names appear in the document only because they are essential to
the content of the report.

This report is being distributed through the U.S. Department of
Transportation's Technology Sharing Program.

DOT-T-93-18
(381SIC.html)