Federal Committee on Statistical Methodology
Office of Management and Budget
FCSM Home ^
Methodology Reports ^

 

  Statistical Policy Working Paper 20 - Seminar on Quality of Federal Data - Part 2 of 3


Click HERE for graphic.

 



 



                           Statistical Policy

                            Working Paper 20







                   Seminar on Quality of Federal Data



 



                              Part 2 of 3



  



               Federal Committee on Statistical Methodology



 



                          Statistical Policy Office



               Office of Information and Regulatory Affairs



                   Office of Management and Budget



 



                              March 1991



 



                  



 



                   MEMBERS OF THE FEDERAL COMMITTEE ON



                          STATISTICAL METHODOLOGY



 



                              (February 1991)



 



                         Maria E. Gonzalez, Chair



                      Office of Management and Budget



 



 



    Yvonne M. Bishop                  Daniel Kasprzyk



    Energy Information                Bureau of the Census



      Administration



                                      Daniel Melnick



    Warren L. Buckler                 National Science Foundation



    Social Security Administration



                                      Robert P. Parker



    Charles E. Caudill                Bureau of Economic Analysis



    National Agricultural



      Statistics Service              David A. Pierce



                                      Federal Reserve Board



    Cynthia Z.F. Clark



    National Agricultural             Thomas J. Plewes



      Statistics Service              Bureau of Labor Statistics



 



    Zahava D. Doering                 Wesley L. Schaible



    Smithsonian Institution           Bureau of Labor Statistics



 



    Robert M. Groves                  Fritz J. Scheuren



    Bureau of the Census              Internal Revenue Service



 



    Roger A. Herriot                  Monroe G. Sirken



    National Center for               National Center for



      Education Statistics              Health Statistics



 



    C. Terry Ireland                  Robert D. Tortora



    National Computer Security        Bureau of the Census



      Center



 



    Charles D. Jones



    Bureau of the Census



 



                                PREFACE



 



 



  In 1975, the Office of Management and Budget (OMB) organized the



  Federal Committee on Statistical Methodology. Comprised of



  individuals selected by OMB for their expertise and interest in



  statistical methods, the committee has during the past 15 years.



  determined areas that merit investigation and discussion, and



  overseen the work of subcommittees organized to study particular



  issues.  Since 1978, 19 Statistical Policy Working Papers have been



  published under the auspices of the Committee.



 



  On May 23-24, 1990, the Council of Professional Associations on



  Federal Statistics (COPAFS) hosted a "Seminar on the Quality of



  Federal Data." Developed to capitalize on work undertaken during



  the past dozen years by the Federal Committee on statistical



  Methodology and its subcommittees, the seminar focused on a variety



  of topics that have been explored thus far in the Statistical



  Policy Working Paper series.  The subjects covered at the seminar



  included:



 



       Survey Quality Profiles



       Paradigm Shifts Using Administrative Records



       Survey Coverage Evaluation



       Telephone Data Collection



       Data Editing



       Computer Assisted Statistical Surveys



       Quality in Business Surveys



       Cognitive Laboratories



       Employer Reporting Unit Match Study



       Approaches to Developing Questionnaires



       Statistical Disclosure-Avoidance



       Federal Longitudinal Surveys



 



  Each  of these topics was presented in a two-hour session that



  featured formal papers and discussion, followed by informal



  dialogue among all speakers and attendees.



 



  Statistical Policy Working Paper 20, published in three parts,



  presents the proceedings of the "Seminar on the Quality of Federal



  Data." In addition to providing the papers and formal discussions



  from each of the twelve sessions, this working paper includes



  Robert M. Groves' keynote address, "Towards Quality in a Working



  Paper Series on Quality," and comments by Stephen E. Fienberg,



  Margaret E. Martin, and Hermann Habermann at the closing session,



  "Towards an Agenda for the Future."



 



  We are indebted to all of our colleagues who assisted in organizing



  the seminar, and to the many individuals who not only presented



  papers and discussions but also prepared these materials for



  publication.  A special thanks is due to Terry Ireland and his



  staff for their work in assembling this working paper.



 



                      Table of Contents



 



                    Wednesday, May 23, 1990



 



 



                             Part 1



 



 



                        KEYNOTE ADDRESS



 



 



TOWARDS QUALITY IN A WORKING PAPER SERIES ON QUALITY. . . . . .  3



    Robert M. Groves, The University of Michigan and U. S.



    Bureau of the Census



 



 



 



         Session 1 - SURVEY QUALITY PROFILES



 



 



 



THE SIPP QUALITY PROFILE. . . . . . . . . . . . . . . . . . .   19



    Thomas B. Jabine, Statistical Consultant



 



INITIAL REPORT ON THE QUALITY OF AGRICULTURAL SURVEY PROGRAM.   29



    George A. Hanuschak, National Agricultural Statistics



    Service



 



DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . . . 40



    Barbara A. Bailar, American Statistical Association



 



DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . .   46



    Nancy A. Mathiowetz, U. S. Bureau of the Census



 



 



 



Session 2 - PARADIGM SHIFTS USING ADMINISTRATIVE



                           RECORDS



 



 



 



PARADIGM SHIFTS: ADMINISTRATIVE RECORDS AND CENSUS-TAKING. . . 53



    Fritz Scheuren, Internal Revenue Service



 



AN ADMINISTRATIVE RECORD PARADIGM: A CANADIAN EXPERIENCE . . . 66



    John Leyes, Statistics Canada



                                               



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . .  77



       Gerald Gates, U.S. Bureau of the  Census



 



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . .  83



       Edward J. Spar, Market Statistics



 



 



 



        Session 3 - SURVEY COVERAGE EVALUATION



 



 



 



 



 CONTROL MEASUREMENT, AND IMPROVEMENT OF SURVEY COVERAGE . . .87



       Gary M. Shapiro, U. S. Bureau of the Census; Raymond R.



       Bosecker, National Agricultural Statistics Service



 



 QUALITY OF SURVEY FRAMES. . . . . . . . . . . . . . . . .   100



       Judith T. Lessler, Research Triangle Institute



 



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . 108



       Fritz Scheuren, Internal Revenue service



 



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . .   114



       Joseph Waksberg, Westat, Inc.



 



 



 



          Session 4 - TELEPHONE DATA COLLECTION



 



 



 



 QUALITY IMPROVEMENT IN TELEPHONE SURVEYS. . . . . . . . . . 123



       Leyla Mohadjer, David Morganstein, Westat, Inc.



 



 COMPUTER ASSISTED SURVEY TECHNOLOGIES IN GOVERNMENT:



       AN OVERVIEW. . . .  .     .  . . . . . . . . . . . .  137



       Marc Tosiano, National Agricultural Statistics Service



 



 DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . .  155



       William L. Nicholls II, U.  S. Bureau of the Census



 



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . 161                                                           .161



       James T. Massey, National Center for Health Statistics



 



 



 



 



 



 



 



 



                                    iv



 



            



 



                                  Part 2



 



 



 



                          Session 5 - DATA EDITING



 



 



 



    OVERVIEW OF DATA EDITING IN FEDERAL STATISTICAL AGENCIES .167



          David A. Pierce, Federal Reserve Board



 



    EDITING SOFTWARE (An excerpt from Chapter IV of Working-



          Paper 18). . . . . . . . . . . . . . . . . . . . . .173



          Mark Pierzchala, National Agricultural Statistics



          Service



 



    RESEARCH ON EDITING. . . . . . . . . . . . . . . . . . .  180



          Yahia Ahmed, Internal Revenue Service



 



    DISCUSSION. . . . . . . . . . . . . . . . . . . . . ..    184



          Charles E. Caudill, National Agricultural Statistics



          Service



    DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . 186



          Richard Bolstein, George Mason University



 



 



 



          Session 6 - COMPUTER ASSISTED STATISTICAL



                                SURVEYS



 



 



 



  OVERVIEW OF COMPUTER ASSISTED SURVEY INFORMATION COLLECTION. 191



          Richard L. Clayton, U. S. Bureau of Labor Statistics



 



    A COMPARISON BETWEEN CATI AND CAPI. . . . . . . . . . . . .197



          Martin Baum, National Center for Health Statistics



 



    COMPUTER ASSISTED SELF INTERVIEWING. . . . . .. . . . . .  202



          Ralph Gillmann, Energy Information Administration



 



    COMPUTER ASSISTED Self INTERVIEWING: RIGS AND PEDRO,



          TWO EXAMPLES . . . . . . . . . . . . . . . . . . . . 205



          Ann M. Ducca, Energy Information Administration



 



    DATA  COLLECTION. . . . . . . . . . . . . . . . . . . .  . 209



          Cathy Mazur, National Agricultural Statistics Service



 



                                  v



 



   DISCUSSION. . . . . . . . . . . . . . . . . . . . .  . . . 212



         Robert N. Tinari, U. S. Bureau of the Census



 



   DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . 216



         David Morganstein, Westat, Inc.



 



 



                         Thursday, May 24, 1990



 



 



            Session 7 - QUALITY IN BUSINESS SURVEYS



 



 



 



  IMPROVING ESTABLISHMENT SURVEYS AT THE BUREAU OF LABOR



        STATISTICS .. . . . . . . . . . . . . . . . . . . . . .221



       .Brian MacDonald, Alan R. Tupek, U. S.Bureau of Labor



        Statistics



 



  A REVIEW OF NONSAMPLING ERRORS IN FEDERAL ESTABLISHMENT



  SURVEYS WITH SOME AGRIBUSINESS EXAMPLES. . . . . . . . . . . 232



        Ron Fecso, National Agricultural Statistics Service



 



  DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . .  243



        David A. Binder, Statistics Canada



 



  DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . .247



        Charles D. Cowan, Opinion Research Corporation



 



 



              Session 8 - COGNITIVE LABORATORIES



 



 



 



 THE  BUREAU OF LABOR STATISTICS' COLLECTION PROCEDURES



 RESEARCH LABORATORY: ACCOMPLISHMENTS AND FUTURE DIRECTIONS . .253



       Cathryn S. Dippo, Douglas Herrmann, U. S. Bureau of Labor



       Statistics



 



 THE ROLE OF A COGNITIVE LABORATORY IN A STATISTICAL AGENCY. . 268



       Monroe G. Sirken, National Center for Health Statistics



 



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . . 278



       Elizabeth Martin, U. S. Bureau of the Census



 



 DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . .  .281



                                             



       Murray Aborn, National Science Foundation (retired)



 



                               vi



 



                            Part 3



 



         Session 9 - EMPLOYER REPORTING UNIT MATCH



                                     STUDY



 



   INTERAGENCY AGREEMENTS FOR MICRODATA ACCESS:



         THE ERUMS EXPERIENCE. . . . . . . . . . . . . . . .   291



         Thomas B. Petska, Internal Revenue Service; Lois



         Alexander, Social Security Administration



 



   SAMPLE SELECTION AND MATCHING PROCEDURES USED IN ERUMS. . . 301



         John Pinkos, Kenneth LeVasseur, Marlene Einstein,



         U. S. Bureau of Labor Statistics; Joel Packman, Social



         Security Administration



 



   RESULTS, FINDINGS, AND RECOMMENDATIONS OF THE ERUMS PROJECT. 309             .309



         Vern Renshaw, Bureau of Economic Analysis; Tom Jabine,



         Statistical Consultant



 



   DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . .  318



         W. Joel Richardson, Charles A. Waite, U. S. Bureau of the



         Census



 



   DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . .324



         Thomas J. Plewes, U. S. Bureau of Labor Statistics



 



            Session 10 - APPROACHES TO DEVELOPING



                              QUESTIONAIRES



 



   TOOLS FOR USE IN DEVELOPING QUESTIONS AND TESTING



         QUESTIONNAIRES. . . . . . . . . . . . . . . . . . . .  331



         Theresa J. DeMaio, U. S. Bureau of the Census



 



   TECHNIQUES FOR EVALUATING THE QUESTIONNAIRE  DRAFT. . . . .  340



         Deborah H. Bercini, National Center for Health Statistics



 



   DESIGNING QUESTIONNAIRES FOR CATI IN A MIXED MODE



         ENVIRONMENT. . . . . . . . . . . . . . . . . . . . . . 349



         Gemma Furno, U. S. Bureau of the Census



 



   DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . 360



         Carol C. House, National Agricultural Statistics Service



                                 vii



 



    Session 11 - STATISTICAL DISCLOSURE - AVOIDANCE



 



                           



 



  DISCLOSURE AVOIDANCE PRACTICES AT THE CENSUS BUREAU. . . . . .367



        Brian Greenberg, U. S. Bureau of the Census



 



  THE MICRODATA RELEASE PROGRAM OF THE NATIONAL CENTER



  FOR HEALTH STATISTICS .. . . . . . . . . . . . . . . . . . ...377



        Robert H. Mugge, National Center for Health Statistics



        (retired)



 



  DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . . 385



        George Duncan, Carnegie Mellon University



 



 



        Session 12 - FEDERAL LONGITUDINAL SURVEYS



 



 



 



  FEDERAL LONGITUDINAL SURVEYS . . . . . .  . . . . . . . . . . 393



        Daniel Kasprzyk, U. S. Bureau of the Census; Curtis



        Jacobs, U. S. Bureau of Labor Statistics



 



  THE ADVANTAGES AND DISADVANTAGES OF LONGITUDINAL SURVEYS. . ..407



        Robert W. Pearson, Social Science Research Council



 



  LONGITUDINAL ANALYSIS OF FEDERAL SURVEY DATA. . . . . . . . . 425



        Patricia Ruggles Joint Economic Committee



 



  DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . . . 438



        Michael Brick, Westat, Inc.



 



  DISCUSSION. . . . . . . . . . . . . . . . . . . . . . . . .   447



        Marilyn E. Manser, U. S. Bureau of Labor Statistics



 



 



                 TOWARDS AN AGENDA FOR THE FUTURE



 



 



 



  Stephen E. Fienberg, Carnegie Mellon University . . . . . . . 455



 



  Margaret E. Martin. . . . . . . . . . . . . . . . . . . . . . 462



 



  Hermann Habermann, Office of Management and Budget. . . . . . 465



 



                               viii



 



                      



 



                              Part 2



                             Session 5



                           DATA EDITING



 



 



 



 



 



 



 



 



                              165



 



                             166



 



    OVERVIEW OF DATA EDITING IN FEDERAL STATISTICAL AGENCIES



 



                         David A. Pierce



                      Federal Reserve Board



 



 



Abstract



 



     This paper is the first of three in the session on Data



Editing presenting highlights of the report "Data Editing in



Federal Statistical Agencies", Statistical Policy Working Paper 18,



OMB, prepared by the Subcommittee on Data Editing in Federal



Statistical Agencies, FCSM.  Included in this paper are a listing of



the Subcommittee members, a discussion of its mission statement



from the FCSM, definition and concepts of data editing, the major



areas investigated and the methods used to do so, the development



of case studies, and the Subcommittee's recommendations for data



editing in Federal statistical agencies.  The paper highlights the



findings from a survey of current data editing practices which was



conducted by the Subcommittee.



 



 



1. Introduction



 



     The Subcommittee on Data Editing in Federal Statistical Agen-



cies was established by the Federal Committee on Statistical



Methodology (FCSM) in November 1988 to document, profile, and



discuss the topic of data editing in Federal censuses and surveys.



The Subcommittee consisted of the following individuals:



 



     George Hanuschak, National Agricultural Statistics Service,



          Chair



     Yahia Ahmed, Internal Revenue Service



     Laura Bauer, Federal Reserve Board



     Charles Day, Internal Revenue Service



     Maria Gonzalez, Office of Management and Budget



     Brian Greenberg, Bureau of the Census



     Anne Hafner, National Center for Education Statistics



     Gerry Hendershot, National Center for Health Statistics



     Rita Hohenbrink, National Agricultural Statistics Service



     Renee Miller, Energy Information Administration



     Tom Petkunas, Bureau of the Census



     David Pierce, Federal Reserve Board



 



 



 



 



                                 167



 



 



       Mark Pierzchala, National Agricultural Statistics Service



       Marybeth Tschetter, Bureau of Labor Statistics



       Paula Weir, Energy information Administration



 



       A key aim of this effort was to further the awareness within



 agencies of each other's data editing practices, as well as of the



 state of the art of data editing, and thus to promote improvements



 in data quality throughout Federal statistical agencies.              To



 further these goals, the Subcommittee was given a "charge", or



 mission statement, of



 



       determining how data editing is currently being done in



       Federal agencies, recognizing areas that may need



       attention, and, if appropriate, recommending any



       potential improvements for the editing process.



 



 Among the many items investigated by the Subcommittee were the role



 of subject matter specialists; hardware, software, and the data



 base environment; new technologies of data collection and editing,



 such as CATI and CAPI; current research efforts in the various



 agencies; and some recently developed editing systems such as at



 the Census Bureau and Statistics Canada.



 



       In fulfilling its mission the Subcommittee followed a number



 of paths, including developing a questionnaire on survey editing



 practices, assembling several case studies of editing practices,



 investigating alternative editing systems and software, exploring



 research needs and practices, and compiling an annotated



 bibliography of literature on editing.  The result of the



 Subcommittee's work is its report (1990), organized into 5 main



 chapters with several supporting appendices as follows:



 



         Chapters                      Appendices



 



   I. Executive Summary             A.  Questionnaire Responses



  II. Background                    B.  Case Studies



 III. Current Editing Practices     C.  Software Functions checklist



  IV. Editing Software              D.  Annotated Bibliography



   V. Research on Editing           E.  Glossary of Terms



 



 After discussing some general topics pertaining to editing and to



 the Subcommittee's work, this paper summarizes some of the main



 results of a questionnaire on Current Editing Practices, designed,



 administered and compiled by the Subcommittee.   The two papers



 immediately following address, respectively, the subjects of



 software developments and recent research findings in editing.



 



 



 



 



 



 



 



                                    168



 



 2. Data Editing--Definition and Concepts



 



      The subcommittee first addressed the definition of data



 editing.   While no universal definition of survey data editing



 exists, the following working definition was developed:



 



      Procedures designed and used for detecting erroneous



      and/or questionable survey data, with the goal of



      correcting (manually or electronically) as much of the



      erroneous data as possible (not necessarily all of the



      questioned data), usually prior to data imputation and



      summary procedures.



 



 Thus  data editing can be seen as a data quality improvement tool by



 which erroneous or highly suspect data are found and (if necessary)



 corrected.  We have focused primarily on editing rather than



 imputation in our work, though in practice the boundary between



 these is not absolute.



 



 3. Current-Editing Practices



 



      To obtain a profile of current editing practices, in the



 various Federal statistical agencies, the subcommittee developed an



 editing questionnaire, which was completed for 117 Federal censuses



 and surveys representing 14 different Federal agencies.  These 117



 surveys were selected by subcommittee members, and thus they were



 not a scientific sample of all Federal surveys; however the



 Subcommittee felt that the 117 surveys represented a broad coverage



 of agencies and types of surveys or censuses that would present



 different editing situations.



 



      The Subcommittee members primarily involved with the



 questionnaire and editing profile were Charles Day, Yahia Ahmed,



 George Hanuschak, Rita Hohenbrink and Renee Miller.



 



      The questionnaire that was designed was a six-page document



 containing general questions about the particular survey as well as



 specific questions on editing.  The report contains a complete



 listing of the questions asked, along with a tally of the results



 obtained for the 117 surveys, and should serve as a useful



 reference for the current (1990) state of data editing practice.



 A few of the major results follow.



 



      Regarding general characteristics of the surveys, about three-



 fourths of the surveys are actually sample surveys, and the



 remaining one-fourth censuses.  A wide range of frequencies of



 collection are represented, from daily to quinquennial.  About one-



 fourth are completed by individuals, and three-fourths               by



 establishments.  While traditional means of data collection such as



 mail, personal and telephone interviews were most common, a small



 



                                   169



 



proportion of the surveys used CATI, and some were administrative



 records.



 



      Turning to editing, while the idea that there's no such thing



 as a free lunch seems to be as true of data editing as it is of



 anything else, there was wide variation in the actual cost of



 editing as a percent of total survey cost.  The median editing cost



 for the surveys was more than one-third of the total cost of the



 survey.   One of the interesting findings was that surveys of



 individuals had lower relative editing costs than surveys of



 establishments.



 



      The questionnaire also elicited information on when in the



 survey process the editing occurs.  For about two-thirds of the 117



 surveys, most of the data editing takes place after data entry.



 Editing at the time of data entry is on the increase but not yet



 common.



 



      Subject matter analysts play a large and important role in



 data editing.  In about three-fourths of the surveys, subject



 matter analysts review all unusual or large cases. Only seven of



 the surveys had little or no intervention by subject-matter



 specialists.  In this regard, we found that surveys of



 establishments had heavier involvement from subject-matter



 specialists than surveys of individuals; and this could also be



 related to the, finding, mentioned above, of lower editing costs in



 individual than in establishment surveys.



 



      The degree of automation in data editing varies considerably



 among the surveys in our study.  In about three-fifths of the



 surveys, automated edit checking is done, but error correction is



 performed by clerks or analysts.  In about one-third of the cases,



 only unusual situations are referred to analysts.  Only 3% of the



 surveys were totally automated, though all but 1% had at least some



 automation.



 



      There are different types of edits that are applied to



 surveys. Almost all the surveys in our study use validation



 editing, which detects inconsistent data within a record.  About



 five-sixths also use macro editing, where aggregated data are



 examined.   The majority of surveys use other types of edits as



 well, such as range edits, edits using historical data, ratio



 edits, some of which may overlap.  Additional information is also



 utilized in  editing many of the surveys, such as comparisons with



 other surveys, comparison to a value estimated by regression



 analysis, or the use of interquartile measures.



 



      Satisfaction with the current editing system varied widely.



 About half the respondents were satisfied with their current



 editing systems, and another one-fourth felt only minor changes



 were needed.  The remaining one-fourth thought major changes were



 desired, with 5% of those being in favor of a complete overhaul.



 



                                  170



 



Among those desiring improvements, those most frequently mentioned



were:



 



     an on-line system for data editing,



     the use of prior periods' data to test the current period,



     more statistical edits,



     more sophisticated validation and macro editing,



     an audit trail,



     more automation, particularly automated error correction,



     user-friendlier systems,



     incorporation of imputation into the error package,



     evaluation of effects of data editing,



     reduction of the number of edit flags to follow up,



     incorporation of information on auxiliary variables,



     greater use of Expert Systems, and



     multivariate editing.



 



An Audit trail, or a complete record of the original and corrected



data, the edits failed and any other relevant information, is very



helpful in monitoring and improving the editing process.  The



importance of an evaluation of the effects of editing on the data,



and our current lack of knowledge of such effects, have also been



noted by Bailar (1990).



 



4. Case Studies



 



     In addition to the breadth of valuable information obtained



from the questionnaire, the Subcommittee also felt that an



examination of a relatively few surveys in greater depth would shed



light on the complexity of the different editing situations in



operation.  Therefore several case studies are described, some in



two-paragraph summary format and others in greater detail.  These



comprise Appendix B of the report.  Anne Hafner and Yahia Ahmed had



primary responsibility for preparation of the Case Studies.



 



5. Recommendations



 



     The report lists a number of recommendations for future data



editing practice, some general and some specific. Many of them



fall into the following general categories.



 



     The quality of an agency's existing editing practices and



     technology should be examined in the light of possible



     improvements or alternatives, with respect to such



     criteria as cost efficiency, timeliness, statistical



     defensibility, and accuracy.



 



     Important recent developments in data processing, such as



     new microcomputers, workstations, local area networks,



     data base software, and mainframe linkages, should be



 



                                 171



 



    examined for their possible incorporation into the survey



     editing process.



 



     Agencies should stay in communication with each other and



     with other professionals regarding their research in



     editing, particularly the development and implementation



     of new editing procedures and related methodologies such



     as data base technologies and expert systems.



 



References



 



Bailar, Barbara (1990), "Discussion of 'Survey Quality Profiles'",



Seminar on the Quality of Federal Data, May 22, 1990, COPAFS.  This



Proceedings.



 



Groves, Robert (1990), "Towards Quality in a Working Paper Series



on Quality", Keynote Address, Seminar on the Quality of Federal



Data, May 22, 1990, COPAFS   This Proceedings.



 



Hanuschak, George, Yahia Ahmed, Laura Bauer, Charles Day, Maria



Gon-zalez, Brian Greenberg, Anne Hafner, Gerry Hendershot, Rita



Hohenbrink, Renee Miller, Tom Petkunas, David Pierce, Mark



Pierzchala, Marybeth Tschetter and Paula Weir (1990), Data Editing



in Federal Statistical Agencies, Statistical Policy Working Paper



18, Statistical Policy Office, Office of Management and Budget,



Washington, DC.



 



 



 



 



 



 



 



 



                                172



 



                          EDITING SOFTWARE



          (An excerpt from Chapter IV of Working Paper 18)



 



                           Mark Pierzchala



              National Agricultural Statistics Service



 



 



 A. Introduction



 



      For  most surveys, large parts of the editing process are



 carried out through the use of computer systems.  The task of the



 Software Subgroup has been to investigate software that in some way



 incorporates new methodologies, has new ways of presenting data,



 operates in recently developed hardware environments, or integrates



 editing with other functions.  In order to fulfill this charge, the



 Subgroup has evaluated or been given demonstrations of new editing



 software.  In addition, the Subgroup has developed an editing             

 

 software evaluation checklist that appears in Appendix C of



 Statistical Policy Working Paper 18. This checklist contains



 possible functions and attributes of editing software, which would



 be useful for an organization to use when evaluating editing



 software.



 



      Extremely technical jargon can be associated with new editing



 systems; and new approaches to editing may not be familiar to the



 reader.  The purpose of section B is to explain these approaches



 and their associated terminology as well as to discuss briefly the



 role of editing in assuring data quality.



 



      A distinction must be made between generalized systems and



 software meant for one or a few surveys.  The former is meant to be



 used for a variety of surveys.  Usually there is an institutional



 commitment to spend staff time and money over several years to



 develop the system.  It is hoped that the investment will be more



 than recaptured after the system is developed through the reduction



 in resources spent on editing itself and in the elimination of



 duplication of effort in preparing editing programs. Some software



 programs have been developed that address specific problems in a



 particular survey.  While the ideas inherent in this software may



 be of general interest, it may not be possible to apply the



 software directly to other surveys.  Section C of Chapter IV of



 Working Paper 18 describes three generalized systems in some



 detail, and then briefly describes other systems and software.



 These three systems have been used or evaluated by Subgroup members



 in their own surveys.



 



      New and exciting statistical methodology is also improving the



 editing process.  This includes developments in detecting outliers,



 aggregate level data editing, imputation strategy, and statistical



 quality control of the process itself.  The implementation of these



 activities, however, requires that the techniques be encoded into



 a computer program or system.



 



                                   173



 



B. Software Improving Quality and Productivity



 



 Reasons for the Development of New Editing Software



 



      Traditional editing systems do not fully utilize the talents



 or expertise of subject matter specialists.  Much of their time may



 be spent in dealing with unimportant or spurious error signals and



 in coping with system shortcomings.  As a result, the specialist



 has less time to deal with important problems.  In addition,



 editing systems may be able to give feedback on the survey itself.



 For example, a pattern of edit failures may suggest



 misunderstandings by the respondent or interviewer.  If this is



 recognized, then the expertise of the specialist may then be used



 to improve the survey itself.



 



      Labor costs are a large part of the editing costs and are



 either steady or increasing, whereas the cost of computing is



 decreasing.  In order to justify the heavy reliance on people in



 editing, their productivity will have to be improved through the



 use of more powerful tools.  However, even if productivity is



 improved, different people may do different things in similar



 situations.  If so, this makes the process less repeatable



 (reproducible) and more subject to criticism.  When work is done on



 paper, it is hard to track, and it is impossible to estimate the



 effect of editing actions on estimates.  Finally, some tasks are



 beyond the capability of human editors.  For example, it may be



 impossible for a person to maintain the multivariate frequency



 structure of the data when making changes.



 



      These reasons and several others are commonly given as



 explanations for the increased use of computer software to improve



 the editing process.  It is in the reconciliation of these two



 goals, (the increased use of computers for some tasks and the more



 intelligent use of human expertise), that the major challenge in



 software development lies.  There will always be a role for people,



 but it will be modified.  One positive feature of new editing



 software is that it can often improve the quality of the editing



 process and productivity at the same time.



 



 



 Ways That Productivity Can Be Improved



 



      One way to improve productivity is to break the constraints



 imposed by computer systems themselves. The use of mainframe



 systems for editing data is widespread.  In some cases, however,



 an editor may not use the system directly.  For example, error



 signals may be presented on paper printouts, and changes entered by



 data typists.  Processing costs may dictate that editing jobs are



 run at low priority, overnight, or even less frequently.  The



 effect of the changes made by the editor may not be immediately



 



 



                                  174



 



  known: thus, paper forms may be filed, taken from files, and



   re-filed several times.



 



       The proliferation of microcomputers promises to eliminate many



   of these bottlenecks, while at the same time it creates some



   challenges in the process.  The editor will have direct access to



   the computer, and will be able to prioritize its use.  Once the



   microcomputer is acquired, user fees are eliminated, thus



   resource-intensive programs such as interactive editing can be



   employed, provided the microcomputers are fast enough.  Moving from



   a centralized environment (i. e., the mainframe) to a decentralized



   environment (i.e., microcomputers) will present challenges of



   control and consistency.  In processing a large survey on two or



   more microcomputers, communications will be necessary. This will



   best be done by connecting them into a Local Area Network (LAN).



 



        New systems may reduce or eliminate some editing tasks.  For



   example, where data are edited in batch and error signals are



   presented on printouts, a manual edit of the questionnaires before



   the machine edit may be a practical necessity.  Editing data and



   error messages on a printout can be a hard, unsatisfactory chore



   because of the volume of paper and the static and sometimes



   incomplete presentation of data.  The purpose of the manual edit in



   this situation is to reduce the number of machine-generated error



   signals.  In an interactive environment, information can be



   efficiently presented and immediately processed.  The penalty



   associated with machine-generated signals is greatly reduced. As



   a result, the preliminary manual edit may be eliminated.  In



   addition, questionnaires are handled only once, further reducing



   filing and data entry tasks.



 



        Productivity may be increased by reducing the need for editing



   after data are collected.  Instruments for Computer Assisted



   Telephone Interviewing (CATI), Computer Assisted Personal



   Interviewing (CAPI), and on-site. data entry and editing programs



   are gaining wider use.  Routing instructions are automatically



   followed, and other edit failures are verified at the time of the



   interview.  There may still be many error signals from suspicious



   edits, however, the analyst has more confidence in the data and is



   more likely to let them pass.



 



        There are two major ways that productivity can be improved in



   the programming of the editing instruments.  First is to provide a



   system that will handle all, or an important class, of the agency's



   editing needs.  In this way the applications programmer need not



   worry about systems details.  For example, in an interactive



   system, the programmer does not have to worry about how and where



   to flag edit failures as it is already provided.  The programmer



   only codes the edit specification itself.  In addition, the



   end-user has to learn only one system when editing different



   surveys.  Second is the elimination of multiple specification and



   programming of variables and edits.  For example, if data are



 



                                      175



 



  collected by CATI, and edited with another system, then essentially



   the same edits will be programmed twice, possibly by two sets of



   people.  If the system integrates several functions, e.g., data



   entry, data editing, and computer assisted data collection, then



   one program may be able to handle all of these tasks.  This



   integration would also reduce time spent on data conversion from



   one system to another.



 



 



   Systems That Take Editing and Imputation Actions



 



        Some edit and imputation systems take actions usually reserved



   for people.  They choose fields to be changed and then change them.



   The human element is not removed, rather this expertise is



   incorporated into the system.  One way to incorporate expertise is



   to use the edits themselves to define a feasible region.  This is



   the approach outlined in a famous article by Fellegi and Holt



   (1976).  Edits that are explicitly written are used to generate



   implied edits. For example, if 100 < x / y  < 200, and 3 <



   y / z < 4, are explicit edits, then an implied edit obtained



   algebraically is 300 < x / z < 800.  Once all implied edits are



   generated, the set of complete edits is defined as the union of the



   explicit and implied edits.  This complete set of edits is then



   used to determine a set of fields to be changed for every possible



   edit failure.  This is called error localization. An essential



   aspect to this method is that changes are made to as few fields as



   possible, or alternatively to the least reliable set of fields



   which are determined by weights given to each field.



 



        The analyst is given an opportunity to evaluate the explicit



   edits.  This is done through the inspection of the implied edits



   and extremal records (the most extreme records that can pass



   through the edits without causing an edit failure).  In inspecting



   the implied edits, it may be determined if the data are being



   constrained in an unintended way. In inspecting extremal records,



   the analyst is presented with combinations of the most extreme



   values possible that can pass the edits.  The human editor has



   several ways to inject expertise into this kind of a system:  (1)



   the specification of the edits;  (2) the inspection of implied



   edits and extremal records and then the re-specification of edits;



   (3) the weighting of variables according to their relative



   reliability.



 



        There are some constraints in systems that allow the computer



   to take editing actions.  Fellegi and Holt systems cannot handle



   certain kinds of edits, notably nonlinear and conditional edits.



   Also algorithms that can handle categorical data cannot handle



   continuous data and vice versa.  Within these constraints (and



   others), most edits, can be handled.  For surveys with continuous



   data, a considerable amount of human attention may still be



   necessary, either before the system is applied to data or after.



 



 



                                     176



 



      Another way that computers can take editing actions is by



 modeling human behavior.  This is the "expert system" approach.



 For example, if typically maize yields average 100 bushels per



 acre, and the value 1,000 is entered, then the most likely



 correction is to assume that an extra zero was typed.  The computer



 can be programmed to substitute 100 for 1,000 directly and then to



 re-edit the data.



 



 



 Ways That Data Quality Can Be Improved or Maintained



 



      It is not clear that editing done after data collection can



 always improve the quality of data by reducing non-sampling errors.



 An organization may not have the time or budget to recontact many



 of the respondents or may refrain from recontacts in order to



 reduce respondent burden.  Additionally, there may be cognitive



 errors or systematic errors that an edit system cannot detect.



 Often, all that can be done is to maintain the quality of the data



 as they are collected.  To use the maize yield example again, if



 the edit program detects 1,000 bushels per acre, and sets the value



 to 100 bushels per acre, then the edit program has only prevented



 the data from getting worse.  Suppose the true value was really 103



 bushels per acre.  The edit and imputation program could not get



 the value closer to the truth in this case.  Detecting outliers is



 usually not the only problem.  The proper action to take after



 detection is the more difficult problem.  One of the main reasons



 that Computer Assisted Data Collection is employed is that data are



 corrected at the time of collection.



 



      There are a few ways that an editing system may be able to



 improve data quality. A system that captures raw data, keeps track



 of changes, and provides well conceived reports, may provide



 feedback on the performance of the survey.  This information can be



 used, to improve the survey in the future.  To take another



 agricultural example, farmers often harvest corn for silage (the



 whole plant is harvested, chopped into small pieces, and blown into



 a silo). Production of silage is requested in tons.  Farmers often



 do not know their silage production in tons.  Instead, the farmer



 will give the size (diameter and height) of all silos containing



 silage.  In the office, silo sizes are converted into tons of



 production.  If this conversion takes place before data are



 entered, then there is no indication from the machine edit of the



 extent of this reporting problem.



 



      Another way that editing software can improve the quality of



 the data is to reduce the opportunity cost of editing.  The time



 spent on editing leaves less time for other tasks, such as



 persuading people to participate, checking overlap of respondents



 between multiple frames, and research on cognitive errors.



 



 



 



 



                                  177



 



Ways That Quality of the Editing Process Can Be Defended or



 Confirmed



 



      There is a difference between data quality and the quality of



 the editing process itself.  To refer once again to the maize yield



 example, a good quality process will have detected the



 transcription error.  A poor quality process might have let it



 pass.  Although neither process will have improved data quality,



 the good quality process would have prevented their deterioration



 from the transcription error.  Editing and imputation have the



 potential to distort data as well as to maintain their quality.  



 This distortion may affect the levels of estimates and the



 univariate and multivariate distributions.  A high quality process



 will attempt to minimize distortions.  For example, in Fellegi and



 Holt systems, changes to the data will be made to the fewest fields



 possible and in a way such that distributions are maintained.



 



      A survey organization should be able to show that the editing



 process is not abusing the data.  For editing after data



 collection, this may be done by capturing raw (unedited) data and



 keeping track of changes and the reasons for change.  This is



 called an audit trail.  Given this record keeping, it will be



 possible to estimate the impact of editing and imputation on



 expansions and on distributions.  It will also be possible to



 determine the editor effect on the estimates.  In traditional batch



 mode editing on paper printouts, it is not unusual for two or more



 specialists to edit the same record.  For, example, one may edit the



 questionnaire before data entry while another may edit the record



 after the machine edit.  In this case, it is impossible to assign



 responsibility for an editing action.  In an on-line mode one



 person handles a record until it is done.  Thus all changes can be



 traced to a person.  For editing at the time of data collection,



 (e.g., in CATI), it may be necessary to conduct an experiment to



 see if either the mode of collection, or the edits employed, will



 lead to changes in the data.



 



      A high quality editing process will have other features as



 well.  For example, the process should be repeatable, in time and



 in space.  This means that the same data passed through the same



 process in two different locations, or twice in one location, will



 look (nearly) the same.  The process will have recognizable



 criteria for determining when editing is done.  It will detect



 real errors without generating too many spurious error signals.



 The system should be easy to program in and have an easy user



 interface.  It should promote the integration of survey functions



 such as micro- and macro-editing.  Changes made by people should



 be on-line (interactive) and traceable.  Database connections will



 allow for quick, and easy access to historical and sampling frame



 data.   An editing system should be able to take actions of minor



 impact without human intervention.  It should be able to



 accommodate new advances in statistical editing methodology.



 



                                  178



 



Finally, quality can be promoted by providing statistically



defensible methods and software modules to the user.



 



Acknowledgements



 



    Other members of the Editing Software Working Group for



Working Paper 18 were Tom Petkunas, Bureau of the Census, Gerry



Hendershot, National Center for Health Statistics, Charles Day,



Internal Revenue Service, Marybeth Tschetter, Bureau of Labor



Statistics, and Rita Hohenbrink, National Agricultural Statistics



Service.



 



 



 



 



 



 



 



 



                               179



 



                         RESEARCH ON EDITING



 



                             Yahia Ahmed



                      Internal Revenue Service



 



 Introduction



 



      This paper is one of three papers presented in a session



 organized to present topics from the Statistical Policy Working



 Paper 18, "Data Editing in Federal Statistical Agencies."  The



 Subcommittee on Data Editing in Federal Statistical Agencies was



 established by the Federal Committee on Statistical Methodology to



 document, profile and discuss data editing practices in Federal



 surveys.  To effectively accomplish its mission, the subcommittee



 was I divided into four major groups: Editing Profile, Case Studies,



 Editing Software, and Editing Research.



 



      The purpose of this paper is to present briefly the goals,



 findings and recommendations of the Editing Research Group.  A more



 detailed description of editing research is provided in Chapter V



 of the Working Paper.



 



      The goals of the Editing Research Group were to identify areas



 in which improvements to edit systems would prove most useful, to



 describe recent and current research activities designed to enhance



 edit capabilities, to make recommendation for future research an



 to develop an annotated bibliography on editing.



 



 Areas Which Need Improvement,



 



      The Editing Research Group used two sources of information to



 identify areas which need improvement.  The first source was the



 editing profile questionnaire which was administered to managers, of



 117 Federal surveys covering 14 different agencies.  This



 questionnaire included questions about edit movements.  One



 question asked was "For future applications, what would you like



 your edit system to do that it doesn't do now?" The second source



 was discussions with those responsible for edit tasks within a



 number of Federal agencies.  The following areas emerged as



 priorities:



 



 0    More on-line edit capabilities



 



 0    Better ways to detect potentially erroneous responses



 



 0    More sophisticated and extensive macro-editing



 



 0    Evaluation of the effect of data editing.



 



 



                                 180



 



 Areas of Edit Research



 



      Much editing research has been conducted in national



 statistical offices around the world.  It is these organizations,



 which conduct huge and complicated surveys, that have the most to



 be gained from developing new systems and techniques.  They also



 have the resources upon which to draw for this development.



 



      One area of current research interest is that of "on-line



 edit capabilities".  BLAISE, SPEER, and PEDRO discussed in the



 preceding paper are examples of such research activities.



 



      A second area of active research is in the detection of



 potentially erroneous responses.  The method most commonly used is



 to employ explicit edit rules.  For example, edit rules may require



 that:



 



   1) The ratio of two fields lie between prescribed bounds,



 



   2) various linear-inequalities and/or equalities hold, or



 



   3) the current response be within some range of a predicted



      value based on a time series or other models.



 



      Edit rules and parameters are highly survey specific.  A



 related area of editing research is the design of edit rules and



 the development of methods for obtaining sensitive parameters.



 



      In order to make sure that all errors are flagged, often many



 unimportant error flags are generated.  These extra flags not only



 take time to examine but also distract the reviewer from important



 problems.  These extra flags are generated because of the way that



 the error limits are set.  A related area of research focuses on



 developing statistical editing techniques to reduce the-number of



 error flags, while at the same time, ensuring that not many errors



 escape detection.  Several research studies in which different



 statistical techniques (such as clustering, exponential smoothing



 and Tukey's biweight) to detect potentially erroneous responses or



 to set error bounds are described in the working paper.



 



      In contrast to the rule-driven method f or the detection of



 potentially erroneous response combinations within a record, one



 alternative procedure is to analyze the distribution of



 questionnaire response.  Records which do not conform to the



 observed distribution are then targeted as outliers and are



 selected for review.  Although there has been research interest in



 this method, no application of these multivariate methods was



 found.



 



 



 



 



 



                                    181



 



 Recommendations



 



      The most important recommendation is that agencies recognize



 the value of editing research and place in high priority on



 devoting resources to their own research, to monitoring



 developments in data editing at other agencies and elsewhere and to



 implement improvements.



 



      Often innovations in editing methods made by survey staff are



 viewed as enhancements to processing for that particular survey and



 little thought is given to the broader applicability of methods



 developed.  Accordingly, survey staff do not prepare discussion of



 new methods for publication.  We encourage survey staff to take the



 time to describe their work and publish them in order to share



 their experiences with others who may be working under similar



 conditions.  It is often in such articles that methods which may be



 applicable to more than one survey are first introduced and



 described.



 



      The survey on editing practices indicated that there was



 little analysis of the effect of editing on the estimates that were



 produced.  Considering that the cost of editing is significant for



 most surveys, this is clearly an area in which more work is



 required.  A related issue is the need to attempt to determine when



 to edit and not to edit.



 



      Clearly, all the errors are not going to be found and we



 should not attempt to find them all.  Therefore, there is a need to



 design guidelines for determining what is an acceptable level of



 editing.



 



      Another neglected research area in this country concerns the



 editing of data at the time they are keyed from mail responses.



 This area is usually discussed in the setting of quality control;



 however, it is an area that can benefit from further research from



 the perspective of data editing.



 



 



 Annotated Bibliography



 



      It is quite difficult to provide a complete assessment of



 current research activities in the area of editing because so much



 of the research, progress, and innovations are described only in



 specific documentation.  However the group was able to identify 86



 references which describe research efforts over the past years.



 Appendix D of the working paper contains the annotated



 bibliography   The annotations are brief and are only intended to



 give a very general idea of the paper's content.  The appendix



 provides a valuable source of information on the editing



 literature.  In addition it includes papers which describe the



 underlying methods, the software, proposed uses, and possible



 



 



                                 182



 



  advantages of three generalized editing software systems -- GEIS,



   BLAISE and SPEER.



 



   Acknowledgements



 



        Other members of the Editing Research Group for Working Paper



   18 were Laura Bauer, Federal Reserve Board, Brian Greenberg, Bureau



   of the Census, Renee Miller, Energy Information Administration,



   David Pierce, Federal Reserve Board, Paula Weir, Energy Information



   Administration.



 



 



 



 



 



 



 



 



                                    183



 



                               DISCUSSION



 



                           Charles E. Caudill



               National Agricultural Statistics Service



 



 



       As Administrator of a Federal-State Cooperative Statistical



  Agency, I am quite impressed with the information contained in OMB



  Statistical Policy Working Paper No. 18 on Data Editing in Federal



  Statistical Agencies.  The working paper thoroughly, documents many



  existing editing practices, generalized editing software



  developments and provides a detailed software evaluation protocol.



  In addition, it covers current research activities on editing,



  provides an annotated bibliography and has a good executive summary



  including recommendations.



 



       I believe that this report, if read and seriously considered



  by federal survey managers and administrators, can have a



  substantial effect on improving productivity.  Thus, "precious"



  resources could be freed up to more formally address nonsampling



  errors, quality control, and total survey error models,



  measurements and structures. In my opinion, if there was ever a



  report that survey administrators should take seriously, this is



  it.



 



      There are several more detailed comments and observations that



  I have about working paper number 18.  The data on the costs of



  editing was intriguing.  My observation is that there may be an



  upward bias in the data, and some non-editing cost may have been



  included.  However, even if this is the case, there obviously is



  still plenty of room for productivity gains in the editing process.



  With the proliferation of personal computer networks and data base



  software, there is substantial potential to improve the



  productivity of editing systems by being on-line and providing the



  editor with immediate screen feedback and re-editing of their



  proposed changes.



 



      Recent computer processing technology advances also make the



  use of audit trails more available for more users.  Inexpensive



  audit trails provide the capability to analyze and conduct research



  on the effects of editing on the estimators and also on the overall



  performance of the survey as well.



 



     The detailed checklist of edit software system features in



  Appendix C of working paper 18 will be beneficial to both the



  development of new systems and maintenance and evaluation of



  existing systems.  The annotated bibliography of articles and



  papers on editing presented in Appendix D will be valuable for



  researchers and system developers as a substantial source of



  literature and information.



 



 



 



                                 184



 



        Working paper 18 certainly demonstrated that current data



  editing practices are labor intensive.  Many remain mainframe and



  batch oriented, with multiple passes of the data.  Also, I think



  that there may be a tendency to stay with existing systems too



  long.



 



       My final comments are on total quality management of surveys.



  As an Administrator, one of my major concerns is with the quality



  of the final products and reports that the Agency delivers to the



  public.  Thus, if the editing process can be made more efficient,



  without degrading accuracy, then that adds to the potential of



  using the saved resources on other important areas of the survey



  process.  Total quality management techniques applied to surveys



  are useful tools in efficiently identifying the most important



  potential sources of survey error.



 



                              DISCUSSION,



 



                           Richard Bolstein



                       George Mason University



 



      The serious impact that erroneous survey data can have on



  results, the fact that the number of errors tend to increase with



  the size and complexity of the survey, and the relatively large



  proportion of survey costs currently required to edit and correct



  data, make the need for new and improved methods of data editing



  imperative.  To this end, the authors have done a laudable job in



  researching methods currently used, presenting several case



  studies, testing and discussing the advantages and disadvantages of



  some current and developing editing software, and providing a



  synopsis of current research.



 



      A working definition of editing was clearly necessary in this



  study, since, among other things, in order to estimate costs



  of editing, a fairly rigorous definition of the scope of editing was



  required.  The working definition used by the authors, namely,



  "procedure(s) designed and used for detecting erroneous and/or



  questionable survey data with the goal of correcting as much of the



  erroneous data as possible, usually prior to data imputation and



  summary procedures" is quite suitable for this purpose.  We should



  keep in mind, however, that while it feels comfortable to clean up



  erroneous data prior to imputation for missing data, in practice



  the two are often intertwined.



 



     The paper states that the cost of editing was available for



  40% of the 117 surveys in the sample, and cost estimates were



  possible for an additional 40%.  It was reported that between 75%



  and 80% of these surveys had editing costs of at least 20% of total



  costs.  It is not too meaningful to compare the relative costs of



  editing across all types of surveys however, since one would



  naturally expect that these costs would be higher in less expensive



  surveys (such as mail or administrative records) than in expensive



  surveys (such as personal interview, surveys of institutions), as



  found by the authors.  Thus, it would be more informative if the



  relative cost figures cited above were reported by survey type.



  Another factor that can account for a large percentage of editing



  costs is the presence of a relatively large number of questions



  requiring open-ended responses and subsequent coding of the



  responses.  But although the distribution of the relative cost of



  editing may vary considerably, there is no doubt that editing is



  costly and methods to reduce this cost and improve data quality are



  much needed.



 



     Finally, no discussion of the costs of editing is complete



  without determining what percentage is due to bad data that should



  not have occurred but for inadequate interviewer training, poor



  supervision and quality control of interviewers, and simple common



 



                                186



 



 



  sense errors.  For these are errors which should not have occurred



  and should be deducted from the cost of editing in the estimates of



  the surveys above, since they are likely to have varied



  considerably.



 



       Although elimination of such unnecessary errors was not part



  of the project of the three authors, it seems appropriate in a



  discussion of improving data editing procedures to mention ways in



  which the need for editing can be reduced.  To illustrate an



  example of a common sense error that should be eliminated, in a



  certain survey, the sponsor of which I will not name, fishermen are



  interviewed and their catch is weighed and measured.  The



  interviewer is supposed to record weight in kilograms, but the



  scale used shows weight in both pounds and kilograms.  As expected,



  frequent errors occur. The obvious solution is to use a scale that



  only shows kilograms, but when I suggested this to the survey firm,



  the response was "no one makes such a scale".  When I then



  suggested taping over the side of the scale showing pounds, the



  reply was "but the fishermen want to know what their fish weigh in



  English".  Finally, I suggested taping over the kilogram side of



  the scale, have the interviewer record the weight in pounds, and



  have the data entry program convert it to kilograms.  The response



  to this suggestion I am sure you have all heard before: "well,



  that's the way we're used to doing it".  There are numerous other



  examples of course (for example, in some surveys interviewers are



  required to record the hour in military time).



 



       The most promising methods to reduce editing costs and improve



  data quality (after elimination of the unnecessary errors) are



  found in interactive data entry software and in general editing



  software systems.  These methods seem appropriate for large,



  complex surveys, or surveys which are repeated.  For small one-time



  surveys the cost of purchasing, learning, and programming the



  software will most likely outweigh the savings, as this is even



  true with CATI.  But this is generally not the case with surveys



  gathering Federal Data.  The three generalized editing software



  systems studied in detail by Mark Pierzchala seem very promising,



  especially BLAISE because of its generality and ability to handle



  both categorical and continuous data.  GEIS and SPEER are specific



  to economic type surveys.



 



       To what extent can graphics or other theoretical tools be used



  in editing systems?  The STAR WARS software described uses graphics



  to compare edited values with the originals, but not to detect



  outliers.  The parallel coordinate system for graphic displays of



  high-dimensional data [see Miller and Wegman (1989), Wegman (1990)]



  may be used to detect outliers.  Yahia Ahmed noted that analysis of



  the multivariate distribution of questionnaire responses to flag



  records that don't conform to the distribution as outliers has been



  infrequently used, no doubt due to its complexity.  I believe that



  graphical methods for detecting outliers will meet with more



  acceptance than the multivariate analysis approach has but it would



 



                                      187



 



 not be cheap (time-wise) and probably would be best used as a final



  check rather than at the front-end of the editing task.



 



      Finally, I have two recommendations.  In view of the



  increasing abundance of software we will see in the future, we



  should construct a standard set of test data sets for evaluating



  present and future software editing systems.  Secondly, a one or



  two-day demonstration seminar of some of these systems would be



  well received.



 



 



  References



 



  Miller, J.J. and Wegman, E.J. (1989), "Construction of line



  densities for parallel coordinate plots", Technical Report No. 53,



  Center for Computational Statistics, George Mason University.



 



  Wegman, E.J. (1990), "Hyperdimensional data analysis using parallel



  coordinates", Journal of the American Statistical Association, to



  appear.



 



 



                                Session 6



                   COMPUTER ASSISTED STATISTICAL SURVEYS



 



 



 



 



 



 



 



 



                                   189



 



   OVERVIEW OF COMPUTER ASSISTED SURVEY INFORMATION COLLECTION



 



                         Richard L. Clayton



                  U. S. Bureau of Labor Statistics



 



 



      This section provides a summary of Working Paper 19 on



 Computer Assisted Survey Information Collection (CASIC).  For



 additional information, we encourage you to see this document.



 



      The power of rapid calculating has been applied to virtually



 every phase of the survey process, including sample design and



 selection, and estimation.  The most important implication of these



 applications is that survey practitioners are allowed to consider



 a growing range of techniques which were not affordable prior to



 the availability of inexpensive and fast calculating capability.



 



      The field of computer assisted collection applications may be



 the area of greatest and most rapid change in survey methods.  This



 field includes the rapidly expanding variety of applications based



 on the availability of powerful and inexpensive computers.  Most



 familiar of the new techniques are CATI and CAPI.  However, a



 variety of other collection methods are being developed across the



 Federal government's statistical agencies, including Touchtone Data



 Entry, Prepared Data Entry and more recently, voice Recognition



 Entry.



 



      High quality published data begins with collecting high



 quality data from our respondents.  Much of survey processing



 addresses, and compensates for, weaknesses in the quality of the



 collected data and the data we do not collect.  Those methods which



 capture data quickly and accurately should be developed which allow



 respondents to answer our questions accurately and quickly.  With



 this in mind, we provided the results of research and development



 activities using new technological features throughout the Federal



 government seeking new data collection methods, and in modifying



 the old, to improve the quality of data collection.



 



      For the purposes of this report, we defined computer assisted



 survey information collection methods as those using computers as



 a major feature in the collection of data from respondents, and in



 transmitting of data to other sites for post-collection processing.



 



      Goal:  The overall goal of Working Paper 19 was to provide



 information on new data collection methods to challenge Federal



 survey managers to reconsider their operations in light of recent



 changes in survey methods available, or made attainable through



 changing technology to reassess their methods of accomplishing the



 common goal of providing the critical information to the public



 which is accurate, timely and relevant.  We hope that by sharing



 information and experiences, that others may gain and forward the



 overall effectiveness of governmental activities.



 



                                  191



 



     Objectives:  The primary objective is to describe emerging



methods of interactive electronic data collection, the potential



benefits, and current examples of its use in Federal surveys.  In



describing current uses and tests, a secondary objective is to pose



questions about the implications of use of computer assisted



methods and try to suggest some answers.  These questions involve



such factors as quality, costs, and respondent reaction to.



computerized surveys.



 



     Scope: The survey operations included in this report includes



all of the activities and tasks from the transmittal of the



questionnaire, conduct of the interview, data entry, editing and



followup for nonresponse or edit reconciliation.



 



     The last major survey operation to benefit from automation is



data collection.  Computers were first applied to collection using



mainframes to control certain aspects of telephone collection, and



Computer Assisted Telephone Interviewing (CATI) was born.  The



first applications of CATI stimulated new research worldwide



evaluating the impact on of CATI on the survey error profile and



costs.  CATI is now used to assist interviewers in all collection



activities, including scheduling calls, controlling detailed



interview branching, editing and reconciliation, providing much



greater control over the collection process and reducing many



sources of error.  At the same time, a tremendous amount of



information it captured by the computer providing additional



insight into the data collection process.



 



     The ongoing advances in computer technology, and particularly



the advent of microcomputers, continue to offer additional



opportunities for improving the quality of published data.  The



first portable computers were quickly pressed into service to



duplicate the advantages of CAT! in a personal visit environment.



Thus, Computer Assisted Personal Interviewing (CAPI) was launched



from the work in CATI.



 



     While CATI and CAPI represent advances for surveys requiring



interviewers, microcomputers are now finding important roles in



self-administered questionnaires, where interviewers are not



needed.



 



     Prepared Data Entry (PDE), developed by the Energy Information



Administration, allows respondents which have a compatible



microcomputer or terminal to access and complete the questionnaire



directly on their screen.



 



     Touchtone Data Entry (TDE), developed at the Bureau of Labor



Statistics, allows respondents to call a toll-free telephone



number.   Questions posed by a computer are answered using the



keypad of their touchtone telephone.  The machine repeats the



answers for verification with the respondent which are stored in a



database.  TDE systems are now commonplace for bank transfers, and



 



                                192



 



telephone call routing, as examples.  We have just applied



 existing technology to the data collection process.



 



      As an extension of this approach, techniques have been



 developed more recently allowing respondents to answer the



 questions by speaking directly into the telephone.  The incoming



 sounds are matched to known patterns recognizing the digits and the



 words "yes" and "no".  Voice Recognition Entry (VRE), as this is



 known, is not the distant future.  The Bureau of Labor Statistics



 is currently conducting live tests where this method is being



 warmly received by respondents as natural and convenient.



 



      Both TDE and VRE offer inexpensive data collection where the



 respondents initiate the calls, enter and verify the data.



 Refinements to procedures will now focus on minimizing nonresponse



 prompting activities.



 



      Respondent Burden:  For many respondents, the use of automated



 methods can actually reduce the collection burden placed on them.



 For example, use of Prepared Data Entry, where respondents interact



 with computer screens, provides a single set of step-by-step



 procedures with on-line editing to prevent inconsistent or



 incorrect reporting, thus reducing the need for expensive and



 troublesome recontacts.  Also, these methods have, in some cases,



 substantially reduced the time taken to provide complex data for



 large establishments.  Similar methods may be applied to other



 surveys covering large establishments where the one-time costs of



 data conversion to a standard format would be cost-effective,



 especially in repeated surveys.



 



       Ouality:   Automated collection allows for improved control



 yielding reduced error from several sources including errors caused



 by the respondent, the interviewer, and post collection processes



 such as key entry error.  The instant status capabilities of CATI,



 for example, provide stronger intervention features for nonresponse



 prompting, reducing nonresponse error.



 



       In deciding which collection method to use, quality can become



 a relative concept that is affected by a tradeoff between cost and



 benefit.  The choice of a data collection method is usually based



 on a combination of performance and cost factors determining



 affordable quality.  For traditional collection methods, these



 factors and the decision-making process are fairly well known.



 Now, these new methods discussed in Working Paper 19 expand the



 array of potential collection tools and challenge the survey



 designer to reevaluate old cost/performance assumptions.



 



       Costs: The data collection process is composed of a few major



 activities, including transmitting and receiving the questionnaire,



 data entry, editing and nonresponse prompting.  The labor and



 nonlabor costs will vary depending on the method used.  For



 example, under mail collection virtually each action is conducted



 



                                    193



 



manually and postage is the dominant nonlabor cost.  By contrast,



 CATI operations can minimize postage costs reduces many of the



 expensive mail handling operations.  However, CATI adds new costs



 in the form of telephone line charges and computers (including



 Systems design and ongoing maintenance).  Self-response methods,



 such as TDE, VRE and PDE collection, reduce postage, the manual



 mail operations and the labor involved in CATI interview



 activities, but may still require edit reconciliation and



 nonresponse followup.



 



      Thus, the factors of production, and the composition of each



 those inputs vary greatly  among the existing and newer techniques.



 Many factors can change in a short period.  Only a few years ago,



 automation costs were driven by the scarcity of mainframe hardware



 capacity.  Now, the costs of automation are driven by the labor



 involved in developing specialized systems dominates automation



 costs.  Portable and desktop microcomputers were not widely



 available at the beginning on this decade.  Now, microcomputers are



 widely available, very inexpensive and extremely powerful.



 



      Old assumptions about costs need to be reevaluated.  Labor and



 postage costs have risen steadily in recent years, while capital



 costs, such as microcomputers and telephone services have been



 declining.



 



      The decision on which collection mode to use, or which



 combination, will depend on the particular survey application and



 the existing cost structure.  However, it is important to view such



 investments over the long-term as the relative costs of each of the



 inputs do not remain constant over time.  Survey managers should



 periodically review old assumptions in light of new technology and



 project operating costs over the reasonable foreseeable future in



 deciding not to investigate new methods.



 



      Users: Automated data collection includes three major groups



 of people: the respondents, the interviewers and the designers and



 developers of the system and procedures for collection.  This



 report covers the essential factors involved in successfully



 including the requirements of each group.



 



      Respondents: The respondent must be considered the primary



 user of any survey vehicle, whether automated or not, and all



 aspects of the response environment must be developed with the



 respondent in mind.  The cooperation of the respondent is the



 single most critical factor in survey operations. Respondents must



 be treated with the greatest care.  We must consider our



 respondents as a Customer, after all, if our survey vehicle doesn't



 "sell", if the questionnaire is not successful in getting an



 accurate response, we will have no input for the rest of our



 production process.



 



 



                                  194



 



       Even one-time surveys must strive to leave the respondent with



 the feeling of contribution and importance, and most of all, a



 willingness to participate in other surveys in the future if called



 on. Thus, our primary job is to develop techniques which allow the



 respondent to complete the survey completely and accurately and



 with a minimum level of burden.



 



       The use of these collection methods, while bringing



 improvements in the quality of collected data, has entailed other



 challenges.  These automated collection methods are made possible



 through the close interaction of subject matter experts,



 statisticians, and computer scientists.  To effectively use these



 methods, each of these groups learned the basic tenants of the



 others.  This close relationship will only continue to grow, with



 advances in each field aiding advances in the others.



 



       Interviewers:  The second most important user is the



 interviewer.  The systems provided to assist in the interview



 process must be easy to use, must work infallibly and must actually



 provide improvements in his or her work environment.  Interviewers



 must feel as they are the most valuable feature in the interview,



 that the machine is merely a tool to expedite and simplify their



 work.  This is not always an easy task.



 



       Survey Practitioners: We are the third major group of users.



 The decisions made early in the development process will carry over



 into the ongoing use and maintenance of the system.



 



       Systems designers face difficult choices, such as building



 customized systems from scratch versus linking standardized "off



 the shelf" routines or commercial, packages.  The inevitable



 limitations would have to be traded off against reduced maintenance



 and lower start up costs.



 



       Automated collection methods can also improve data quality.



 All of the methods discussed could be designed to include on-line



 editing to prevent impossible and inconsistent entries.  Some of



 these methods, such as TDE and VR, improve data quality by



 verifying recorded data with the respondent.



 



       These are potential improvements.  The final impact of quality



 lies in the up front planning and execution.  This place



 responsibility for clearly defining and controlling the collection



 environment directly with the survey designer.



 



       Future:  The future application of these techniques is limited



 only by our creativity and initiative of program managers and



 planners.  The "case studies" serve to illustrate the options



 available, and will surely raise many more questions for further



 investigation.



 



 



 



                                    195



 



     We hope that the discussion of technological advances



generates discussion and stimulates creative, new applications to



the whole range of governmental information collection activities.



 



     In addition to the methods described here there are other



advances in, technology which hold potential for vastly changing



data collection.  Integrated Services Digital Network (ISDN) is a



powerful network system which will provide simultaneous



transmission of sound, video and data.  The result could be a



change in the way some surveys are conducted offering all of the



benefits of personal interviewing with the lower costs of telephone



interviewing.



 



     You have heard a several different collection methods



described and discussed which are currently available.  And you can



see that the pace of change will accelerate and match changes in



technology.  So what does the future hold?



 



     You have to ask yourself how your survey operations will be



conducted in 5 or perhaps 10 years.  In doing so, ask yourself how



things were done 5 or 10 years ago.  What sorts of things have



happened and what were their implications?



 



 



 



 



 



 



 



 



                                196



 



                 A COMPARISON BETWEEN  CATI AND CAPI



 



                             Martin Baum



               National Center for Health Statistics



 



  Introduction



 



      I will describe for you some of the critical factors one must



  consider when deciding whether to conduct a survey by either CATI



  or CAPI.  I also will try to indicate the similarities and



  differences between these to methods of survey data collection



  automation.



 



  Definition



 



      Let me first define each of the methods.  Computer Assisted



  Telephone Interviewing (CATI) is a computer assisted survey process



  which uses the telephone for voice communications between the



  interviewer and the respondent.  Computer Assisted Personal



  Interviewing (CAPI) is a personal interview usually conducted at



  the home or business of the respondent using a portable computer.



 



  Rationale



 



       The rationale for the development and for your use of these



  methods are based primarily on reasons of improved data quality and



  improved timeliness of data release.  Cost is a factor, but in our



  experience, it has been a break-even situation; the cost of



  automating has equaled the savings.  This result has been due



  primarily to the high cost of software development.



 



  Factors



 



      The following are critical factors that must be considered in



  addition to those of improved data quality and timeliness, and cost



  when deciding whether to use CATI or CAPI for your survey data



  collection.  I will discuss each of these factors in some detail.



 



  Hardware CATI



 



       Initially CATI was developed as a mainframe application but



  as computer technology changed, CATI moved to the mini computer and



  then to a networked micro computer application.  The investment in



  hardware has steadily decreased without any lost of capability.



  Telephone technology, which impacts telephone availability is



  important to the CATI application - no phone no respondent.



 



                                   197



 



 Hardware CAPI



 



      The most important computer hardware criteria for a CAPI



 application are generally quite different from those that would be



 critical to most other applications.  The major reason is the role



 that environmental conditions play in the selection of CAPI



 hardware.  The fact that CAPI is a personal interview situation,



 usually taking place in or at the home of the respondent, dictates



 a number of possible circumstances under which the interview will



 be conducted.



 



      For example, screen visibility becomes a paramount criterion



 because of the environmental conditions.  Interviews will take



 place under all types of lighting conditions; outside in bright



 sunlight, twilight, and normal light, and inside under lamp light,



 fluoresce light, and bear bulb.



 



      Weight is especially critical because of the variety of



 environmental conditions.    Interviewers may be conducting the



 survey in an urban setting where the computer will be carried up



 and down the stairs of apartment houses; or in a suburban setting



 where the computer is carried many blocks; or in a rural setting



 where the computer is carried long distances from car to house.  In



 any of these conditions, the computer is moved in and out of a car



 many times.  This situation is further compounded by the fact that



 the interviewer must also carry considerable paper e.g. back-up



 paper questionnaires in case the computer fails, letters of



 explanation, introduction, and thank you.  Carrying all of this



 weight in and out of cars and up and down steps all day is no easy



 job, particularly if the computer and back up battery weighs 10



 plus lbs. and the paper weighs an additional 5 lbs. or more.



 



     For a household type survey, the interviewers are generally



 reluctant to ask for the respondent's permission to use power for



 the computer because of fear of possibly losing the interview.



 Also, surveys frequently are conducted outside of the house where



 no power is available.  Many of our surveys can last as long as 2-



 4 hours.  Consequently, battery life it critical.



 



     Environmental conditions often impact the ergonomics of the



 hardware.  Consider a survey interview conducted where the computer



 must be placed on the interviewer's lap.  This situation would be



 quite difficult if the computer were either top heavy when open or



 the interviewer was small and the computer's depth long.



 Balancing would be a problem.  Also consider the door step



 interview with a 10 lb. clam shell design computer.



 



 Software



 



     Now let's discuss the most costly factor in the CATI/CAPI



 decision - software.  There are four components to the CATI/CAPI



 



                                198



 



software:  Questionnaire, Case Management, Output Reporting, and



 Authoring System.



 



      The questionnaire component refers to the software that places



 each question in the survey on the computer screen in the proper



 sequence with the appropriate information (i.e. prompts) and allows



 the entry of an answer or answers to the question with edits on



 those answers such as; range, specific values, consistency with



 another question's answer.  This software should also contain on



 screen help and if necessary, rostering.



 



      The case management component is the software that allows the



 interviewer to keep track of the status of the survey interview;



 that is, is the interview complete?; if the interview is not



 complete, what has been completed and what is the next question to



 be asked?; is the interview a partial interview or is the interview



 to be completed later?; what sections of the survey are mandatory?;



 and in some instances, interviewer assignments.  In the case of



 CATI, case management software also would provide the sample



 selection and dialing of the phone number.



 



      The output reporting component is often either overlooked or



 given minimal consideration.  This is a big mistake.  Collection of



 the data is not very useful if the data cannot be easily accessed



 for analysis.  Output reports can be categorized as either survey



 questionnaire statistics or management statistics.  The level of



 detail and complexity can vary significantly.  Survey questionnaire



 reporting can be as little as the ability to place the data into



 specific analysis software file format e.g. SAS or can include



 actual analyses.



 



      Management statistics can be extremely useful for the conduct



 of the survey data collection.  For example, data can be



 automatically collected on the time to complete a section of the



 questionnaire by interviewer.  This information could provide



 insights for training and/or question rewrite.



 



      The authoring system allows a non-computer programmer e.g. a



 survey questionnaire designer, to create the questionnaire while



 simultaneously and automatically generating the questionnaire



 software component.  It has been our experience that this is the



 most difficult component to develop.  Although there are a number



 of such systems that are available, none of these systems has met



 all of our requirements for the type of complex survey we conduct



 e.g. NHIS.  The authoring system should be extremely user-friendly



 and be able to handle a large number of question types.



 



 



 



 



 



 



                                  199



 



Data Transmission



 



     In the case of CATI, the data is automatically transmitted to



a central point for either uploading to larger computer or further



processing e.g. analysis.



 



     In the case of CAPI, the data collection is dispersed



generally over a wide geographic area.  The two primary methods for



data transmission have been mailed floppy disk or



telecommunications.  For data that is needed in one day or later,



floppy disk has been adequate.  Telecommunications, however, adds



a new dimension - Two way communications.  Not only can data be



transmitted to a central point, but instructions for the



interviewers, for example, could be transmitted from the central



point to the field.  The major problem with the telecommunications



method has been consistent quality of the communication lines.



Cost can also be a barrier.



 



Interviewer Training



 



     The level and amount of training needed depends, to large



extent, on the level of user-friendliness of the software.  Our



experience has shown that the type of training is different for



either a CATI or CAPI conducted survey than for the pencil and



paper conducted survey.  In the paper and pencil conducted survey,



training is focussed on almost entirely on the content of the



questionnaire, management of the questionnaire, and the proper



question sequencing.  It would not be unusual to have an



accompanying instruction manual 3-4 inches thick that would have to



learned by each interviewer.  Whereas, in the CATI or CAPI



conducted survey, training included both questionnaire content and



the care and use of the computer.  The major focus, being the



computer not the content because the computer software can handle



most of the problems the interviewer needs to worried about in the



pencil and paper conducted survey, such as; probes, question



sequencing, completeness.



 



    There is one major difference between CATI and CAPI that



impacts on the training: the level of interviewer anxiety.  CATI is



conducted at a central location where supervision and help are



readily available.  CAPI, on the other hand, is conducted in the



field where no supervision or help is readily available.



Therefore, CAPI training must try to provide the interviewers with



sufficient confidence in the software and hardware to cope with



this lack of help.   One method that has proven effective is to



emphasize hands-on practice.  Interviewers are encouraged to take



home their computer and practice interviews with anyone they can



get prior to going into the field.  In addition, interviewers are



given their computer prior to the training so they can have some



familiarity with them.  CAPI interviewers must be able to cope with



 



                                200



 



problem occurrences.  Consequently, training must concentrate on



such situations.



 



 



Future Technology



 



     Impending technological advances can have a profound impact on



these automation methods; particularly CAPI.  Changes in hardware



such as; an "etch-a-sketch" microcomputer and an inexpensive long-



life, light-weight battery would open new possibilities for the



CAPI conducted survey.  Use of a light-weight computer, under 5



lbs,no key board, with light pen hand-written entry would allow



door step surveys as well as reduce training efforts.  The "etch-a-



sketch" computer has been introduced by one vendor and several



other are about to announce.  The long-life light-weight



inexpensive battery, although not currently announced or available,



when available will produce much faster and larger light-weight



computers.  Thus allowing larger and more complex surveys to be



automated.



 



     The development of an generalized authoring system software



would open up the use of CATI and CAPI to the quick-turn-around



type survey.  Survey questionnaires could be designed and



implemented quickly and easily.  Staff productivity would also



increase significantly because computer programming efforts to



automate each survey questionnaire would be reduced to a minimum.



The survey designer, in effect, would be programming the survey



while designing the questionnaire.



 



 



 



 



 



 



 



 



                                 201



 



                COMPUTER ASSISTED SELF INTERVIEWING



 



                           Ralph Gillmann



                 Energy Information Administration



 



     The phrase "computer assisted self interviewing" (CASI)



covers all survey methods in which respondents access computers.



These methods include "computerized self administered



questionnaires" (CSAQ) and "prepared data entry" (PDE) where the



respondent fills out a computerized version of the survey



instrument.  Also included are methods where the respondent uses a



telephone to access a computer: "touch tone data entry" (TDE) and



"voice recognition data entry" (VRE).



 



     Let's step back for a moment and look at different ways that



computers can be used in interviews:



 



Click HERE for graphic.



 



 



 



 



 



 



     The top line represents direct interaction between an



interviewer and a respondent.  The left line represents the



interviewer accessing a computer such as in CATI and CAPI which



were previously discussed.  CASI methods are illustrated by the



lower right triangle.  The diagonal represents respondents



accessing an agency computer as in TDE and VRE.  The right line



represents respondents accessing their own computers as in PDE.



With the personal computer (PC) becoming ubiquitous, at least in



establishments, respondents usually have access to a computer.



 



     The bottom represents computer to computer interaction for



data transmission.  The missing diagonal would represent the



activities of hackers and spies.



 



 



 



 



 



 



 



 



                                202



 Next, let's compare manual and computer assisted methods:



 



Click HERE for graphic.



 



      Some methods are part manual and part computer assisted.  For



 instance, CATI and CAPI combine a personal interview with an



 electronic survey instrument.   One survey which uses all of the



 computer assisted methods is the Petroleum Electronic Data



 Reporting option (PEDRO) in use at the Energy Information



 Administration.  In general, the manual methods are slower and more



 prone to processing errors.  Labor and postage costs are also



 rising faster than the operational expenses of computer assisted



 methods.



 



      For transmission of the data to the collecting agency, paper



 copies can be sent via facsimile machines (fax).  This method is



 faster than the mail but doesn't eliminate the need to key in the



 data.  If the data are in electronic form, a diskette with the data



 can be mailed in.  This is useful if security and authenticity are



 a particular concern.  Transmission time may be saved by sending



 the data over the telephone network or using "electronic mail" over



 a computer network.  (Note that it's becoming harder to tell



 telephone and computer networks apart.)



 



      The use of an electronic mail service is feasible now and



 likely to be more important in the future.  This method allows a



 third party to handle the support for telephone lines, security,



 and temporary storage.  Respondents only need to have a terminal



 which operates over ordinary telephone lines if the survey



 instrument resides with the electronic mail service in the form of



 an electronic questionnaire.  Security can be provided by passwords



 and data encryption.  The survey agency can retrieve the data at



 its convenience.



 



      Finally, CASI offers several quality improvements:



 



      Increased timeliness of the data (especially important in



      monthly and weekly surveys)



 



      Fewer follow-up calls to respondents (because many, if



      not all, data edits can be done immediately)



 



                                  203



 



    Reduced respondent burden (fewer persons are needed to



     fill out an electronic form)



 



     Lower costs (at least in cases where labor and postage



     make up a large part of the costs)



 



 



 



 



 



 



 



 



                              204



 



                COMPUTER ASSISTED SELF INTERVIEWING:



                    RIGS AND PEDRO, TWO EXAMPLES



 



                            Ann M. Ducca



                 Energy Information Administration



 



 



      I am going to talk about two systems that the Energy



 Information Administration has for reporting data using personal



 computers (PC's).  One system is a mail submission of a PC



 diskette, and the other uses telecommunications between the



 respondent's PC and our mainframe computer.



 



      The first example is the Reserves Information Gathering



 System, known as RIGS.  It is a system for reporting data on



 domestic oil and natural gas reserves on PC diskettes.  The data



 are collected by the EIA in its annual survey of oil and natural



 gas well operators.  Reporting to this survey is mandatory.



 



      Briefly, this survey is a stratified sample survey with the



 stratification being done according to the amount of production of



 oil and natural gas.  Respondents in the first strata, representing



 the largest amounts of production and having the most data to



 report, are eligible to report using RIGS.  They will also continue



 to have the option of reporting on paper forms.  The EIA cannot



 require an electronic form of submission.  RIGS first became



 operational for the reporting of 1988 data.   We anticipate that



 25-30 percent of the 1989 reserves information will be reported



 using the RIGS system.



 



      The EIA sends PC diskettes containing the RIGS processing



 software by mail to respondents.  A user's guide is also provided.



 The respondents install RIGS onto their PC's and use it to enter



 data.



 



      The basic hardware requirement is an IBM compatible PC with at



 least 360K of random access memory, and two floppy disk drives or



 one floppy and one hard disk drive.  A printer should also be



 attached to the system so that a hard copy can be printed.  Version



 2.0 or higher of MS DOS is also required.  The IBM PC compatible



 computer was chosen because of its wide availability.



 



      The software for RIGS was originally written in dBASE III, a



 PC database management system.  dBASE III programs can only be



 executed using the dBASE III software, that is, stand-alone



 programs cannot be created.  Since the EIA did not want to purchase



 and provide the dBASE III software for every respondent, Clipper,



 a linkage compiler, was used to compile dBASE III into object code



 to make it a portable system.  The licensing agreement with Clipper



 permits run-time programs created by it to be operated outside the



 agency.  Thus, the respondents are provided with an executable load



 module, not programs.  Licensing agreements must be carefully



 



                                  205



 



reviewed before planning to use software products outside an



 agency.



 



      An advantage of a load module is that respondents cannot



directly or inadvertently change the programs.  Also, there is no



cost to the respondents since the RIGS software was developed by



the government.



 



      Using the RIGS software, the respondents enter data directly



on their PC.  The data entry screens for RIGS are formatted like



the data collection form.  There may be some benefits to exploring



other formats which take advantage of options available to



automated collection, such as question sequencing.



 



      There is also the option of sending an ASCII file to the RIGS



system so that data already available in an automated form at the



respondent site can be submitted without re-keying.  The RIGS



User's Guide gives the instructions and record layout requirements



for downloading ASCII files.



 



      Respondents are required to submit to us by mail a diskette



containing a copy of the cover page and the data. They must also



return a paper copy of the cover page with the signature of the



certifying official.



 



      Because the survey is an annual one, it was decided that



telecommunications with the EIA mainframe computer was not needed,



and that the mail submission would be sufficient.  Since the data



in the RIGS system are proprietary, it was also decided that



respondents would not be provided with their previous year's data



because of the risk of sending confidential data to the wrong



respondent.



 



      Preliminary edits such as range checks are performed as the



data are entered into the RIGS system.  If the system detects an



incorrect entry, the bell sounds and a message appears across     the



top of the data entry screen. The message will prompt the user    for



a response.    Help screens are available to assist the user,   and



help is also available by telephone on a toll-free number.  For



data that have been downloaded into RIGS, an edit report is



produced afterwards.  A respondent may then use the RIGS edit



function to correct the errors.



 



      Final edits, such at comparisons with previous year's reports,



are made after the data are returned to the EIA.  These edits are



performed on our mainframe system.  When questionable data are



identified, a quality control analyst contacts the respondent by



telephone and changes are made by the EIA.



 



      Respondents also have the option to make notes in a footnote.



These notes may be helpful in explaining data that appear to be



questionable.



 



                                 206



 



      The second example is the Petroleum Electronic Data    Reporting



  Option (PEDRO).  It gathers monthly data for petroleum supplies



  from petroleum companies.  The respondents eligible to use PEDRO



  participate in 7 monthly surveys.  They include refineries, storage



  facilities,  pipelines, importers, and extraction facilities.



  Reporting to these surveys is also mandatory.  But again, the EIA



  cannot require an electronic form of submission.



 



      The participation in PEDRO varies among the 7 surveys.  The



  market share represented by reports to PEDRO ranges from 25 to 90



  percent of the total volume for a survey.



 



      The main difference between the PEDRO and RIGS systems is that



  PEDRO uses telecommunications to transmit data directly to the EIA



  mainframe computer.  PEDRO users need an IBM compatible PC with a



  hard disk and a floppy drive, and a modem.  As with the RIGS



  system, respondents are provided with an executable load module at



  no cost.  PEDRO also requires the Arbiter communications software



  which is licensed only for use with the EIA.  Arbiter was selected



  because it satisfied our security needs.  The EIA supplies the



  respondents with Arbiter.



 



      The basic methods of entering data to PEDRO are the same as



  those with RIGS -- keying on the PC or sending an ASCII file to the



  PEDRO system.  However, data submission in PEDRO is done by



  telecommunications directly to our mainframe, rather than by



  mailing diskettes.  Since these are proprietary data, PEDRO



  submissions are encrypted.  The transmissions are time-stamped to



  replicate a postmark.  The respondents must use passwords to



  transmit data, and the password, rather than a written signature,



  serves as the certification of the validity of the data.



 



       All edits in the PEDRO system appear on the respondent's PC.



  Since there is a direct link to our mainframe, all data needed for



  editing comparisons, for example prior month's data, are available



  on-line.    Preliminary edits are performed before respondents



  transmit. any data.  Final edits are performed after the link to the



  EIA mainframe and transmitted back to the user.



 



       The EIA is very pleased with the RIGS and PEDRO reporting



  systems.   We believe that we are getting data faster and more



  accurately from these systems, and are encouraged by the increase



  in interest in using them.



 



 



 



 



  



 



 



 



                                    207



 



 



 



                                    208



 



 



Click HERE for graphic.



 



 



 



                                   208



 



 



                          DATA COLLECTION



 



                             Cathy Mazur



             National Agricultural Statistics Service



 



 



      In this session, I will first mention several factors to



 consider when deciding on a mode of data collection.  Then I will



 spend a few minutes comparing the modes of data collection that



 have been discussed.



 



      The primary factors in choosing a method of data collection



 for a given survey are (as previously ;mentioned) the available



 time frame, the desired quality, and the cost of resources.  It is



 unusual to have all three of these in abundance.  Therefore,



 tradeoffs must be considered.



 



      Several other factors to consider which relate to survey



 design and operation are whether the survey is mandatory or



 voluntary, whether a onetime or ongoing survey is to be



 implemented, whether households or businesses are sampled, whether



 the data will be collected; in a centralized or decentralized



 manner, whether networking of computers will be done, the sample



 size, and the complexity of the questionnaire.



 



      The remaining factors to consider in automated data collection



 refer to the characteristics of the technology.  First is the speed



 of the hardware and data transmission over the phone lines.  Next



 is the size of the computer's memory, and the system's weight (as



 in CAPI).  Portability is a concern to data collection when



 different hardware and/or software is to be used (as in Prepared



 Data Entry (PDE).  The type of display is important in some modes



 (as in CAPI).  The mode of data entry can be through the keyboard,



 a pushbotton phone, or using one's voice.  Data verification



 depends on the desire for quality, the complexity of the data, and



 other factors.  The database generation is also an important step



 (as was discussed by Martin Baum).  It refers to integrating the



 data with other survey processes (label generating, data



 summaries).  Hardware is selected based on cost, the amount of time



 available, the data quality desired, and the background of the



 staff that will operate the machines.  Lastly, training is



 important in any survey, the amount of which depends on the



 technology chosen.



 



      The priorities that are given to these factors and the



 relationships between them, help to decide which technology to use.



 All combine data collection with data entry, and most add editing



 at the time of data collection.  This reduces the time component



 and increases the quality component.  Also, mixed modes of data



 collection are possible in a survey.



 



 



 



                                 209



 



      First, (as a means of comparison), a mail or manual survey



 would require a fairly long time to send out personal enumerators



 or to send and receive questionnaires through the mail.  The amount



 of editing is very limited as data entry and editing is done after



 all the data is collected and the interview is completed.  The cost



 is fairly high if personal interviews are done, and nonresponse may



 also be high if questionnaires are mailed out.



 



      CATI is used because it collects data quickly and accurately.



 The cost component (which is fairly high), comes from the hardware,



 software, training, and support factors (such as phone charges).



 One cost component which is eliminated is the travel expense.  One



 suggestion is that CATI improves the cost benefit.  The respondent,



 however, must have a phone.  Other benefits are that it is useful



 in complex survey environments, can provide information on call



 scheduling successes/failures, and can be used for non-response



 follow up.



 



      CAPI also has fairly high costs, but it provides accurate data



 with a tendency for higher response rates (which may be a problem



 in CATI), and saves on the separate keyentry time.  The largest



 cost component is due to travel (with some in hardware and software



 support costs).  The weight, battery life, and screen visibility



 are important issues to CAPI.



 



      As to computer-assisted interviewing, 3 data collection modes



 are discussed -- Prepared Data Entry (PDE), Touchtone Data Entry



 (TDE) and Voice Recognition Entry (VRE).  PDE provides faster and



 more accurate data, for an average cost.  Costs are incurred in



 software development and support areas.  This mode requires the



 availability of a PC (usually by establishments), and two issues



 are data security and data integration (as different PC's are



 used).



 



      TDE allows respondents to call and answer questions posed by



 a computer using the keypad of their touchtone telephone.  VRE also



 allows respondents to call and answer questions posed by a



 computer, but the respondent answers by speaking directly into the



 telephone, and a computer system translates the incoming sounds



 into text.  TDE and VRE offer low cost alternatives in a short data



 collection  time, but editing is more limited.  In both, surveys



 tend to be  shorter and simpler, non-response prompts are used, and



 respondent  acceptance is a concern.  TDE requires access to a



 touchtone phone and service, where VRE can use any phone.  The



 Bureau of Labor Statistics collects data monthly for the Current



 Employment Statistics Program using mail, CATI, TDE, and VRE.  The



 VRE system recognizes any American English-speaking person with



 continuous speech of the numbers 0-9, yes, and no.



 



      These are not simple issues, and there are no clear cut



 answers.  The definitions and importance of the factors must be



 



 



                                 210



 



 



agreed upon.  This comparison only represents the current state of



technology, much will change with future development.



 



    Lastly, I hope this session has made you more aware of the



possibilities, the issues, and what to consider when choosing a



data collection method.



 



 



 



 



 



 



 



 



                               211



 



                           DISCUSSION



 



                          Robert N. Tinari



                     U. S. Bureau of the Census



 



 



      I want to begin my remarks today by noting that this paper is



 a very thorough treatment of the issues surrounding automated



 survey collection methodologies.



 



      I am impressed with the organization of the paper and the



 thoroughness of discussion of the many considerations that go into



 selecting, designing, and implementing these types of data



 collection systems.  The subcommittee is to be commended for the



 excellent job they have done in bringing together in one document



 a tremendous amount of information that I think will be extremely



 useful to those considering alternative data collection



 methodologies.



 



      Based oh my experience as a program manager responsible for



 the initial development and implementation of CATI on the National



 Crime Survey, there are several issues raised in the paper that I



 believe need more emphasis.



 



      The first issue I want to discuss has to do with organization



 and its affect on CATI/CAPI development and implementation.



 



      In its conclusion, the committee notes that increased reliance



 on software development has important implications for hiring and



 training skilled survey designers.  It also states that previously



 distinct boundaries between occupational groups will-continuously



 blur and disappear and survey design will likely be increasingly



 accomplished through teams of skilled workers from different



 occupations.



 



      Based upon my experience, I believe that this is an accurate



 assessment.  Obtaining the maximum benefit from the these data



 collection methodologies requires that a fully integrated system be



 developed and this, in turn, requires the concerted effort and



 collaboration of programmers, survey design experts, statisticians,



 field staff, program managers, and survey sponsors.



 



      However, the level of cooperation and communication necessary



 to successfully design and implement CATI/CAPI may be very



 difficult to achieve in a large, hierarchical organization.  Staffs



 tend to be highly specialized and not experienced in projects



 requiring a multi-disciplined approach.



 



      From my own experience working on one of the first CATI



 applications at the Census Bureau, we had a very difficult time



 organizing the right team with the right experience necessary to get the



 project underway and in keeping the lines of communication



 



                                 212



 



 open among the various divisions involved to implement it



  successfully.



 



      We learned a lot from that process and have come a long way.



  A recent example is a cooperative effort between the Economic Area



  and the Demographic Area in successfully   developing and



  implementing a CATI system for the Survey of Manufacturing



  Technology.  The Industry Division was responsible for conducting



  the survey and wanted to use CATI for nonresponse followup of



  manufacturing plants.  The division lacked the experience to



  develop the questionnaire on CATI.  Demographic Surveys Division



  offered to help with the authoring, Industry assisted with testing



  and Field Division worked on interviewer training and data



  collection.  The survey was carried out on time, within budget, and



  with high quality.  This is a good example of what can be



  accomplished by individuals working together from the various



  divisions and sharing their expertise to get the job done.



 



       Poor organization and control can have a very serious impact



  on the cost and time of development and the quality of the final



  product.  I believe that what is needed to successfully design and



  implement automated data collection methodologies is:



 



  0    commitment and full support from upper-level management.



 



  0    a full-time, dedicated staff - no part-time work along



       with other projects.



 



       open lines of communication with clear assignment of



       responsibility/accountability.



 



  0    designate a project coordinator/facilitator



 



  0    breaking down of traditional barriers between survey



       statisticians, mathematicians, survey designers,



       programmers, and field staff in order to work



       effectively.



 



  0    ongoing commitment and organizational change to adapt to



       needs of the new data collection methodology.  Especially



       important if you are using mixed mode such as personal



       visit (paper) and centralized telephone (CATI).



 



  0    reduced layers of bureaucracy.



 



  0    empowerment of the team to get the job done.



 



       We must think of new ways of organizing ourselves to be more



  flexible and effective in designing and implementing new



  technologies.  In addition, there must be more sharing of



 



                                   213



 



 information among the various statistical agencies on approaches



  and experiences in the area of organization.



 



      The second issue has to do with interviewer acceptance of new



  technologies like CATI and CAPI.  The paper points out the



  importance of involving the user in the design process.  I do not



  think this point can be over-emphasized.



 



      In the rush to develop survey instruments on tight time



  schedules or in deciding which portable machines to use for CAPI



  applications, we the developers and/or program managers, take it



  upon ourselves to decide what is best for the interviewers and may



  not actively involve them in the decision or development process.



  This can be a big mistake.



 



      If the interviewers are not comfortable with the interface, if



  it is slow, clumsy or awkward to use, "not natural" feeling, not



  helpful, etc., the survey is in serious trouble.  If the



  interviewers have no say in the design and for any reason should



  decide that the system is not helping them to get the job done



  better, then you face an uphill struggle to gain their acceptance,



  and in some instances, the system may never be fully accepted.



  Interviewers may work to defeat the system, morale may suffer,



  respondent cooperation may suffer, turnover rates will increase,



  quality will suffer, and costs will escalate.



 



      In addition, if you are contemplating switching from a



 personal visit environment to CATI, you must consider the effect on



 the interviewer staff out in the field.  Field interviewers will be



 concerned about losing their jobs and quality may suffer during the



 transition to CATI.  How the Field interviewers will be treated and



 possible impact on data quality during the transition period should



 most definitely be taken into account.  For example, in planning



 the transition of cases from personal visit to CATI for the



 National Crime Survey we used attrition among interviewing staff



 and hard to enumerate areas for conversion to CATI.  By using this



 approach, CATI was viewed as positive tool by Field staff.  This



 plan helped to gain acceptance of CATI.



 



      The third and final area I want to discuss has to do with the



 need for adequate testing and evaluation of these new



 methodologies.



 



      Before implementing any survey operation, it is good practice



 to allow enough time for adequate testing and evaluation of the



 instrument and the data collection and processing system.  This is



 especially crucial for automated data collection systems.  Complex



 questionnaires (those with complex branching or edits)need to be



 thoroughly tested and evaluated before they are introduced on a



 production basis.



 



 



 



                                 214



 



      While the automated data collection systems provide us with



 the ability to field much more complex questionnaires than we could



 using conventional paper forms, they also pose additional



 challenges related to testing.  Aside from the obvious problems



 that may surface during interviewing, if the instrument is not



 adequately tested, there may be logic errors hidden in the



 instrument that go undetected or aren't found until after the data



 collection phase is complete.



 



      In addition, when changes are introduced to the questionnaire,



 (even minor ones), thorough testing should be conducted again to



 insure that other questions or skip patterns have not been



 affected.



 



      In the paper, the committee discusses the possible application



 of expert systems in questionnaire development.  I would suggest



 that perhaps some application could be found for these systems to



 testing and evaluating as well.  There is definitely a need for



 more systematic and thorough methods for checking out the



 questionnaire.  In addition, attention must be paid to testing the



 case management, call scheduling, training, data transmission, and



 processing systems before the survey is fielded.



 



      This is not something that only needs to be done before, a



 survey is fielded.  It should be an ongoing effort to evaluate how



 well the system is functioning.  It should allow for feedback for



 continuous improvement/refinement such as monitoring, observation,



 debriefing interviewers/respondents.



 



      I want to thank the organizers for giving me the opportunity



 to share my views on this important topic.  I think the committee



 has made an important contribution by bringing together in one



 document many of the issues facing project managers in deciding



 whether or not to adopt these technologies.  I hope that the



 document will be treated as a dynamic one that will be expanded as



 we gain more experience with the various aspects of these data



 collection methodologies.



 



 



 



 



 



 



 



 



                                  215



 



                              DISCUSSION



 



                           David Morganstein



                              Westat, Inc.



 



 



      I thank Terry Ireland for organizing this intriguing session



 and I would like to express my appreciation to the speakers for the



 work they have done in their examination of new methods for



 assisting in the processor conducting government surveys.  It is



 a pleasure to be given this opportunity to participate in the



 session as a discussant.



 



      The job description for a discussant might be:



 



 -    To agree with the speakers comments,



 -    To point out errors or omissions,



 -    To suggest areas of new research, or



 -    To do something completely different that they'd like to



      do!



 



 I think I will try a little of all four of these objectives.



 



      There is a great heed for new approaches to gaining



 cooperation as the respondent population is increasingly bombarded



 with requests for survey participation.  The initial 1990 Census



 experience indicates the level of difficulty surveyors can



 anticipate.



 



      According to our speakers, their "Primary job is to



 develop ... computer related techniques which allow the respondent



 to answer the survey completely and accurately".  The emphasis on



 the respondent's cooperation is very appropriate.  There is a



 potential trap of having the software developed by software experts



 who have little knowledge of or interest in the



 respondent/interviewer who must use the system.  At a minimum, a



 part of the system designer team should be practitioners of long



 standing who understand the process.  There may be good reason to



 have this leader-of the team be such a practitioner.



 



      I was concerned by the following statement found in the paper,



 "Interviewers must believe that Computer Assistance will improve



 their effectiveness.  They need to be convinced that the computer



 is simply a tool to expedite and simplify their work.  This sounds



 a bit like psychological behavior modification.  Such verbal



 persuasion should be unnecessary.  In fact, the users WILL believe



 and be convinced IF the system actually DOES this!  You can be sure



 that no amount of argumentation will insure the interviewer is



 support if the system is awkward, difficult to use and makes their



 work harder.



 



 



 



                                  216



 



     The focus of the paper was primarily on the technology.  It



 said little about comparison studies which measure the



 accuracy/reliability of CASIC responses as compared to more



 traditional methods.  For example, an L84 paper by Waterton & Duffy



 in the International Statistical Review indicated self-reported



 alcohol consumption rates that were significantly higher when



 obtained via CASI than previously measured by interviewer.  Perhaps



 there have not been enough such studies, however, there is a need



 for them.



 



     The paper pointed out the importance of a good authoring



 system to CAPI but didn't say the same for CATI.  I believe it is



 true in that environment as well.



 



     Quality Measures (Human Interface discussion) are very



 important and are needed if we are to evaluate the efficacy of



 these new approaches.  The authors also mentioned an evaluation by



 'user' (interviewer), something I agree is important as it speaks



 to the committees 'primary job' mentioned earlier.



 



      I found the Appendix 3 examples a useful reference for



 contacts.  The authors would perform a valuable service if they



 would include names and phone numbers for all contacts.



 



      These approaches conform to the modern concept of quality.



 Reduced variability is designed into the system.  They reduce the



 potential for 'creative interviewing' in which undesired variation



 is introduced by the interviewer during the interview process.



 



      While I have not worked with CASI, it would appear that it



 could suffer from a potential loss of control by the survey



 operator.  It could be subject to 'creative respondents' who are



 intrigued by technology or who seek to befuddle the survey



 operators.  Care must be taken to insure that this does not occur.



 



      The survey instrument's logic/design still depends upon the



 human mind.  Techniques for encoding it into a CATI/CAPI/CACI



 system need to be better understood.  An unrealized advantage of



 these methods is that they force the designer to better understand



 the instrument/flow earlier in the process.  The designer can't



 rely upon last minute training/role plays with the interviewers to



 clarify muddy logic or instrument flow.



 



       I would like to close my comments on the value of these high



 tech methods for assisting in survey operations with the following



 short essay on the beauty of the abacus written by Robert Fulghum.



 



       Essay taken from All I Needed to Know I Learned in



 Kindergarten, Robert Fulghum.



 



 



 



 



                                  217



 



 



                                218



 



 



                            Session 7



                   QUALITY IN BUSINESS SURVEYS



 



 



 



 



 



 



 



 



                                 219



 



                                



 



                                 220



 



 IMPROVING ESTABLISHMENT SURVEYS AT THE BUREAU OF LABOR STATISTICS



 



                           Brian MacDonald



                            Alan R. Tupek



                 U. S. Bureau of Labor Statistics,



 



 



 Introduction



 



     The report on "Quality in Establishment Surveys" (see



 Statistical Policy Working Paper 15, 1988) concluded that there



 were few commonly accepted approaches to the design, collection,



 estimation and analysis of establishment surveys.  In contrast to



 household surveys, there was little standardization of



 methodological approaches across establishment surveys.  The report



 classified potential sources of errors in establishment surveys and



 examined the range of practices which are used to improve and



 measure quality.



 



      Each Federal agency which collects statistical data from



 establishments develops their own frame of business establishments.



 These frames are of varying quality, which greatly affects the



 methodology for surveys and contributes to the divergence of



 methodology across establishment surveys.



 



      This paper first provides a summary of the design



 considerations for establishment surveys as discussed in



 Statistical Policy Working Paper 15.  This paper then describes the



 efforts at the Bureau of Labor Statistics (BLS) for improving their



 business establishment list, the effect of these improvements on



 BLS surveys, and the potential impact on other statistical



 agencies.



 



 Design Considerations for Establishment Surveys



 



      Establishment populations differ from household populations in



 several ways (see Statistical Policy Working Paper 15).  These



 dissimilarities result in frame development, sample design, and



 estimation approaches which are in some areas markedly different



 from Approaches for household surveys.  Among the major



 distinctions between establishment and household populations and



 frames are:



 



    1. Establishments come from skewed populations wherein units



       do hot contribute equally (or nearly equally) to



       characteristic totals, as is the case for households; and



 



 



    2. Accuracy of frame information about individual population



       units is crucial to sample design and estimation for



       establishment surveys, while for household surveys the



 



                                   221



 



 



       accuracy of frame characteristics concerning individual



       units is not as critical to the sample design.



 



       Establishment surveys are characterized by the skewed nature



 of the establishment population (see, for example, Table 1).  A few



 large firms commonly dominate the estimates for most of the



 characteristics of interest.  This is especially true for



 characteristics tabulated within an industry.  Small firms may be



 numerous, but often have little impaction survey estimates of level



 although they may be more critical to estimates of change over time



 or for measuring characteristics related to new businesses.  This



 distribution has a major impact on both the frame development and



 maintenance and on the sample designs used for establishment



 surveys.



 



 



Click HERE for graphic.



 



 



 



 



 SOURCE:    U.S. Bureau of Labor Statistics



 



 



       List frames are widely used in establishment surveys conducted



 by the Federal government.  The use of list frames for



 establishment surveys arose from the availability of administrative



 records on businesses compiled mainly for tax purposes.  However,



 because these administrative record files are not normally



 developed for statistical purposes, they often need refinement



 before being used as sampling frames for surveys of businesses.



 Extensive resources are spent on maintaining the list frames since



 a significant source of nonsampling error may be due to



 inadequacies in the frame.



 



                                  222



 



 



     Establishment list frames typically are characterized by



 detailed establishment identification information, periodic



 updating of this information, and multiple sources for the



 information.  The data on the frame are required for sample design,



 sample selection, identification of sample units, and estimation.



 The primary source of administrative records for a frame may have



 shortcomings which require the identification information to be



 supplemented using other sources of information.  This may include



 using identification information from the surveys themselves.



 Supplemental filet, including the use of area frames, may also be



 required to overcome coverage problems in the primary source.



 Duplication of sampling units may also be a problem associated with



 the use of list frames.  Refinement of the frame includes efforts



 to unduplicate units prior to sampling.



 



      The individual establishment information on the frame is



 critical to the effectiveness of the sample design and estimation



 for the survey.  Maintaining a frame over time is complicated by



 the dynamic nature of the establishment community.  Changes in



 ownership, mergers, buyouts, and internal reorganizations make



 frame maintenance a real challenge.  Matching and maintaining unit



 integrity over time provides the opportunity for consistent unit



 identification in the numerous periodic surveys conducted by the



 Federal Government.



 



      New establishments must be added to the frame.  However, it is



 often difficult to differentiate, using administrative records, new



 establishments from formerly existing establishments that have



 changed their name or corporate identity.  It is also difficult to



 link businesses over time when there have been ownership or other



 changes.  Each survey may have different requirements as to



 handling of new establishments and changes in existing



 establishments.  The timeliness of adding new establishments to the



 frame and reflecting them in the sample is also a problem.  The lag



 time between formation of new establishments and selecting them



 into the sample may be anywhere from several months to several



 years.  While new establishments may have little impact on



 estimates of level, in some instances they may dominate estimates



 of change.



 



 



 The Business Establishment List Improvement Project



 



      In May 1987, the Economic Policy Council issued a report that



 noted five areas in national economic statistics where improvements



 were needed.  One of these areas dealt with the business lists used



 by the three major Federal statistical agencies to conduct their



 surveys.  One of their recommendations was that the Bureau of Labor



 Statistics and the National Agricultural Statistical Service of the



 Department of Agriculture be designated as the central Federal



 government agencies for the collection of nonagricultural and



 agricultural, respectively, business identification information.



 



                                   223



 



In addition, the Economic Policy Council recommended that efforts



 be initiated to revise the statutes that prohibit the sharing of



 survey data collected by the Census Bureau with other specified



 Federal statistical agencies.  The main purpose of the Economic



 Policy Council recommendations was to have a single, high- quality



 source of business data available to selected Federal statistical



 agencies in order to increase the quality and comparability of



 national economic statistics.



 



       Shortly thereafter, the Office of Management and Budget (OMB)



 requested that the BLS develop a proposal to assume this role.  The



 issue of devoting resources to developing a central frame is not



 unique to the fragmented U.S. statistical system.  Statistics



 Canada is in the process of developing a central frame for its



 business establishment surveys (see Colledge and Lussier 1981).



 



      For the BLS universe file to sufficiently serve as the primary



 frame for statistical survey sampling by Federal statistical



 agencies, the BLS recognized that modifications to its existing



 file were necessary.  The most critical need was to improve the



 information available about employers engaged in multiple



 operations within a State.  The Business Establishment List (BEL)



 Improvement Project was initiated to do this.  Its primary purpose



 is to create an establishment (i.e. worksite) based register of



 units with full identification information on United States'



 businesses.  At present, data for multi worksite employers in the



 BLS register are available mostly at a higher level of aggregation.



 



      The data for the current BLS universe file come primarily from



 administrative records collected by State Employment Security



 Agencies (SESAs) as part of the administration of the Federal/State



 Unemployment Insurance (UI) System.



 



      All employers covered by unemployment insurance are required



 to file quarterly UI Contributions Reports with the SESAs for each



 of their UI accounts.  On these forms, employers report the number



 of full and part-time workers, employed during the pay period



 including the 12th of each month in the quarter and the total



 payroll for the quarter.  This reporting is mandatory for single



 location employers as well as those engaged in multiple operations



 in the State.



 



      Data collection and classification procedures for multi-unit



 employers differ from those for single units.  For multi-unit



 employers, the statistical branch of the SESA is responsible for



 the direct collection and review of monthly employment and



 quarterly wages at the reporting unit (county by industry) level of



 detail.  A multi-unit employer is defined as an employer who has



 more than one industrial activity (four-digit SIC) and/or county



 location covered by the same UI account and meets, the following



 criteria.  To quality as a multi-unit employer the employer must



 have 50 or more employees in the sum of their secondary industries



 



                                 224



 



or counties.  The primary industry or county is defined as the



 industry or county that has the greatest number of employees.



 



     Under the BEL Improvement Project (see Searson and Pinkos



 1990), this threshold is being lowered from 50 employees to 10



 employees with the States being responsible for collecting



 employment, wage and identifying data at the worksite level.  Thus,



 more detailed business identification information will be available



 for small multi-establishment employers.



 



     Multi-unit employers that do not meet the above criteria are



 treated as if they were single-unit employers for data collection



 and recordkeeping purposes.  These small multi-unit employers who



 are engaged in multiple industrial activities within one county are



 assigned industry codes based on their primary activity (that is,



 the activity providing the most shipments or sales).  Conversely,



 those in one industry with several locations are given a county



 code based on the location employing a majority of all the



 employees.



 



      Large multi-unit employers are treated differently than single



 units as they are requested to file a quarterly statistical



 supplement form in addition to the Contributions Report.  On the



 SESAs' current forms, large multi-unit employers report monthly



 employment, quarterly wages, industry and location information for



 each reporting unit.  These supplements are used to maintain



 separate identification and characteristic records on the



 individual reporting units to ensure correct geographical and



 industrial totals are maintained.



 



      As part of the BEL Improvement Project, the BLS is replacing



 the 53 individually-designed State forms with a standardized



 statistical supplement form.  The name of the form is being changed



 to the Multiple Worksite Report.  Each quarter, the employer will



 be requested to verify the identifying information (trade name,



 description of the establishment, and physical location address)



 for each establishment (worksite) that will be computer printed on



 the new Multiple Worksite Report.  In addition, the employer will



 be requested to provide the monthly employment and total wages for



 each worksite for that quarter.  By using a standardized form, the



 reporting burden on many large employers, especially those engaged



 in multiple economic activities at various locations across



 numerous States, should be reduced.  States will accept listings



 and floppy diskettes of this information in lieu of the form.  In



 addition, the BLS is investigating the central collection of



 multiple worksite employers data from major multi-establishment



 employers.  The Multiple Worksite Report form will be used in all



 States to collect data by establishment (worksite) beginning with



 data for the first quarter of 1991.  Some twenty-one States,



 however, are switching to a State version of the new form with data,



 collected for the first quarter of 1990.



 



 



                                  225



 



       As a result of these efforts at worksite reporting, we expect



 the number of units on the frame to increase from approximately six



 million to slightly more than seven million.  Because the UI system



 still serves as the basis for the worksite based frame, both the



 scope as well as the data on employment and wages on the new frame



 will be identical to that on the old frame, only the level of



 disaggregation will be different.



 



 



 Implications of BEL on BLS Surveys



 



       Several features of the BEL Improvement Project will affect



 the design of BLS sample surveys (see Plewes 1989).  These include:



 



  -     reporting unit number for each worksite of multi-unit



        companies;



 



  -     better identification information, including multiple



        unit multiple addresses, worksite descriptions and



        telephone numbers;



 



  -     better linking of data over time through the use of



        reporting unit number for worksites within multi-unit UI



        account numbers.  Also, UI accounts will be linked



        through the use of predecessor and successor codes for



        ownership changes such as buyouts, mergers, etc;



 



  -     more data items for each unit, such as initial date of



        tax liability, date of establishing a new worksite, and



        comment codes for explaining unusual employment changes;



 



  -     quarterly data, historical files, and response history



        files to track the surveys for which a worksite has been



        selected and whether they have responded;



 



  -     linking of units within enterprises or corporations,



        across UI accounts; and



 



  -     improved standard industrial classification (SIC)



        refiling process, in order to identify new multi-'worksite



        reporters in addition to updating SIC codes on a 3-year



        cycle.



 



       The effect of these BEL improvements on four areas of survey



  design will be examined.  These include sample frame development,



  sample design, data collection, and estimation.  Implications for



  the short-term, during the period in which the survey program will



  transition into the improved system, as well as the long-term will



  be discussed.  The transitional period implications are usually



  related to problems in maintaining consistency of survey estimates



  while BEL improvements are implemented.  The long-term implications



 



 



                                  226



 



  are usually related to improvements that can be made to survey



   designs by reexamining survey design objectives.



 



         Over the years, each BLS survey has developed activities for



   creating their sampling frame from the old Universe Maintenance



   System, which BLS will change.  These unique activities for each



   survey focus on specific survey requirements as well as limitations



   of the list.  For example, BLS surveys which attempt to maximize



   sample overlap over time must match frame units from one time



   period to another.  The BEL improvements will affect the matching



   operation, due to the shift to worksite reporting.  During the



   transition period, the surveys may need to reexamine the need to



   maximize sample overlap.  If they maintain this objective, then



   less sample overlap is likely, and much of the operation will need



   to be done manually.  However, in the long-term the use of



   reporting unit numbers, and predecessor and successor codes should



   greatly facilitate the automated matching operation.  Other BLS



   surveys use supplemental frames to survey populations not entirely



   covered by the BEL.  These populations may include railroads;



   federal, state and local government; religious organizations; and



   seasonal industries.  BEL improvements will allow many surveys to



   reexamine the need for supplemental frames, especially for state



   and local governments, and seasonal industries.



 



         Several other long-term benefits for sample frame development



   are possible through BEL improvements.  The availability of



   quarterly data can be used by some surveys for creating their



   sample frame.  The identification of new businesses on the BEL can



   be used as a stratification variable for surveys.



 



         Although BLS does not now do so, the new list will enable



   survey operators to conduct surveys of enterprises or companies.



   This will bring about reconsideration of the scope of the surveys.



   All surveys will need to modify their control file systems to



   handle additional data items on the BEL.



 



         At this stage of the planning process, certain obvious changes



   have been identified for each survey.  The following three examples



   illustrate the types of operational modifications which are



   planned.



 



         First, the survey which is used to develop the Producer



         Price Index (PPI)  must use lst quarter data for measures



         of size.  The BEL improvements will allow PPI to use more



         current quarterly data, or other quarters for seasonal



         industries.  This is expected to improve the coverage of



         some industries, and to increase the sample design



         efficiency.



 



         Second, an annual survey which measures occupational



         industries and illnesses supplements the BEL with a frame



         of the 500 largest companies in the United States,



 



                                    227



 



   including all of their subsidiaries.   Currently, this



    supplemental frame is developed specifically for this



    survey.  The BEL improvement plan will provide adequate



    organizational relationships for large companies, so



    that the separate operation will be terminated.



 



    Third, a monthly survey of employers, which measures



    employment and average hourly earnings, lags in measuring



    the affect of new businesses.  A sampling strategy is



    being developed for this survey, which will bring in a



    sample of new businesses each month, once the BEL



    improvements are introduced.



 



       Greater flexibility in sample designs will be possible with



    the introduction of BEL improvements.  Separate strata for seasonal



    or volatile firms can be considered.  Stratification by age of firm



    may be appropriate for some surveys.  Surveys designed to produce



    local area estimates can use worksite locations for stratification.



    Surveys may want to stratify by multi-reporters versus single



    reporters, or by enterprise size.  The survey response history can



    be used to avoid overlap between surveys and to spread respondent



    burden.



 



       During the transition period for BEL improvements, there will



 be some loss in sample design efficiency.  The use of current data



 to develop sample designs for surveys conducted during the



 transition period will be somewhat inappropriate.  In the long-



 term, sample design efficiencies will be possible through the use



 of new design variables and more homogeneity within size classes.



 



       Surveys with size cutoffs will need to reevaluate the survey



 scope or target population.  Some BLS surveys cover only large



 establishments.  For example, most of the occupational wage surveys



 cover only, establishments with 50 or more employees.  The BEL



 improvements will shift units between size classes.  In general,



 the sampling unit will shift from a county-wide report to a



 worksite report.  Maintaining a 50 or more employee size cutoff



 will artificially move units in or out-of-scope of the survey and



 decrease employment coverage.  The effect on wage estimates will



 need to be examined, and decisions made on how to maintain



 consistency over time.



 



       Surveys designed to measure change can use the linking of data



 over time to improve on the efficiency of the sample design through



 sample overlap.  Samples for surveys conducted three or more years



 apart are now independently selected.  With historical relations



 maintained over time, samples could be selected which improve upon



 estimates of change, possibly using composite estimation.



 



       The new features of BEL will be most beneficial during the



 data collection phase.  Because of better address information,



 especially physical location addresses and telephone numbers,



 



                                 228



 



  response rates are expected to increase for mail and telephone



   surveys, since one of the primary reasons for low response rates is



   failure to reach the correct respondent.  Additionally, better



   address information will result in a decrease in data collection



   time and effort, such as reduction in telephone and mail follow-up



   of nonrespondents.



 



        The breakdown of the multi-establishment companies that



   presently report on a consolidated basis (e.g., county-wide) into



   establishment or worksite level reporting will affect all BLS



   surveys.  Surveys will need to make special reporting arrangements



   with these companies to provide data on a worksite basis.  Recent



   cognitive research conducted by Statistics Canada shows that



   respondents who are in the survey on a regular basis report data in



   the same manner from one time period to another and usually do not



   take into account changes to the survey instrument or procedures.



   The worksite information should reduce the reporting error due to



   failure to identify the selected sample unit.



 



        The impact of BEL during the estimation process for BLS



   surveys will vary significantly by survey type and estimation



   procedures used.  An area of survey estimation that will be



   affected by BEL is benchmarking.  Benchmarking is a process that



   accounts for changes that occur during the time lapse between the



   reference date of the sampling frame and the date of data



   collection.  In other words, it accounts for births, or those units



   which have come into existence since the sampling frame was



   created.  This is accomplished by multiplying the sample estimates



   of totals by the benchmark factor at the estimating cell level,



   usually SIC or size class within an SIC.  For BLS surveys, the



   benchmark factor is calculated at the estimating cell level as the



   rator of the reference period employment (benchmark employment) to



   the weighted employment from the sample.



 



        Surveys that benchmark at the size class level would be most



   affected because of the change in the distribution of units across



   size classes due to worksite level reporting.  For example, size



   class benchmarks for a survey that measures occupational employment



   statistics (OES) by industry may be inappropriate during the



   transition period.  A possible solution for all surveys which



   benchmark by size class is to benchmark at the industry level



   during the transition period.



 



        With the new business registry, population data for



   benchmarking employment will be available for all 12 months.  This



   additional information may be utilized by the Current Employment



   Statistics (CES) Survey, which is a monthly survey of about 300,000



   establishments that measures employment at National and State



   levels by industry, to benchmark the employment data quarterly and



   thereby better analyze the components of error by time period.



 



 



 



                                    229



 



  Central Agency Status



 



       When the OMB issues the directive naming BLS as the central



  agency charged with maintaining a list for nonagricultural



  businesses, several actions will have to be undertaken before



  extracts from the BLS list can be made available to other Federal



  statistical agencies for use in surveys.



 



       First, BLS will have to conduct a series of negotiations with



  the State Employment Security Agencies to gain their agreement to



  waive or modify existing  State confidentiality rules and



  regulations that would currently not allow widespread use of the



  state provided UI data.  We expect that most SESAs will readily



  welcome the sharing for statistical purposes of these data.  There



  have recently been examples where most, if not all, State agencies



  authorized this type of data sharing, but on a much more limited



  basis.  In those few States where current State law might prohibit



  the sharing with other Federal statistical agencies, we will



  propose modifications to the State Unemployment Insurance laws to



  allow the sharing and work with the state agencies to seek passage



  of the heeded legislation.



 



       Similarly, there will have to be certain actions taken both by



  BLS and those Federal statistical agencies authorized by OMB to



  have access to the BLS list before the sharing can begin.  BLS will



  have to develop formal procedures for use of the file by other



  agencies.  These procedures will include such obvious items as



  security measures for the data, assurances that the confidential



  data will be used for statistical purposes only, agreements on



  feeding back 'corrections' or updates to the file, access rules and



  techniques (the BLS list is maintained at the NIH computer



  facility) and arrangements made to cover marginal Operating costs



  for providing the data.  A possible solution to the question of



  providing for satisfactory computer security may be for the using



  agency to have conducted an application security review for its own



  sensitive Automated Information System in compliance with the



  requirements of OMB circular A-130.



 



  Summary



 



      A central agency charged with maintaining a list of



  nonagricultural businesses provides an opportunity for improving



  business establishment surveys conducted by the Federal Government.



  However, the key to its success will rest with the ability of all



  the agencies involved to provide clear and concise requirements to



  the central agency, and to weigh the costs of improvements to the



  central list against the benefits to survey operations and data



  quality.



 



 



 



 



                                 230



 



References



 



Colledge, M. and Lussier, R. (1987), "A Generalized Methodology for



Economic Surveys" in Proceedings of the Business and Economic



Section of the American Statistical Association Annual Meetings,



pp. 131-149.



 



Plewes, T. (1989), "Improving the Business Establishment List:



Survey Design Implications" in Proceedings of the Fourth



International Roundtable on Business Survey Frames, Newport, Gwent,



United Kingdom: Available through the U.S. Department of Labor,



Bureau of Labor Statistics, in press.



 



Searson, M., and Pinkos, J. (1990), "The Bureau of Labor



Statistics' Business Establishment List Improvement Project" in



Proceedings of the Sixth Annual Research Conference, Washington,



D.C.: U.S. Department of Commerce, Bureau of the Census, in press.



 



Statistical Policy Working Paper 15 (1988), "Quality in



Establishment Surveys", U.S. Office of Management and Budget.



 



 



 



 



 



 



 



 



                                 231



 



            A REVIEW OF NONSAMPLING ERRORS IN FEDERAL



      ESTABLISHMENT SURVEYS WITH SOME AGRIBUSINESS EXAMPLES



 



                             Ron Fecso



             National Agricultural Statistics Service



 



 



     Working  Paper 15 (WP-15), "Quality in Establishment Surveys,"



addresses the accuracy of establishment surveys.  Although WP-15



concentrates on accuracy, we need to recognize that accuracy is



only a part of the total quality picture.  Remember the importance



of other aspects of quality and their interaction with accuracy



concepts.  The definition of survey quality is the totality of



features and characteristics of a survey that bears upon its



ability to satisfy a given need.  Sometimes these ideas are



referred to as "fitness for use." Discussions of quality usually



address how well something is made.  We must also address the true



needs of the product or service as well as productivity issues such



as increased output and unit cost.  Continued pressure on budgets



and demands for increased statistical output are quality aspects



which may be occupying major portions of our time.  Thus, a model



for survey quality needs four elements: accuracy, timeliness,



relevance and resources.



 



     The intent of this paper is to provide a glimpse of the



nonsampling error treatment from WP-15 and several examples of the



treatment of nonsampling errors in agricultural surveys.  I hope



that I can persuade the audience to study working paper 15 in more



detail after seeing this commercial.



 



     Many sources of error are possible in establishment surveys.



While there are several good ways to organize the presentation of



these errors, WP-15 chose two main groupings:  design and



estimation, and methods and operations.  The latter group contains



the nonsampling errors which are highlighted here.



 



 



Nonsampling Errors



 



     Errors which arise during the specifications for and the



conduct of establishment surveys are called nonsampling errors.



Commonly known examples of nonsampling errors include incomplete



sampling frames, nonresponse and keypunching errors.  The variety



of nonsampling error sources and results from studies of these



sources lead survey researchers to believe that nonsampling errors



may often far exceed sampling error.  There are three objectives



found in the chapter on nonsampling errors in WP-15.  The



objectives are to outline major categories of nonsampling errors in



establishment surveys, to identify some of the diverse sources of



error in each category, and to provide insight into strategies to



detect, measure, and control these errors.  The error categories



 



 



                                232



 



   discussed are specification, coverage, response, nonresponse, and



    processing errors.



 



         WP-15 defines each of these error groups, gives examples,



    identifies major sources of the error, describes methods to control



    and measure the errors, and profiles the control and measurement



    techniques used in the major establishment surveys of the Federal



    Government (9 agencies and 55 surveys).  (The presentation



    contained some detail about response error treatment and examples



    of WP-15's graphics since most of the audience had not seen WP-15.



    These materials are not reproduced here.)



 



         Although several good references are available concerning



    nonsampling errors in surveys of individuals (for example United



    Nations, 1982), WP-15 is the first detailed treatment for Federal



    establishment surveys.  The need for this separate treatment arises



    because establishment surveys differ from surveys of individuals by



    typically seeking hard data for which records are available.  This



    characteristic both simplifies the collection and complicates the



    interpretation of the data.  The collection is simplified when hard



    data on record can be used, rather than relying on the memory,



    opinions, or interpretations of the respondents.  These differences



    present complications when establishing the concepts and



    definitions to be used in the surveys.  Special care must be taken



    to consider carefully the establishments' recordkeeping systems,



    definitions, and data availability to avoid introducing



    specification error into the data.



 



         Establishment surveys, which commonly use list frames, are



    subject to errors such as duplication, overcoverage of out-of-scope



    and out-of-business units, under coverage of business births, and



    misclassification of units.  The availability of records affects



    the structure of the response and nonresponse errors as well as the



    methods to measure and control them.  The treatment of processing



    errors differs the least from other types of surveys.



 



 



    SOME HIGHLIGHTS OF WP-15



 



         WP-15, unfortunately, makes no specific recommendations.  Yet,



    the profile of nonsampling error practices used in 55 Federal



    establishment surveys by nine agencies provides considerable



    insight into the state of quality in these surveys.   This



    commercial for the paper will present a few of the highlights.



 



    0    No single measurement of specification error is used in



         a large majority of the surveys profiled.



 



    0    Relatively little is done to measure specification  error.



 



    0    Few direct measures of list coverage error were reported



         as regularly used.



 



                                      233



 



   0    Outside of the calculation of edit failure rates, little



         response error measurement is done.



 



    0    Although follow up procedures for large units are common,



         very little is done to directly measure nonresponse



         error.



 



    0    Cognitive studies are rare.



 



    0    Questionnaire pretesting was not widely used on a regular



         basis.



 



    0    Relatively few nonsampling error measurements are



         published.



 



    0    There is relatively little information about processing



         errors.



 



         WP-15 contains considerably more detail on good practices



    which are currently in use as well as those practices which are



    lacking in use and need examinations WP-15 states in an overview



    that "Nevertheless, the tenor of the findings can be depicted as



    recommending more work to improve and document the quality of



    surveys... a need to focus additional attention, and resources, on



    the general improvement and documentation of survey practices."



 



 



    A Reinterview Study from Agribusiness



 



        An example of measuring response error in an establishment



    surveys is next.  The results presented are from a reinterview study



    which measured the bias of Computer Assisted Telephone Interviewing



    (CATI) methods on a National Agricultural Statistics Service (NASS)



    survey.(Fecso and Pafford) As part of its estimating program, the



    NASS publishes quarterly estimates of crop acreage, intentions to



    plant, actual plantings, harvested acreage, stocks of grains, and



    livestock numbers.  The source of these estimates is a multi-



    purpose, multi-frame survey.



 



        Because of the detailed nature of acreage, stocks and



    livestock inventory items, the NASS had relied primarily on



    personal interviews to get the most accurate answers from the farm



    population.  For example, on-farm grain stocks data, extremely



    important because of their effect on commodity trading, is a



    collection problem because farmers may store these grains in



    multiple bins on property they own and/or rent.  In addition



    farmers often have multiple operating arrangements involving their



    own grains, those of landlords, and those where formal and informal



    partnerships exist.



 



        Recently, NASS has expanded the use of telephoning, including



    CATI to collect these data.  The primary reasons for change are



 



                                 234



 



   inadequate budget and the need to reduce the time between initial 



    data collection and publication.  We suspected difficulty in using



    the telephone to collect some of these quarterly survey data.



    Obtaining accurate responses is difficult because of the detailed



    nature of these data and the centralized (state) telephone



    interviewers often lacking farm experience and familiarity with



    farm terms.  The reinterview study is our first attempt to measure



    response errors.



 



        You cat find the use of reinterview methods in the literature



    for measurement of simple response variance (Bailar, 1968;



    O'Muircheartaigh, 1986) and correlated response variance (Groves



    and Magilavy, 1986), for example.  This response error study



    focused on measurement of the bias by treating the final reconciled



    response between the CATI and independent personal reinterview



    response as the "truth."  To obtain truth measures, experienced



    supervisory field enumerators reinterviewed approximately 1,000



    farm operations for the December 1986 Agricultural Survey.  The



    following tables contain the results for the grain stocks items



    (corn and soybean stocks).



 



        Table I indicates that the difference in the CATI and final



    reconciled responses, "the bias," was significant for all but one



    item (soybean stocks in Indiana).  The direction of the bias



    indicates that the CATI data collection mode tends to underestimate



    stocks of corn and soybeans.



 



        The process of reconciliation identified the reasons for



    differences.  A summary given in Table 2 indicates that an



    overwhelming percent of differences (41.1%) could be related to



    definitional problems (bias related discrepancies), and riot those



    of simple response variance (random fluctuation).   Definitional



    discrepancies contributed almost half of the large bias.    About



    two-thirds of the definitional discrepancies had a relative



    difference (the reconciled response minus the CATI response divided



    by the CATI response) more than 25% or less than -25%.    In



    contrast, the differences due to rounding and estimating



    contributed less than 10% of the overall bias.  Almost all of the



    rounding and estimating relative differences were between -25% and



    25%.



 



 



 



 



 



 



 



                                  235



 



 



 



 



 



Click HERE for graphic.



 



 TABLE I.  Estimates of Bias in CATI Collected Responses



 



 



 



 * Indicates the CATI and final reconciled response were



 significantly different at a=.05.



 



      These results suggest that we can reduce the bias in the



 survey estimates generated from the CATI telephone sample using a



 revised questionnaire design, improved training, or a shift in mode



 of data collection back to personal interviews. considering the



 constraints of time and budget, the change to additional personal



 interviews is unlikely.  Thus, the alternative is to use



 reinterview techniques to monitor this bias over time to determine



 whether the bias has been reduced through improvement in



 questionnaires or training.  If large discrepancies continue, the



 estimates for grain stocks can be adjusted for bias through a



 continuing reinterview program.  If the bias stabilizes, even at



 zero, periodic reinterview studies can validate a "constant" bias



 adjustment used in interim periods.,



 



 An Example -- Bias Measurement



 



      NASS conducts crop yield surveys in states which are major,



 producers of field crops.  The survey data are used to forecast



 expected yield and production during the growing season and to



 estimate these values at harvest.



 



      Briefly, the survey design can be described as a multiple step



 sampling procedure.  Samples are drawn from an area frame to



 estimate acreage for harvest, followed by subsampling of fields and



 small plots to make measurements related to yield per acre.



 Detailed information on the area frame design is available in



 Fecso, Tortora and Vogel.  More detail on the crop yield surveys,



 called objective yield (OY) surveys,, is in Matthews (1985), Reiser,



 Fecso and Taylor (1987), and Francisco, Fuller and Fecso (1987).



 



 



                                   236



 



 



 



                                 237



 



Click HERE for graphic.



 



 



       Several control procedures existed for the OY surveys.



  Supervisory enumerators visited the plots (approximately a 10



  percent subsample which included the first sample visited by each



  enumerator).  The field office survey statistician occasionally



  visited plots. Data are hand and computer edited.  Finally,



  periodic validation surveys, covering a subset of crops and states



  in a given year, were conducted to measure the overall bias of the



  survey estimate in the domain studied.



 



       These control procedures had shortcomings.  For example,



  visits by the supervisory enumerator served mostly as a retraining



  system; the data was not used to improve the estimates or to



  estimate biases.  Budget and staff reductions reduced the number of



  field visits by survey managers.  Edits have been changing.  New



  computer edits and some areas creating individualized recording



  forms have resulted in estimates which may differ from those based



  on the old editing procedures.  Finally, the expensive and



  administratively burdensome validation survey received increased



  questioning.



 



       The validation survey had one major goal -- to measure the



  differences between the objective yield crop cutting and the



  farmer's harvest.  The validation surveys had clearly shown that



  the difference between the OY crop cutting and farmer's harvest is



  not equal to zero.  These studies found differences by crop, year,



  and state.  Since the validation surveys have answered the major



  question for which they were designed, we asked what purpose would



  they have in the future?



 



       Our main consideration remained the assessment of the bias.



  Several concepts needed attention.  Was the overall bias consistent



  over the years?  Our data is a time series, especially when



  considered by the users; thus, knowledge of bias-included level



  change is important.  Are the sources of bias changing?  Are there



  large enough bias changes to deserve extra concern?  Are there any



  needs for procedural changes to reduce specific bias sources, or do



  we only need to monitor the overall level of bias?  Finally, if we



  use overall bias measures to adjust survey values, are the biases



  within a specified tolerance?



 



       NASS currently conducts a redesigned validation survey for



  soybean OY.  This survey is done in all states in the OY sample



  program.  This design removed some unpopular aspects of the old



  validation surveys, including the concentration of work in one or



  two states and the variable workload resulting from changing states



  each year.  Our goal was to verify the approximate 6% bias



  adjustment suggested by the historic series of studies.  The



  current approach differs from prior studies.  We now combine sources



  of error rather than trying to measure specific components.  Thus,



  the results provide a basis for adjusting the survey for the many



 



                                    238



 



  small sources of error found in prior studies.  These errors                                   These errors



   included: incorrectly measured row widths, field counts differing



   from lab counts, time lag bias due to the enumeration differing by



   several days from actual harvest, new planting patterns causing



   enumeration and imputation difficulties, enumerator fatigue errors,



   and plot location biases.



 



   The rational for the redesign begins with our estimator of



   state yield, the mean of the sample field yields, which is



   basically unbiased, except that we do not have the true field



   yield, Y; but a sampled value, y. This estimate can be modeled as



   follows:



 



Click HERE for graphic.



 



 



 



 



 



 



                                    239



   



 



Click HERE for graphic.



 



 



         Three years of data from the validation survey have produced



    the following results:



 



                        Estimated        Estimated Bias as percent of



           Year      Bias in Bushels Standard Error the Estimate



 



           1987             2.2               .9             5.8



           1988             2.3               .8             7.6



           1989             3.2               .9             8.7



 



 



         Thus, the studies validated the 6% adjustment of the survey



    data as reasonable.  Future research can determine the optimal use



    of the validation survey for adjustment.  We also need to assess



    the implicit missing at random assumptions.  We can get some ideas



    on the reasonableness of the assumption using farmers reported



    yields to measure group differences.  We need the assumption that



    the biases measured by the validation survey are uncorrelated with



    the action of obtaining elevator yields.  This assumption is



    reasonable, but should be tested occasionally.  With the redesigned



    validation survey we have two of the three estimates (the OY yield



    estimate, the validation survey estimates of OY bias, and a



    nonresponse bias estimate).  These are the estimates, of the major



    error components which are necessary to assess the accuracy of the



    between-year of yield estimates.



 



 



    Conclusion



 



        Although the level of nonsampling error in establishment



    surveys was not directly measured in WP-15, nonuse of control and



    measurement techniques should not be interpreted as a lack of



    errors.   Is it time for us to regain the balance between the



    importance which we put on the elements of survey quality and our



    actual practice?  For too many years, emphasis in most government



    agencies has been on timeliness and resources (usually shrinking).



    It's time to shift more effort to relevance and accuracy issues.



    We might help ourselves by training users in survey quality



    concepts so they can help us prioritize our efforts and maybe lead



    the effort to secure more funding.  Our easiest beginning in this



    road to quality could start merely by publishing more of what we do



    know about the errors.



 



 



 



 



                                   240



 



       Increased interest in organized quality efforts such as total                      u



    quality management philosophies is promising. organizations need



    to ask questions such as:



 



    1. What measure(s) does top management use to quantify



       survey or organizational effectiveness? (Is it the same



       as the data users?)



 



    2. How are these measures used to manage and plan' for the



       long run?



 



       Agencies need to assess their training needs.  We will face at



   least some shortage of new hires with the survey research skills



   necessary.  Some predict that the shortage will be acute and go



   beyond survey skills to general quantitative skills.  Will agencies



   respond with creativity in developing staffing and training plans?



   We should do more to address this problem now.



 



       Finally, WP-15, actually all the working papers, needs to be



   more widely read. (Only a small percentage of the audience at the



   presentation had seen WP-15.) Agencies and users can benefit by



   identifying errors which were not previously considered and/or



   techniques which could be used.  I caution against being



   overwhelmed with the quantity of errors displayed it WP-15.  Don't



   worry that you can't eliminate or measure them all at once.  I



   doubt that you have all these errors.  Yet, don't be complacent.



   To improve survey quality you need a strategy.  The strategy should



   define a systematic approach to the improvement and measurement of



   the effects of existing error sources as well as proposed changes



   in the survey process.  Be flexible as you move along with the



   strategy, enjoying small successes as they come and avoiding the



   expectation of overnight miricles.



 



 



   References



 



   Bailar, B.A., (1968) "Recent Research in Reinterview Procedures,"



   JASA 63:41-63.



 



   Fecso, Ron, (1986) "Sample Survey Quality:  Issues  and Examples



   from an Agricultural Survey," Proceedings of The Section on Survey



   Research Methods, American Statistical Association.



 



   Fecso, R., R.D. Tortora and F.A. Vogel, "Sampling Frames for



   Agriculture in the United States," J. of official Statistics, Vol.



   2, No. 3, pp. 279-292, 1986.



 



   Fecso, Ron and Brad Pafford, "Response Errors in Establishment



   Surveys with an Example From Agribusiness Survey," Proceedings of



   the Section on Survey Research Methods, ASA, 1988.



 



 



 



                                  241



 



  Francisco, C., W.A. Fuller and R. Fecso, "Statistical Properties



   of Crop Production Estimators," Survey Methodology Vol. 13, No. 1,



   June 1987, pp. 45-62.



 



   Groves, Robert M. and Lou J. Magilavy, (1986)  "Measuring and



   Explaining Interviewer Effects in Centralized Telephone Surveys,"



   Public Opinion Quarterly, Vol. 50:251-266.



 



   Matthews, R. V. , "An overview of the 1985 Corn, Cotton, Soybean, and



   Wheat Objective Yield Surveys," USDA, Stat.  Rept.  Ser., Staff



   Report.  Nov. 1985.



 



   Office of Management and Budget, Ouality in Establishment Surveys,



   Statistical Policy Working Paper 15, Washington, D.C., 1988.



 



   O'Muircheartaighl Coln A., (1986) "Correlates of Reinterview



   Response inconsistency in the Current Population Survey." Second



   Annual Research Conference, Bureau of the Census, March 23-26, 1985



   in Reston, Va.



 



   Pafford, Brad, (1988) "Use of Reinterview Techniques for Quality



   Assurance: The Measurement of Response Error in the Collection of



   December 1987 Quarterly Grain Stocks Data Using CATI," National



   Agricultural Statistics Service, Research Report, USDA.



 



   Reiser, M., R. Fecso and K. Taylor, "A Nested Error Model for the



   Objective Yield Survey," Proc. of Section on Survey Research



   Methods, ASA, 1987.



 



   United Nations, National Household Survey Caipability Proctram,



   Nonsampling Errors in Household Surveys, New York, 1982.



 



 



 



 



 



 



 



 



                                    242



 



                               DISCUSSION



 



                              David A. Binder



                             Statistics Canada



 



 



       I would like to thank the organizers for inviting me as a



   discussant at this important session on Quality in Business



   Surveys.  Prior to these meetings, I reviewed once again the



   Statistical Policy Working Paper 15, "Quality in Establishment



   Surveys", and I would highly recommend it be read by both novices



   and experienced survey statisticians who deal with the design or



   analysis of business surveys.



 



       One clear fact which comes out of Working Paper 15 is that



   there are many issues and methods which are common to most federal



   business surveys.  Certain issues faced in business surveys are



   more difficult than in social and demographic surveys.  Part of



   this is due to the complex and dynamic structures within which the



   business community, operates.  When designing and conducting such



   surveys, it is important to keep in mind the operational realities



   of the business world.



 



       Since there are many commonalities among business surveys,



   statistical agencies should pool their knowledge and expertise to



   take advantage of their combined experience.  For example, there



   are sufficiently many common practices for sampling, data collec-



   tion, editing, estimation and dissemination of the results, that



   certain, standards and guidelines could be developed among the



   agencies.  Sharing information and expertise is a worthwhile



   objective which meetings such as this can help accomplish.  Whereas



   legalities of data sharing pose some obstacles at present,



   hopefully these can be overcome in the longer term.



 



       There are, of course, many aspects to improving the quality of



   business surveys, including frame issues and non-sampling errors.



   The development of general purpose business frames can lead to



   sophisticated and expensive systems, especially with respect to



   development and maintenance.  This is because, a general purpose



   frame should reflect the realities of the operating structures in



   the business world and there must also be user-friendly interfaces



   with such a frame.  In practice, there is of ten a gap between



   conceptual frameworks and actual application.



 



 



   Quality of the Frame



 



       An important area of concern in the quality of business



   surveys is the quality of the frame itself.  Survey quality will



   depend on the quality of the frame information as well as the ease



   of accessibility to the frame data.  Frames can never be perfect.



   Some of the sources of error are:



 



                                 243



 



  -   undercoverage, especially for births



   -   overcoverage, especially due to duplication and inclusion



       of out-of-scope units



 



   -   misclassification of industry code, employment size,



       other size measures, etc.



 



   -   identification of appropriate reporting units (collection



       entities) which reflects the operating structure of the



       business



 



      It is important to include in the development of a frame a



   Program to measure the quality of the frame information.  This is



   particularly true when the frame will be used by a variety of users



   other than the developers themselves.  Examples of quality measures



   are:



 



   -   site of the backlog for SIC classification



 



   -   distribution of lag times for births and other updates to



       the frame



 



   -   errors resulting from cutoffs for multi-unit employers



 



   -   duplication



 



   -   matching errors



 



       If the frame is to contain the most up to date information,



    there should be some facility for incorporating and verifying



    feedback from the surveys themselves.  This can lead to



    complications, where the information being derived from one survey



    may affect other surveys (e.g. a change in the relationships among



    multi-unit employers).



 



 



    Structure of the Frame



 



         If it is anticipated that the Business Establishment Listing



    (BEL) of the Bureau of Labor Statistics will be used by other



    agencies conducting business surveys, it should be noted that many



    of their needs cannot be met within the framework being discussed



    here.  The administrative world does not always correspond to the



    business world.  A listing which is useful for employment and



    related labor characteristics may not be suitable for surveys of



    economic production and other special characteristics



 



        The structure of the BEL for multi-unit employers needs some



    clarification.  Whereas the worksite may be able to report



    employment data, it may not be able to report on profit and loss or



    balance sheet data.  Different reporting units (collection



    entities) may need to be identified for different surveys.  It



 



                                   244



 



   cannot be assumed that the respondent will necessarily conform to



    your concepts.



    



        At Statistics Canada, we have developed a hierarchical



    structure of statistical entities for the larger businesses.  These



    are (i) the enterprise, where a full set of consolidated financial



    statements are available, (ii) the company which can report on



    profit and loss and other balance sheet items, (iii) the



    establishment, which can report on such items as value of output,



    cost of intermediate inputs, inventories, number of employees, and



    salaries and wages, (iv) the location, which can report sales and



    number of employees.  This recognizes the relationship between the



    business world and the statistical needs for economic surveys.



    However, it is a complex structure to maintain.



 



    Retrieval Systems



         Not only are frame maintenance procedures resource intensive,



    but effective retrieval systems can be quite complex. and expensive



    to develop.  Quality improvements to business surveys through



    better quality frames can only be realized if the frame information



    is easily obtained both cross-sectionally and through time.



    Examples of some of the needs which are expressed by users of frame



    information are:



 



    -   linking of data through time



 



    -   historical files



     



    -   response histories



 



    -   linking of data within enterprises



 



    -   identification of seasonal and volatile firms



 



    -   having sufficient structure to roll up to enterprise and



        track changes in structure over time



 



    -   survey feedback (and verification)



 



    -   requirements for estimation (regression, ratio,



        composite, benchmarking, poststratification)



 



  Other Frame Considerations



 



       The needs of the frame will change depending upon the survey



    frequency and the reference periods.  For example, the units



    considered in-scope could vary according to whether the survey is



    monthly, quarterly or annual.



 



                                   245



 



       Even with all the complexities I have mentioned regarding the



    development and maintenance of business frameso I would strongly



    encourage such development, with any deficiencies explicitly laid



    out.  One of the uses of a high quality frame is the ability to



    perform analyses of business demographics, showing behaviour of



    births, deaths, mergers and amalgamations, which is an important



    side benefit.



 



 



    Total Survey Error



 



        As was  pointed during the session, improving frame quality is



    only one of the many mechanisms to meet the overall objective of



    controlling survey errors.  Development of survey quality profiles



    has been mentioned as an important tool to monitor, control and



    manage surveys.



 



        Response errors should be a particularly important concern to



    the survey-taker.  However, response errors are often due to the



    survey instrument itself, rather than the respondent.            Recent



    experiences with cognitive methods have proven useful here.  Often



    there are trade-offs between ideal concepts And the respondents'



    ability to respond accurately.  For example, when asking a farm



    operator about value of equipment on land which he operates, he may



    prefer to report on equipment which he owns but which may be



    situated on another farm, rather than including equipment which is



    owned by someone else, but which is situated on his land.  This



    creates difficulties for the survey-taker who is trying to avoid



    coverage errors.  These are not easy problems to overcome, but the



    first step in all these endeavors is to recognize the problem and



    possibly measure its impact.  Without special studies, it would be



    difficult to assess the relative merits of coverage error on the,



    one hand and response error on the other.



 



        In general, we need to concentrate on methods to synthesize



    all the errors into, an overall measure of survey quality.          This



    would allow informed decisions to be made regarding the relative



    merits of improving one survey process over another.         If such a



    model existed, we could answer some common concerns such as the



    relative contribution of edit and imputation to the reduction in



    total survey error and whether simpler methods could achieve



    comparable results.



 



        One possibility would be to use develop a microdata simulation



    database which incorporate as many of the known errors as possible.



    This database would consist of microdata which look like the real



    population.  Various models for response and nonresponse errors



    could be simulated and then the data would be processed using



    existing or proposed methods.  Since the original "true" data are



    known, we could assess the relative impacts of improving survey



    coverage versus using an Alternative estimator versus adding more



    edits to the survey process, for example.



 



                                    246



 



                                DISCUSSION



 



                             Charles D. Cowan



                      Opinion Research Corporation



 



 



   What These Papers Have in Common



 



        If there is a single message that comes through in both the



   papers being discussed, it is that:



 



        Avoidance and/or Control is the Best Approach in Dealing



        with Nonsampling Error.



 



        Quality is something that one builds into surveys and



   continues to monitor.  While one cannot completely avoid problems



   in surveys, it is markedly better to avoid or control a problem



   than it is to attempt to make an a posteriori correction to fix the



   problem.  Such a fix usually is based on a much smaller amount of



   information collected from a supplemental sample or survey and adds



   variance to the original survey estimates.  It is also usually the



   case that a fix introduced at the end of a survey only takes care



   of one problem and is not very cost efficient.



 



        In their paper, Tupek and MacDonald describe a process of



   expanding a sampling frame for business surveys that addresses



   several different sources of nonsampling error.  Their work with



   the sampling frame deals with coverage issues, timing issues,



   definitional problems in the surveys, estimation, use of



   administrative records for weighting and variance reduction, and



   other aspects of the conduct of business surveys.  Their approach



   is to improve the basic materials used for surveys to encourage



   more efficiency and accuracy at later stages.



 



        Pecso in his paper describes a process of measuring and



   controlling as many aspects as possible of incidence of nonsampling



   error.  He also supports the idea that nonsampling error it best



   dealt with by avoidance, but is also realistic in suggesting that



   a catalog of problems is useful for two primary purposes: planning



   future surveys and providing documentation for users of the current



   effort.  This control process can be used to ensure that the data



   produced in a survey are of the best quality given the constraint



   that control is imposed as part of the process, since many types of



   nonsampling errors cannot be totally avoided.



 



 



   Specific Quality Issues for Business and Establishment Surveys



 



       As one reads and compares these papers, one is reminded of the



   fact that business and establishment surveys, are different



   household surveys in several key ways:



 



 



                                   247



 



     1)  The availability of attributes on the frame and the use



          of this frame information at the unit level differs from



          what can be done in household surveys,



 



      2)  The surveys themselves make extensive use of records as



          a basis for reporting, and



 



      3)  The data to be collected in business and establishment



          surveys has a multilevel nature, meaning that information



          about the businesses is hierarchical and we are



          interested in the information at each level (e.g., Sears



          Headquarters, regional offices, distribution centers, and



          individual stores).



 



        These factors are crucial to the design of business and



   establishment surveys.  Use of information on the frame for design



   and use of records in collection makes it possible to improve the



   quality of these types of surveys relative to household surveys,



   but, this is counterbalanced to an extent by the complications



   introduced by the multilevel nature of the data to be collected.



 



        Tupek and MacDonald note in their paper that for the surveys



   they conduct that establishments come from skewed populations, and



   having this information on the frame makes it possible to design a



   survey that is much more efficient, especially for multiple



   characteristics to be measured simultaneously.  However, reliance



   on this information in the frame makes the accuracy of frame



   information crucial at the individual unit level for both sampling



   and estimation purposes.  Their project on frame expansion and



   improvements has an impact in several areas.  The first is sample



   frame development, so that more business and establishments are



   represented.  This is broader than a coverage issue, since coverage



   is usually viewed, as a problem that pervades an extant frame.



   Tupek and MacDonald address coverage issues in this way, but also



   include whole segments of the business population previously



   excluded from the frame.



 



        A second area impacted by. the frame expansion and improvements



   project on which they report is the actual design of the sample,



   where the sample can be optimized for making different types of



   estimates using information available on the frame.  A third area



   impacted by the frame expansion and improvements. is in data



   collection, and the final area is in estimation.  Tupek and



   MacDonald point out that the new frame encourages the conduct of



   new longitudinal surveys, the selection of sample at the unit of



   analysis (instead of collecting the information by proxy or



   sampling down to the unit of analysis after starting at a higher



   level in the hierarchy), improvement in response rates because of



   higher eligibility rates, savings in terms of time and effort



   expended on the survey, and improvement in weighting and ratio



   estimation procedures.



 



 



                                     248



 



      Fecso takes a different approach to dealing with nonsampling



error.  He catalogs sources of nonsampling error, and his approach



is to detect, measure, and control the nonsampling error.  Many of



the sources of nonsampling error he lists are common to both



household and business surveys, but with business surveys he has a



variety of records, including past survey collections, available



for detection and measurement of nonsampling error.



 



      A primary concern for the use of records is the accuracy of



the data in the records, since the records themselves could be in



error.   Although not mentioned in the paper, some of the most



interesting work in health care surveys is modeling of nonsampling



error when hospital records and information based on patient recall



don't match and either is potentially wrong.  The same is true for



business surveys -- accuracy in the records systems is crucial for



detection and measurement of nonsampling error as part of a quality



management system for a survey.  Another factor related to accuracy



is the consistency of definitions used by different respondents.



If the data are accurate but based on different definitions, then



there is a problem in how the data might be used for detection and



measurement of nonsampling error.



 



 



Concerns with Business and Establishment Surveys Not Covered



 



      While both papers are excellent in the way they cover in depth



quality issues facing business and establishment surveys, they both



miss some salient points peculiar to these types of surveys.  The



first was mentioned earlier, namely that businesses are



hierarchical, which leads to some difficult questions regarding who



reports in these surveys, and how the various businesses relate to



one another (i.e., at what level do we define the unit of



analysis?).  In terms of how units relate an example was given



earlier for Sears, which owns not only Sears Retail, but also has



Allstate Insurance, a mailing service, regional offices, catalog



stores, and local retail stores.  Are we interested in these



surveys in getting reports from the lowest level in this chain?



How does Sears headquarters report exactly -- for itself as an



establishment with a certain number of employees, or does it



include all employees and sales at all locations?  If there is



confusion in reporting rules for a survey, we could wind up with



severe overcounting or undercounting of activities and personnel.



 



      Another issue has to do with the reporting of activities



within a firm.  In reporting mailing activities, for example, each



firm and each location of a firm will have some activities to



report.  To whom do we speak in the firm to get a complete picture?



There are separate operating units within firms, each with a



manager knowledgeable about his own unit's activities.  And there



are sometimes other units that assist in terms of technical or



operational support.  Do we talk to managers in both or all offices



 



 



                                  249



 



or units, or is there a central source that can answer all



 questions knowledgeably and without duplication?



 



      There ate two final concerns we have regarding quality in



 business and establishment surveys.  One has to do with the process



 of improving and expanding the frame for a business survey, which



 usually translates into adding smaller firms.  These firms are more



 likely to be related to other members of the population, and they



 are more prone to movement in and out of the population (births and



 deaths).  Because of these factors, they add a certain amount of



 instability to the estimation process.  This may be good or bad --



 on the one hand we have a more realistic representation of the



 population of businesses when we include more firms, but on the



 other hand for certain types of statistics we may be adding more



 variation without a real gain in forecasting or descriptive



 accuracy.  This problem could be labeled: "messiness at the edge".



 



      The other problem not addressed in either paper, and of



 particular concern in the Fecso paper, is that a large, well



 conceived and executed survey might not benefit from a



 Nonresponse/Nonsampling Error Correction that it estimated from a



 small onetime experiment.  While in theory the idea of implementing



 research studies to monitor the quality of ongoing surveys is



 laudable and should enhance the quality of the surveys,



 implementation for Federal surveys often falls a bit short, with a



 simple, one-time study implemented to measure a particular problem.



 A small scale, high variance research study should be viewed as



 just that, and not a vehicle for making corrections to a



 multimillion dollar effort.  If the nonsampling error problem is



 sufficient to justify such an effort, and the nonsampling error



 cannot be dealt with as part of the design, then sufficient



 resources should be devoted to measurement and control to take care



 of the problem.  Essentially, the problem becomes one of design



 again, with focus on the Proper allocation of resources between the



 survey and the experiment to fix the survey.



 



 



 Conclusions



 



      Both papers were excellent summaries of the state of the art



 for measuring and maintaining, quality in Federal surveys of



 businesses and establishments.  Researchers involved in the design



 of either business or household surveys would benefit from studying



 and implementing the principles found in either paper.



 



 



 



 



 



 



 



 



                                  250



 



                               Session 8



                         COGNITIVE LABORATORIES



 



 



 



 



 



 



 



 



                                  251



 



 



 



                                252



 



      THE BUREAU OF LABOR STATISTICS' COLLECTION PROCEDURES,



   RESEARCH LABORATORY: ACCOMPLISHMENTS AND FUTURE DIRECTIONS



 



                         Cathryn S. Dippo



                         Douglas Herrmann



                 U. S. Bureau of Labor Statistics



 



 



 I. Introduction



 



      The accomplishments of the Cognitive Aspects of Survey



 Methodology movement (Jabine, et al. 1984) have clearly been



 substantial.  This is especially true in Washington, where three



 Federal agencies (Bureau of the Census, Bureau of Labor Statistics



 (BLS), and the National Center for Health Statistics) have



 established laboratories.



 



      Consider the scope of BLS' survey research programs.  Most of



 the sampling units from which data are collected by or for BLS are



 establishments.  While approximately 60,000 households are



 questioned about labor force participation each month in the



 Current Population Survey (CPS), 340,000 establishments are being



 asked to report their payroll employment each month in the Current



 Employment Statistics Survey.  More than 200,000 price quotes are



 being collected each month from establishments in the Consumer



 Price, Producer Price, and International Price Index programs.



 Moreover, much of the data are currently being collected by mail,



 without person-to-person interaction.  In the future, more and more



 of the data will be collected with computer assistance, and the



 human-machine interface will take on added importance.



 Furthermore, in most establishment surveys, the needed data can be



 directly observed (e.g., consumer prices) or exist in records



 rather than in the memories of the respondents.  Even in household



 surveys, many respondents are being asked to recall not only



 autobiographical events, but also information that exists in



 household records and information about other members of their



 household.



 



      Thus, the mission of the Bureau requires the BLS laboratory to



 consider more than just questionnaires to be used with personal



 visit interviewing in the context of a household survey about



 autobiographical events.  The Bureau acknowledged this fact when



 selecting the name for its laboratory -- the Collection Procedures



 Research Laboratory (CPRL) -- which was established in 1988.  The



 basic goal of the CPRL is to improve through interdisciplinary



 research the quality of data collected and published by BLS.  As



 originally envisioned, all forms of oral and written communication



 used in the collection and processing of survey data are



 appropriate subjects for investigation, as are all aspects of data



 collection, including mode, manuals, and interviewer training.



 



 



 



                                 253



 



        The CPRL's staff includes cognitive psychologists, social



  psychologists, sociologists, and a psychological anthropologist.



  For most of their projects, they work closely with the economists



  or program specialists responsible for defining the concepts to be



  measured by the Bureau's survey programs.  To augment staff



  resources, the CPRL has labor hour contracts with the Institute for



  Social Research at the University of Michigan and Westat, Inc.  The



  laboratory also does work under contract for other Federal agencies



  such as the Internal Revenue Service.



 



        Although the CPRL has only existed for two years, its research



  program has been both broad and prolific.  In section II, some



  accomplishments of the CPRL, are reviewed.  The discussion is



  organized within the framework of an information processing model.



  In section III, some directions for future research are described.



  The success of focusing on the cognitive system suggests that



  focusing on other behavioral systems may produce further gains in



  data quality through improved survey theory and practice.



  Moreover, the success of using laboratory techniques for



  investigating the data collection processes used in sample surveys



  leads us to believe the techniques can be useful in improving other



  aspects of survey design.



 



 



  II. Accomplishments to date



 



        The CPRL has integrated the cognitive approach into the



  Bureauls survey research program to good effect in many ways.



  Primarily, the laboratory has changed how data collection research



  is conducted at BLS.  Not only has the research conducted to date



  affected our understanding of the survey process, but the fact of



  its existence has heightened awareness throughout BLS of the need



  for a better understanding of all aspects of the data collection



  process (Norwood and Dippo in press).



 



        Some results of the CPRL's research efforts are presented here



  within the framework of an information processing model (Cannell



  et al. 1989; Tourangeau 1984) that has four distinct stages:



  comprehension, retrieval, judgment, and communication.  As applied



  to respondents, these stages refer to the comprehension of a



  question, retrieval of pertinent information, judgment about the



  accuracy of the information retrieved, and communication about this



  information within social and other restrictions imposed by the



  survey situation.  As applied to interviewers, these stages may



  refer to comprehension of the question, retrieval of appropriate



  ways to say the question aloud, judgment about whether the



  respondent has understood the question, and communication to ensure



  the question has been understood (such as by rereading it) or, if



  the question has apparently been understood, to indicate that



  another question is about to be presented.



 



 



 



                                   254



 



  A. Comprehension



 



      Question comprehension clearly requires that the terms making



  up a question be correctly understood.  The accuracy of term



  comprehension has been shown by many psycholinguistic



  investigations to differ in certain ways.



 



      Multiple meanings of terms: A term may lead some respondents



  to answer inappropriately because it may convey a meaning different



  from that intended by the designer.  Research at BLS has



  accordingly attempted to identify terms with several meanings that



  are not made explicit by the phrasing of questions and might be



  likely to produce misinterpretations.  Since the issue of employment



  is of personal significance to most people, questions about



  employment status are likely to predispose respondents (especially



  the unemployed or those with insecure employment) to be influenced



  by social desirability when answering the CPS (DeMaio 1984;



  Edwards, Levine, and Allen 1989).  The misinterpretation of



  employment status terms may easily occur in a survey such as the



  CPS (Martin 1987).



 



       Accordingly, respondents I interpretations of two key terms on



  the CPS concerning unemployment status, "on layoff" and "looking



  for work," have been examined.  The CPS definition of unemployment



  refers to persons who were not employed during the survey week,



  were available for work, and had made specific efforts to find



  employment sometime during the prior four weeks.  Persons who are



  waiting to be recalled to a job from which they have been laid off



  need not be looking for work to be classified as unemployed.  As



  expected, research demonstrates that these terms are sometimes



  misinterpreted by laboratory respondents to the CPS.  Similar



  research into the effects of multiple meanings of terms has also



  beet conducted for several sections of the Consumer Expenditure



  (CE) Interview Survey, including the sections on medical care, home



  purchase, and trip expenditures (Miller and Downes-LeGuin 1989).



  Since our results indicated that people interpret "payments" in



  different ways, the section on medical care expenditures has since



  been modified to avoid misinterpretations of this term.



 



       Diverse Meanings: Diversity of term meaning also may impair



  comprehension.  For example, in a recent pilot survey of business



  establishments, respondents were asked to report all "nonwage cash



  payments" paid to employees during the calendar year.  BLS defined



  the payments to include bonuses and awards, lump-sum, cash profit



  sharing, and severance payments, and nonregular commissions, but



  since this technical term probably was not too familiar to



  respondents, the meanings of "nonwage cash payments" can be



  expected to vary across respondents.  When the interpretations of



  this term by respondents were investigated, it was found that



  respondents interpreted "nonwage cash payments" in a diverse



  fashion.  Some interpreted it too broadly to include payments in



  kind, such as a new car (Boehm 1988), and some too narrowly to



 



                                    255



 



 include only cash And not cashable checks (Phipps 1990).  Another



  group of respondents who had made such payments simply checked they



  had made no payments because of a lack of understanding of what the



  term included.  Respondent exclusion and nonreporting of payments



  were more serious comprehension errors than inclusion of



  inappropriate payments, contributing to underreporting.



 



        Format Properties: When respondents complete a survey form



  received in the mail, the format of the instrument may play a



  crucial role in the respondents' comprehension.  If the format does



  not make it- clear what parts of the instructions are essential,



  respondents may overlook these parts and respond inappropriately.



  For example, in the Nonwage Cash Payments Pilot Survey (Phipps



  1990), instructions, definitions, and examples were on the back of



  a one-page questionnaire, for which two different layouts were



  ,used. one layout required respondents first to provide an annual



  nonwage cash Payment total and an annual payroll total, then answer



  a set of yes/no questions asking if they made specific types of



  nonwage cash payments.  The second layout placed the set of yes/no



  questions first, with the payments and payroll totals requested at



  the bottom of the page.  Reporters receiving the  second layout were



  much less likely to provide the annual payroll total, stating in



  retrospective interviews that they overlooked it or did not



  understand they were to provide it.  Thus, the layout of the second



  form, combined with a lack of instruction, caused an entire section



  of the form to be overlooked.  As expected, the format of a survey



  played an important role in the respondents I comprehension of



  survey items.



 



       The types of cues used on a self-administered form like an



  expenditure diary also can affect comprehension.  In developing a



  diary for recording clothing expenditures, alternative cueing



  levels were tested in a laboratory.  Results indicated that a



  shorter diary with multiple pages that repeated the general cues,



  e.g., buying clothes, was more effective than a longer, more



  structured version with specific cues.  Respondents were better at



  clarifying the domain of purchases to be recorded with the general



  cues than with the specific cues, i.e., the specific cues led them



  to restrict their comprehension of listed items more narrowly than



  intended.



 



 



  B. Retrieval



 



       Most Federal surveys require respondents to retrieve



  information about factual or autobiographical events.  Faced with



  the need to control data collection costs, the time period for



  which the events Art to be recalled is often long.  For example,



  the reference period for the CE Interview Survey is three months.



  In the CPS, respondents may be asked questions about last week, the



  last four weeks,, or the last time they worked, which could require



  recall for a long period of time.  (For further discussion of



 



                                    256



 



memory retrieval errors in CE and CPS, see Dippo 1989 and Mullin



1990).



 



     Cues: Often a situation is inadequate in the cues it presents



for retrieval.  Alternatively, when enough appropriate cues are



brought forth, a person can retrieve the previously "forgotten"



memory. while some information is probably lost from memory due to



diseases and environmental influences (such as alcohol), cues



clearly play an important role in retrieval.  Accordingly, several



investigations have attempted to increase response, accuracy on



surveys by providing additional cues to retrieval, e.g., Lessler,



et al. (1989).  Still, it is important to recognize that some cues



can be misleading and ensure that a respondent does not retrieve



the appropriate information.  Cues facilitate only when they



correctly direct retrieval.



 



     In the Nonwage Cash Payments Pilot Survey, underrepotting was



investigated by presenting cues to facilitate retrieval.  When



respondents (company representatives) were given specific cues



pertaining to bonus and award payments, recall of such payments was



11 percent higher thin without cues (Phipps 1990).  Also, in the CE



Diary Survey, cues with varying levels of generality have been



tested.  For example, general cues included "beef (ground, roasts,



steaks, briskets, etc.)" and specific cues included "ground beef,



chuck roast, round roast, other roast, round steak, sirloin steak,



other steak, other beef and veal." Underreporting was greater with



general cues for certain items, particularly nonfood items.  On the



other hand, the level of reporting for many food items was not



affected by the type of cues (Tucker and Bennett 1988).



 



      Strategies:  To get accurate recall about the past, it is



necessary to get people to retrieve the mental records of what they



actually did.  Several strategies to get respondents to access



their memories of experiences have proved useful in our



investigations at BLS.  One strategy has respondents recall a



critical personal event that occurred in the reference period in



order, to anchor the period.  A second strategy has a respondent



consult a calendar when attempting to recall.  A third strategy has



respondents decompose events recalled into smaller events to ensure



that what is being recalled is a real experience and not a



stereotypical schema.  Research funded by BLS has found that



respondents vary in the extent to which they employ the strategy



that they were instructed to use. only one-third of the laboratory



subjects instructed to use a decomposition strategy when responding



to questions on their hours worked used the strategy.  Also, the



vast majority of proxy respondents presented with this strategy



ignored it because they did not have the knowledge necessary to use



it.



 



      Expertise:  In a laboratory study of household respondent



pairs using the CPS questionnaire, proxy responses disagreed with



those of the self-respondent approximately one-third of the time



 



                                 257



 



(Boehm 1989).  In another laboratory study, when responents were



 instructed to use the decomposition procedure, the vast majority of



 proxy respondents ignored the procedure, since they did not have



 the knowledge necessary to use it (Edwards, et al. 1989).  Self-



 respondents were found to overreport and proxy respondents to



 underreport the hours worked.  Also, proxy respondents were more



 likely than self-respondents to make errors, and their errors



 tended to be larger (see also Tanur 1990).  As might be expected,



 proxies fail in areas they are less likely to know about.  For



 example, proxies underreport more when the I person reported on



 worked weekends or worked extra hours.  Also, proxy error was



 greater when the respondent was unrelated to or from a different



 generation than the person to whom the data related (Edwards, et



 al. 1989).



 



 



 C. Judgment



 



      People may recall correctly but not realize the recalled



 information is correct.  They may recall correct information, know



 it is correct, but express it inappropriately because they



 misconceive how responses are to be expressed.  It was noted above



 that field research on the CE Diary Survey indicated specific cues



 were often more effective and led to less underreporting than



 general cues (Tucker and Bennett 1988).  Laboratory research has



 indicated that judgment is also a factor.  When given specific



 cues, laboratory subjects were sometimes unsure of where to record



 products on the form.  Whether this hinders reporting is still an



 open question, but the accuracy of reports is affected (Tucker, et



 al. 1989).  The specific cues also may make the task more onerous.



 



 



 D. Communication



 



      The importance of communication to cognition has largely been



 recognized in social psychology and anthropology.  A considerable



 amount of survey research has shown that respondents' inclination



 to answer questions may be affected by the social desirability of



 the answers.   In some cases, respondents may be disinclined to



 answer because they do not want to share certain kinds of



 information.  In other cases, they may not want to present



 themselves in a bad light.  In other cases yet, they may want to



 adapt their response to what they perceive to be the expectations



 of the interviewer.



 



      While BLS has yet to complete an investigation of



 communication, it has recently begun several such investigations.



 First, the laboratory is conducting research into the



 psycholinguistic factors that persuade a respondent to provide



 confidential information to a survey (Herrmann, et al. 1990).  This



 research will indicate the degree of trust elicited by different



 protection terms (confidential, private, secret, concealed,



 



                                 258



 



 nondisclosed).  Second, we are examining the influence of



  interviewer errors an the errors of respondents using techniques



  developed by Cannell (Cannell, et al. 1989).  For example, tape



  recordings of CE Survey interviews are being analyzed to determine



  whether the quality of answers produced by respondents varies with



  the quality of the interviewers' presentation of a question.



  Third, like other agencies we are investigating the use of



  computer-assisted telephone interviewing (CATI) for some BLS



  surveys.   Research is underway for the CPS, CPI-Housing, and



  Continuing Point-of-Purchase surveys to determine if people respond



  in the same manner in a computer-assisted telephone interview as



  they do in a personal interview.  It has been suggested that the



  personal interview ensures better attention from the respondent,



  but it has also been suggested that CATI elicits information that



  otherwise might not be disclosed because the respondent feels less



  personally involved when interacting with an interviewer on the



  telephone.   In various ways our research is addressing these



  alternative expectations about CATI.



 



 



  III.  Future directions



 



       Prior to the establishment of the laboratory, BLS sponsored a



  Questionnaire Design Advisory Conference to seek advice on the



  types of questionnaire research that should be undertaken for the



  CE and CPS (Bienias, et al. 1987).  The conference participants all



  advocated the incorporation of cognitive concepts into the BLS



  research program and suggested that research focus on the issues of



  respondent rules, respondent and interviewer roles, questionnaire



  form and content, and statistical estimation.



 



       In addition, our ongoing research program has taught us that



  many aspects of the data collection process require a broader



  integrated-systems approach rather than a cognitive approach to



  research.  The accuracy and efficiency of survey responses are



  affected not only by cognitive variables (e.g., abstractness of



  terms, retrieval cues) but also by other kinds of variables (e.g.,



  physiological, perceptual, emotional, motivational, social,



  societal, cultural, and economic; see Royce 1973).  In some cases,



  these variables affect responding because they interact with the



  quality of cognitive processes underlying responding.  In other



  cases, these other variables leave cognitions unaffected but



  instead interact with a respondent's inclination to report



  accurately about these cognitions.



 



 



  A. Looking beyond the cognitive approach



 



       An integrated-systems conception of cognition has been



  advocated increasingly in recent years by scholars in anthropology



  (Cole and Scribner 1974), psychology, and neuroscience.  Some



  noncognitive psychological and societal factors that may affect the



 



                                   259



 



response process are: Physiological condition, perception,



emotional state, motivation, familial roles, and societal norms.



 



     Physiological condition:  The accuracy and efficiency of



cognitive responses are affected by the physical state of a



person's body (Squire 1987).  Physiological condition, as affected



by physical health, influences a person's ability to understand,



remember, reason, and analyze.  A variety of routine health



conditions (such as the common cold) may impair the accuracy and/or



efficiency of cognitive processes (Cutler and Grams 1988).



Cognitive processes are also impaired by commonly imbibed



substances, such as coffee, tobacco, tranquilizers and



antidepressants, and even certain antibiotics.



 



     The CPRL has been sponsoring laboratory research on the



effects of computer-assisted personal interviewing (CAPI) on the



interviewer (Couper et al. 1990).  Although the studies have been



within the context of the Consumer Price Index survey, where



interviewers conduct interviews both on the doorstep of housing



units and walking the aisles in retail establishments, the



procedures developed, concerns raised, and results are generally



applicable.  For example, more than, 40 percent of the 46



interviewers who volunteered to be laboratory subjects stated that



they had suffered neck, shoulder, and/or lower back problems in the



12 months prior to any contact with a portable computer.  Moreover,



approximately 75 percent of the subjects wore some form of



corrective lenses, with bifocals presenting particular problems for



interviewers trying to focus on the keyboard, screen, and



respondent.



 



     Perception: The quality of visual stimuli affects the ease of



reading and comprehension.  The role of perception is of special



importance in many Federal surveys where data are collected via a



self-administered form.  For these surveys, the perceptual



constructs may have significant effects on the quality of data.



Wright (1980) suggests classifying form-design issues into three



categories: the language of forms, overall structure, and the



substructures within the forms such as the questions themselves.



In addition, there are perceptual issues related to the appearance



of questionnaires, such as color and print font.



 



     The presence of visual stimuli affects retrieval processes



more than thinking about or imagining the stimulus.  For example,



psychological research indicates that the frequency at which



academics use external aids, such as files and piles of papers on



one's desk, has been found to be positively correlated with



scholarly productivity (Hertel 1988).  Survey research indicates



that expenditure reporting increases with the use by respondents of



an information booklet describing the types of items that belong to



the categories being read aloud by the interviewer.  More



respondents appear to be willing to read the item lists than to



listen to an interviewer read the list to them.



 



                                 260



 



       Respondents to the Occupational Safety and Health Survey face



  a very difficult task in deciding if an incident is an injury or an



  illness and if it is reportable or not.  Currently, respondents



  receive a 22-page set of guidelines.  Laboratory staff are now



  investigating different methods for communicating the decision



  logic to respondents, i.e., flow charts or graphic representations



  of the decision paths.  In addition, a simple user's guide (no more



  than 10 pages) is being prepared for respondents who are new to



  OSHA recordkeeping.  Unlike the longer guidelines, this guide



  contains background on the 1970 OSHA act and provides examples on



  how to recognize, record, and report occupational injuries and



  illnesses.



 



       Emotional state:  Our cognitive ability to comprehend,



  retrieve, evaluate, and respond may be affected by our emotional



  state (Wolkowitz and Weingartner 1988), which in turn may be



  affected by recent events or prolonged stress.  Stress, a major



  factor moderating emotional states, has been associated with



  cognitive failures in everyday life.  Sometimes, emotional states



  may prevent people from producing correct responses, that they



  "know" at some level.  For example, despite decades of controversy,



  it is now generally accepted that sometimes people repress



  memories.



 



       Nontrivial levels of stress are currently experienced by



  interviewers.  With the change over the next decade to increased



  CATI, the possibility of increased interviewer stress is real.  In



  surveys like the CPS, the proportion of personal visit interviews



  will increase for most interviewers working in large metropolitan



  areas as many of their telephone interviews are transferred to a



  centralized CATI facility.  Concerns about personal safety and



  administrative pressures to maintain high response rates are but



  two factors which may contribute to increased interviewer stress.



  In a centralized CATI facility, interviewers know their work is



  constantly being monitored.  Recent news stories about the effects



  of constant observation and work quotas in the telephone industry



  indicate stress levels can be very high in these kinds of



  situations.



 



       Motivation: We know little about respondents' motivations for



  responding to survey questionnaires.  Census' recent experience of



  overestimating the mail-return rate in the decennial census is but



  one indicator of how little we know.  At BLS, those of us working



  on the CE Interview Survey constantly wonder why anyone would agree



  to an interview that is expected to last 2 hours.  To investigate



  survey respondent motivation, a large-scale research project on



  household survey response has been initiated by Robert Groves at



  the University of Michigan, sponsored by the Bureau of Justice



  Statistics, the Bureau of Labor Statistics, and the National Center



  for Health Statistics.  One part of the project is an examination



  of both interviewer (e.g., attitudes, behavior, and



  characteristics) and administrative (e.g., procedures, workload



 



                                     261



 



 



levels, design parameters) influences on survey participation



(Groves, R.M. and Cialdini, R. 1990).  To examine the effects of



alternative forma of persuasive communication on sample attrition



rates and item response rates, BLS is conducting experiments using



appeals that stress the use of Current Employment Statistics data



by the trade associations representing the establishments (McKay



1990).



 



     Familial roles:  The roles people assume within the family



have been found in recent years to affect cognitive processes.



While it may be assumed in some surveys that people within a home



are equally able to answer questions pertaining to the household,



research shows that different family roles carry responsibility for



knowing -about certain kinds of information.  For example, wives



tend to know more about the health and activities of children



whereas husbands tend to know more about how community activities



affect the household.  Single parents tend to know the information



possessed by both spouses in dual-parent households.



 



     With the prevalence of proxy reporting in most household



surveys, the importance of :learning about what information is



exchanged within households and how should not be understated.



Recent research on proxy reporting in the CPS indicates adults may



be worse proxy reporters for youths than, for other adults in a



household (Tanur 1990).  Moreover, the proxy reporting of job



search may be dependent upon the type of job search strategies



being used by youth.  At Tanur notes, there is no literature about



family communication patterns and the issue of who in the family



talks to whom about what.



 



     Societal norms: Cognitive performance is affected by groups



in several ways.  For example, people are disinclined to perform



memory tasks when the social stereotypes that apply to them



indicate that they cannot perform well, such as the stereotypes



associated with age or with gender.  Also, people will sometimes



knowingly give the wrong answer to a question because they



recognize that their answer is contradicted by the other members of



a group.



 



     Moreover, social pressures sometimes dispose people to



communicate falsely what they do or do not know in order to achieve



social goals.  For example, people may say they cannot recall some



event or information to avoid, or speed up the questioning or to



make a certain impression on the questioner.  We do know that



social desirability plays a role, but there has been little



research into understanding the role (DeMaio 1984).  We also know



that the mode of data collection appears to have an effect on data,



but we do not know why (Shoemaker, et al. 1989).  Recent research



by Suchman and Jordan (1990) shows clearly the influence of social



and cultural variables.



 



 



 



                                262



 



      Evidence indicates that members of all cultures can equally



 perform all manner of cognitive tasks if the environment has



 provided the cultures equivalent education and experience.



 However, because cultures typically involve different educational



 systems, belief systems, and occupational opportunities, members of



 different cultures acquire different cognitive skills (Cole and



 Scribner, 1974).  hus, members of different subcultures of a



 multicultural society will interpret certain concepts differently



 and answer differently.



 



 



 B. Looking beyond the interviewing process



 



    The research laboratory and laboratory techniques can be used



 in a variety of survey design applications.  Just as the responding



 process is affected by noncognitive variables, the survey process



 consists of more than just question answering.  The entire survey



 design process, from defining the concepts to be measured through



 analyzing the data, involves the communication of concepts between



 people with different knowledge bases or an interaction between



 people and things.  The process can benefit from a broad range of



 interdisciplinary research including both cognitive and other areas



 of psychology, other behavioral sciences, and human neuroscience.



 



      The importance of the role of the interviewer has long been



 recognized.  Data collection and training methods designed to



 control interviewer error, such as structured questionnaires and



 verbatim training, have been developed in an attempt to control



 interviewer error.  Interviewer training typically stresses the



 need for neutrality, the use of specified questionnaire wording and



 administration procedures, and appropriate probing techniques.



 Recognizing the importance of this source of error, many BLS-



 sponsored laboratory studies conducted in the last two years have



 focused on the interviewer.  These studies indicate the role of the



 interviewer can be studied effectively with laboratory techniques.



 Thus, it seems natural to expand our research in this area.



 



 



 IV. Summary



 



      As survey researchers, we really know very little about the



 psychological processes underlying interviewer and respondent



 behavior.  The few laboratory studies to date indicate the cognitive



 approach is very useful.  With this approach we are learning about



 the roles of comprehension, recall, judgment and communication in



 the survey response process.  Eventually, as we learn more, we can



 develop detailed models which questionnaire designers can use to



 assess new questions and forms for survey data collection.



 



       Just as the research to date has shown that the cognitive



 approach is effective, it has shown that a more broad-based



 approach is necessary.  Survey responses clearly emanate from all



 



                                   263



 



behavioral systems within and outside the respondent.  An



 understanding of how responding is affected by the cognitive system



 is not enough.  A respondent's behavior is influenced by



 physiological, emotional, social, societal, and economic variables.



 A complete explanation of responding requires an understanding of



 all systems and how their influences are integrated overall to



 produce a response.



 



      The adoption of an integrated-systems approach would be a



 natural step in the evolution of survey science.  Consider the



 disciplinary history of economic statistics.  First, there were



 economists producing simple descriptive statistics.  The discipline



 of mathematical statistics was not really incorporated until



 probability sampling became the basis for sample designs.  Then



 came the advent of computers.  Just as we have expanded our use of



 statistical theory as applied to survey research beyond just



 sampling (e.g., to incorporating operations research techniques in



 sample design optimization and iterative methods such as raking in



 survey estimation) survey research may progress further by making



 use of not only cognitive psychology but also of knowledge of other



 psychological and sociopsychological systems.



 



 



 References



 



 Bienias, J., Dippo, C., and Palmisano, M. (1987), Questionnaire



 Design: Report on the 1987 BLS Advisory Conference, Washington, DC:



 U.S. Department of Labor, Bureau of Labor Statistics.



 



 Boehm, L. (1988), "CES Nonwage Cash Payment Prepilot Interviews,"



 Internal memorandum to Alan Tupek dated December 16, Washington,



 DC: U.S. Department of Labor, Bureau of Labor Statistics.



 



 Boehm, L. (1989), "The Relationship Between Confidence, Knowledge,



 and Performance in the Current Population Survey," in Proceedings



 of the Section on Survey Research Methods, American Statistical



 Association, in press.



 



 Cannell, C., Fowler, F., Kalton, G., Oksenberg, L., and Bischoping,



 K. (1989), "New Quantitative Techniques for Pretesting Survey



 Questions," in Bulletin of the International Statistical Institute,



 pp. 481-495.



 



 Cole, M. and Scribner, S. (1974), Culture and Thought: A



 Psychological Introduction, New York: John Wiley and Sons.



 



 Couper, M., Groves, R., and Jacobs, C. (1990, in press), "Building



 Predictive Models of CAPI Acceptance in a Field Interviewing



 Staff," in Proceedings of the 1990 Annual Research, Conference,



 Washington, DC: U.S. Department of Commerce, Bureau of the Census.



 



 



 



                                  264



 



 Cutler, S.J. and Grams, A.E. (1988), "Correlates of Self-Reported



  Everyday Memory Problems," Journal of Gerontology, 43, 582-590.



 



  DeMaio, T. (1984), "Social Desirability and Survey Measurement: A



  Review," in Surveying subjective Phenomena, eds.  C. Turner and E.



  Martin, New York: Russell Sage.



 



  Dippo, C.S. (1989), "The Use of Cognitive Laboratory Techniques for



  Investiating Memory Retrieval Errors in Retrospective Surveys, in



  Bulletin of the International Statistical Institute, Vol.  LIII,



  Book 2, pp. 363-382.



 



  Edwards, S., Levine R., and Allen, B. (1989), "Cognitive



  Strategies for Reporting Hours Worked, " in Proceedings of the



  Section on Survey Research Methods, American Statistical



  Association, in press.



 



  Groves, R.M. and Cialdini, R. (1990), "Toward a Useful Theory of



  Survey Participation," unpublished manuscript.



 



  Herrmann, D., van Melis-Wright, M., and Stone, D. (1990), "The



  Semantic Basis of Confidentiality," in Proceedings of the Section



  on Survey Methods Research, American Statistical Association, to



  appear.



 



  Hertel, P. (1988), "External Memory," in M. Gruneberg, P. Morris,



  and R. Sykes (eds.), Practical Aspects of Memory, New York: John



  Wiley and Sons.



 



  Jabine, T., Straf, M., Tanur, J., and Tourangeau, R. (1984),



  Cognitive Aspects of Survey Methodology: Building a Bridge Between



  Disciplines, Washington DC: National Academy Press.



 



  Lessler, J., Salter, W., and Tourangeau, R. (1989).  "Questionnaire



  Design in the Cognitive Research Laboratory: Results of an



  Experimental Prototype," Vital and Health Statistics, Series 6, No.



  1 (DHHS Publication No. PHS 89-1076), Washington, DC: U.S.



  Government Printing Office.



 



  Martin, E. (1987), "Some Conceptual Problems in the Current



  Population Survey," in Proceedings of the Section on Survey Methods



  Research, American Statistical Association, pp. 420-424.



 



  McKay, R. (1990), "Application of Persuasive Communication



  Strategies to a Business Establishment Survey," in Proceedings of



  the section on Survey Methods Research, American Statistical



  Association, to appear.



 



  Miller, L. A. and Downes-LeGuin, T. (1989), "Reducing Response



  Error in Consumers' Reports of General Expenses: Application of



  Cognitive Theory to the Consumer Expenditure Interview Survey,"



  Advances in Consumer Research, in press.



 



                                     265



 



 Mullin, P., (1990), "Proposal for Laboratory Research on the



  Feasibility of an Extended Interview Period for the CPS,"



  unpublished memorandum to A. Tupek, in preparation.



 



  Norwood, J. and Dippo, C. (in press), "Goverrment Applications," in



  Questions about Questions: Memory, Meaning and Social Interaction



  in Surveys, New York: Russell Sage.



 



  Phipps, P. (1990), "Applying Cognitive Techniques to an



  Establishment Mail Survey," paper to be presented at the annual



  meeting of the American Statistical Association, Anaheim,



  California, August.



 



  Royce, J.R. (1973), "The Present Situation in Theoretical



  Psychology," in B.B Wolman (ed.), Handbook of General Psychology,



  Englewood Cliffs, NJ: Prentice Hall.



 



  Shoemaker, H., Bushery, J., and Cahoon, L. (1989, in press),



  "Evaluation of the Use of CATI in the Current Population Survey,"



  in Proceedings of the Section on Survey Research Methods, American



  Statistical Association.



 



  Squire, L. (1987), Memory and Brain, New York: Oxford University



  Press.



 



  Suchman, L. and Jordan, B. (1990), "Interactional Troubles in Face-



  to-Face Survey Interviews," Journal of the American Statistical



  Association, 85, 232-240.



 



  Tanur, J. (1990, in press), "Reporting Job Search Among Youths:



  Preliminary Evidence from Reinterviews," in Proceedings of the 1990



  Annual Research Conference, Washington, DC: U.S. Department of



  Commerce, Bureau of the Census.



 



  Tourangeau, R. (1984), "Cognitive Sciences and Survey Methods," in



  Cognitive Aspects of Survey Methodology: Building a Bridge Between



  Disciplines, T. Jabine, M. Straf, J. Tanur, and R. Tourangeau



  (eds.), Washington, DC: National Academy Press.



 



  Tucker, C. and Bennett, C. (1988), "Procedureal Effects in the



  Collection of Consumer Expenditure Information: The Diary



  Operations Test," in Proceedings of the section on Survey Methods



  Research, American Statistical Association, pp. 256-261.



 



  Tucker, C., Miller, L., Vitrano, F., and Doddy, J. (1989),



  "Cognitive Issues and Research on the Consumer Expenditure Diary



  Survey," paper presented at the annual American Association for



  Public opinion Research Conference.



 



  Wolkowitz, O.M. and Weingattner, H. (1988), "Defining Cognitive



  Changes in Depression and Anxiety: A Psychobiological Analysis,"



  Psychiatry Psychobiology, 3, 1-8.



 



                                    266



 



Wright, P. (1980), "Strategy and Tactics in the Design of Forms,"



Visible Language, XIV 2, pp. 151-193.



 



 



 



 



 



 



 



 



                                 267



 



               THE ROLE OF A COGNITIVE LABORATORY IN A



                          STATISTICAL AGENCY



 



                           Monroe G. Sirken



               National Center for Health Statistics



 



 



 Introduction



 



    The statistical survey is an invention of the twentieth century.



 It produces a commodity, namely information, which many believe is



 the most important property in the modern world.  Our Federal



 establishment, for example, would be unable to function nearly as



 effectively without the information being produced by surveys that



 are conducted by the Federal agencies represented at this Seminar.



 The Congressional and Executive branches use Federal surveys to



 monitor the nation's well-being, to evaluate the government's



 social, health and economic programs, and to plan legislation



 involving the collection of billions of tax dollars and the



 disbursement of billions of benefit dollars.  Federal surveys could



 not have attained this level of acceptance and importance without



 the technological advances in survey methods that have occurred



 during the past half century.  However, we can hardly afford to be



 complacent.  As data producers, we are even more mindful than data



 consumers of the limitations of current survey technology.  We



 realize that further technological advances are essential to assure



 that Federal surveys will meet the growing needs for, more and



 better survey data.



 



      There have been two major technological advances in survey



 methodology during the past 50 years and I believe a third may be



 in the offing.  Each advance has introduced innovative technologies



 for improving the precision of the survey measurement process and



 was made possible by technology and theory transfers from the



 applied sciences.  The "sampling" revolution in survey methodology



 that began in earnest during the 1930's came about as a result of



 technology transfers from the statistical sciences, and produced



 substantial advances in survey sampling and estimation methods.



 The "automation" revolution had its onset in the late 1960's. it



 came about as a result of technology transfers from the computer



 sciences, and has produced substantial advances in the methods of



 compiling and processing survey data.  The "cognitive" revolution,



 which, as some of us believe got underway during the 1980's



 [Jabine, 1989], was made Possible by technology and concept



 transfers from the cognitive sciences.  Whether called a revolution



 or a movement, it has been introducing improved methods of



 designing data collection instruments and conducting questionnaire



 design research.



 



     Federal Statistical agencies were major players in the



 "sampling" and "automation" revolutions in survey technology.: Now



 they are playing a major role in the "cognitive" movement by



 



                                 268



 



developing and applying cognitive laboratory techniques to find



 better solutions to survey response problems.  It is noteworthy



 that the cognitive movement is not confined to the U. S. government



 nor to the United States [Jobe and Mingay, 1991].  This paper,



 moreover, deals with only one part of the U.S. movement, namely,



 the work of the cognitive laboratory at the National Center for



 Health Statistics.  The paper briefly describes the history and



 programs of the NCHS Laboratory and outlines the Laboratory's



 benefits to survey research, cognitive psychology, and Federal



 statistics.



 



 History of the NCHS Laboratory



 



     Until 1984, the role of cognition in the survey measurement



 process was largely ignored in the survey research programs of the



 National Center for Health Statistics.  None of the earlier NCHS



 projects had been conducted in a cognitive laboratory, though one



 study [Laurent, Cannel and Marquis, 1972] used psychological



 theories to guide the development of interviewer and questionnaire



 techniques.  Prior to 1984, survey response had been modeled as a



 two stage stimulus/response process with little attention paid to



 the effects that the respondents' mental processes had on the



 accuracy of their responses.  In accordance with this psychological



 paradigm, survey research investigated the error effects of survey



 instruments and procedures almost exclusively in field tests.



 Since these field tests sought to replicate the actual conditions



 of the survey, they provided little opportunity to investigate



 cognitive issues, such as the following:



 



 -     What kinds of cognitive processing modes and strategies



       do respondents use in answering survey questions?



 



 -     How do the cognitive processing modes and strategies of



       survey respondents affect the accuracy of their responses



       to survey questions?



 



      In 1984, with the support of an NSF grant, the NCHS embarked



 on a demonstration project that was motivated largely by the work



 of the Advanced Research Seminar on the Cognitive Aspects of Survey



 Methodology [Jabine, Straf and Tanur, 1984].  This project sought



 to demonstrate the utility of investigating the cognitive aspects



 of answering survey questions in a laboratory setting as a means of



 improving the design of Federal survey instruments [Sirken and



 Fuchsberg, 1984].  The project compared alternate versions of the



 dental supplement to the questionnaire of the 1986 National Health



 Interview survey.- One supplement was designed by the traditional



 field test method and the other by the proposed cognitive



 laboratory method [Lessler and Sirken, 1985].



 



      The rationale for the demonstration project as expressed in



 the NSF grant proposal [Sirken, 1984] was:



 



                                  269



 



     "... because (1) questionnaire design is one of the



      weakest links in the survey measurement process, (2) past



      efforts to improve the quality of questionnaire have



      posed serious and difficult methodological problems, (3)



      the traditional field methods currently being used to



      improve questionnaire design are inadequate by themselves



      to handle many of these problems, and (4) complimentary



      methodologies that are not subject to the weakness of



      traditional field methods need to be developed, it is



      [therefore) essential to investigate the potential of



      using the (combined] techniques of the statistical and



      cognitive sciences in a laboratory setting as a



      complementary methodology for improving questionnaire



      design..."



 



      The demonstration project was conducted in an



interdisciplinary mode and in close collaboration with university



scientists so that, as the NSF grant proposal noted, another



potential benefit was:



 



      "... it could go a long way in bridging the gap that



      exists between cognitive scientists academia and survey



      statisticians in Federal Statistical Agencies..."



 



      This was critical to the ultimate success of the project



because it was felt that gap between the disciplines had been



largely responsible for the delay in applying cognitive methods in



survey research.



 



      At the successful conclusion of the demonstration project in



1986, NCHS established, with the support of a second NSF grant, the



National Laboratory for Collaborative Research in Cognition and



Survey Measurement.  The National Laboratory's broad mission is to



promote and advance interdisciplinary research on the cognitive



aspects of survey methodology among Federal Statistical Agencies



and the nation's universities and research centers.



Interdisciplinary research with university scientists is promoted.



by a Collaborative Research Program which awards competitive



research contracts and appoints visiting scientists.  Collaborative



research with other Federal Agencies is promoted by the



Questionnaire Design Research Laboratory which serves as the



workplace for NCHS and other Federal Agencies to conduct intramural



research [Royston, et al 1986].  The Collaborative Research Program



has been largely funded by NSF grants and the Questionnaire Design



Research Laboratory has been partially funded by reimbursable work



agreements with other PHS Agencies [Sirken, et al 1990].



 



 



Activities of the NCHS Laboratory



 



      Much of the work of the National Laboratory is based on a



cognitive theory of survey response errors that can be stated as



 



                                 270



 



follows: "survey respondents carry-out a series of mental tasks in



 the interval between being asked a survey question and providing a



 response.  When these mental tasks pose serious mental burdens for



 respondents they are likely to cause response errors." This view of



 the survey response process stimulated the development of cognitive



 methods for designing and pretesting questionnaire and for



 conducting questionnaire design research.  Developing and testing



 survey instruments has short term objectives, namely, to detect and



 revise the design flaws before the survey instruments are field



 tested. in contrast, questionnaire design research objectives are



 long term, namely, to improve the designs of the next generation of



 survey instruments.  These differences in objectives led to the



 development of distinctly different cognitive methods for



 developing and testing survey instruments and for conducting



 questionnaire design research.



 



 



 Developing and Pretesting Questionnaires



 



      The cognitive laboratory approach to developing and pretesting



 survey questionnaires is based on the premise that difficult,



 unreasonable or impossible the mental tasks implicit in some survey



 questions increase the likelihood of response errors.  For example,



 survey questions containing terms respondents do not understand,



 that are vague or ambiguous, that impose unrealistic demands on



 recall, that require complicated mental calculations, that contain



 too many elements for the respondent to think about simultaneously,



 that involve issues the respondent knows or cares little about, or



 that ask for embarrassing or threatening information-all impose



 cognitive burdens that are likely to result in invalid responses.



 



      The realization that questionnaires obtain poor quality data



 when they ask respondents to perform difficult, if not impossible,



 mental tasks led to the development of a battery of laboratory



 techniques for investigating the cognitive burdens posed by survey



 questions [Bercini, In press, Royston, 1989] including think-aloud



 interviews, in-depth probing and focus group discussions, etc.



 These techniques are not new to questionnaire designers [DeMaio,



 1983] but never before had they explicitly and systematically



 served as means of observing the manner in which respondents



 mentally process survey questionnaires and procedures.



 



      Intensive interviewing techniques detect questionnaire design



 flaws by observing the cognitive problems that result from these



 flaws.   Poor questionnaire designs may impose difficult mental



 tasks at any cognitive stage of the response process including



 comprehending the questions recalling or estimating the information



 needed to answer the questions, and deciding whether or how to



 answer, the questions.  Identifying the underlying cognitive



 difficulties experienced by respondents facilitates the process of



 revising the questionnaires appropriately.



 



 



                                  271



 



      Many questionnaire design problems detected and repaired by



 laboratory techniques are far less likely to be detected by



 traditional field testing methods.  Consider the following question



 which was proposed for the National Health Interview Survey (NHIS),



 "During the past 12 months, have you been bothered by pain in your



 abdomen?" When laboratory respondents were asked this question,



 most answered it readily with a "Yes" or a "No".  It was not until



 the laboratory interviewer probed into how respondents interpreted



 the term "abdomen" that it became apparent that respondents were



 unsure of what section of the body to include.  The interviews also



 determined that respondents had variable interpretations of the



 phrase, "bothered by," which in turn, affected whether they



 answered the question affirmatively or negatively.  Intensive



 interviewing methods not only revealed that the question was apt to



 result in response errors, but also the underlying cause of the



 problem.  When the cause of a question problem is understood, the



 solution is more likely to be found.  In this case, part of the



 solution was a respondent flash card that showed an outline of the



 torso with the abdominal area shaded in.



 



      Intensive interviews are conducted by laboratory trained



 questionnaire designers with many years of survey research



 experience.  Paid subjects are recruited for the interviews.  The



 topic and target populations of the survey determine the criteria



 for subject recruitment.  Subjects are often selectively recruited



 to include those that would be most burdened by the survey



 questions  or least successful in adopting effective mental



 strategies in answering the questions.  Laboratory testing is



 usually carried out in interviewing waves of 5 to 10 subjects at a



 time; the questionnaire is revised in consultation with the sponsor



 after each wave; and the testing continues until an acceptable



 version is obtained.  Typically, flawed questions undergo 2-4



 revisions before an acceptable version is ready for field testing.



 Field testing is essential in order to determine how the



 questionnaire will work under actual survey conditions.  Additional



 laboratory testing may be needed to evaluate the questionnaire



 revisions that are suggested by the field test.



 



      Depending on the complexity and scope of the questionnaire and



 on the number of conceptual problems associated with it, laboratory



 testing can be completed within several weeks or could span a



 longer period.  For example, projects that involve special subject



 recruitment and testing may require a lead time of about six months



 or even longer.  Also, laboratory projects are conducted



 collaboratively with survey sponsors and therefore involve frequent



 meetings to assure that the designed questionnaires satisfy the



 sponsors' research objectives.



 



 



 



 



 



 



                                 272



 



Questionnaire Design Research



 



     Cognitive methods of conducting questionnaire design research



investigate why some survey questions and procedures pose cognitive



tasks that are difficult, unreasonable or impossible for



respondents to perform.  In the same way that much has been learned



in medicine by studying the cognitive aspects of amnesia and other



memory disorders, so it is hoped that much can be learned in survey



research by studying the cognitive aspects of questionnaires that'



pose severe response burdens.



 



     Questionnaire research seeks to improve the design of the next



generation of survey questionnaires, especially those



questionnaires dealing with topics for which better quality survey



data are needed.  Causal relationships between the mental tasks



performed by respondents and the accuracy of their responses are



investigated in experiments.  These experiment may be conducted in



the cognitive laboratory or embedded in on-going surveys.  The



laboratory approach makes it possible to undertake many types of



complex experiments that would be administratively impossible or



prohibitively expensive to conduct as field experiments.  Embedding



cognitive experiments in on-going surveys makes it feasible to test



laboratory findings under actual survey conditions.



 



     Several features of cognitive laboratory experiments are



noteworthy.  They are interdisciplinary, involving the joint



participation of cognitive psychologists and survey researchers.



They generally involve testing questions that ask for the kinds of



information that typically is poorly reported in surveys.  They



investigate those mental tasks implied by the survey questions that



pose the greatest risks to accurate reporting.  For example, if the



question implied retrospective reporting, the focus would be on the



cognitive aspects of the memory tasks and if the question asked for



sensitive information the focus would be on the cognitive aspects



of risk taking under conditions of uncertainty.



 



     Generally, the subjects of laboratory experiments are



recruited from population frames that contain information needed to,



validate the experiment's findings.  For example, the laboratory



subjects for experiments on retrospective reporting of medical



visits were selected from the files of a Health Maintenance



Organization.  Because the files provided access to the recruitment



of subjects with known health conditions and doctor visit patterns



[Means, et al, 1988].  Finally, the findings of the laboratory



experiments are interpreted in terms of their potential



contributions to cognitive theory as well as their implications for



improving the design of survey instruments.



 



     A recent project on dietary recall in nutrition surveys



illustrates some of the benefits of conducting experiments in a



cognitive laboratory.  This complex multi-experiment project,



involving randomization of subjects, diary keeping, and multiple



 



                                 273



 



data collection sessions, could probably not have been undertaken



 as a traditional field experiment.  The project investigated the



 cognitive burdens posed by the kinds of questions that are asked in



 household nutrition surveys [Smith, In press].  Generally these



 surveys collect dietary histories, food frequency inventories, and



 data on food portion sizes.  Collecting these kinds of data imposes



 mental tasks involving free recall, frequency estimation, and



 magnitude estimation, respectively.  Separate laboratory



 experiments were designed and conducted to assess the ability of



 respondents to provide accurate information on each of these tasks.



 The laboratory subjects participating in these experiments kept



 food diaries so their subsequent responses to dietary



 questionnaires could be validated.



 



      For example, one of the nutrition survey experiments tested



 the effect of varying the portion size definitions on respondents'



 reports of the amount of food consumed.  For each listed food item,



 respondents indicated whether their typical portion was small,



 medium or large in comparison with a defined medium portion size.



 Surprisingly, the food consumption reports in the experiment were



 invariant to changes in the definition of medium portion size.



 These findings raise serious questions about the design of



 nutrition survey questionnaires and the quality of survey data on



 food consumption that are based oh portion size reports.



 



      Over the past several years, laboratory experiments have



 investigated the cognitive factors involved in responding to



 difficult-to-answer questions on a variety of health related topics



 including utilization. of health services, cigarette smoking



 histories, illegal drug use, chronic pain episodes, and chronic



 disease prevalence.



 



      A recent project on recall of doctor visit illustrates the



 benefits of embedding experiments in surveys.  This split-ballot



 experiment was embedded in the pilot study of the National Medical



 Expenditure Survey.  The experiment investigated the relative



 accuracy of retrospectively reporting doctor visits in a forward or



 in a backward temporal order [Jobe, et al, 1990].  It was suggested



 by the findings of previous laboratory experiments indicating that



 subjects varied in their preference between forward and backward



 recall order but that backward recall seemed to produce more



 accurate reporting [Loftus, 1985].



 



      The survey experiment assessed the accuracy of forward,,



 backward and free recall reporting strategies by comparing the



 medical visits reported by each strategy with the visits listed in



 medical records.  The survey experiment did not confirm the



 findings of the laboratory experiments and showed little difference



 in accuracy between the alternative recall strategies.  It was



 concluded that there was no evidence to suggest that survey



 instruments should be designed to favor either the forward,



 backward or free recall strategies.



 



                                  274



 



      Cognitive experiments involving survey material, whether



  conducted in laboratories or embedded in surveys, are valuable for



  several reasons.  First, they provide in-depth knowledge about the



  cognitive processes respondents use in answering hard-to-answer



  survey questions.  In particular, they often identify the kinds of



  question approaches that pose response burdens.  And they suggest



  methods of designing the questionnaires to reduce the response



  burdens and response errors.  Secondly, because validation



  information is almost always collected (e.g., diaries, medical



  record matches, and biochemical markers) the response error effects



  of different questionnaire designs and cognitive strategies can be



  assessed.  Third, the cognitive bounds on the abilities of



  respondents to perform specified kinds of mental tasks



  (comprehension, recall, etc.) posed by survey questions can be



  assessed.



 



 



  Benefits of the NCHS Laboratory



 



      The activities and programs of the NCHS cognitive laboratory



  during the past five years have benefitted survey research,



  cognitive science and Federal statistics in variety of ways.  Some



  of the benefits are briefly outlined in these summary remarks.



 



      Survey research has benefitted from the development of methods



  for investigating the cognitive aspects of the survey response



  process.  Intensive interviewing methods were perfected for



  designing and pretesting survey instruments in a laboratory



  setting, and experimental methods were perfected for conducting



  laboratory experiments and for embedding experiments in on-going



  surveys.



 



       Cognitive science benefitted from the opportunities afforded



  its scientists by the NCHS laboratory to participate in the



  interdisciplinary research projects in cognition and survey



  measurement.  Cognitive psychologists participating in these



  projects had opportunities to test cognitive theories with real



  world survey phenomena either in laboratory experiments or in



  experiments embedded in on-going surveys.  And it is believed that



  the gains in cognitive psychology will ultimately benefit survey



  research and the quality of Federal surveys.



 



       The activities of the NCHS laboratory fostered an appreciation



  and respect for the importance of conducting cognition and survey



  measurement research within and outside the Federal establishment.



  For example, the NCHS laboratory played a vital role in designing



  and testing NCHS survey instruments during the past several years,



  and it is being viewed increasingly as a PHS laboratory with a



  mission to service the needs of agencies throughout the Public



  Health Service.  As the first cognitive laboratory of its kind



  devoted to survey research, the NCHS laboratory served as a point



  of reference, if not the prototype, for the cognitive laboratories



 



                                   275



 



that have since been established at other statistical agencies



 including the Bureau of the Census, Bureau of Labor Statistics and



 Statistics Sweden.  Information dissemination has always been a



 high priority activity and during the past five yearsi the NCHS



 laboratory staff and collaborators published nearly 50 reports, and



 presented more than 100 papers at meetings and conferences.



 



     Whether the existing movement in cognition and survey



 research, of which the NCHS laboratory is a part, will evolve into



 a full-fledged cognitive revolution with an impact equal to the



 sampling and automation revolutions remains to be determined.  We



 will know that. the cognitive revolution has, occurred when it



 becomes apparent that the cognitive sciences are providing



 scientific support to survey response research comparable to the



 support the statistical and computing sciences have been providing



 to research in survey sampling and in the automation of survey



 data.



 



 



 References



 



 Bercini, D.H. Presented at the EPA/AOWNA Symposium on Total



 Exposure Assessment Methodology.  Pretesting Questionaire in the



 Laboratory: An Alternative Approach.  In print.  Toxicology and



 Industrial Health.



 



 DeMaio, Theresa J. (Ed.) (1983).  Approaches to Developing



 Questionnaires.  Statistical Policy Working Paper 10.  Statistical



 Policy Office, Office of Information and Regulatory Affairs, Office



 of Management and Budget.  Washington, D.C.



 



 Jabine, Thomas B. (1990).  Cognitive Aspects of Questionnaire



 Development.  Presented at the EPA/AOWNA Symposium on Total



 Exposure Assessment Methodology.  In print.  Toxicology and



 Industrial Health.



 



 Jabine, T.B., Straf, M.L., (1984).  Tanur, J.M. and Tourangeau R.



 (Eds.). (1984).  Cognitive Aspects of Survey Methodology: Building



 a Bridge Between Disciplines.  Washington, D.C. National Academy



 Press.



 



 Jobe, J.B., White, A.A., Keileyi C.L., Mingay, D.J., Sanchez, M.J.,



 and Loftus, E.F. (1990).  Recall Strategies and Memory for Health



 Care Visits.  Milbank Memorial Fund Quarterly/Health and Society,



 68, 171-199.



 



 Laurent, A.C., Cannell, C. and Marquis, K. (1972).  Reporting



 Health Events in Household Interviews:  Effects of an Extensive



 Questionnaire and Diary Procedure.  Vital and Health Statistics,



 Series 2, No. 49 (DHHS Publication No. PHS 91-1079).  Washington,



 D.C., U.S. Government Printing Office.



 



 



                               276



 



Lessler, J.T. and Sirken, M.G. (1985).  Laboratory-Based Research



 on the Cognitive Aspects of Survey Methodology: The Goal of the



 National Center for Health Statistics Study.  Milbank Memorial Fund



 Quarterly/Health and Society, 63, 565-581.



 



 Loftus, E.F. and Fathi, D.C. (1985).  Retrieving Multiple



 Autobiographical Memories.  Social cognition.  Vol. 3, pp. 280-95.



 



 Royston, P.N. (1989).  Using Intensive Interviews to Evaluate



 Questions.  In F.J. Fowler, Jr. (Ed.), Heath Survey Research



 Methods (pp. 3-7) (DHHS Publication No. PHS 89-3447).



 Washington, D.C., U.S Goverrment Printing Office.



 



 Royston,  P.N., Bercini, D.H., Sirken, M.G. and Mingay, D. (1986).



 Questionnaire Design Research Laboratory.  American Statistical



 Association, 1986 Proceedings of the Section on Survey Methods



 Research, pp. 703-707.



 



 Sirken, Monroe G. (1986).  National Laboratory for Collaborative



 Research on Cognition and Survey Measurement.  Grant Proposal to



 the National Science Foundation.  Washington D.C.



 



 Sirken, Monroe G. (1984).  Laboratory Based Research on the



 Cognitive Aspects of Survey Methodology. Grant Proposal to the



 National Science Foundation.  Washington, D.C.



 



 Sirken, M.G. and Fuchsberg R. (1984).  Laboratory Based Research on



 the Cognitive Aspects of Survey Methodology.  In Cognitive Aspects



 of Survey Methodology: Building a Bridge Between Disciples.



 Washington, D.C. National Academy Press.



 



 Smith, A.F. (in press).  Cognitive Processes in Long-term Dietary



 Recall.  Vital and Health Statistics, Series 6, No. 4 (DHHS



 Publication No. PHS 91- 1079).  Washington, D.C., U.S. Government



 Printing office.



 



 



 



 



 



 



 



 



                                277



 



                            DISCUSSION



 



                          Elizabeth Martin



                      U.S. Bureau of the Census



 



       In their two papers, Monroe Sirken of the National Center for



  Health Statistics, and Cathryn Dippo and Douglas Herrmann of the



  Bureau of Labor Statistics, document the activities of the



  cognitive laboratories which were established in 1984 and 1988,



  respectively, at their two agencies.  The cognitive laboratories



  represent a commitment to survey data quality which is accredit to



  the two agencies.  And Monroe Sirken and Cathryn Dippo, as two of



  the main instigators and initiators responsible for establishing



  the laboratories, deserve credit and appreciation for their effort



  and accomplishment.  The record of achievement by the two



  laboratories is a good one.  Dippo and Herrmann organize their



  paper around a clear and comprehensive discussion of the sources of



  cognitive problems which can introduce errors in the response



  process; it id impressive how many of these problems have already



  been tackled in the BLS Collection Procedures Research Laboratory



  in its short history.  Excellent research on a range of topics is



  also being conducted at the NCHS National Laboratory for



  Collaborative Research in Cognition and Survey Measurement, though



  in his paper Sirken does not actually describe the research.  The



  NCHS lab lives up to the "collaborative" in its name; the number



  and caliber of academic researchers who have been involved in their



  projects are very high.



 



      The growth of laboratory-based research on cognitive aspects



  of survey methodology is described by Dippo and Herrmann as a



  "movement" and by Sirken as a "revolution."  These



  characterizations accurately reflect the enthusiasm and ferment of



  activity and new ideas in this area.  However, "revolution" may not



  be the most useful metaphor to describe how cognitive psychology is



  affecting (or, more importantly, should affect) survey research.



  In fact, the metaphor of "revolution" reflects and reinforces a



  weakness of the work currently going on in the new cognitive



  laboratories.



 



      By emphasizing discontinuity with the past, researchers are,



  led to ignore relevant work which preceded many of the methods and



  ideas of the current "movement." Sirken characterizes survey



  research as (until recently) "based almost exclusively on the



  behaviorist paradigm" with "respondent's mental states...



  virtually ignored." This isn't accurate.  Survey researchers, at



  least those practicing in academic or commercial settings, have



  hypothesized about and investigated psychological states



  intervening between survey questions and respondents' answers at



  least since World War II. (Jean Converse's Survey Research in the



  United States:  Roots and Emergence, 1890-1960 provides a



  fascinating and useful history which traces the intellectual



 



                                   278



 



origins of survey research.)  Much of this work is still very



 relevant, and should be built on rather than ignored.  For example,



 Dippo and Herrmann state that, "except for social desirability, the



 survey field is just beginning to investigate factors that affected



 communication of responses."  They would benefit from reviewing the



 survey literature on the topic of communication, beginning with



 Herbert Hyman et al's comprehensive, Interviewing in Social



 Research, published in 1954.  The methods used in the cognitive



 laboratories also have roots in the past.  For example, Naomi D.



 Rothwell used very similar methods to conduct research on



 questionnaire design at the Census Bureau during the 1960s and



 1970s.  It is a bit of an overstatement for Sirken to claim in his



 paper to have invented the cognitive laboratory, without



 acknowledging similar, earlier activities.



 



       In the field of survey research, there is a tradition of



 applying ideas from psychology to survey measurement issues.  For



 the new work in the, cognitive laboratories to advance the state of



 the art of survey measurement, it should build on this tradition.



 This would also increase its credibility to many survey



 researchers.



 



       A danger of the "revolution" metaphor is it suggests a



 philosopy of "out with the old, in with the new." In some cases,



 this leads researchers to forget what they know about good survey



 practice.  Compared to a survey, the cognitive laboratories



 generally rely on more intensive, less structured interviews with



 smaller numbers of respondents.  This approach can be very



 informative about the nature and sources of cognitive errors in



 surveys.  However, the "samples" usually are very small and not



 selected according to probability methods. one must be cautious



 drawing inferences from the results of most of the cognitive lab



 studies to date.  For instance, I think Dippo and Herrmann are



 overstating the case when they conclude that, "research done at BLS



 shows clearly that proxy recall is different than self recall, both



 in terms of amount and kinds of information recalled." Laboratory



 findings such as this are more usefully thought of as hypotheses



 which should be subjected to more rigorous testing in a sample



 survey, and/or experimentally.



 



       It is important to keep in mind that standards of evidence and



 proof still apply to research conducted in the cognitive



 laboratories.  In some writings, the word "cognitive" is repeated



 so often as to suggest that the writer believes the word itself is



 sufficient to establish the merits of the research.  But the



 researcher is still obliged to make his or her case on the



 evidence.  For example, Sirken presents an example of a question on



 marijuana use which he says was improved by cognitive testing.  How



 do we know it is better?  He presents no evidence or logic to



 support his claim.  In the long run, if the cognitive "movement" is



 to be taken seriously, it must demonstrate, not simply assert, the



 



 



                                    279



 



value of its products, and be wary of the temptation to oversell



itself.



 



     I believe there are two common goals behind the activities in



the cognitive laboratories.  One goal is to improve particular



survey measurements.  The second is to develop a theoretical



foundation (beyond sampling theory) for improved survey design.



The latter, broader aim requires that we develop better measures of



nonsampling errors, and a better understanding of the effect of



alternative survey designs on nonsampling errors.  Methods and



ideas from cognitive psychology are tools for achieving both



specific and general improvements, but are not an end in



themselves. Other social sciences (for example, social psychology)



also have relevant knowledge to contribute.



 



     With these goals (and the previously-stated cautions) in mind,



what then is new and revolutionary about the work being done in the



cognitive laboratories?  First, this research has yielded new



appreciation of the vulnerability of factual survey questions to



biases and errors.  I think it is fair to say that most government



statisticians and academic survey methodologists probably have



taken for granted the validity of simple factual questions.  The



research on problems of comprehension, recall and other cognitive



difficulties is contributing to a more sophisticated understanding



of how much we have yet to learn about the error properties of



survey measurements.  Second, and more important, the research in



the cognitive labs represents a new and more extensive set of



methods for pretesting survey questionnaires and procedures.  This



in itself is a great leap forward.  Traditionally, pretests of



survey questionnaires have been ad hoc and informal, based on



interviews with a few respondents and with no real guidelines



beyond common sense to decide when one has succeeded or failed.



The cognitive laboratories are changing that.  Close and in-depth



examination of Problems of respondent comprehension, recall, and'



judgment, is shedding new light on the causes of these problems and



(better yet) new ideas about how to correct them.  The new methods



which are being used and developed in the cognitive laboratories



form a logical series of pretests prior to fielding a survey,



proceeding from intensive, informal interviews, to small-scale



experiments testing alternative questions or designs, to large-



scale field experiments.  In addition, as Cathryn Dippo points out



in her remarks, testing can be integrated into the main survey



itself, to provide ongoing information about nonsampling errors.



The new methods thus make Possible a more scientific and, systematic



approach to pretesting, and they promise to yield improvements in



the quality of data collected by the federal government.



 



 



 



 



 



 



 



                                280



 



                            DISCUSSION



 



                           Murray Aborn



              National Science Foundation (retired)



 



     I am grateful to my co-discussant, Elizabeth Martin of the



 Census Bureau, for providing the perfect lead-in to my own



 commentary on the papers presented at this session.  Dr. Martin



 reminded us of the importance of viewing any disciplinary



 development from the perspective of its historical predecessors,



 and in this connection she succeeded in moving the advent of CASM 



 (Cognitive Aspects of Survey Methodology) -- writ large -- back 



 several decades from the year most commonly cited as the date of



 its birth -- namely, 1980.



 



     More consequential than revising our perception of the



 chronology of CASM (again writ large) is the difference Dr.



 Martin's remarks point up between the characterization of CASM in



 the paper presented by Cathryn Dippo and Douglass Herrmann of the



 Bureau of Labor Statistics, and the one presented by Monroe Sirken



 of the National Center for Health Statistics.  Dr. Martin's remarks



 implicitly characterize CASM as a reawakening of old concerns, and



 thus place her in strong agreement with Dippols and Herrmann's



 labeling of CASM as a "movement," in contrast with Sirken's



 labeling of CASM as a methodological "revolution." Indeed, there



 is much to support the view of CASM as a movement; for instance,



 the enthusiasm of its adherents and the growing frequency with



 which its ideology is being endorsed by sectors of the statistical



 community and users of statistical data generally who have



 heretofore tended to ignore the psychosocial underpinnings of



 survey-taking (see, for example, Suchman and Jordan, 1990).



 



     However, this does not mean that Sirken's description of CASM



 as representing a revolutionary development is totally incorrect.



 It may merely be premature, for the potential of CASM as a true



 breakthrough -- as a true revolution in survey research -- is



 clearly present in the programmatic and research agenda laid out



 for it in the seminal CASM document prepared by the National



 Academy of Sciences (see Jabine, et al, 1984).  At the present



 time, only half the CASM prospectus is being actively pursued;



 namely, those objectives having to do with the adoption of certain



 recent advances in cognitive science into the survey design and



 instrumentation process.  What we have seen little of to date is



 action on those objectives having to do with the use of surveys as



 naturalistic test beds for laboratory-based theories of the



 functioning of the neuronal mind and, ultimately, the emergence of



 a new paradigm for social/behavioral research in which survey-



 taking plays an important role in understanding such basic



 cognitive phenomena as how the brain stores memories and how mental



 imagery influences perception and recall, and in which developments



 in cognitive science relating to such branches of the field as



 



                                 281



 



natural language semantics are used to produce greatly improved



methods for achieving high-quality survey measurement.  In other



words, fulfillment of the "cognitive revolution" alluded to in



Monroe Sirken's paper is clearly in prospect, but is yet to



materialize.



 



      I shall have a bit more to say on this subject at the close of



my commentary; meanwhile, however, it is my opinion that much of



the force behind Dr. Martin's view of CASM as a reawakening of old



survey concerns -- as a "movement" more so than a "revolution" --



stems from the present truncated status of the programmatic agenda



initially prescribed for the field.  This gives CASM the appearance



of a one-sided effort to adopt, in fairly superficial terms, some



of the investigative techniques employed in recent laboratory-based



cognitive psychology, and incorporate them in the conventional



procedures for constructing and pretesting survey questionnaires.



Under such a perspective, not much may appear to have been added to



what has long been known to be of influence in survey responding,



and audiences such as the one attending the present session may



rightfully feel that CASM amounts to little more than another real-



life example of the familiar tale of "The Emperor's New Clothes"



which, albeit a story-from the literature of childhood, embodies a



profound adult theme concerning human gullibility and our tendency



to accept uncritically what experts -- genuine and otherwise



tell us is true, novel, or significant.



 



      Now, let me examine the Emperor's New Clothes proposition



against the CASM-engendered activities at the BLS and NCHS



laboratories reported in the papers by Dippo and Herrmann and by



Sirken.  Reducing a sample of these activities to their most



generic properties (in the sense of survey factors which, induce



 



 



 



 



 



 



 



 



                                 282



 



 response error), I would break them down into the following



  classification:



 



 



      COLLECTION PROCEDURES                      QUESTIONNAIRE DESIGN



       RESEARCH LABORATOY                             LABORATORY



               (BLS)                                     (NCHS)



 



- Question Ambiguity                       - Question Wording and Order



(The extent to which a question            (The differential results induced



may be interpreted in more than            by synonymous variation



one way.)                                  rearrangement of sequence.)



 



- Long-term Recall                         - Memorial Decay



(The length of time over which             (The validity -- or veridicality



the respondent is required to              -- of information supplied from



retrieve from memory.)                     short- and long-term memory.)



 



- Emotional Loading                        - Affective Sensitivity



(The degree of psychological               (The likelihood that a question



stress which a question may place          may be embarrassing or impinge



upon the respondent.)                      upon the respondent's privacy.)



 



- Subcultural Norms                        - Linguistic Complexity



(Question comprehensibility                (The effect of gramatical



across ethnic subgroups.)                  construction on the respondents



                                          ability to comprehend.)



 



- Social Desirability                      - Lexical Level



(The extent to which a question            (The extent to which a question



is likely to elicit a normative            requires the respondent to have



rather than an idiopathic                  specialied -- in this case



response.)                                 medical -- knowledge.)



 



 



       Now, it is hard to believe that the many survey researchers



  trained in social psychology and cognate fields of social science



  are oblivious to influences -- such as those charted above --



  regardless of whether intellectual, technical, and/or cost factors



  make it impractical to subject such nonsampling sources of error to



  adequate control, or to estimate the proportion of total survey



  error due to their ubiquitous presence.



 



       To take the phenomenon of Social Desirability, for example, it



  does not require a social scientist to comprehend the universal



  tendency of people represent a societally acceptable facade when



  questioned about attitudes and behavior.  The popular press and



  many humorous books have for decades poked fun at surveys by



  ridiculing the informational value of asking such survey items as,



  "Do you bathe at least once a week?" or "Do you brush your teeth



  every day?"



 



 



                                     283



 



      To take some other examples, did it require CASM to alert



 survey researchers to the difference in results when a question is



 phrased one way as opposed to another?  Or to the difficulty of



 most respondents to deal with questions presented in grammatically



 complex form?  Or to the impingement of certain areas of



 questioning on the sensitivity of respondents?  Or to memorial



 decay overtime?  Or to a respondent's understanding of questions



 embodying medical terminology?



 



      I can't resist regaling the audience with a personal anecdote



 illustrating how ordinary, and even old-fashioned, if you will, is



 appreciation of the fact that few individuals not trained or highly



 educated in medicine cannot comprehend medical lexicography, and



 that one is apt to get ludicrous results from asking questions



 embodying medical terminology.



 



      More than 25 years ago, when employed at the National



 Institute of General Medical Sciences, I shared an office with a



 public health epidemiologist who had just returned from a tour of



 duty in Puerto Rico.  He told me of an effort to obtain data on the



 extent of interruption to normal life activities due to amoebic



 dysentery, which was then prevalent in most rural areas of Puerto



 Rico.  Having never before conducted a survey, his group of public



 health officials put together a series of questions utilizing such



 terms as diarrhea and defecation to get estimates of frequency.



 When the obtained results showed an average of only one to two



 bowel movements per day, the survey takers knew something was wrong



 and quickly realized that it was likely due to the language



 employed in identifying the disease.



 



      The Public Health people reran a small subsample of



 respondents using the term "bowel movement" in the questionnaire,



 and obtained a slightly higher, but still medically incredible,



 estimate of frequency.  Finally a native informant suggested that



 they phrase all questions pertaining to diarrhea in terms of La



 Mange or "The Curse" as it was known in the rural areas of the



 island and when they did this, the average reported frequency



 shot up to a more medically believable 11 or 12 occurrences per



 day.



 



      If sheer knowledge of the fact that such variables as 1evel of



 lexical comprehension, differences in subcultural norms, and the



 tendency to respond in socially desirable ways are sources of error



 in survey research, what, then, is it that is truly new about the



 CASM movement?



 



      There are, to my mind, three major issues that have been



 brought to the fore by the CASM movement, coupled to the addition



 of new technical procedures which have proved powerful in cognitive



 research in psychology and artificial intelligence.  And, as I have



 mentioned before and will emphasize at the close of my remarks,



 there is the potential for bringing about a truly interdisciplinary



 



                                  284



 



effort to understand just what goes on in the interactional



 dynamics for survey and respondent.



 



       The three major issues which have surfaced as a result of CASM



 are:



 



    1. A reawakening of the essential conflict between survey



       questionnairing and ordinary conversation owing to the



       need for artificially imposed standardized conditions of



       administration from the standpoint of survey statistics



       on the one hand, and the natural world existence of



       individual differences in mentality on the other.



 



    2. The extent to which laboratory-based treatments and



       results can be transferred to the field in the case of



       survey-taking.  This issue is of general importance to



       social science, as well as being particularly relevant to



       survey research insofar as the laboratory setting, which



       provides greater conditions of control and flexibility,



       creates possibilities for a more systematic approach to



       instrumentation, and hence to survey measurement.



 



    3. The degree to which the contemporary shift in the



       underlying paradigm of survey research's cognate



       substantive discipline -- i.e., psychology -- requires a



       realignment away from behaviorism and toward cognition.



       CASM represents a bold attempt to test this issue and



       assay its yield, but there has thus far been far too



       little involvement of cognitive psychology per se apart



       from the importation of certain investigative techniques.



 



       I by no means wish to detract from the accomplishments



 reported in the papers by Dippo and Herrmann and by Sirken based



 upon the importation of the techniques employed in contemporary



 cognitive psychology, into the innovative laboratory facilities now



 ensconced in such two prestigious governmental agencies as BLS and



 NCHS.  Much thought and expertise have been applied to the transfer



 of technology represented, by the successful adoption of such



 cognitive probes and methods as: (1) Focus Groups; (2) Part-set



 Cueing; (3) Protocol Analysis; and (4) Think-aloud procedure.



 



       But in my opinion, this could be just the beginning of a truly



 revolutionary development in survey research and, through its



 influence, on social science more broadly.  The laboratory-based



 techniques and procedures you have heard presented at this session



 are derived from research begun in the early 1960's by Nobel



 Laureate Herbert Simon and Alan Newell that resulted in the General



 Problem Solver and led, to the foundations of the field of



 artificial intelligence (Barr and Feigenbaum, 1982).   The more



 recent work of Simon (Simon, 1987),  shows the even greater,



 potential of cognitive technology to uncover human information



 processing systems.



 



                                    285



 



     However, there is reason to be both pessimistic and optimistic



about the future of CASM.  On the one hand, the statistical



framework of survey research -- the dominant framework for the



field -- is concerned with drawing inferences about populations --



about whether the sample of a population is large and



representative enough to permit accurate and valid conclusions to



be reached about the distribution of characteristics in the



population from which the survey sample was drawn.  On the other



hand, the cognitive framework is concerned with drawing accurate



and valid inferences about individuals about respondent



"truthfulness," if you will.



 



     Therefore, one framework calls for instrumentation designed to



enhance person-to-person comparability, while the other calls for



instrumentation designed to enhance the assessment of person-to-



person variations on each survey variable.



 



     It is the work of the two survey/cognitive research



laboratories reporting here today that represents one of the two



reasons I find to be optimistic about the future of CASM.  Such



facilities offer the best opportunities for reconciling the



conflicting survey conceptual frameworks described above.



 



     The other reason I find to be optimistic lies in the



pronouncement appearing in a neuropsychological book which has



become a national bestseller in addition to its importance to the



scientific literature on brain-behavior relationships.  I refer to



-- and endorse to you as top-quality literature as well as a work



of cognitive science importance -- Oliver Sacks' The Man Who



Mistook His Wife for a Hat.  I close my remarks by quoting from a



passage in this work that, I believe, should stimulate cognitive



scientists to become fuller participants in CASM, recognizing that



survey centers and facilities are ideally suited to cognitive



explorations and offer the prospect of a vital new interdisdipline.



 



     After presenting and analyzing the case of The Man Who Mistook



His Wife for a Hat, Sacks concludes, as I do here, that:



 



         cognitive sciencesiare themselves suffering from an



     agnosi similar to the one afflicting the man who mistook



     his wife for a hat.  That man may thus serve as a warning



     and parable of what happens to a science which eschews



     the judgmental, the particular, the personal, and becomes



     entirely abstract and computational (Sacks, 1987, p. 20).



 



     I hope that cognitive psychologists will take heed of Dr.



Sacks' warning and see the opportunity that survey research offers



to offset the present trend toward abstract computationalism.



 



 



 



 



 



                                286



 



References



 



1. Suchman, L. and Jordan, B., "Interactional Troubles in Face-to-



Face Survey Interviews," JASA, Vol. 85, No. 409, pp. 232-253, 1990.



 



2. Jabine, T., Straf, M., Tanur, J., and Tourangeau, R. (eds).



Cognitive Aspects of Survey Methodology: Building a Bridge Between



Disciplines, Washington, D.C.: National Academy Press, 1984.



 



3. Barr, A. and Feigenbaum, E.A., (eds.) The Handbook of Artificial



Intelligence.  Stanford, CA: Heuristech Press, 2:184-192, 1982.



 



4. Simon, H., "The Steam Engine and The Computer: , What Makes



Technology Revolutionary," EDUCOM Bulletin, 22(l):2-5, 1987.



 



5. Sacks, O., The Man Who Mistook His Wife For A Hat, New York:



Harper and Row, p. 20, 1987.



 



 



 



 



 



 



 



 



                                 287



 



                                288



 



                      Reports Available in the



                          Statistical Policy



                         Working Paper Series



 



 



1.    Report on Statistics for Allocation of Funds (Available



      through NTIS Document Sales, PB86-211521/AS)



2.    Report on Statistical Disclosure and Disclosure-Avoidance



      Techniques (NTIS Document Sales, PB86-211539/AS)



3.    An Error Profile:  Employment as Measured by the Current



      Population Survey (NTIS Document Sales PB86-214269/AS)



4.    Glossary of Nonsampling Error Terms: An Illustration of a



      Semantic Problem in Statistics (NTIS Document Sales, PB86-



      211547/AS)



5.    Report on Exact and Statistical Matching Techniques (NTIS



      Document Sales, PB86-215829/AS)



6.    Report on Statistical Uses of Administrative Records (NTIS



      Document Sales, PB86-214285/AS)



7.    An Interagency Review of tizie-Series Revision Policies (NTIS



      Document Sales, PB86-232451/AS)



8.    Statistical Interagency Agreements (NTIS Documents Sales,



      PB86-230570/AS)



9.    Contracting for Surveys (NTIS Documents Sales, PB83-233148)



10.   Approaches to Developing Questionnaires (NTIS Document



      Sales, PB84-105055/AS)



11.   A Review of Industry Coding Systems (NTIS Document Sales,



      PB84-135276)



12.   The Role of Telephone Data Collection in Federal Statistics



      (NTIS Document Sales, PB85-105971)



13.   Federal Longitudinal Surveys (NTIS Documents Sales, PB86-



      139730)



14.   Workshop on Statistical Uses of Microcomputers in Federal



      Agencies (NTIS Document Sales, PB87-166393)



15.   Quality in Establishment Surveys (NTIS Document Sales, PB88-



      232921)



16.   A Comparative Study of Reporting Units in Selected Employer



      Data Systems (NTIS Document Sales, PB-90-205238)



17.   Survey Coverage (NTIS Document Sales, PB90-205246)



18.   Data Editing in Federal Statistical Agencies (NTIS Document



      Sales, PB90-205253)



19.   Computer Assisted Survey Information Collection (NTIS



      Document Sales, PB90-205261)



20.   Seminar on the Quality of Federal Data (NTIS Document Sales,



      PB91-142414)



 



Copies of these working papers may be ordered from NTIS Document



Sales, 5285 Port Royal Road, Springfield, VA 22161 (703) 487-4650



 



 



"1"David A.Pierce is Senior Statistician, Micro Statistics



Section, Division of Research and Statistics, Federal Reserve



Board, Washington, DC 20551, and a member of the Federal Committee



on Statistical Methodology and its Subcomittee on Data Editing in



Federal Statistical Agencies. Any views expressed do not



necessarily reflect those of the Federal Reserve System.



 



2 The sampling design in the original CATI sample was stratified



simple random sampling.  The reinterview sample was a random sample



of CATI respondents within strata.  The bias was approximated by



expanding the  difference in reconciled and CATI response at the



sample unit level.   



 



 

(wp20b.html)

ARROW UP

 


Page Last Modified: April 20, 2007 FCSM Home
Methodology Reports