DATA PREPARATION AND PROCESSING PROCEDURES Automated data collection procedures for the survey were introduced in NHANES III. In the mobile examination centers, data for the interview and examination components were recorded directly onto a computerized data collection form. With the exception of a few independently automated systems, the system was centrally integrated. This operation allowed for ongoing monitoring of much of the data. Before the introduction of the computer-assisted personal interview (CAPI), the household questionnaire data were reviewed manually by field editors and interviewers. CAPI (1992-1994 only) questionnaires featured built-in edits to prevent entering inconsistencies and out-of-range responses. The multi-level data collection and quality control systems are discussed in detail in the Plan and Operation of the Third National Health and Nutrition Examination Survey, 1988-1994 (NCHS, 1994; U.S. DHHS, 1996). All interview, laboratory, and examination data were sent to NCHS for final processing. Guidelines were developed that provided standards for naming variables, filling missing values and coding conventional responses, handling missing records, and standardizing two-part quantity/unit questionnaire variables. NCHS staff, assisted by contract staff, developed data editing specifications that checked data sets for valid codes, ranges, and skip pattern consistencies and examined the consistency of values between interrelated variables. Comments, collected in both interviews and examination components, were reviewed and recoded when possible. Responses to "Other" and "Specify" were recoded either to existing code categories or to new categories. The documentation for each data set includes notes for those variables that have been recoded and standardized and for those variables that differ significantly from what appears in the original data collection instrument. While the data have undergone many quality control and editing procedures, there still may be values that appear extreme or illogical. Values that varied considerably from what was expected were examined by analysts who checked for comments or other responses that might help to clarify unusual values. Generally, values were retained unless they could not possibly be true, in which case they were changed to "Blank but applicable." Therefore, the user must review each data set for extreme or inconsistent values and determine the status of each value for analysis. Several editing conventions were used in the creation of final analytic data sets: 1. Standardized variables were created to replace all two-part quantity/unit questions using standard conversion factors. Standardized variables have the same name as the variable of the two-part question with an "S" suffix. For instance, MAPF18S (Months received WIC benefits) in the MEC Adult Questionnaire was created from the two-part response option to question F18, "How long did you receive benefits from the WIC program?," using the conversion factor 12 months per year. 2. Recoded variables were created by combining responses from two or more like variables, or by collapsing responses to create a summary variable for the purpose of confidentiality. Recoded variables have the original variable name with an R suffix. For example, place of birth variable (HFA6X) in the Family Questionnaire was collapsed to a three level response category (U.S., Mexico, Other) and renamed HFA6XR. Generally, only the recoded variable has been included in the data file. 3. Fill values, a series of one or more digits, were used to represent certain specific conditions or responses. Below is a list of the fill values that were employed. Some of the fill values pertain only to questionnaire data, although 8-fill and blank-fill values are found in all data sets. Other fill values, not included in this list, are used to represent component-specific conditions. 6-fills = Varies/varied. (Questionnaires only) 7-fills = Fewer than the smallest number that could be reported within the question structure (e.g., fewer than one cigarette per day). (Questionnaires only) 8-fills = Blank but applicable/cannot be determined. This means that a respondent was eligible to receive the question, test, or component but did not because of refusal, lack of time, lack of staff, loss of data, broken vial, language barrier, unreliability, or other similar reasons. 9-fills = Don't know. This fill was used only when a respondent did not know the response to a question and said, "I don't know." (Questionnaires only) Blank fills = Inapplicable. If a respondent was not eligible for a questionnaire, test, or component because of age, gender, or specific reason, the variable was blank-filled. In the questionnaire, if a respondent was not asked a question because of a skip-pattern, variables corresponding to the question were blank-filled. For examination or laboratory components, if a person was excluded by a defined protocol (e.g., screening exclusion questions) and these criteria are included in the data set, then the corresponding variables were blank-filled for that person. For home examinees, variables for examination components and blood tests not performed as part of the home examination protocol were blank-filled. 4. For variables describing discrete data, codes of zero (0) were used to mean "none," "never," or the equivalent. Value labels for which "0" is used include: "has not had," "never regularly," "still taking," or "never stopped using." Unless otherwise labeled, for variables containing continuous data, "zero" means "zero. 5. Where there are logical skip patterns in the flow of the questionnaire or examination component, the skip was indicated by placing the variable label of the skip destination in parentheses as part of the value label of the response generating the skip. For example, in the Physical Function Evaluation, the variable PFPWC (in wheelchair) has a value label, "2 No (PFPSCOOT)" that means that the next item for persons not in a wheelchair would be represented by the variable, PFPSCOOT. Variable Nomenclature A unique name was assigned to every NHANES III variable using a standard convention. By following this naming convention, the origin of each variable is clear, and there is no chance of overlaying similar variables across multiple components. Variables range in length from three to eight characters. The first two variable characters represent the topic (e.g., analyte, questionnaire instrument, examination component) and are listed below alphabetically by topic. For questionnaires administered in the household, the remainder of the variable name following the first two characters indicates the question section and number. For example, data for the response to the Household Adult Questionnaire question B1 are contained in the variable HAB1. For most laboratory and examination variables, as well as some other variables, a "P" in the third position refers to "primary" and the remainder of the variable name is a brief description of the item. For instance, in the Laboratory Data File, information on the length of time the person fasted before the first blood draw is contained in the variable PHPFAST. The variable PHPFAST was derived as follows: characters 1-2 (PH) refer to "phlebotomy," character 3 (P) refers to "primary," characters 4-8 (FAST) refer to an abbreviation for "fasting." CODE TOPIC AT Alanine aminotransferase (from biochemistry profile) AM Albumin (from biochemistry profile) AP Alkaline phosphatase (from biochemistry profile) AL Allergy skin test AC Alpha carotene AN Anisocytosis AA Apolipoprotein (AI) AB Apolipoprotein (B) AS Aspartate aminotransferase (from biochemistry profile) LA Atypical lymphocyte AU Audiometry BA Band BO Basophil BS Basophilic stippling BC Beta carotene BX Beta cryptoxanthin BL Blast BU Blood urea nitrogen (BUN) (from biochemistry profile) BM Body measurements BD Bone densitometry C1 C-peptide (first venipuncture) C2 C-peptide (second venipuncture) CR C-reactive protein UD Cadmium CN Central nervous system function evaluation CL Chloride (from biochemistry profile) CO Cotinine CE Creatinine (serum)(from biochemistry profile) UR Creatinine (urine) DM Demographic DE Dental examination MQ Diagnostic interview schedule DR Dietary recall (total nutrient intakes) EO Eosinophil EP Erythrocyte protoporphyrin FR Ferritin FB Fibrinogen RB Folate (RBC) FO Folate (serum) FH Follicle stimulating hormone (FSH) FP Fundus photography CODE TOPIC GG Gamma glutamyl transferase (GGT) (from biochemistry profile) GU Gallbladder ultrasonography GB Globulin (from biochemistry profile) G1 Glucose (first venipuncture) G2 Glucose (second venipuncture) SG Glucose (from biochemistry profile) GH Glycated hemoglobin GR Granulocyte C3 HCO3 (Bicarbonate)(from biochemistry profile) HD HDL cholesterol HP Helicobacter pylori antibody HT Hematocrit HG Hemoglobin AH Hepatitis A antibody (HAV) HB Hepatitis B core antibody (anti-HBc) SS Hepatitis B surface antibody (anti-HBs) SA Hepatitis B surface antigen (HBsAg) HC Hepatitis C antibody (HCV) DH Hepatitis D antibody (HDV) H1 Herpes 1 antibody H2 Herpes 2 antibody HX Home examination (general) HF Household family questionnaire HA Household adult questionnaire HQ Household questionnaire variables (composite) HS Household screener questionnaire HY Household youth questionnaire HZ Hypochromia I1 Insulin (first venipuncture) I2 Insulin (second venipuncture) UI Iodine (urine) FE Iron SF Iron (from biochemistry profile) LD Lactate dehydrogenase (from biochemistry profile) L1 Latex antibody LC LDL cholesterol (calculated) PB Lead LP Lipoprotein (a) LH Luteinizing hormone LU Lutein/zeaxanthin LY Lycopene LM Lymphocyte MR Macrocyte MC Mean cell hemoglobin (MCH) MH Mean cell hemoglobin concentration (MCHC) MV Mean cell volume (MCV) PV Mean platelet volume MA MEC adult questionnaire MX MEC examination (general) FF Dietary food frequency (ages 12-16 years) MP MEC proxy questionnaire MY MEC youth questionnaire ME Metamyelocyte MI Microcyte MO Monocyte MN Mononuclear cell ML Myelocyte CODE TOPIC IC Normalized calcium (derived from ionized calcium) OS Osmolality (from biochemistry profile) PH Phlebotomy data collected in MEC (e.g., questions) PS Phosphorus (from biochemistry profile) PF Physical function evaluation PE Physician's examination PL Platelet DW Platelet distribution width PK Poikilocytosis PO Polychromatophilia SK Potassium (from biochemistry profile) PR Promyelocyte RC Red blood cell count (RBC) RW Red cell distribution width (RDW) RE Retinyl esters RF Rheumatoid factor antibody RU Rubella antibody WT Sample weights SE Selenium SI Sickle cell NA Sodium (from biochemistry profile) SH Spherocyte SP Spirometry SD Survey design TT Target cell TE Tetanus TB Total bilirubin (from biochemistry profile) CA Total calcium SC Total calcium (from biochemistry profile) TC Total cholesterol CH Total cholesterol (from biochemistry profile) TI Total iron binding capacity (TIBC) TP Total protein (from biochemistry profile) TX Toxic granulation TO Toxoplasmosis antibody PX Transferrin saturation TG Triglycerides TR Triglycerides (from biochemistry profile) TY Tympanometry UA Uric acid (from biochemistry profile) UB Urinary albumin VU Vacuolated cells VR Varicella antibody VA Vitamin A VB Vitamin B12 VC Vitamin C VE Vitamin E WC White blood cell count (WBC) WW WISC/WRAT cognitive test