"
Section=""
Keywords=""
Description=" "
runat="server"/>
TABLE OF CONTENTS || NEXT
I. Introduction
The primary objective of the National Household Survey on Drug Abuse (NHSDA) is to measure
the prevalence of use of illicit drugs, alcohol, and tobacco products, as well as the nonmedical use of
prescription drugs in the United States. The 1998 NHSDA is the eighteenth in the series, which began in
1971 and has been conducted annually since 1990.
The Population Estimates report is published by the Office of Applied Studies (OAS) within the
Substance Abuse and Mental Health Services Administration (SAMHSA). The report provides the drug
abuse prevention, treatment, and research communities, and other interested parties, with timely data on
current substance use prevalence measures. OAS issues a companion volume, the Main Findings report,
at a later date to present an expanded analysis of the data, including information on drug and alcohol use
trends; demographic correlates of use of illicit drugs, alcohol, and tobacco; patterns and problems of drug
use; and perceived harmfulness of drug use. OAS produces and distributes another NHSDA report
containing prevalence estimates when the survey results are first released each year. Special analytic
reports also are periodically produced on topics of current interest (e.g., drug use among employed
people, drug use and family structure).
Estimates presented in Sections III and IV of this report are based on a questionnaire and estimation
methodology introduced in 1994 and continued through 1998.
1
Due to the effect this new methodology
may have on the magnitude of estimates, comparisons with NHSDA data prior to 1994 should be made
with caution.
2
TOC || BACK || NEXTSURVEY METHODOLOGY
The NHSDA is based on a stratified, multistage area probability sample. For the 1998 study, 139
primary sampling units (PSUs) were selected at the first stage of sampling. The 1998 national sample
was comprised of 115 PSUs (43 certainty PSUs and 72 noncertainty PSUs). The national certainty PSUs
have been included in the NHSDA since 1988, and the noncertainty PSUs are the same PSUs selected
since 1996. The 1998 national sample was supplemented by six noncertainty PSUs from the 14 Arizona
PSUs that were selected in 1997, plus two noncertainty PSUs from the four California PSUs that were
selected in 1997, and an additional 16 new noncertainty PSUs selected from 13 States to provide minimal
sample sizes for testing the small area estimation methodology in all States.
¹
See page 9 of the following: Office of Applied Studies, Substance Abuse and Mental Health Services
Administration. (1995). National Household Survey on Drug Abuse: Population estimates 1994
(DHHS Publication No. SMA 95-3063). Rockville, MD: Author.
²
For additional detailed information on the 1994 questionnaire modification and its implications, see the
following: Office of Applied Studies, Substance Abuse and Mental Health Services Administration.
(1996). The development and implementation of a new data collection instrument for the 1994 National
Household Survey on Drug Abuse (DHHS Publication No. SMA 96-3084). Rockville, MD: Author.
Within each PSU, area segments were selected with probability proportional to a composite size
measure that was designed to overrepresent concentrated Hispanic and black neighborhoods, as well as
younger individuals.
³
Dwelling units were selected from each sample segment. The target population
included all 12-year-old or older civilian residents of households (including civilians residing on military
installations) and noninstitutional group quarters (e.g., college dormitories, homeless shelters, rooming
houses). Persons excluded from this target population included military personnel on active duty,
transient populations (such as homeless people who do not reside in shelters), and residents of
institutional group quarters (e.g., jails, hospitals). Data collection was continuous over the 1998 calendar
year.
Survey data were collected through personal visits to each selected residence. Introductory letters
were mailed to each residence and were used to explain the survey prior to the interviewer's visit. Upon
arrival, the NHSDA field representative conducted a short voluntary screening procedure with any
resident of the household 18 years old or older who was capable of providing information on the age,
race/ethnicity, gender, and marital status of each resident 12 years old or older. This information was
used in a random selection procedure to determine whether any resident members were eligible for an
in-depth interview (either one, two, or no individuals were selected). The interviewer had no control
over this selection procedure. The 1998 within-household person selection probabilities were based on
the race/ethnicity of the head of household and the ages of each household member. Selected individuals
then were asked if they would complete a voluntary interview. NHSDA field representatives conducted
the interviews using a paper-and-pencil questionnaire that included both interviewer-administered
questions and self-administered answer sheets (for collection of sensitive information). All screening
and interview responses were kept confidential.
In 1998, a total of 33,128 eligible dwelling unit members were selected for an interview; of these, a
total of 25,500 interviews were completed (4,903 interviews were conducted in California, 3,869
interviews were conducted in Arizona and 16,728 interviews were conducted in the remaining States).
Total response rates for screening and interviewing were 93.0% and 77.0%, respectively.
TOC || BACK || NEXTPROCEDURES FOR DERIVING POPULATION ESTIMATES
For convenience, both rate estimates and corresponding population estimates are included in each
table. Population estimates are presented in thousands (e.g., a population estimate of 430 represents
430,000 persons). Each "observed estimate" is followed by its 95% confidence interval in parentheses.
Estimates are provided for each of the following time periods when use of illicit drugs, alcohol, and
tobacco occurred: (a) use in the lifetime (ever used), (b) use in the past year (used past year), and (c) use
in the past month (used past month), also referred to as "current use." These estimates have been
obtained by weighting the data to reflect current population totals for various demographic subgroup
populations.
³
In the interest of readability for this report, “white” is used to indicate “white, non-Hispanic,” and
“black” to indicate “black, non-Hispanic.”
Development of Weights
An analysis weight was calculated for each completed interview to reflect selection probabilities
and to compensate for nonresponse and undercoverage. A poststratification adjustment was made to
force the respondent weight totals to equal U.S. Bureau of the Census projections of the civilian,
noninstitutionalized population 12-years or older. The poststratification totals
4
were obtained from the
National Estimates and Population Projections Branch of the U.S. Bureau of the Census and classified
by age group, gender, race/ethnicity, and Hispanic origin. These poststratification totals were
appropriately adjusted using State-level population projections that also were obtained from the Census
Bureau. With the State-level demographic control totals so obtained, a three-level “State” variable (i.e.,
California, Arizona, and remainder of the United States) was used in the poststratification to produce
stable State-level estimates for California and Arizona. In 1998, regional control totals also were
obtained and used during poststratification. In general, the interview samples from each quarter were
poststratified to one-fourth of the projected population totals. These totals represent the population at the
midpoint of each quarter's data collection period (the 15th day of February, May, August, and November
1998). The resulting quarterly analysis weights sum to the average of the four quarter-specific
projections. The final analysis weight can be viewed as the number of population members that each
respondent represents.
Tables 1A, 1B, and 1C in Section II present the sample sizes and U.S. population totals on which
the population estimates of drug use are based. The population figures are estimates of the civilian,
noninstitutionalized population and are generated by summing the individual final analysis weights of
the respondents belonging to each population category.
Adjusting for Nonresponse Through Imputation
The population estimates in this report are based on either the total responding sample or all cases in
a subgroup, including some cases where missing data for some recency-of-use and frequency-of-use
variables were replaced with logically or statistically imputed (i.e., replaced) values. The interview
classification "minimally complete" (a status necessary for a case to be included in the database) requires
that data on the recency of use of alcohol, marijuana, and cocaine be present. To determine case
completeness, an editing procedure was employed to replace missing data for these substances based on
information supplied by the respondent elsewhere in the questionnaire. After this editing, case
completeness was determined. When necessary, additional logical imputation also was done to replace
other inconsistent, missing, or otherwise faulty data.
After editing, any data still missing for recency-of-use and frequency-of-use questions (for drugs
other than alcohol, cocaine, and marijuana) were statistically imputed using the technique of sequential
hot-deck imputation. The first step in this procedure involves sorting the data file progressively using
data on recency of use of alcohol, marijuana, and cocaine; age; gender; Hispanic origin; race/ethnicity; and a State indicator variable (i.e., California, Arizona, or remainder of the United States). The hot-deck
imputation procedure replaces a missing item on a particular record by the last encountered nonmissing
response for that item (from the previous record) on the sorted database. The hot-deck imputation
procedure is appropriate for recency-of-use and most frequency-of-use variables because the level of
item nonresponse is low.
Missing data for the variables on frequency of use of alcohol, cocaine, and marijuana in the past 12
months were statistically imputed using a regression-based method of imputation. This imputation
procedure involves estimating a polytomous logistic model using a number of respondent characteristics.
The explanatory variables used in these models included those variables used in the recency-of-use hot-deck
imputation procedure, such as recency of use of alcohol, marijuana, and cocaine; age; gender;
Hispanic origin; race/ethnicity; and State. After the model parameters were estimated, the resulting
model was used to predict a categorical value for each frequency-of-use item nonresponse. The
model-based imputation procedure is appropriate for alcohol, cocaine, and marijuana frequencies for two
reasons: (a) the relative amount of nonresponse or faulty responses to these questions is larger than what
is observed for the recency-of-use and other frequency-of-use items, and (b) the model-based procedure
allows a greater number of statistically significant explanatory variables to contribute to imputing a
response compared to what is possible with the hot-deck method.
The main advantage of imputation is that it simplifies the calculation of estimates. Its use can
reduce the bias caused by missing data and thus improve the accuracy of estimates. In the 1998 NHSDA,
however, the potential impact of bias due to item nonresponse and the impact of imputation on the
estimates themselves were quite small because item nonresponse was less than 2% for most of the drug
use recency questions.
´
These 1998 population projections were based on the 1990 U.S. Census counts.
Sampling Error and Confidence Intervals
The NHSDA, like all sample surveys, has an inherent degree of statistical uncertainty based on the
sample design. NHSDA estimates are subject to uncertainties of two types: sampling errors and
nonsampling errors. Examples of nonsampling errors are recording mistakes, coding errors,
nonresponse, differences in respondents' interpretations of questions, and purposely false answers. The
effects of nonsampling errors on the estimates cannot normally be quantified; however, rigorous attempts
are made to minimize their occurrence through pretesting, interviewer training, interview verification,
coder training, coding verification, and other quality control measures.
Sampling errors denote the random fluctuations that occur in estimates based on samples drawn
from a population; such variations can be eliminated only by conducting a complete census. Using the
same procedures, different samples drawn from the same population would be expected to result in
different estimates. Many of these observed estimates would differ to some degree from the "true"
population value, and these differences are due to sampling error. The variance of an estimate is the
basic measure of this type of error.
To account for the complex features of the NHSDA sample design (such as unequal selection
probabilities, stratification, and clustering), the variance estimates of the NHSDA drug use statistics are
computed for this report using a survey data analysis software package called SUDAAN.
5
Estimates of
means or proportions, such as drug use prevalence, take the form of nonlinear statistics where the
variances cannot be expressed in closed form. Variance estimation for nonlinear statistics in SUDAAN
is based on a first-order Taylor series approximation of the deviations of estimates from their expected
values. The resulting variance estimates are approximately unbiased for sufficiently large sample sizes.
For a given variance estimate, the associated design effect is the ratio of the design-based variance
estimate over the variance that would have been obtained from a simple random sample of the same size.
Because the combined design features of stratification, clustering, and unequal weighting are expected to
increase the variance estimates, the design effect should virtually always be greater than one. For
prevalence rates near zero, however, the variance-inflating effects of unequal weighting and clustering
are sometimes underestimated, resulting in design effects of less than one. Because the corresponding
variance estimates are then considered anomalously small, two other variance estimates are computed as
quality control measures. The first of these other variance estimates is based only on the stratification
and unequal weighting effects, and the second is based on simple random sampling. The variance
estimate used for obtaining confidence intervals is then the maximum of these three estimates.
The 95% confidence intervals for the drug use proportions and corresponding population estimates
are constructed based on the logit transformation. Because the drug use proportions in the NHSDA are
frequently small, the logit transformation has been used for this report to yield asymmetric interval
boundaries. These asymmetric intervals are more balanced with respect to the probability that the
interval is above or below the true population value than is the case for standard symmetric confidence
intervals.
To illustrate the method, let
p = estimated proportion,
var(p) = variance estimate of p,
q =1-p,
L = logit of p = ln [p/(1-p)], where "ln" denotes the natural logarithm, and
var(L) = var(p)/(pq)² .
The approximate 95% confidence interval for L is then calculated as
where the quantity in parentheses that is multiplied by 1.96 estimates the standard error (SE) of L.
Applying the inverse logistic transformation to the confidence interval endpoints, A and B, yields a 95%
confidence interval for the proportion, P, as
where "exp" denotes the inverse log transformation. The lower and upper confidence interval endpoints
for percentage estimates are obtained by multiplying the lower and upper endpoints for proportions by
100. The confidence interval for the corresponding population estimate is obtained by multiplying the
confidence interval endpoints by the estimated number of individuals in the population subgroup
constituting the base or denominator of the associated proportion.
The precise interpretation of the 95% confidence interval is as follows: If repeated samples of
identical design are drawn from the population, and the sample estimate and corresponding upper and
lower confidence limits are calculated for each sample, then the true population value is covered by the
confidence intervals of, on average, 95 of 100 samples.
For tables in this report, each estimate of the number of users of the drug in the defined subgroup
(as well as its corresponding estimated percentage of the subgroup's total population) is accompanied by
an upper and lower confidence limit. For example, in the lower portion of Table 3A, the "observed
estimate" for the total number of people who have "ever used" marijuana is 72,070,000. The "lower
limit" is 69,122,000, and the "upper limit" is 75,080,000. The interpretation of these estimates is that one
can be 95% confident that the total number of people who have ever tried marijuana at least once in their
lifetime lies between 69,122,000 and 75,080,000, with the best 1998 NHSDA estimate being 72,070,000.
The corresponding percentage estimates for the lower and upper confidence limits are 31.6% and 34.4%,
respectively, with the best estimate being 33.0%.
As in other publications in the NHSDA series, estimates with low precision are not reported.
The criterion used for suppressing estimates is based on the size of the estimate and the relative standard
error (RSE) of the estimate. The RSE is defined as the ratio of the standard error of an estimate divided
by the estimate itself. Specifically, cell percentages and corresponding estimates of numbers of users are
suppressed if at least one of the following three criteria is met:
(1) p < .0005 or p A .9995
(2) RSE[-ln(p)] > 0.175 when p < 0.5
(3) RSE[-ln(1-p)] > 0.175 when p > 0.5
where RSE[-ln(p)] is the RSE(p)/-ln(p). For computational purposes, this is equivalent to
where SE(p) is the standard error estimate of p. The log transformation of p is used to provide a more
balanced treatment of measuring the quality of small, large, and intermediate p values. The switch to
(1-p) for p greater than 0.5 yields a symmetric suppression rule across the range of possible p values.
Because the sample sizes for subgroup populations are relatively large, low precision generally occurs
only for prevalence rates that are near either 0% or 100%.
TOC || BACK || NEXT
NHSDA drug use prevalence data are presented for each gender; four major age groups (12 to
17, 18 to 25, 26 to 34, and 35 years old or older); three major mutually exclusive racial/ethnic groups,
based on respondents' self-classifications (Hispanic in origin, regardless of race; white, not of Hispanic
origin; and black, not of Hispanic origin); and four geographic regions. (Those who did not identify
themselves as Hispanic, white, or black are included in the population totals, but separate estimates are
not presented for this "other" category because the sample size is too small [see Table 1B].) Tables are
presented separately for the total population, whites, Hispanics, blacks, and geographic region. The four
U.S. Bureau of the Census regions are Northeast, North Central, South, and West (see Exhibit 1). For
each drug, eight tables are arranged to facilitate group comparisons. Data for the estimated numbers of
users in subgroups are arranged in rows and presented by gender for each of four age groups. Data in the
remaining seven tables for each racial/ethnic or regional subgroup are presented first by age, then by
gender, and finally for the total population.
Time periods of use shown in column headings are "ever used," "used past year," and "used past
month." These categories are cumulative (i.e., those who have "used [in the] past month" also are
included in the "used [in the] past year" and "ever used" categories). Likewise, those who have "used [in
the] past year" are included in the "ever used" estimates.
Other than presenting results by age group and other basic demographic characteristics, no
attempt is made in this report to control for potentially confounding factors that might help explain any
associations observed. This point is particularly salient with respect to race/ethnicity, which tends to be
highly associated with socioeconomic characteristics. Also, the cross-sectional nature of the data
precludes any causal interpretations of observed relationships. Nevertheless, data presented in this report are useful for comparing demographic subgroups with respect to drug use rates, regardless of why they
differ.
Prevalence Estimates for Specific Drugs and Drug Classes
Section III presents the basic set of drug use prevalence estimates grouped by various drug
categories. The first drug category presented is "Any Illicit Drug," which includes any use of
marijuana/hashish; cocaine, including crack; inhalants; hallucinogens, including lysergic acid
diethylamide (LSD) and phencyclidine (PCP); heroin; and the nonmedical use of psychotherapeutics
(i.e., stimulants, sedatives, tranquilizers, and analgesics). Following the estimates for any illicit use,
tables are presented separately for various specific categories of illicit drug use, as well as for alcohol,
cigarettes, and smokeless tobacco. The small number of respondents reporting these drug use behaviors
resulted in low precision for most "used past month" estimates, as well as many other estimates;
therefore, less detail is shown for estimates of PCP use, LSD use, heroin use, and needle use (see Tables
16 to 19).
Frequency of Drug Use Among Past Year Users
Data presented in Section IV are useful for identifying how often a drug is used. After results
from earlier surveys were published, the estimate of those who had used a drug in the past month was
cited by some readers as an estimate of the number of "regular users." This interpretation was
unsatisfactory because past month users include both those experimenting with the drug as well as
regular users. Therefore, information has since been collected on the frequency of drug use in the past
year.
Frequency of drug use during the past 12 months is classified into three categories: "at least
once," "12 or more days," and "51 or more days." The categories are cumulative; those using "51 or
more days" also are counted among the "12 or more days" and the "at least once" users. Similarly, those
using "12 or more days" also are counted among those who have used "at least once" in the past year. By
definition, estimates for those who have used “at least once” are equivalent to those who have “used past
year” in earlier tables.
TOC || BACKCONSIDERATIONS IN INTERPRETING THE DATA
The estimates produced in this report should be viewed as approximations based on the best
available data. Readers are therefore cautioned to take the following points into account when using or
interpreting the data in this report:
- The value of self-reports obviously depends on the honesty and memory of
sampled respondents. Research has supported the validity of self-report data in similar contexts.6,7
Although NHSDA procedures are designed to encourage
truthfulness and recall, as with all studies of this type, some underreporting or
overreporting may occur.
- NHSDA drug use prevalence estimates for specific subgroups are sometimes
based on modest to small sample sizes, which may lead to substantial sampling
error.
- Population projections prepared for the U.S. Bureau of the Census' Current
Population Survey (and used in weighting the 1998 NHSDA sample) are subject
to error, which increases with the age of the last census.
- The population surveyed consists of noninstitutionalized civilians living in
households, college dormitories, homeless shelters, rooming houses, and on
military installations. Therefore, this report does not present estimates for some
segments of the U.S. population that may contain a substantial proportion of
drug users, such as transients not residing in shelters (e.g., users of soup kitchens
or residents of street encampments) and those incarcerated in county jails or
State and Federal prisons.
6
Rouse, B.A, Kozel, N.J., & Richards, L.G. (Eds.). (1985). Self-report methods of estimating drug
use (NIDA Research Monograph No. 57, DHHS Publication No. ADM 85-1402). Rockville, MD:
National Institute on Drug Abuse.
7
Turner, C.F., Lessler, J.T., & Gfroerer, J.C. (Eds.). (1992). Survey measurement of drug use:
Methodological studies (DHHS Publication No. ADM 92-1929). Rockville, MD: National Institute
on Drug Abuse.
TOC || BACK
Homepage
|
|