General Measures of Health for use in Health Interview Surveys and Censuses: the UK experience
Professor Howard Meltzer
Social Survey Division
Office for National Statistics
London, SW1V 2QQ
United Kingdom
Washington Group meeting
9 – 10 January 2003
Ottawa
1
Introduction
The aim of this review was to provide information
that would assist in the design of surveys and the interpretation of survey
results on the general health of the
population resident in private households in the UK.
The project was empirically driven, being based
around results from large population
surveys in the UK. Several of these had been
commissioned by governmental bodies over the previous decades. These included
the General Household Survey (GHS), the Health Survey for England (HSE), the
Health Education Monitoring Survey (HEMS) and a number of others.
Evidence from health surveys conducted outside
central government was also
considered and special attention was paid to
‘calibration’ exercises designed to
throw light on the way questions about
respondents’ general health are
interpreted
and answered by members of the public.
Whereas the review included many general health measures, this paper has extracted those sections concerned with measures of health covered by one or two questions which are generally applicable, cover a range of dimensions of health of importance to the public, and are simple to understand and use in sample surveys of the general population.
1.1 Use of general
health measures
At a population level, general health measures
can be used to produce prevalence estimates and thus provide a method of
monitoring the population’s health and of assessing the likely demand for
health care and services.
When used to produce health measures at a
national level, they may be collected
either via a census
or a sample survey. The use of
the decennial census in the UK to collect health information has the advantage
of near complete coverage of the whole
population and, thus, the ability to provide
estimates for small areas of the country.
The disadvantage of using a census, apart from
the cost, is that space on the
census form is limited, and the forms are
normally completed by one member of the
household, without the presence of an interviewer
to probe for a full answer. In the UK, census forms are put through letter
boxes by enumerators and returned by post. In addition, there are long periods
between censuses, which means that information
collected on them will become outdated.
Because of these disadvantages, the monitoring of
the nation’s health is normally
carried out by the inclusion of health measures
in continuous or frequently repeated
surveys of the general population, using face-to-face
interviewing. Therefore the
main focus of this paper is the measurement of
health in the face-to-face sample survey context. However, the national
Censuses for 1991 and 2001 are included here because they include general
health questions and therefore have an important place in health monitoring
strategy.
2.
Conceptual issues
2.1
Why are questions about ‘general health’
included in surveys?
Single questions or brief sets of questions about
general health are frequently
included both in specialised health surveys and in general surveys of the
population. Three main needs seem to underlie this popularity.
The first need is to control both the burden on
respondents and the cost and complexity of surveys by minimising the number of
questions on any one topic that has to be included in a questionnaire. A single
question providing an indicator of general health is cheap and may appear
straightforward to interpret. Simplicity is an important advantage,
particularly in the case of large surveys that present and compare results for
many sub-samples and over time. If survey respondents are willing and able to
answer these simple-seeming questions, why incur more expense and complication
by asking more?
The second need, which is implicit in all
quantitative survey work, is to derive a simple indicator (or small set of
indicators) to subsume the detail which emerges when a person is questioned in
depth about something as complex as his or her state of health
A third reason for including general health
measures in surveys may be as a relatively straightforward way of estimating
the ‘burden of ill health’ in the population. Here there may be a subtext which
defines ‘ill health’ as ‘that which requires input from the health services’.
Without such an indicator there is no simple way of using a continuous or
repeated health survey to answer to the question ‘How well are we doing in our
efforts to improve health in the population?’ Of course, the fact that such a
measure would be very useful does not, of itself, imply that it should not be
subject to serious methodological scrutiny.
2.2
The concept of ‘general health’
The concept of ‘general health’ may at first sight appear rather straightforward
and commonsensical. In everyday conversation we often address to a friend or
acquaintance such questions as ‘How are you these days?’. Sometimes this is
mere politeness, but sometimes we actually expect.and appear to receive more or
less informative answers bearing on the person’s general health.
However, in asking and answering such questions
we seldom give conscious
attention to such issues as:
· whether
we mean the individual to take account of stable long-term conditions or
disabilities, or only of recent or acute episodes of ill health;
· whether
they are giving appropriate and consistent weightings to different aspects of
health, mental and physical;
· whether
or not we expect them to ‘discount’
health problems associated with advancing age;
In short, we seldom ask ourselves whether or not
the person we are talking to is drawing the same conceptual boundaries around
the idea of ‘general health’ as we do
ourselves, or as other persons to whom we might address the same question. Interview respondents probably take formal
interview questions more seriously than casual conversational enquiries, but
the evidence suggests that terms within the conceptual domain of ‘health’ are unlikely to be interpreted
very consistently either across different individuals or within the same
individual over time.
Even if respondents appear to understand
consistently what they are being asked to do in providing an assessment of
their general health, there is no
way, without special cognitive studies, that we can assess whether the response
given is based on careful and comprehensive thought and the application of
‘reasonable’ standards of judgement, or not. The exception perhaps occurs in
cases of gross discrepancy, such as where persons who on objective evidence
appear to be very ill give responses suggesting that they have no health
problems
In other words, there can be no ultimate “gold
standard” that can be applied to
questions inserted in health surveys to distinguish “correct” from
“wrong or misleading” responses to questions asking for a self-assessment of
health.
2.3
Cognitive tasks required by general
health questions
To arrive at a single, summary answer to a
question about his or her general health the
respondent must, in theory, strike a weighted average of how they feel they
stand on different dimensions of health (the weights representing the
importance to them, personally, of the different dimensions). In trying to
understand what really goes on in respondents’ minds as they take in a question
about their general health, decide
what is required by way of an answer and apply standards and judgement to their
personal experience in order to produce a response, our main source of
information has been to the relatively small amount of “cognitive” question
testing work that has been done on the way general
health questions are answered.
Indicators of different aspects of health may,
perfectly legitimately, move in different directions over time. For example, a
person’s mobility may improve while their sight (or digestion, or depression,
or migraines) get worse; and differential movement of indicators will often be
observed at the population as well as at the individual level.
3.
Methodological issues
In order to assess the quality of data derived from the responses to survey questions, a well-established list of criteria is available to social researchers.
3.1
Validity
Validity is the degree to which a measure
captures the concepts it is intended
to measure and is not systematically affected by other, irrelevant variables.
Also, the same
concept needs to be measured in the same way, and using
the same standards, for all respondents. There
are several different ways of
assessing validity.
Face validity appeals
to semantic or observational judgements of whether the measure being evaluated
appears to capture what it is intended to capture. For example the response
category ‘Yes’ to the question ‘Do you suffer from any longstanding illness or
disability?’ has face interpretability.
Criterion or external validity makes comparisons with other sources. Criterion validity looks for
appropriate correlation between the measure being evaluated and some other independent
and trusted measure or classification of the same concept. For example,
correlation with clinical diagnosis of severe arthritis might be used to
validate a questionnaire item on long standing health problems. However, it would be very surprising if a measure
with face validity as a measure of longstanding health problem did not produce some degree of positive correlation with diagnosed arthritis. To be
convincing as validation of the measure a very
strong positive relationship would need to be shown.
Construct validity is
assessed by testing theory-based predictions of the pattern of statistical
relationships between the measure being evaluated and other,
conceptually-related measures. For example, a variable said to measure “social
isolation” might be predicted to correlate more highly with “living alone” and
“mobility problems” than with “digestive problems”. Again, it is not enough for
the predicted pattern of correlation to be present andstatistically significant
on large samples. To be convincing the observed differentials need to be quite
large.
Predictive validity is
assessed by testing theory-based predictions of how health-related outcomes
(for example, hospitalisation or death) should vary for cases having different
scores on the measure. Once again, almost any measure claiming to detect
serious ill-health should be associated with a higher-than-average chance of
early death. To provide convincing proof of the validity of the measure the
prediction achieved needs to be striking, even
after controlling for other variables such as age.
3.2
Freedom from overall bias
The idea of freedom from overall bias is linked
to, but not the same as, the idea of ‘validity’ and also to that of
‘sensitivity’ Clinical examination of a representative sample of the population would probably show
that ‘perfect health’, like ‘very poor
health’ was relatively rare, though the health
defects suffered by many of the
population would no doubt be relatively trivial
or latent (such as, for example,
unfitness and obesity due to lack of exercise or
poor diet, which is known to be a predictor of serious diseases in middle age).
Therefore it could be argued that a questionnaire measure of general health
that suggested that the majority of the population had no health defects is
either biased in an ‘optimistic’ direction, or, alternatively, that it is
insensitive to real differences in health within that part of the population
which is free of major health
problems.
3.3
Sensitivity
Measures should be sufficiently sensitive to
differences in health states. A measure needs to able to detect changes over
time, or mean differences between groups, in the aspect of health that it is
intended to measure. This sensitivity should ideally be standard over the whole
range of the underlying health variable, so that there are neither “ceiling
effects” (loss of sensitivity in distinguishing “very good” from “good”
health), nor “floor effects” (loss of sensitivity
in distinguishing “very poor” from
“poor” health).
3.4
Freedom from bias between sub-groups
If our aim is to monitor the health of all
sections of the general household population, it is important that the criteria
above apply equally to all subgroups of the population, so that the health of particular subgroups is not spuriously represented as being better
or worse than that of other subgroups. In other words,
measures must be equivalent in their
meaning and interpretation for all members of the population. A degree of random variability in how individuals
interpret and answer survey questions is tolerable, but systematic relative bias in the way questions are interpreted and answered (say) by men versus
women, or by younger people versus
older people, undermines the aims of monitoring
(with a view to determining health policy priorities), unless it can be
corrected for in some way.
3.5
Reliability
The term ‘reliability’
is here used in the technical sense which distinguishes it from validity. Ideally it is assessed by
special test-retest studies. A
reliable measure is
one that is not subject to excessive random
variability in the results it obtains at the individual level. The presence of
such measurement variance has the
same effect on survey estimates as a reduction in sample size.
3.6
Portability
Measurement instruments used in monitoring must
not be prone to relative bias in
their application. For flexibility in developing a health monitoring strategy
it may be desired to mount a measure on different survey vehicles and to vary
the precise questioning context or use of proxy responses.
It is therefore very desirable that a measure
should be portable between surveys in
the sense that it will produce the same results, irrespective of whether it is
included on a dedicated health survey or a multi-purpose survey (freedom from context/order bias).
The ideal measure should also be independent of mode of administration (whether the
measure is administered by telephone or face-to-face, for example) and use of proxy responses. Given the fact that
health means in the population tend to change rather slowly and that small
changes are therefore of interest, lack of portability in measures may have
serious consequences in causing statistical artifacts that may be mistaken for
true changes or differences in general
health.
3.7
Practicality
While the criteria already discussed are of prime
importance in scientific and
methodological terms, a criterion which in practice tends to outweigh them is
practicality. In survey contexts this concerns:
a) the length of time it takes to administer and
complete the survey questionnaire module concerned and hence the associated
operational and opportunity costs;
b) whether the results will slot readily into an
existing time series and offer scope for useful comparisons.
c) whether survey respondents seem able and
willing to provide answers to the questions involved without any untoward
reaction (acceptability);
d) the cost and complication of processing the
resulting data;
e) the suitability of the results for
presentation in descriptive survey reports.
3.8
Stability of measurement performance over time
To fulfil the purpose of monitoring health over
time, it is important that the format and wording of the measure used and the way in which they relate to what we
intend to measure be invariant over time.
Then one can be confident of interpreting a change over time in the measure as
indicating a real change in the population’s health, not a change in
expectations or in the relative weight of different components of the measure.
The ideal measure would be one that also provided
continuity with available time series, thus extending the current monitoring
data rather than having to start a new series. However, some doubts arise over
whether the health standards applied by respondents do remain stable over time.
For example, the aims of monitoring over time are undermined if what
respondents in one year understand and intend (on average) by the response “My
health is good” is different from what respondents understood and intended by
the same response five years earlier.
3 The self-assessed general health question.
3.1
Use of the question in the UK
Surveys
in the UK which have included a question on self-reported general health
include:
The Health Survey for England;
The General Household Survey;
The Health and Lifestyles Survey
The Health Education Monitoring Survey
The ONS Psychiatric Morbidity Surveys
The National Child Development Survey
The 1970 British Cohort Survey
The Allied Dunbar National Fitness Survey
The Health Education Authority’s Today’s Young
Adults Survey.
3.2
Range of questions used in the UK.
The Health Survey for England uses the following
question, which is recommended by the WHO Regional Office for Europe, as an instrument for collecting
internationally comparable data for measuring progress towards achieving
WHO-Europe Health for All targets.
Use of this question therefore provides a basis
for international comparisons of self-assessed health, although respondents’
understanding of what constitutes ‘good’ or ‘bad’ health will be influenced by
cultural and historical contexts
[*]
Now I would like to ask you some questions about your health. How is your
health in general? Would you say it was..
RUNNING
PROMPT
1 very
good
2 good
3 fair
4 bad
5 or
very bad?
The General Household Survey (GHS) has included a
single-item question since 1976 and therefore offers twenty five years of
annual estimates and the Health Survey for England has included a question
since its inception in 1991.
The General Household Survey uses the following
question which, unlike that used by the HSE, specifies a time period.
[*]
Over the last 12 months would you say your health has on the whole been...
1 good
2
fairly good
3 or
not good?
Questions on other surveys ask respondents to
compare themselves with others; the question used by the Health and Lifestyles
Survey, for example, asked respondents to say how good their health was ‘for
someone of your age’.
Although self-assessed health is often measured
by a single item, there is widespread evidence that this question nevertheless
covers several dimensions of health, and that people implicitly go through a
process of considering and weighing these dimensions when answering the
question.
3.3
Cognitive testing of the question
Respondents to the 1984 Health and Lifestyles
Survey, for example, were asked what they understood by the term ‘health’:
among the aspects which they mentioned were absence of disease, functional
ability, and fitness (both physical fitness and psychological well-being). Also
identified were a ‘moral’ dimension, whereby health depended on will-power,
self-discipline and self-control; health as healthy behaviour (being a
non-smoker or non-drinker, taking exercise); and health as a ‘reserve’ which
could be diminished by neglect and accumulated by good behaviour (Blaxter,
1990).
Cognitive work carried out for the pilot phase of
the 1997 Health Education Monitoring
Survey (HEMS) identified very similar themes. Respondents interpreted
‘health in general’ as absence of ill-health, the
ability (or not) to lead a normal life, a state of mind, and physical
fitness Participants in the 2001
Census question-testing programme also referred to frequency of
doctor consultations, whether or not people were absent from school or work
because of ill-health, and whether or not they were taking medication.
3.4
Socio-economic differences
Many questions on self-assessed health were
specifically designed for inclusion in surveys of the general population. As
single items, they take very little time in an interview or when a respondent
is self-completing a questionnaire. There is evidence (Calnan, 1987) that those
with higher levels of education are able to produce more elaborated definitions
of health; there may be therefore systematic differences between social groups
in their understanding of questions and hence in the meaning of their answers.
Blaxter (1990) believes that this distinction
does not hold when people are encouraged to elaborate on their ideas in an
in-depth interview, but warns that respondents do not have the time to do this
in most surveys. It may be that less well-educated respondents are more likely
to draw on narrower concepts of health in the survey setting.
3.5
Validity
Self-assessed general health has been shown by
studies in several countries to be a good predictor of mortality. In the UK, a
follow-up study to the Health and Lifestyles Survey (HALS2) showed that, after
the existence of a serious disease, self-reported poor health was one of the most powerful predictors of
mortality. Among those who said in their 1984 interview that they had no
serious disease, men at all ages who assessed their general health as ‘fair’ or
‘poor’ were twice as likely as those who rated it as ‘excellent’ or ‘good’ to
die in the seven years between the initial and the follow-up study. For women,
self-assessment was a good predictor only for those aged 55 or over (Blaxter
& Prevost, 1993).
Similar studies in Sweden (Sundquist and
Johansson, 1997), the USA (Berkham and Syme, 1979; Idler et al. 1990) and
France (Grand et al. 1990) have shown similar results. The Swedish study had a
very large sample of almost 40,000 respondents. It found that poor
self-reported health status was a significant risk for men and women of all
ages, when the effects of age, marital status and low socio-economic status
(measured by educational level and tenure status) were controlled for.
The validity of questions on self-reported health
has also been tested by comparing them with other measures of health. In an
analysis of the 1984 Health and Lifestyle Survey results, Blaxter (1990)
constructed a health index based on four dimensions: the presence or absence of
disease, the presence or absence of illness (as measured by reported symptoms),
fitness and unfitness, and a measure of perceived well-being. The presence or
absence of disease was partially validated by nurse assessments and by details
of medication reported by respondents. The fitness/unfitness dimension was
based on physiological measurements such as Body Index, blood pressure and
respiratory function. Blaxter found a high level of
agreement between self-reported general health
and the index at the two ‘extremes’; that is, those whose measured health was
best and worst (as measured on the four dimensions) were most likely to give an
appropriate self-assessment.
Self-assessed health has also been shown to be
associated with doctor consultation rates, with the mean rates of consultation
increasing as self-perceptions of health deteriorate. However, Blaxter (1985)
found that, once social class was taken into account, self-assessments and
consultation rates were clearly associated only for those belonging to the
manual social classes; she suggests that not consulting is part of the
definition of being in good health for these groups.
Evidence suggests that there is an overall
tendency for respondents to give positive rather than negative assessments of
their health, but as with other measures discussed here, there are systematic
variations between the assessments given by people in different social groups.
Evidence from a number of surveys suggests that older people have lower
expectations of health, and are more likely to make a positive assessment of
their health than a younger person with similar illnesses or symptoms might;
they consider themselves healthy despite the difficulties associated with
ageing. Similarly, people with a disability can give assessments of their
health as good, ‘despite the disability’. Those in families where the head of
household is defined as belonging to the manual social classes are more likely
to make a more pessimistic assessment than objective measures suggest is
appropriate (Blaxter, 1990).
People in different social groups also emphasise
different dimensions in their definitions of health; functional ability is more
likely to be mentioned by older people, and fitness by younger people.
Psychosocial well-being is stressed more by people in the middle years, by
women and by more highly educated respondents.
3.6
Reliability
Data from the 1997 HEMS (Bridgwood et al. 1998)
indicate that individual changes in self-rated health are associated with
objective changes in health. The 1997 survey was a follow-up, in which
respondents who were first interviewed in 1996 were interviewed for a second
time in 1997. As well as being asked about their health, they were also asked
whether they had experienced one or more of a series of events in the last
year. Those who reported a serious illness, injury or operation since their
first interview were three to four times as likely as other respondents to give
an assessment of their health in 1997 which was more than one category ‘poorer’
than in 1996.
Blaxter (1990) warns that people are often
inconsistent in their assessments of their own general health. One of the
reasons for this may lie in the answer categories available. The cognitive work
carried out for the 2001 Census and the 1997 HEMS pilot explored respondents’
understanding of the different answer categories. The ‘fairly good’ category in
the GHS question and the ‘fair’ category in the HSE question were least easy to
define; ‘fairly good’ was considered to be a vague term, while ‘fair’ was seen
as an average of good and bad days. Those who described their health as ‘fair’
in the 1996 HEMS were most likely to have changed their assessment; less than
half used the same description in 1997. Similarly, about one in six of those
who
described their health as ‘good’ and more than a
quarter of men and more than two fifths of women who said it was ‘bad’ in 1996,
opted for ‘fair’ in 1997. If the term ‘fair’ is difficult to define clearly,
than it is perhaps not surprising that some respondents change their
assessments over time. Similarly, some respondents had difficulty
distinguishing between the ‘very good’ and ‘good’ categories in the HSE
question; some movement between these two categories is therefore perhaps to be
expected.
3.7
Ease of interpretation
Responses to questions on self-reported general
health offer a simple summary measure with an intuitively comprehensible
meaning, which can be used to compare different social and health status
groups. They give an overall summary assessment of health, although it is
difficult to know whether any differences in reported health for a given
population over time are real differences or a difference in the relative
weight attached to the component dimensions of health, particularly as these
dimensions are implicit rather than explicit. When analysing differences
between social groups, it should be borne in mind that there are systematic
differences in the dimensions which respondents have in mind when making an
assessment of their own health, and in the extent to which these assessments
correlate with more objective measures
4.
Long-standing and limiting long-standing illness questions
4.1
Use of the question in the UK
Surveys
in the UK which have included a question on longstanding illness
include:
The Health Survey for England;
The General Household Survey;
The Health and Lifestyles Survey
The Health Education Monitoring Survey
The ONS Psychiatric Morbidity Surveys
The National Child Development Survey
The Survey of the Physical Health of Prisoners
The National Survey of Sexual Attitudes and Lifestyles
The 1991 and 2001 Censuses
The National Child Development Survey
Questions on long-standing illness or disability
have been included in the General
Household Survey since 1971 (with a separate question on limiting
long-standing illness since 1974), with a break in 1977 and 1978, which
provides time series data
spanning a period of over 25 years. A question on
limiting long-standing illness was included in the Census for the first time in
1991, and repeated in 2001 in part to obtain an improved indicator of the
likely need for health services for small areas than could be produced from
survey data.
4.2
Range of questions used in the UK.
The Health Survey for England, the General
Household Survey, and many other surveys, use the following question:
[*] Do
you have any long-standing illness, disability or infirmity? By long-standing I
mean anything that has troubled you over a period of time or that is likely to
affect you over a period of time?
1 Yes
2 No
The GHS also asks whether the condition is a
limiting one:
[*]
Does this illness or disability (Do any of these illnesses or disabilities)
limit your activities in any way?
1 Yes
2 No
The 1991 Census used the following question:
Do you
have any long-term illness, health problem or handicap which limits your daily
activities or the work you can do? (Include problems which are due to old age)
1 Yes
2 No
The 2001 Census used a slightly different
question:
Do you have any long-term
illness, heath problem or disability which limits your daily activities or the
work you can do? (Include problems which are due to old age)
1 Yes
2 No
These core questions are sometimes supplemented
with further questions on Activities of Daily Living (OPCS, 1994) or by a
checklist of symptoms (Health Promotion Trust, 1987).or asking “What is the
matter with you?) (General Household Survey)
The question asking for details of illness is
sometimes asked only as a courtesy with no intention of analysing the
responses, as in most years of the GHS; at other times, interviewers are asked
to probe the nature of the self-reported illness or disability fully. This was
done in 1988, 1989, 1994 and 1996 for the GHS, for all years of the Health
Survey for England, for the first Health and Lifestyles Survey and for the
Survey of the Physical Health of Prisoners.
The dimensions of health covered by the questions
are not explicit, but there is some evidence that they measure physical
morbidity more successfully than psychiatric morbidity
Answers to these questions are used to produce
estimates for the prevalence of self-reported long-standing and limiting
long-standing morbidity among people living in private households. Long time
series, such as those produced by the GHS, provide a point of comparison for
local, ad hoc or irregular surveys. International comparisons are possible, as other
countries use similar questions, although prevalence estimates will be
influenced by cultural understandings of illness, disability and normal activities. The data have also been
used to produce estimates of Healthy Life Expectancy (Bone et al. 1995b) and
combined with other measures, including more objective measures such as blood
pressure and lung function, to produce a summary scale of health (Blaxter,
1987).
4.3 Use in surveys of the general population
Questions on long-standing illness and disability
are short and easy to administer and therefore take little interview time. They
can, however, be sensitive to changes in question wording and to mode of
administration. For example, the overall prevalence of limiting long-term
illness as measured by the 1991 Census among those resident in private
households was 12%, significantly lower than the estimate of 18% from the 1991
GHS. The authors of the 1992 GHS report argue that differences in methodology
accounted for some of the difference; the census information was collected by
self-completion, usually by one member of the household and related to one
night in April, while for the GHS all adult members of the household are
interviewed individually by a trained interviewer and fieldwork goes on throughout
the year (Thomas et al. 1994). The change in wording to include
reference to ‘the work you can do’ may also have
contributed to the discrepancy’.
A comparison of responses to the Census question
and to an identical question on the 1991 Census Validation Survey (CVS) found a
‘gross error’, that is the proportion of times the answers to the two studies
were different, of 4.9% (Heady et al. 1996). Higher estimates of prevalence
were obtained in the Census Validation interview than from completed Census forms.
The authors of the CVS report point out, however, that the differences may
reflect genuine changes in health between the Census and the
survey, or lack of knowledge on the Census form-filler’s part about the health
of other members of the household. The comparison between the Census and GHS
questions, together with several other studies, also show that quite small
differences in survey design, question wording and possibly in question order
also appear to influence response (OPCS, 1994). In this regard some of the most
prominent effects are:
·
Surveys which attempt to measure both limiting
and non-limiting chronic illness with one question tend to produce lower
overall estimates of prevalence than those which ask two separate questions.
·
Asking whether respondents ‘have’ a long-standing
illness produces higher estimates than asking whether they ‘suffer’ from an
illness; some people may answer ‘no’ to the latter on the grounds that they are
not actually suffering (Goddard, 1990).
·
Asking whether an illness limits activities
compared with ‘people of your age’ produces lower estimates than asking whether
it limits them ‘in any way’; it is believed that elderly people in particular
would say no because most of their contemporaries were as limited in their activities
as they were (OPCS, 1975).
·
Using a checklist of symptoms stimulates
reporting (Blaxter, 1987). One advantage of a checklist is that it provides all
informants with a common frame of reference; it is possible, however, that it
might produce overestimates of prevalence as informants who are not sure
whether they have a condition might include themselves (Goddard, 1990). A
checklist cannot be used to produce accurate prevalence estimates for more
serious diseases as sufferers are more likely than others to be in hospital or
unavailable for interview (Blaxter, 1987).
·
Analysis of GHS data suggests that asking
informants for full details of their illness before they are asked whether the
illness limits their activities might result in lower estimates of limiting long-standing illness or
disability. The authors of the 1988 report suggest that some informants may be
reluctant to say that an illness limits them when interviewers know what it is;
they also note, however, that unexplained fluctuations in the levels of
self-reported limiting illness were a feature of GHS data throughout the 1980s
(Foster et al. 1990).
·
Asking interviewers to use directed probes,
rather than generalised ones, can result in marginally more codeable conditions
being reported. The cognitive question-testing carried out for the 1997 HEMS
pilot found that respondents were able to define the terms ‘illness’ and
‘disability’ without difficulty, but that some had difficulty in understanding
‘infirmity’. For some respondents, infirmity was synonymous with old age.
4.4 Validity
Assessments of the validity of questions on
long-standing illness or disability have been based on comparisons with
standardised mortality ratios (SMRs), the results of clinical examination and
doctors’ reports. They show a high level of agreement for overall prevalence,
although the level of agreement varies for specific conditions and for
different social groups. Commentators note that discrepancies do not
necessarily indicate that data from self-reported sources is inaccurate;
informants may not have brought a condition to the attention of a doctor,
medical records could be inaccurate, doctors may not have informed patients of
their diagnosis, and lay descriptions may differ from those given by doctors
(White, 1995).
A comparison of age-standardised ratios for
overall prevalence of self-reported chronic sickness and standardised mortality
ratios carried out for the first GHS in 1971, showed that for males, with the
exception of Scotland, regions where SMRs were higher than expected also had
higher than expected age-standardised ratios of long-term illness. This was
also true for limiting long-standing illness and disability. There was less
apparent correspondence between the two measures for females (OPCS, 1975). A similar
comparison carried out at local authority level on 1987 Census test data showed
correlations of 0.80 for men and 0.82 for women between all-cause mortality (as
measured by standardised mortality ratios) and limiting long-standing illness
(Charlton et al. 1994).
Interview data from the 1984 Health and
Lifestyles Survey yielded an estimate of 30% overall prevalence of
self-reported long-standing illness; information collected from respondents
during a subsequent nurse session, which included recording details of
medication, increased this estimate by only two percentage points (Blaxter,
1987).
Evidence from several sources indicates that
these questions underestimate the prevalence of long-standing illness and
disability among the elderly; for example, a proportion of informants who
reported difficulties with Activities of Daily Living nevertheless say they had
no chronic illness or disability. Even when there is no reference to ‘people of
your age’, it appears that elderly people regard limitations in their daily
activities, particularly difficulties with eyesight and hearing, as a normal
part of growing old, not as evidence of illness or disability (Martin et al.
1988).
However, when the data from the 1991 Census
Validation Survey were analysed, it was found that the proportion of those with
a disability who reported a long-standing condition actually increased with
age; the overall underestimation of chronic conditions among the elderly arose
because the number who are disabled is much
greater among the elderly than other age-groups,
so that a slightly smaller proportionate under-recording produces a much larger
absolute effect (Heady et al. 1996).
Supplementing the questions on long-standing
illness with questions on Activities of Daily Living and on eyesight and
hearing, as is done periodically on the GHS, is one way of improving estimates
of prevalence for the elderly, as the estimates from the two different measures
could be cross-referenced at the case level.
When comparing self-reported morbidity among
different groups in the population, it must also be remembered that some people
are more troubled by a certain kind of symptom than others, and that the need
to limit activities will depend on what people usually do (Bennett et al.
1996). Informants may also vary in the amount of information they choose to
give or in their knowledge of the extent and nature of their ill-health
(Blaxter, 1990).
Comparisons have been made for estimates for
specific conditions, as well as for overall prevalence. Blaxter (1990) found an
80% agreement between self-reported data and clinical assessments on the
presence or absence of specific chronic conditions. The majority of the serious
conditions which were reported were treated (and therefore presumed to be
medically diagnosed); conditions which were most likely to be untreated were
conditions such as varicose veins, migraine, haemorrhoids and ‘back trouble’.
Those belonging to a non-manual social class were more ready to declare a
chronic condition, even if it was not functionally troublesome or accompanied
by symptoms. Informants in manual social classes, particularly men, were likely
to say they had a named disease only if it was actually troublesome; this was
particularly true for mental disorders. Very few of those with a severe
condition said it did not affect their lives (Blaxter, 1990). Analysis of the
1987 Census Test results showed the highest correlations at Local Authority
level between named conditions and standardised mortality ratios were for
circulatory diseases (Charlton et al. 1994).
4.5
Reliability
There is little or no data on how well the
questions on self-reported health problems or disability perform using a
test-retest methodology. There is some evidence on reliability, however, from
the 1997 HEMS; respondents who reported a serious illness, injury or operation
in the life events section of the interview were twice as likely as others to
give an assessment of self-reported morbidity which was poorer in 1997 than in
1996 (Bridgwood et al. 1998).
4.6
Ease of interpretation
Data from the GHS enable trends over time to be
measured; these show year-to-year fluctuations, but the overall trend for both
long-standing and for limiting long-standing illness and disability is upwards.
Caution needs to be exercised, however, when interpreting changes in the
prevalence of self-reported morbidity as changes over time may reflect changes
in people’s expectations of health as well as the prevalence or duration of
sickness (Bennett et al. 1996).
5.
Empirical comparisons
5.1
Introduction
Because we are dealing with surveys of the
general population, we rarely have objective data on the health status of
individuals in the sample. Thus, we must rely on self-reported measures of
health status to evaluate other self-reported measures of health status, a
circularity which it is hard to avoid when using general population survey
data.
5.2
Context effects
There are a number of reasons why questions which
aim to measure the same concept produce different estimates for the same
population; even a relatively small difference in the wording of the question
or of the response categories, as on the self-assessed health questions on the
HSE and GHS, can have a significant
effect. Consistency of results across surveys cannot, however, be guaranteed,
even if identical questions are used, because of the context in which the
questions are asked. There is a substantial body of methodological and survey
literature demonstrating such context effects for a wide range of different
types of questions. Secondary analysis of the HSE, GHS and other surveys
provides evidence of the scale of context effects for three of the general
health measures under consideration: self-assessed general health,
long-standing illness and limiting long-standing illness.
Thus, it can be seen that identical questions do
not produce identical estimates – although any differences tend to be small.
Differences could emerge for a number of reasons; if, for example, the surveys
had differing approaches to the taking of proxy information, or if they were
affected by different types of non-response bias. On the three surveys
analysed, however, questions on health would not be answered by proxy as they
are opinion questions, and, in general, all three have similar characteristics
of non-response (younger adults tend to be under-represented). It is quite
possible, therefore, that the observed variation may occur because of the
context in which the questions are asked. It might be expected that there would
be a difference between answers to questions asked on a general survey, and those asked on a specific health
survey, but there was also a difference between the two general surveys, the
GHS and the Omnibus. Despite both of these surveys covering several different substantive
topics, they are quite different in their actual content. The GHS carries
relatively long question modules on major aspects of a person’s life, such as
housing tenure, education and employment, while the Omnibus carries a selection
of much shorter modules that could be on a wide variety of topics. It may be
that the latter survey does not encourage as much consideration of health
issues
before the answer is given, but this can only be
speculation.
5.3
Service use
One way of validating health measures is to
examine how they relate to use of health services. In this section results from
the 1996 GHS are used to show the relationship between the health measures
included in that survey, and whether or not a doctor had been consulted in the
two weeks prior to interview. There is, of course, no reason to expect that all
those reporting a health problem will
have consulted a doctor recently, particularly if the health problem is of a
long-standing nature, but the proportions of those who have consulted do give
some indication of the validity of the measure..
22% of
men and 30% of women with a long-standing illness or disability had consulted a
doctor in the two weeks before interview, while slightly higher proportions of
those with a limiting long-standing illness had done so. A better predictor of
doctor's consultation (though not necessarily ill health) appears to be the
question on self-assessed general health. Around a third (35%) of men and
two-fifths (42%) of women who said that their health in the last 12 months had
not been good, had consulted a doctor in the previous two weeks. A fifth (19%)
of men and a quarter (24%) of women who said that their health had been fairly
good had consulted a doctor, while only 9% of men and 14% of women who reported
good health had done so.
Thus, for these three general health measures
(all asked on the GHS in 1996), the expected relationship between poor health
and doctor consultations was observed. However, of all three measures, the
presence of a long-standing illness or disability showed the weakest
association.
The ability of health measures to predict use of
health services is clearly important from a policy perspective. An instrument
which discriminates well between those likely and those unlikely to use health
services would clearly be of benefit for planning for future demand. It should
be borne in mind, however, that the associations discussed here are not really
predictive relationships in this sense. The use of services reported here
refers to GP consultations prior to
completion of the general health measures. It is equally likely that consulting
a doctor affects how one subsequently rates one's general health rather than
causality running in the opposite direction. In order to assess the ability of
general health measures to predict future
service use, a longitudinal design would be necessary.
5.4
Distributions self-reported general health by age and sex
A direct comparison between the prevalence of
self-reported good health, as measured by the HSE on the one hand, and the GHS
on the other, is not possible because of these differences in response scale
format. 'Good' health is normally derived on the HSE by combining the
categories 'very good' and 'good'; this category almost certainly includes some
of those who would rate their health as 'fairly good' in response to the GHS.
More than three-quarters of respondents to the 1993 and 1996 HSE, for example,
rated their health as 'very good' or 'good', compared with between half and
three-fifths of GHS respondents who chose the 'good' option. This in itself
shows that how people rate their health depends crucially on how the question
is framed. In addition, the GHS question specifies a time period, 'in the last
12 months', while the HSE does not.
All surveys, however, show a similar pattern of
association between self-reported
general health, age and sex. Men were
consistently more likely than women to say that their health was good, although
the differences were not significant on the two Health Surveys for England.
Similarly, all the surveys showed a strong relationship between self-reported
health and age, with the proportion of respondents who reported being in good
health declining with age.
Differences between the proportions of men and
women who said they had 'bad' or 'very
bad' health on the HSE, or 'not good' health on the GHS, were small and not
always statistically significant. Not surprisingly, however, the likelihood of
reporting poor health increased with age on all surveys.
5.5
Self-reported long-standing illness, disability or infirmity by age and sex
All surveys showed a clear association between
the prevalence of long-standing
illness, disability or infirmity and age. Below
the age of 55, between a fifth and two-fifths of respondents reported a chronic
condition; among those aged 55 and over, between a half and two-thirds said
that they had such an illness, disability or infirmity. The prevalence of
long-standing illness as estimated by the HSE was higher than for the GHS;
authors of previous HSE reports have suggested that respondents to a health
survey may be more likely than those participating in a general survey to
report an illness due to the subject matter of the questionnaire
stimulating them to think more closely about all
aspects of their health. On those surveys which included a question on limiting
long-standing illness, between a tenth and a half of respondents said they had
such a condition, the proportion increasing quite steeply with age. On the 1994
GHS, for example, 10% of men and women aged 16-24 said they had a limiting
illness, compared with 44% of men and 48% of women aged 75 and over.
5.6
Association between self-reported health and long-standing illness
All five surveys under consideration included
questions on self-reported general health and self-reported long-standing
illness although, as noted earlier, the wording of the question on general
health and the response categories used varied across surveys. All surveys show
an association between the two measures, with respondents reporting good health
much less likely than those whose health is not good to report a long-standing
illness or disability. Thus, for example, only 19% of respondents to the
1994 GHS with 'good' health said they
had a chronic illness or disability, compared with 86% of those with 'not good'
health. Similarly, 97% of respondents to the 1996 HSE with 'bad' or 'very bad'
health reported a long-standing complaint, compared with 28% of those whose
health was 'very good' or good'.
While this represents a high degree of congruity
between these two instruments, it should be noted that a significant minority
of respondents whose self-reported health was ‘good’, nevertheless said they
had a chronic illness or disability, suggesting that the two questions are
measuring somewhat different aspects of health. The key to this difference
probably lies in the fact that the self-rated general health question contains
an implicit valuation component while the long standing illness question does
not. Therefore, while someone may report having a long standing illness the
same person may nevertheless report their general health as being very good,
because they may see the long standing illness as minor or unproblematic (e.g.
minor skin complaints or correctable visual problems
6.
Implications of the UK review on the Minimum European Health Module
The first question in the Minimum European Health Module is:
How is
your health in general?
Very
good
Fair
Bad
Very
bad
This question is very similar to that used in the
national health survey in England. Therefore all the comments made above in
terms of how it is administered, how it is understood and how it is answered
are relevant.
The second question in the Minimum European Health Module is:
Do
you have any long-standing illness or health problem?
Yes
No
There is no comparable question to this in the UK surveys. All the UK surveys add the words “disability” or “infirmity” or give a reference period. The expression “health problem only occurs in the census questions, not in the large population sample surveys.
The third question in the Minimum European Health Module is:
For
at least, the past six months, have you been limited in activities people
usually do because of a health problem?
Yes
No
This question asks respondents to concentrate first on the last six months, then whether they have had a limitation in activity during this time, then whether it is an activity people usually do, and finally whether it is a result of a health problem. In the UK, questions on limiting long standing illness, in population surveys at least, tend to ask this as two questions: first to establish whether there is a problem and then to establish its consequences in terms of limitations in activity. In these questions and in the census question, which does put the concepts together, the focus is on the limitation in the respondents’ own activities including in the census, the work that they can do, and not compared to what people usually do.
7.
Conclusions
Subjecting any survey question to rigorous conceptual and methodological scrutiny is bound to throw up inconsistencies in interpretation and response. This is especially apparent when the task relates to asking people to rate or evaluate their health.
All the evidence from the UK experience suggests that at the most fundamental level it is important to have the same question wording across surveys and that help should be given in telling the person answering the question what we mean by health, perhaps by means of a preamble. This would help get over the problems of differential response by age, sex, and level of education. Also we should also be aware that when we compare data collected in different contexts, by subject and by proxy and with different modes of administration, that these have an effect on responses.
References
Bennett N. et al (1995) Health Survey for England 1993, London: HMSO
Bennett N, Jarvis L, Rowlands O, Singleton N
& Haseldon, L. (1996) Living in
Britain: results from the 1994 General Household Survey, London: HMSO
Blaxter M. (1987) ‘Self-reported health’ in The Health and Lifestyles Survey London:
Health Promotion Research Trust
Blaxter M. (1990) Health and Lifestyles. (London: Routledge)
Bone M. (1995) Trends in dependency among older people in England, London: HMSO)
Bone M, Bebbington AC, Jagger C, Morgan K & Nicolaas G. (1995) Health expectancy and its uses. London:
HMSO
Bowling,A. (1991/1997) Measuring Health: a review of quality of life measurement scales. Milton
Keynes: Open University Press
Breeze E. et al (1994) Health Survey for England 1992 London: HMSO
Bridgwood, A. (1993) Baseline ‘93: health status and performance
Bridgwood A & Malbon G. (1995) Survey of the Physical Health of Prisoners
1994. London: HMSO
Bush JW, Chen MM, Patrick,DL (1972) Social
Indicators for Health Based on Function Status and Prognosis. Proceedings of the American Statistical
Association Social Statistics Section: 71.
Cadman D, Boyle MH, Offord DR, Szatmari P,
Rae-Grant NI, Crawford J, Byles J (1986) Chronic illness and functional
limitation in Ontario children: findings of the Ontario Child Health Study CMAJ 135(7):761-7
Crosnick J (1999) Survey Research Annual Review of Psychology 50 537-567.
Department of Health (1992) The Health of the Nation: a strategy for health in England London:
HMSO
Donovan, JL, Frankel SJ and Eyles JD (1993)
Assessing the need for health status measures, Journal of Epidemiology and Community Health, 47,158-162.
Foster K, Wilmot A & Dobbs, J. (1990) General Household Survey 1988 London: HMSO
Franks P, Gold MR and Clancy CM (1996) Use of care and subsequent mortality:
the
importance of gender. Health Serv Res Aug;31(3):347-63.
Goddard E. (1990) Measuring morbidity and some of the factors associated with it’, in
Health and Lifestyle surveys: towards a common approach: report of a
workshop held on 7 November 1989 organised by the HEA and OPCS. (London: HEA
and OPCS)
Goddard E & Savage D.(1994) General Household Survey: People aged 65 and over: GHS No. 22
Supplement A. London: HMSO
Grand A, Grosclaude P, Bocquet H, Pous J,
Albarede (1990) Disability, psychosocial factors and mortality among the
elderly in a rural French population Journal
of Clinical Epidemiology 43(8):773-82.
Kind, P (1995) Measuring the reliability of individual
assessments of the life quality associated with health states. Survey Methods Centre Newsletter Vol. 15
No 2.
Lawton MP & Brody EM (1969) Assessment of
older people: self-maintaining and instrumental activities of daily living, Gerontologist, 9, 179-186.
Long A (1993) General
Health Measures - an introduction to multidimensional profiles, Paper
prepared for the sub-group of the Chief.125 Medical Officers’ Health of the
Nation Survey.
Sundquist J & Johansson SE (1997) Indicators
of socio-economic position and their relation to mortality in Sweden. Social Science and Medicine 45(12), 1757-66
The
Health and Lifestyle Survey (1987) London: Health
Promotion Research Trust
Thomas M, Goddard E, Hickman M and Hunter P
(1994) General Household Survey 1992 London:
HMSO
Thomas R & Purdon S (1994) Survey Methods Centre Newsletter, 14( 2) National Centre for Social
Research 1994
White A (1995) Measuring subjective health status. (Unpublished paper: Social
Survey Division).
White A et al (1993) Health Survey for England 1991 (London: HMSO)