IRE
Information Retrieval Experiment
Retrieval system tests 1958-1978
chapter
Karen Sparck Jones
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
248 Retrieval system tests l958[OCRerr]l978
(1) that artificial indexing languages do not perform strikingly better th.'iii
natural language;
(2) that complex structured descriptions do not perform strikingly bettci
than simple ones;
(3) that the number of searching keys is more important than their individual
quality;
(4) that the characterization of queries is more important than that [OCRerr],l
documents;
(5) that formal properties of the data may be turned to advantage, as iii
weighting schemes.
But of course, as these statements all refer only to mechanism variable[OCRerr],
they can have real meaning only by being related to their environment of'
data parameters; and the main failure of information retrieval research hit%
been in determining those environment properties significant for systelil
operation and in establishing the relationship between data and mechanisni
variables. Cleverdon'20 in 1971 maintained that `it is, in theory, possible to
design and operate a system that will achieve a given satisfactory
performance, at the least possible cost, in a particular environment'. But he
also observes that while it is possible, in any given situation, to design all
effective system, `a problem that is still unsolved is how it is possible to
predicate exactly what a situation will be.. . . Designing for the hypothesised,
but probably non-existent, "average" user, we may produce systems that
satisfy no-one'. (pp.67-8)
Some advance in this area since 1971 can in fact be detected: a good deal
of rather crude evidence about systems has been gathered; and some system
models have been proposed which have stood up to initial testing, for
example the Robertson'21 and van Rusbergen"0 probabilistic theories. But
it remains the case that our ignorance is large: to take a conspicuous instance,
we have virtually no information about the real recall levels of large online
search systems, or about real recall for many retrieval schemes investigated
by research workers.
12.9 The current state of retrieval system understanding
After an evaluative survey of the retrieval test literature, van de Water (`I
al.122 concluded that the standards and content of tests were slightly higher
than those found in a survey carried out five years earlier, but that information
science was nowhere near established as a science. This is certainly true; but
perhaps this is aiming too high too soon. A more reasonable question is
whether retrieval research has any more modest, but nonetheless material,
achievements to its credit.
The best way of answering this question is to ask whether there have been
any research results which have been applied to operational systems. Even
allowing for some delay, one would hope that after five or ten years good
research results could have had operational outcomes.
Cleverdon'23 considered this question in 1976. Looking at the historical
development of retrieval systems, he asked whether some more conspicuous
research projects had contributed, either positively or negatively, to the