IRE
Information Retrieval Experiment
Ineffable concepts in information retrieval
chapter
Nicholas J. Belkin
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Inference chains and operational definitions 49
interdependent. Whether one wishes to design experiments to study one of
lie concepts as such, or to be able to control for their effects in a system test,
to use one or more of them as an explanation for system performance,
.,()rting them out from one another always remains a difficulty. For instance,
the concepts of information need and of desire always must underlie any
[OCRerr].[OCRerr]tisfaction concept. Thus, any discussion of relevance depends upon some
l(lc[OCRerr]i of information need. This means that, in any test, the relationship of the
I elevance concept being used to the information need concept underlying it
light to be made explicit, and the implications of the relationship should be
.it least mentioned. The questions used in information retrieval tests also
tlepend on the concept of need, usually assuming only topic-related issues.
l[OCRerr]tit question representation also depends upon the system used for text
[epresentation, which in turn depends upon concepts of information or
.Ihoutness. And in those cases in which the question is used as the basis for
tidging relevance, relevance becomes dependent upon the question concept,
md the circle is completed.
It is obviously necessary to eliminate or control for these interactions in
testing situations, and this is indeed possible, but with some attendant costs.
I[OCRerr]()r instance, in the Cranfield experiments the questions were purposefully
composed without any underlying information need, thereby allowing the
experimenters to use a relevance concept based solely on topic relations
l)etween question and text. In this way, the systems for text and question
i.epresentation could be evaluated without reference to information need or
(lesire, the user-related concepts. The cost for having been able to disentangle
these concepts lies in the assumptions that questions without needs are the
s[OCRerr]ime as questions with needs and that performance judged by topic-based
relevance is at least directly related to performance judged by need-based
relevance.
In experiments designed to investigate one of these variables per se, the
problems of interdependence are perhaps somewhat easier to control for.
13ut, as always, questions of relevance must depend upon concepts of need
and of information or aboutness, and empirical investigations of information
ire difficult to separate from the states of knowledge of the subjects receiving
the information. There are, unfortunately, no good and general rules or
techniques for isolating one of these variables in any given testing situation.
The usual solutions have been to hold as many variables as possible as
constant as possible, or to establish control groups for variables that cannot
be held constant. Usually, these designs require strong assumptions about
the nature of the interactions among the variables, and about the variables
themselves. The most general suggestion that one can make in these cases is
that the best way to discover the least obtrusive and confining design is to
make these basic assumptions explicit, and on their basis to establish the
control structure required. The next section will discuss some possible
operational definitions for these ineffable concepts, and also their problems.
4.4 Inference chains and operational definitions
A fundament[OCRerr]tl di[OCRerr]culty with these concepts is th[OCRerr]it thev are very basic
indeed. This means that theories about them are often very general, and that