IRE
Information Retrieval Experiment
Gedanken experimentation: An alternative to traditional system testing?
chapter
William S. Cooper
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
208 Gedanken experimentation: An alternative to traditional system testing?
refer to such data-gathering as an experiment, since the aim is merely to
obtain crude estimates of certain statistics rather than to test anything. Here
is the kind of small-scale empirical investigation by which many conventional
`What-works-best?' tests might well be replaced, if indeed there is to be any
experimentation at all beyond gedanken experimentation.
11.4 Further remarks
These examples by no means exhaust the possibilities inherent in an approach
based on probability and utility theory and gedanken or small-scale
experimentation. There are ways of combining gedanken weighted indexing
with gedanken weighted requesting, of constructing thesauri which weight
relationships among terms probabilistically by thought experiment, of
translating boolean requests into probabilistically weighted ones, and so on.
One of the most far reaching advantages of the probabilistic approach to
system design is that it provides a natural means of combining large numbers
of weak clues. Many kinds of evidence could be brought to bear in ordering
system output that are not exploited in conventional systems, but which it
would be natural to utilize in a probabilisitc system. Among them are the
many kinds of relatively weak clues available even before a request is
received, e.g. document recency, citedness, language, level of technicality1
form of publication, and so on. These could all be used with low weights as
part of the probability computations and would for many kinds of requests
be apt to bring about greatly improved retrieval. Known-work searches on
the basis of non-standard clue-types constitute another possible application1 2
There is much scope for further investigation in this area.
11.5 Summary
When a retrieval system design is explicitly probabilistic or utility-theoretic,
its parameters are endowed with a clear meaning which makes their
estimation a fit subject for gedanken experimentation or in some cases small-
scale statistical estimation techniques. Since by virtue of the statistical theory
embodied in them such systems are known a priori to make optimal or near-
optimal use of the data at their disposal, comparative tests among whole
systems of this kind may be largely replaceable by tests of the accuracy of
their associated input data estimation methods, or in obvious cases by simple
judgements of which of these estimation methods is probably most accurate.
This suggests as potentially advantageous an approach to information
retrieval research which (1) emphasizes the discovery of explicitly probabil-
istic or utility-theoretic retrieval system designs; (2) emphasizes the
development of improved input estimation methods including gedanken
experimentation techniques; and (3) de-emphasizes the role of traditional
comparative system tests in favour of restricted data-gathering aimed at
measuring error of estimation in the input data.
Gedanken experimentation, as opposed to actual data-gathering, is apt in
general to be most valuable where decisions must be taken quickly,
frequently, and with a minimum of fuss. Indexing and request-weighting