IRE Information Retrieval Experiment Actual tests part Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Part 3 Actual tests In this section individual tests are described and evaluated. Where Part 1 discussed general problems of testing and Part 2 those of types of test, the intention here is to focus on individual experiments and what can be learnt from them. The chapters taken together show what the substantive concerns and methodological approaches of retrieval system tests in the last twenty years have been, and so throw light on the state of the art from a third point of view. In particular the tests described indicate how, and how far, testing experience has grown in the field as a whole. The four chapters illustrate the themes of the previous sections in different ways. In the first chapter by Sparck Jones the whole development of retrieval system testing, and specifically experimentation, over the last two decades is surveyed, with references on the one hand to major tests and on the other to average or typical ones. Sparck Jones' second chapter is devoted to the Cranfield Project tests, particularly Cranfield 1 and 2, which are widely held to have been exceptionally important. These key tests are presented chiefly in their own terms, and looked at through the eyes of contemporaries. The chapter thus brings out the contribution of Cranfield 1 and 2 to retrieval experimentation at the time that they were conducted; but their longer term contribution to retrieval system testing is also considered. In the third chapter Evans gives a full-dress account of a single experiment studying the performance effects of different search strategies. This is a detailed case study showing how a single test was originally approached, and how it appears retrospectively. Finally, the largest and longest retrieval system testing project, the Smart Project, is reviewed by Salton. As system testing is expensive and individual tests are too often incomparable, there are great advantages in facilities for long-term testing. Salton's discussion of the way these facilities were provided and used by the Smart Project, and also of the disadvantages which may accompany the advantages, provides a conclusion for the book which reiterates the point made in the Introduction: namely that retrieval system testing is difficult. 211