IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 80 The pragmatics of information retrieval experimentation where N(R) and N(C) are the numbers of levels in the row and column treatments, respectively, N(Q) is the number of queries, and m is a whole number. If the columns in one of the above two Latin squares represent search order, the rows searchers, and the table entries queries then one can see that the square ensures no two languages will be searched in the same query order. Tables of Latin squares of order up to 12 by 12 may be found in Fisher and Yates19 A Grec[OCRerr]Latin square is obtained by combining two Latin squares in such a way (called orthogonally) that each treatment combination occurs the same number of times. This is sometimes useful if there are two factors (for example, indexing language and searcher) as well as a sequence effect. An example of a Grec[OCRerr]Latin square, in which rows represent searchers and columns search order, is: 1 2 3 sl Qigi Q2g2 Q3g3 s2 Q2g3 Q4gl Qlg2 s3 Q3g2 Qlg3 Q2gl Such designs are useful but not always available if the dimension is greater than 5. Although Latin and Grec[OCRerr]Latin squares ensure that each treatment occurs at each position in the order of administration, it does not totally control sequence effects in the sense that all possible m! sequences of m treatments are observed. This could be achieved by introducing sequence as a fully-fledged factor in a three-factor experiment. For example, with three indexing languages, there are six possible orderings in which searches can be carried out. Thus, to maintain the Latin square design, one needs six rather than three searchers and six rather than three query sets. The design might be as follows: order 123 132 213 231 312 321 si Qi Q2 Q3 Q4 Q5 Q6 s2 Q2 Qi Q4 Q5 Q6 Q3 s3 Q3 Q4 Q6 Qi Q2 Q5 s4 Q4 Q3 Q5 Q6 Qi Q2 s5 Q5 Q6 Ql Q2 Q3 Q4 s6 Q6 Q5 Q2 Q3 Q4 Qi This design does not control effects due to query sequence. In this and previous designs, random sequencing with respect to queries should be carried out by the searchers. This factor, too, can be formally controlled, but the price would be high in terms of additional complexity of the design and resulting sample size. Latin squares can be repeated as many times as needed. For example, if the pool of queries was 72 rather than 18, then one might use four Latin squares rather than a single one. Additional squares may be the same as the initial one, others randomly selected, or others obtained by permuting the columns of the first square (balanced squares). To analyse a Latin square in order to test for differences in the criterion I