IRE
Information Retrieval Experiment
The pragmatics of information retrieval experimentation
chapter
Jean M. Tague
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
80 The pragmatics of information retrieval experimentation
where N(R) and N(C) are the numbers of levels in the row and column
treatments, respectively, N(Q) is the number of queries, and m is a whole
number. If the columns in one of the above two Latin squares represent
search order, the rows searchers, and the table entries queries then one can
see that the square ensures no two languages will be searched in the same
query order. Tables of Latin squares of order up to 12 by 12 may be found in
Fisher and Yates19
A Grec[OCRerr]Latin square is obtained by combining two Latin squares in such
a way (called orthogonally) that each treatment combination occurs the same
number of times. This is sometimes useful if there are two factors (for
example, indexing language and searcher) as well as a sequence effect. An
example of a Grec[OCRerr]Latin square, in which rows represent searchers and
columns search order, is:
1 2 3
sl Qigi Q2g2 Q3g3
s2 Q2g3 Q4gl Qlg2
s3 Q3g2 Qlg3 Q2gl
Such designs are useful but not always available if the dimension is greater
than 5.
Although Latin and Grec[OCRerr]Latin squares ensure that each treatment
occurs at each position in the order of administration, it does not totally
control sequence effects in the sense that all possible m! sequences of m
treatments are observed. This could be achieved by introducing sequence as
a fully-fledged factor in a three-factor experiment.
For example, with three indexing languages, there are six possible
orderings in which searches can be carried out. Thus, to maintain the Latin
square design, one needs six rather than three searchers and six rather than
three query sets. The design might be as follows:
order
123 132 213 231 312 321
si Qi Q2 Q3 Q4 Q5 Q6
s2 Q2 Qi Q4 Q5 Q6 Q3
s3 Q3 Q4 Q6 Qi Q2 Q5
s4 Q4 Q3 Q5 Q6 Qi Q2
s5 Q5 Q6 Ql Q2 Q3 Q4
s6 Q6 Q5 Q2 Q3 Q4 Qi
This design does not control effects due to query sequence. In this and
previous designs, random sequencing with respect to queries should be
carried out by the searchers. This factor, too, can be formally controlled, but
the price would be high in terms of additional complexity of the design and
resulting sample size.
Latin squares can be repeated as many times as needed. For example, if
the pool of queries was 72 rather than 18, then one might use four Latin
squares rather than a single one. Additional squares may be the same as the
initial one, others randomly selected, or others obtained by permuting the
columns of the first square (balanced squares).
To analyse a Latin square in order to test for differences in the criterion
I