IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Decision 9: How to analyse the data? 93 estimators[OCRerr]the sample mean and the sample proportion when the sample [OCRerr]iie is large. It can be shown that, in both cases, these estimators are normally (listributed and that the means are equal to the corresponding population v[OCRerr]iIues. The standard deviations, which are the standard errors, can be tpproximated by the following formulas: standard error of the sample mean: s/\/;;, where S is the sample standard deviation and n is the sample size; standard error of the sample proportion: [OCRerr]p)/n, where p is the sample proportion and n is the sample size. The standard error, by itself, does not mean much. It is more usefully employed in setting up confidence intervals. A confidence interval is an interval or range on either side of the sample estimator which contains the population value with a given confidence. Confidence is usually expressed as `I percentage between 0 and 100. A 95 per cent confidence interval, for example, is an interval determined in such a manner that 95 times out of 100 it will contain the population value. With large samples, a 95 per cent confidence interval for the population mean or population proportion will be [OCRerr]pproximately 2 standard errors (more exactly 1.96) on either side of the sample value. A 99 per cent confidence interval will be approximately 3 (more exactly 2.57) standard errors on either side of the estimator. For example, suppose, in a survey of users attitudes to an online retrieval system, it was found that 96 out of the 120 users surveyed were satisfied with the service they received. The standard error is thus given by 96/120(1-96/ 120)[OCRerr]1/2 120 j - 0.036 a 99 per cent confidence interval for the satisfied proportion of the population would be 96 120i2.57(O.036) (0.707, 0.893) `I'hese methods assume, also, a large population. Other methods must be used to set up confidence intervals for small populations. Tague and Farradane15 have shown that if one estimates system recall and system precision by searching random samples of n queries against m documents and calculates estimated recall, say, by using microaveraging to obtain a sample estimator p of recall, the estimator will have a standard error of approximately ( p(1 p)[OCRerr]1/2 mnyl where p is the average system or population recall, y is the average system generality, m is the size of the database, and n is the number of queries in the sample. This may be approximated by { Zi[OCRerr]=i a[OCRerr][OCRerr][OCRerr]=1 C[OCRerr][OCRerr]112