Introduction

 

 

 

 

 

 

 

 

1. Why a History of the Discovery of the Top Quark?

Gone are the days when one can justify a minute examination of some scientific experiment intended for an audience of, say, philosophers, historians, and sociologists, by lamenting the deplorable lack of attention that experimentation has received from these fields and simply announcing, "It’s time someone did something about this." Starting with Ian Hacking’s (1981), a number of authors from various disciplines have done something about this, producing a diverse and contentious literature addressing a number of significant philosophical problems (the state of this literature was reviewed a few years ago in Franklin (1993)). High energy physics in particular has been the subject of a number of substantial studies (see Franklin (1986, 1990), Galison (1985, 1987), Pickering (1981, 1984a, 1984b)). The presentation of another such study, this time examining the discovery of the top quark by the Collider Detector at Fermilab (CDF) collaboration, demands an answer to the question, Why produce another detailed examination of an experiment in high energy physics?

First, the present study, unlike almost all previous studies of experimental practice and reasoning, both puts the question of establishing evidential relationships at the center of attention, and seeks to analyze that question in terms of the theory of hypothesis testing most often employed in particle physics experiments (and in many other fields of research): the Neyman-Pearson theory. In doing this, I make use of the model for interpreting such tests that has been proposed by Deborah Mayo, most recently in her (1996).

Others have addressed the question of evidence in examining historical experimental episodes, but apart from Deborah Mayo (see, e.g., her (1996, 278–92), which focuses on the 1919 eclipse tests of Einstein’s theory of gravitation), these have not employed the resources provided by Neyman-Pearson theory in analyzing the evidential relationship. Allan Franklin, for example, in his (1986 and 1990), uses a subjective Bayesian analysis that rests on principles inconsistent with Neyman-Pearson theory. Others have looked to experimental activity with the aim of theorizing the relationship between data and theory on a larger scale, without explicitly addressing the problem of evidence. Thus, Galison (1987, esp. 243–62) identifies "levels of commitment" that constrain the inferences drawn from experiment. Galison’s analysis is essentially historical, identifying long-, middle-, and short-term constraints on the beliefs that researchers form. He makes the valuable point that none of these domains is the exclusive domain of theory, but that researchers may also have relatively long-term commitments to, for example, particular experimental techniques. Likewise, Ackermann (Ackermann 1985), is interested in the relationship between data and theory in general terms, arguing for a thesis of dialectical accommodation between theory and data, or rather, between theory and "scientific facts [which are]. . . publicly negotiated provisional fixed points for scientific argument about the significance of theory" (Ackermann 1985, 34). Neither Galison nor Ackermann, however, provides an analysis of the evidential relationship that maintains when an experimenter believes that certain data actually provide a reason to believe a particular theory or hypothesis.

There has also been a proliferation of sociological theories of scientific research (Bloor 1991, Collins 1992, Knorr-Cetina 1981, Latour 1987, Latour and Woolgar 1986, Pickering 1981, 1984a, 1984b, 1992, 1995, Pinch 1986) . Many of these have drawn on detailed studies of historical episodes, and some, particularly Pickering’s, have examined large collaborations in high energy physics. In these works, when the problem of evidence is given consideration, then it is usually theorized in "social-constructivist" terms, according to which evidence claims rest on no other basis than the negotiated consensus among parties concerned that the data at hand is to be regarded as supporting a particular theoretical claim (in some accounts the agonistes are not traditional sociological units of human collectives but "networks" (Latour 1987) or "agencies" (Pickering 1995) that are not constituted exclusively by human participants). While such a perspective may be of some use for purposes of sociological theory, the present work seeks to address the question of the conditions under which it is reasonable to believe that an evidential relationship does, in fact, exist. I make a working assumption that an account can be given of this relationship in terms of objective features of the experimental episode, although, as will be seen, I do not exclude the social context from those objective features.

Second, there is the question of scale. While the above-mentioned studies of high-energy physics examined some very large experimental collaborations, the scale of "big physics" is growing rapidly, and the CDF collaboration is at least three times larger than any collaboration studied in these previously published works. This issue of "scale" applies not only to population, however. The detector that CDF used in their experiments was correspondingly more complex (not to mention simply bigger) than those used by the earlier groups. Perhaps most importantly for present purposes, the analysis that was used on the data, with its multiple search modes and vertiginously detailed triggers and event-selection "cuts," was extraordinarily complicated, necessitating the contributions of many different physicists to assemble it. Much of my interest in pursuing this project was to look at this analysis and, regarding it as itself a product of the members of CDF, in the same way that the detector was such a product, to trace some of the history of how it came into being.

Why should scale matter? In 1435, Alberti, the great art theorist of the Italian Renaissance, advised aspiring painters, "I prefer you to practise by drawing things large. . . . In small drawings every large weakness is easily hidden; in the large the smallest weakness is easily seen" (Alberti 1966, 94). Although I have not proceeded by looking for weaknesses in particular, I did wish to see what kinds of details might emerge from an examination of experimentation on a very large scale—details that might get lost in a smaller group. In this, I was helped by two features of the CDF collaboration: the ubiquity of meetings and the requirement of consensus prior to publication. These factors resulted in disagreements over details of the analysis being worked out quite publicly in discussions involving many interested parties. Furthermore, the compromises and agreements that were reached in such meetings were subjected to the scrutiny of a very large number of people. Thus, issues that might have been quickly worked out between a couple of researchers—or even in the mind of a single investigator—in smaller experiments, were made highly visible and a matter of record.

Philosophically speaking, I was interested in these details of the resolution of disagreements because of what they might tell me about the nature of evidence. That is, I was not simply interested in how as a matter of fact physicists settled their disputes, but in how they reasoned in reaching consensus on a claim of a particular sort: that they did, indeed, have evidence of the existence of the top quark. Here three questions turned out to require resolution, and to be related to one another in some interesting ways: (1) How is the data to be analyzed? (2) What should be reported as the outcome of the experiment? (3) Is that outcome evidence for the theoretical claim that the top quark exists? Chapters four and five address these problems.

Another reason for studying this particular episode in the history of high energy physics is simply that the top quark discovery itself was one that had been pursued for a long time, and was regarded by physicists as an important discovery. This of itself gives the episode a certain amount of historical interest. It was also an experiment that made use of enormous resources, much of it in the form of support from the federal government, through the Department of Energy in particular. Although it is not my aim in this work to address the question of the appropriateness of allocating such large amounts of money to such undertakings, I do believe that those who wish to consider such questions should have available to them at least some knowledge of just how the activity that these funds support is carried out. For both of these reasons, I detail more of the history of this discovery, and of the CDF collaboration, than I make use of philosophically.

 

2. A Word on Method

When I began research on this project, I had no philosophical thesis in mind. (More accurately, any philosophical arguments that I might have been contemplating proved to be dead ends before I even started research. Although I did write a dissertation proposal, the ideas in that proposal are completely unrelated to the finished product.) What I did have was an interest in how the top analysis was put together, how the journal articles announcing the discovery were put together, and how the strength of the evidence was evaluated.

I began collecting information very informally, in a few phone calls to Henry Frisch and Alvin Tollestrup, two veteran CDF members who had played important roles in the collaboration from the very beginning, and in meetings with Bruce Barnett, who had joined the collaboration when Johns Hopkins joined in 1989. This was followed by a trip to Fermilab in May 1995, at which time I did considerable documentary research in the Fermilab archives and library, and was introduced to a few more members of the collaboration. I returned to Fermilab in June 1995, this time with a more careful interviewing plan in mind. These conversations were not recorded, although I took detailed notes. On this trip I talked to most of the physicists with whom I later recorded oral history interviews. At this time I also talked to some physicists from the D0 collaboration. By October 1995 I had secured a research grant and the use of a tape recorder from the American Institute of Physics, and I was able to record a new round of interviews, which this time included some physicists I had not talked to previously, but also included nearly all of those I had talked to on the previous trip. (These tape recordings are in the archives of the American Institute of Physics, accompanied by detailed notes on their content.)

In recounting the history of the discovery of the top quark, I have relied, as much as possible, on the tape recordings produced in the last round of interviews. In most instances I have been able to check the information from one interview against documentary sources and other interviews. Where accounts of incidents have varied from one source to another, I have presented the conflicting versions together. I have also made some use of the unrecorded conversations. In most instances, however, these conversations have been used as sources for methodological opinions of particular physicists rather than matters of fact regarding historical events. I benefited on several occasions from e-mail messages sent to me by CDF physicists, and I have used these documents for both methodological opinions and matters of fact.

So much for historical method. The question of philosophical method remains. In particular, the reader may wish to know the relationship between the historical and philosophical parts of this work. The first, and easiest answer, is that the history of the discovery of the top quark served as heuristic to my philosophical interests. That is, the history suggested to me what the most interesting philosophical problems would be—they were the problems that bothered the physicists.

There is, however, a more general problem of the relationship between the history and the philosophy of science, which is the subject of an extensive debate. On the one hand there are naturalizers who hold that philosophical positions concerning what constitutes good scientific method are empirical generalizations from the observed record of successes and failures (with various interpretations given for what constitutes success and what failure) in the history of science (see Callebaut (1993), Churchland (1990), Downes (1993), Giere (1988), Langley, et al. (1987), Laudan (1987, 1990), Laudan, et al. (1986, 1992), McCauley (1992), Nickles (1987a, 1987b, 1989), Schmaus (1996), Stump (1992), Thagard (1988)). On the other hand, critics of this thoroughly naturalized approach maintain that some methodological norms must be presupposed before the enterprise of naturalized methodology can begin, and that hence it cannot be the case that the support for all methodological claims is of an empirical nature, grounded in the history of science (see Doppelt (1990), Howson (1990), Kaiser (1991), Klein (1992a, 1992b), Siegel (1989, 1990, 1996)).

My approach has been to avoid this (to my mind unresolved) dispute as much as possible. I do not regard the general philosophical framework that I have adopted (described in chapter three) as a generalization from historical episodes, and in fact it has not been my aim in this work to give a general argument in favor of that framework. I do, however, think that one test of a philosophical claim about science is that it should not be contradicted by prima facie rational procedures employed by scientists. I have not assumed that every inference or belief formed by every scientist at CDF in pursuit of the top quark is rational (an assumption that would be immediately called into question by the prevalence of disagreements amongst these physicists). My approach has been instead to ask (particularly in chapter five), in those cases in which some aspect of the experimental results was a subject of debate in deciding whether or not CDF had evidence for the existence of the top quark, what the grounds for the disagreement were. I then considered, without assuming one side or the other to be right, what the reasons might be for taking those grounds to be relevant to the assessment of this evidence claim. That is, I sought to locate some epistemological rationale for the considerations that were being brought to bear on the argument, such that the rationale had some fairly intimate relationship with the terms in which the dispute itself was conducted. I then traced the consequences, for the nature of scientific evidence, of the fact that such considerations were relevant to the assessment of evidence claims.

In other words, I did not assume that any particular scientist was as at any particular moment being rational while his disputants on the other side were irrational, but I did adopt, as a working hypothesis, the assumption that particle physicists are en masse rational, in the sense that the kinds of things they take to be relevant to the evaluation of their experiments really are relevant to that task. The rationale that I identify relates closely both to the terms in which the debates over evidence were actually conducted, and to the aims that they were collectively pursuing, in particular the desire to avoid erroneously claiming to have discovered the top quark.

 

3. Summary of the Text

The text is divided into two main sections. Chapters one and two recount the history of the CDF collaboration and of the top analysis and discovery. Chapters three, four, and five discuss various philosophical issues that arise from that history.

In chapter one, I discuss the history of CDF from its inception in 1977 up to June of 1992, when the data-taking run began that culminated in the paper announcing evidence of the existence of the top quark. Much of the history of this period is the history of the building of the CDF detector and of the CDF collaboration. This is not intended as a thorough institutional history of CDF (I pay less attention to organizational relationships, funding arrangements, and relationships to government and industry, for example, than such a history would require). My aim in this chapter, instead, is to address, in part, a question that is posed in a rather striking way by the case of CDF. Scientific activity often gets sorted into three basic categories: formulating theories, performing experiments, and reasoning about the relationship between theories and experimental results. But for at least ten years, CDF consisted of a large number of physicists who were not doing any of these things. So what were they doing, and how did that activity relate to the three activities mentioned above? The quick and easy answer is that they were getting ready to experiment. But I argue that their activity is more accurately characterized, not as preparing for any particular experiment, but for a general experimental programme, characterized in terms of performing certain kinds of basic measurements of interest for the theories they were seeking to test.

In chapter two I recount the heady days of the top discovery, from 1992 to 1995, at both CDF and D0, although the focus remains on CDF. This chapter relates more directly to the philosophical discussions in the remaining chapters, as it outlines the various debates that took place regarding the proper way to analyze the top search data, the proper information to include in reporting the results of that analysis, and the proper way to characterize the results of that analysis vis-a-vis existence claims concerning the top quark. Much of chapter two therefore concerns the question, What, in CDF physicists’ own terms, was relevant to the question of whether they had found evidence for the top quark or not? I conclude chapter two with a speculation about large collaboration research. I hypothesize that collaboration researchers may be developing a new kind of methodological sensibility that regards the collaboration’s own means of negotiation and decision-making as forming part of the experimental procedure, and hence as a process to be monitored for its reliability and susceptibility to error.

In chapter three I outline the model of scientific experimentation and evidential reasoning that I will employ in chapters four and five. The model of experimentation used is a "hierarchy of models" that represents experimental procedures in terms of formal models. Here I draw on the work of both Patrick Suppes (1962) and Deborah Mayo (1996). This model not only displays general patterns of experimental inference, but facilitates greater precision in discussing the many aspects of the top quark experiments. I employ the model to give a simplified representation of a major component of the top quark experiments—the so-called counting experiments. I then outline the "error-statistical" model of reasoning from experimental tests that Deborah Mayo has proposed, and which I use in analyzing the inferential practices of CDF physicists in the context of the top quark search. I illustrate this model with some hypothetical examples, as well as with the reasoning employed by CDF to rule out alternative hypotheses in claiming that their results are evidence for the top quark.

In chapter four, I examine the logic of a counting experiment in general, and then argue that in the case of CDF’s top search, this experimental test violated the model of experimental testing advocated by the hypothetico-deductive view of hypothesis testing. In particular, I argue that the prediction based on the hypothesis that there is no top quark (i.e., the "null" hypothesis that CDF sought to test) was not deduced from that hypothesis together with auxiliary assumptions not derived from the experiment itself. Instead, that prediction was inferred inductively from data produced in the experiment. In making this argument, I seek to extend to a new kind of case an argument presented by Peter Achinstein in his (1991), based on a discussion of J. J. Thomson’s experiments with cathode rays.

Chapter five addresses the methodological problem of "tuning on the signal," which CDF physicists took to be of concern in the top search. (This argument is presented, much abridged, in my (1996), but the exposition here is clearer and avoids an error that appears in the shorter version.) I give an informal account of the problem, describing how it arose as a point of contention concerning one part of the top search analysis, then seek to locate the underlying rationale for the condemnation of the practice of tuning on the signal. I note a similarity between this methodological prohibition and some versions of the "novelty" requirement advocated by some philosophers of science, but argue that the real rationale lies in the requirement that the test to which a hypothesis is subjected must be "severe" (in the sense articulated by Deborah Mayo in her (1991) and (1996)), in order to be evidentially supported by that test. I also argue that a subjective Bayesian account of this methodological stricture cannot be provided for the general case, given the standard solution to the problem of "old evidence." Finally, I consider some consequences of the prohibition on tuning on the signal, and of the way in which CDF resolved its concerns about this problem, for the notion of evidence. I argue that in some cases the social structure and decision-making process within the collaboration become evidentially relevant considerations, making them, in effect, a part of the method of hypothesis testing that must be evaluated in drawing evidential conclusions.