RESEARCH PROGRAM: 1. / BIOMEDICAL KNOWLEDGE REPRESENTATION Submitted to the National Library of Medicine February 1979 Department -_-. ersity '1 Computer Science ; - -- --__ (y--Y;* TABLE OF CONTENTS PART A - AIXINISTPATIVE PAGES Section Page I. FESEARCI! OBJECTIVES . . . . . . . . . . . 2 II0 BUDGETS . e . . . o z . . * e a . . - 3A 1I.A. Administrative and Core Research . . . . . 3B 1I.B. Project 1 . . 0 . a . . . . . . . 3D I1.C. Project 2 . . . . . . . . . . . . 3F 1I.D. Project 3. . . . . . . . . . . . . 3I? III. ELQGETSI)TES .=...*... .'.. - 35 IV . CUPRICULP VITAE . . . . . . . . . . . . 4 Secrion PART B - FESEPFCH PLAN 1. TXE PROGRA&' i?ROPOSED . . . . . . . . . I.A. Rationale for the Program . . . . . I.E. Resources that exist to aid this project I.C. Significance . . . . . . . ?? o PROJECT I --CCDIFICATION AND USE OF ?IEDIC>L II./. 11-B. iI.C. KKOIvXEDGE Ilr ONCOLOGY Tntroduction . . . Specific z!i.ns . . . Efethods . . . I Page III. III. IV* V. VI. VII. PROJECT 2 -- A WRKBENCH FOR FNOWL.EDGE REPRESENTATION . . . . . . . . . 37 111-A. Objectives of the Research and their Significance . . . . . . . . . . . 37 1II.B. Packground and rationale . . . . . . . . 3g 1II.C. Methods of procedure . . . . . . . . . 41 PROJECT 3 -- CODIFICATION AND USE OF MEDICAL KNOWLEDGE IN CLINICAL LABORATORIES . . . 67 1II.A. Objectives . o o 0 1II.B. Background erd- ratronal'e 1II.C. Methods . . . , CCRE RESEARCH . . . . 1V.A. Objectives of Research . 1V.B. Background and Rationale . 1v.c. Methods of Procedure . . 1V.D. Significance . . . . FACILITIES AVAILABLE . . . . V.A. Hardware . . . . V.B. Software and Personnel . COLLABORATIVE ARRANGEMENTS . . PRINCIPAL INVESTIGATOR ASSURANCE VIII. APPENDIX A . . . . IX. APPENDIX B . . . . x +. APPENDIX c . . . . REFERENCES . . . . . . . ii ....... 69 ....... 70 ....... 73 ....... 92 ....... 92 ....... 93 ....... 99 ....... 106 ....... LO7 ....... 107 ....... 107 ....... 109 ....... 110 ....... 111 ....... 144 ....... 188 ....... 204 Qverv iew Sec. I. I. THE PRCGZW PRWXED We begin this proposal with a description of program contemplated, with rationale and justification and a description of resources and facilities already for the purpose. the broad of need, available Herein we propose a knowledge representation, five-year program of research on arCI the various problems associated with it in the design of knowledge-based computer programs, The Stanford University group will work collaboratively with a group from the University of Missouri's Health Care Technolcgy Center; under the direction of Dr. Donald Lindberg. The program will be under the general direction of Professor Edward Feigenbaum of Stanford, who presently serves of SUKEX-AI?Jl , also as t-he Principal Investigator research the NIFl-sponsored National Computer Resource fcr on the application of Artificial te&nigues to medicine and biology. Intelligence (AI) This Resource will serve the computer needs of the proposed program. 1 The proposed program consists of four activities: three projects and a core research activity. Projects One representation, &and Three address the problems of knowledge acquisition, and utilization in medical/hospital settings. specific In Project One, the clinical setting is the Gncolouy Dav Care Clinic. The task that provides specificity and direction to the research is the construction of a consultation system rqard ing experimental protocols clinic outpatients, .This project and selection of therapy for is lti by Professor E.H. Shortliffe of the Stanford Medical School, the original developer of the ?4YCIN program for consultations disease diagnosis and therapy. regarding infectious In Project Two, the transfer of such expertise to other places and to other medical agdications can be viewed as the primary goal. One powerful way of clrmulating the concepts and methods of an emerging branch of Computer Science is them in war king to cumulate software widely shard. This project packages that widely applicable and aims at developing a number of s'ucn packages or "tools", constituting a ccmputer-proqrzm "workbench" for fur thar research on and application of knowledge-based sys te.m.s . The pckzges emerge as generalizations of work done in the tzsk-specific orojects;con=titute result therefrom; kd serve ." a verv to amplifv and tangible type of accelerate future 1 sec.1. Gverv iew efforts. This project is under the direction of Professors Bruce Buchanan and Douglas Lenat of Stanford. In Project Three, the setting is the Clinical Laboratory and the task is one of acquiring and representing the medical expertise that allows the laboratory expert (e.g. the Laboratory Director) to interpret test results and discuss these with the patient's clinical physician. This is the inter-university collaboration headed by Dr. Lindberg. An important subgoal of this project is the transfer of the Stanford expertise in knowledge based systems research to the Missouri Center. The Core Research Activity will' investigate a variety of fundamental research questions whose answers will shape present and future developments in knowledge representation research. Such questions involve formalisms and data structures for representing various types of knowledge: various methods-some automatic, some interactive- for acquiring new knowledge in systems: new inferential methods for putting this knowledge to work; strateay-knowledge representations for reasoning about the domain specific knowledge; and so on. The Core Research Activity is under the direction of Professor Feigenbaum, Douglas Lenat of Stanford. Lastly, it is an objective disseminate the of the overall program to finding s of the research, and to provide training opportunities to others. This objective will be accomplished through publications, presentations of research results at scientific meetings, by making room in the ocerational sites arid the core activity for visiting scientists an; trainees, and by - participation in a special annual meeting. The meeting to discuss our research and similar projects in this field will either be a oart of or be coordinated with the annual artificial intelligence-in medicine meetings is, at Rutgers University. That in years when the Rutgers meeting agenda and housing facilities can accommodate this group and its audience, we will join with Putgers. In years when this is not possible, we will sponsor a separate meeting objectives of this program. addressed to the four principal The administrative arrangements for the Program will be these : The Principal investigators of the various program activities will collectively constitute an Executive Committee for the Program, under the chairmanship of the Program Director. The Executive Committee will meet routinely by telephone- conference and occasionally face-to-face. 2 Gverview Sec. 7:. An -Advisory Group will be formed, consisting of colleagues at other institutions who share our motivations and scientific interests, This group will advise the Elvecutive Committee on major decisions and will offer peer review as necessary. The kernel of the Advisory Committee will be drawn from the membership of the SUMM-AIM Advisory Committee (for which Dr. Lindberg is currently chairman). I.A. Rationale for the Program -- I.A.l. Cvhat do we mean bv knowledge? -- e-4 Computer scientists have long recognized that a computer is a general symbol-manipulating aevice. Arithmetic constitutes a special case of this capability-the manipulation of those symbols that are numbers. In this proposal we will be discussing non-numeric symbol manipulation by computers. non-numeric computation, In thinking about it is useful to think about: a. inference methods (as O?poSed to calculation and algorithms) !I. qualitative "lines of reasoning" (as quantitative formulations) oppsd to C. symbolic facts (not merely numeric formulas) parameters and d. decision rules of expertise and judgment (as opposed to mathematical decision rules) The use of the term "knowledge" in this prooosal is intended to cover both (c) and (d) above. In common us&e, the term "knowledge" does not usually include (d), because such judgmental and experiential knowledge is largely tacit knowledge and therefore not recognized (i.e. the knowledge is "nrivate" and the expert is not aware of what he/she knows and 1s using in problem-solving). The knowledge i- is unwilling to share 3 private not because the expert it, but because he/she is Iunable to discover and verbalize it. Sec. I.A. Overview It is central to our view that knowledge of "expertise"- such knowledge-the medicine and science, is critical for competent practice in in fact constituting the bulk of the knowledge employed in such practice. We view as a matter of great importance that such knowledge be codified and given a concrete (and at least semi-formal) representation, so that it can be used, stored, transmitted to others, taught. analyzed, discussed, and Every activity of this proposed program is aimed at developing the scientific concepts and methods by which this can be most expeditiously, carefully, and usefully done. Symbolic computation, though general and powerful, has hardly begun to be exploited in real applications. The specialty within Computer Science that has studied complex methods of symbolic computation is "Artificial Intelligence Research." I.A.2. Some Relevant Global and Local P-P History Early work in artificial intelligence aimed toward the creation of generalized problem solvers. Work on programs like GPS [by Newell and Simon] and theorem proving , for instance, was inspired by the apparent generality of human intelligence and motivated by the belief that it might prove pssible to develop a single program applicable to all (or most) problems. While this early work demonstrated that there was a general purpose large body of useful subqoals, techniques (such as Froblem decomposition into and heuristic search in its many forms), these techniques did not by themselves offer sufficient power for expert levels of performance. Recent work has instead focused on the incorporation of large amounts of task specific knowledge is what have been called "knowledge-based" systems. Rather than non-specific problem solving power, knowledge based systems have emphasized high performance based on the accumulation of large amounts of knowl&ge about a single domain. A second successful focus in work on intelligent systems has been the emphasis on the utility of solving "real world" problems, rather than artificial problems fabricated in simplified domains. This is motivated by the belief that artificial problss may prove in the long run to be more a diversion than a foundation for further work, and by the belief that the field has developed sufficiently to provide tec.hniques that can aid working scientists. While artificial problms may serve to isolate and illustrate selected aspacts of a task, solutions developed for those selected aspects often do not generalize well to the complete problem. 4 Gverview Set I.A. There are numerous current examples of successful systems embodying both of these trends, systems which apply task-specific knowledge to real world problems. The following are synopses of a variety of knowledge-based systems developed by the Stanford participants in this program over the past thirteen years: DSNDRAL,: An intelligent assistant to an analytic and structural chemist. molecules from It infers the structures of complex organic structural constraints. These constraints are either supplied interactively by the user knowledge and intuition, from his "private" or are inferred instrument data, such as mass spectral data, automatically from resonance data, etc. nuclear maqnetiz For those families of molecules for which the knowledge base has been carefully elaborated, the DEXDRAL, program performs at levels equalling or exceeding the best human experts. The DENDRAL program now has a significant user community in university laboratories and in being used to solve difficult real problems. 'industry, and is Meta-DENDPAL: This program is focused on the problem of elaborating DENDPAL's knowledge base for specific families of compounds. It infers an fra9mentation rules) empirical theory (a of the mass spectrometry body of of specific families from record& mass spectral data. It has not only "rediscover&" rules previously acquired from chemists, but has discovered novel r*ules for certain families-rules that have recently warranted publication in the chemical literature. WKIN: This program is an intelligent assistant to a physician diagnosing infectious diseases. In conjunction with its diagnoses, it recommends therapeutic action. It is capable of explaining its line-of-reasoning in any (and varying) level of detail to the user in English. It can accept new decision rules from the user in English. It keeps an updated model of its own knowledge base, which it uses to critique the introduction of new rules into the system. It is capable of acquiring and usinq measures of the uncertainty of the knowledge, and produces a "believability" index with each inference, i.e., it is capable of awroximate implication. A version called FZYCIN, sans infectious disease knowledge, has been developed to extend the use of the system to other domains. :!IA.s?: Project scientists working in a classifid tnvironment led the development of a signal-understanding proqrzm for continccus surveillance inter=+ of certain objects Pie of military k"C. program ran successfully in a number of highly Sec. I.A. Overview varied test situations, and is being further develo@ in a currently-funded ARPA program. incremental hypothesis formation The program used a design for that was a modification of the EIJZARSAY design for the CNCJ speech-understanding system. Symbolic knowledge from a number of sources was used to aid the interpretations of the primary signal data. Time-dependent analysis was novel in this system and played an important role. AM: This remarkable program conjectures mathematical concepts. "interesting" Its knowledge base (usually private) encompasses the knowledge of a mathematician as to what constitutes an "interesting" construct in mathematics. Starting with the simplest set-theory concepts, and hundreds of rules defining "interestingness" of mathematical concepts, it has conjectured such concepts as addition, factorization, primes, multiplication, unique factorization into primes (the fundamental theorem of arithmetic), and an almost unstudied concept in number theory called "maximally divisible numbers." MOEEN: (under development) This program is being design& to be an intelligent assistant to an experimental molecular geneticist in formulating plans for laboratory experiments involving the manipulation of short DNA strands with restriction enzymes. The program is concerned with representing knowledge about planning and with the automatic formulation of plans to the level of detail demanded by the user. The program's knowledge must be represented at various levels-biological, genetic, topological, and chemical-and these levels must be incorporated into the reasoning. CRYSALIS: Crystallographic Image Interpretation: (under development) This program is being designed ambiguous, to interpret incomplete thretiimensional image data obtained in x- ray crystallography of protein structures. The image input data is the so-called electron density map and the answer desired is an approximately correct protein molecule (or portion thereof). As with HASP, many sources of symbolic data interpretation of the primary signal data. supprt the The HASP progrmn organization has been imported as a test of its generality. The interpretation problem is difficult because the best wavelength available (x-rays) is too long to resolve atoms and interatomic separations; hence the need for additional sources of symbolic knowledge, e.g., the amino acid sequence of the protein. PUFF: This program interprets data from the pulmonary function testing laboratory and provides for the Lab Director an interpretive summary of findings regarding airways obstruction, lung restriction, and the degree of severity: subtyoe, such as bronchitis: the corroborating evidence and its weight; treatment 6 Cverview Set I.A. reconmendations;etc. This knowledge-based system was built in collaboration with a pulmonary physiologist at Pacific Medical Center, and is in routine daily use. VM: A program that offers the attending physician or nurse interpretations of streams of data monitored from a patient in Intensive Care; signals alarm conditions due to unexpected patient condition or possible instrument malfunction: and offers advice regarding the manaqement of the patient's ventilator machine assistance. This is another collaboration with Pacific Medical Center. SACON: A M!KIN-like consult&ion system that advises a structural engin eer on the analysis plan necessary to compute the multitude of structural engineering design parameters needed for building a complex structure (such as an airplane wing or an off- shore oil drilling platform or a building). Interactively, in consultation, the user supplies the design snecifications. The system was built in collaboration with struct&al engineers at the IMARC Analysis Corporation. It was built rapidly using the EWCIN package discussed later. In short, as the capsule sketches above indicate, the main themes of our work involve: the acquisition and maintenance of knowledge bases; the utilization of this knowledge in a variety of ways for data interpretation, problem solvinq, and planninq; and the representation of this knowledge for computer inference. I.A.?. Knowledge Representation Issues and Cesiqns--the MYCIN Experience -- In lieu of further general discussion of knowledge representation, we have chosen to explicate in some depth our viewpoint and methodology by drawing upon the experience in design and development of just one of our programs, the well- known consultation system MYCIN. For us, this work has been seminal; hence the discussion of it that follows generalizes to most of the other Stanford-based efforts mentioned above. I.A.3.a. Backqround Several computer programs ha-ve 'teen written that 3ttempc to mod21 a physician's decision makinq processes. Some of 9-l=+ .-"O have stressed the diaqnostic process itself [27],[17]; others 7 Sec. I.A. Overview have been designed principally for use as educational tools [31],[36],[56]; while still others have emphasized the program's role in providing medical consultations [4] ,[29],[51],[57]. Actually, these applications are inherently interrelated since any program that is aimed at diagnosing disease has potential use for educating and counselling those who lack the expertise or statistical data that have been incorporated into the program. Consultation programs often include diagnosis as a major component, although their principal focus involves interactive use by the physician and/or the determination of appropriate advice regarding therapy selection. In general, the educational programs designed for instruction of medical students and other professionals have met with more long-term success [60] than has been the case for the diagnostic and consultation programs. The relative success in implementing instructional programs may result because they deal only with hypothetical patients as part of an effort to teach diagnostic and therapeutic concepts, whereas the consultation programs attempt to assist the physician in the management of real patients in the clinical setting. A program making decisions that can directly affect patient well-being must fulfill certain responsibilities to the physician if he is to accept the computer and make use of its knowledge. Physicians will, in general, reject a computer program designed for their use in decision making unless it is accessible, easy to use, forgiving of noncrucial errors from nonexpert typists, reliable, and fast enough to facilitate the physician's task without significantly prolonging the time required to accomplish it. They also require that the program function as a tool to the physician, not as an all-knowinq machine that analyzes data and then states its inferences as dogma without justifying them. Those who design computer programs to give advice to physicians must devise solutions to these requirements in an effort to combat the current lack of acceptance of computer-aided diaqnosis by the medical profession [14],[24]. The physician is most apt to need advice from such a program *&en an unusual diagnostic or therapeutic problem has arisen. Rowever, he may be unwilling to experiment with a program that does not meet the general requirements outlined above. Considerations such as those mentioned here have in large part motivated the research of our group over the last half- decade. We felt it was important to devise a consultation program that was (1) useful, (2) educational when appropriate, (3) able to explain its advice, (4) able to understand and 8 Cverview Set I.A. respond to simple questions stated in natural language to acquire new knowledge interactively, and (6) $1L5' ,z"E modified easily. Although we recognized that this list of design considerations was somewhat idealistic in light of the state of the art in computer science, useful set of long-range goals. we did feel that it provided a The program we developed, known as MYCIN, has had considerable success in achieving many of the goals stated.. The current research proposes to build on the MYCiN experience, both by expanding the basic computer science methodology to deal with recognized problems as yet unsolved, and by implementing a consultation system in a clinical setting where its usefulness and acceptability to physicians can be assessed. I.A.3.b. The KKIN Proqram w- As medical knowledge has expanded in recent decades, it has become evident that the individual practitioner can no longer hope to acquire enough expertise to manage adequately the full range of clinical problems that will be encounter& in his practice. Thus when a patient's problem clearly falls outside the area of the attending physician's expertise, consultations from experts in other subspecialties have become a well accepted part of medical practice. Such consultations are acceptable to doctors in part because they maintain the primary physician's role as ultimate decision maker. involves a dialog between The consultation generally the two physicians, with the exoert explaining the basis for his advice jlustification of points and the nonexpert seeking he finds puzzling or questionable. 2 consultant who offered dogmatic advice he was unwillinu tg discuss or defend would find his opinions were seldom sou9ht.- Fig. 1 shows a schematic view of the consultation process. Appendix A shows a detail&i typescript of a sample consultation. The physician nonexpert gives information about his patient to the expert in response to questions and, in return, receives advice and explanations. Thus there are actually three kinds of information flow between the physician and his consultant. The .NYCEJ program models the consultative process by attendin to all three kinds of information. It is our conviction that programs which ignore the explanation pathway will fail to be accepted by physicians because they will see in such systems too severe a departure from the human consultation process (in which the primary physician is providec! with sufficient information to allow him to decide whether to follow the offered advice). Sec. I.A. UJerview Figure 1 - Information Flow Eetween Physician And Consultant iYCIN is a LISP program designed to serve as a clinical consultant on the subject of therapy selection for patients with serious infections. The program may be envisioned as interposed between the expert and nonexpert in much the way that the large box is positioned in Fig. 1. The difference is that the human expert can offer only general knowledge to the program, not patient-specific decisions. The program thus becomes the decision maker, using general medical knowledge from experts to assess a specific patient and to give advice plus explanations for its judgments. Fig. 2 details the organization of .YYCIN relative to the human consultation process depicted in Fig. 1. As before, the nonexpert offers data about his patient and in return receives both advice and, when desirti, information via one of two internal explanation mechanisms (the general question-answerer or the reasoning-status checker). The basis for all decisions is domain-specific knowledge acquired from experts (static knowledge). A group of computer programs (the rlule interpreter) Cverview Set I.-A. uses this knowledge, and data about the specific patient, to generate conclusions and, in turn, simultaneously keeps a thersputic advice. It record of what has happened, and tnis record is available to the explanation routines if the physician asks for justification or clarification of some conclusion that the program has reached. Although complicated, Fig. 2 is somewhat the following discussion should interrelationships among clarify the in the diagram. the various system components depicted Furthermore, Appendix A gives detailed examples of all the features described below. Knowledge Representation Static Knowledge Static knowledge refers to all data that are constant in the program and unchanging from one consultation to the next, Facts About The Domain. requirfi simple statements of Much of the kr;owledge NYCIN fact about the domain. These ran ncmcrall~7 hn rnnracantlyl 3.z -At+Cr;,.tn-T.!e4.-L ..-7 ..^ L-1-7 ^_ Cverview Set I.-A. uses this kncwledge, and data about the specific patient, to generate conclusions and, in turn, simultaneously keeps a thersputic advice. It record of what has happened, and tnis record is available to the explanation routines if the physician asks for justification or clarification of some conclusion that the program has reached. Although complicated, Fig. 2 is somewhat the following discussion should interrelationships among clarify the in the diagram. the various system components depicted Furthermore, Appendix A gives detailed examples of all the features described below. Knowledge Representation Static Knowledge Static knowledge refers to all data that are constant in the program and unchanging from one consultation to the next, Sec. I.A. Overview ,a-------------- --------------a- \ /---- I l . , I I I PROOU I I I II I'- RIN 1 1 INFERFNCES J ; I I : I i -w-w. 7 I 1 STATUS f .--00, : - -N-w-. MYCIN \k YSlClAfi/ KNOWLEDGE-BASED VEJ- PRODIXTION SYSTETL~ Figure 2 - Schematic Description Of MYCIN Related To Fig. 1 -- Production Rules. (Appendix A - Section I) In addition to simple facts, MYCnequires jLdgmenta1 knowledge acquired from experts and available for use in analyzing a new patient. Judgmental knowledge in bNYCIX is expressed as production rules [iSI which define certain preconditions (the PREXISE) that allow a conclusion to be reached (the ACTION) with a specified degree 12 Overview set I.A, of confidence (the "certainty factor" [49]). Although such rules are stored as LISP list structures, a series of routines is available for translating them into English. For example: PREMISE: ACTION: If the stain of the organism is gramneg, and the morphology of the organism is rod, and the organism is anaerobic, Then there is suggestive evidence (.7) that the identity of the organism is bacteroides. Note that the purpose of this rule is the determination of organism identity. Rules are classified and accessed in accordance with their purpose as described blow. Dynamic Knowledge Dynamic knowledge refers to all change from one run of the program to data that are variable and the next. Data About The Patient - Acuuired From The User. MYCIN asks questions ofhe user, driven by a ?eZZni~lgorithn * described below. These questions generally ask the user to fill in the "value" in an attribute-cbject-value triple (eg., "V&at is the patient's name?" ), or to give the truth value of a predicate (eg. I "Is the patient a compromised host?"). Thus these data may be represented, once acquired, in precisely the way that facts about the domain are represented in the static knowldge base (see above) . Data ;"lbout The Patient - Generated & E Program. V&en the preconditions-% the PREMISE of a rule are found to hold, MYCI% executes the ACTION portion of the rule and generates a new "fact" which can, once again, be represented as an attribute- object-value triple. As mentioned above, conclusions may also have a confidence value associated with them, thereby requiring that the triple be expanded to a quadruple: the identity of ZGANIS~4-1 is bacteroides, with - certainty factor of 3.i (IDSNTITI ORGWISX-1 3ACTEECIDES .7) 13 Sec. I.A. Gverv iew Predicates may be similarly expanded. generalizing Furthermore, by this scheme to include representation of data acquired from the user, the physician may be asked to express his confidence in the answer he gives when MYCIN asks a question. Maintenance Of A Record Of The Consultation. A history of the consultation isthe third variety of dynamic knowledge. The details of representation need not be described here, but these data include records of which rules succeeded, which rules were tried but failed, how specific decisions were information was used, and why questions were asked. made, how The Production System The Rule Interpreter -- This series of routines analyzes rules in the static knowledge base, determines whether they apply to the patient under consideration , and if so draws the conclusions delineated in the ACTION portions of the rules. This process would quickly become unmanageable as system knowledge grew if there were not a mechanism for selecting only the most relevant rules for * patient. This is accomplished by a goal-oriented approa~hg~~~ we have described in detail [50],[51]. Briefly, as the rule interpreter examines the PREMISE of a rule, it notes whether the relevant data needed to determine the truth of each precondition are already known. If not, it digresses to examine those rules which make conclusions about the data that are needed by the first rule. The PREMISE conditions of those rules may, in turn, invoke additional rules, and in this way a relevant to the first rule is formed. reasoning network Since rules are classified according to their purpose, as previously described, it is easy to identify all rules that may aid in determining the truth of a specific precondition. The entire process is initiated by invoking a specific "Goal Rule" which defines MyCIN's task and is the only rule necessarily invoked for every consultation. tjhen MYCIN can find no rules for determining the truth of a preccndition, it asks the user for the relevant data. If the physician does not know the information either, the invoking rule is simply ignored. Maintenance Of - Initiative In The Hands Of The Physician -m-e- As was discussed above, a physician is not likely to accept a system such as MYCIN if the program simply asks a series of 14 Overview set I.A. questions and then presents a piece of dogmatic advice as it terminates execution. The production system has therefore been provided with a series of "interrupts" that allow the physician to digress with questions of his own or to demand justification for the line of questioning on which ?lYCIN has embarked during the consultation. Whenever the program asks a question, the user can temporarily refuse to answer and instead call on the explanation capabilities described in the next section. Explanations The Reasoning-Status Checker (RSC) (Appendix A - Section IV) - This component of the explanation system deals with most questions that arise during the consultation session itself. Eecause the context of current reasoning about the patient is well-defined, the physician can be given a great deal of information on the basis of a few simple commands that do not require natural language processing. These commands are briefly described below: the details of their implementation have also been documented [48]. As shown in Fig. 2, the reasoning status checker (FGC) uses only the knowledge base of rules and the current record of the consultation: the general question-answerer (GQA) described below, on the other hand, has access to G static and dynamic knowledge. The WEi Command. Whenever XYCIN asks a question, the physic= .e prefer not to answer initially and instead to inquire about the reasoning underlying the questioning. Thus he may simpiy respnd with the command PiHY (i.e., "Why do you think that the information you are requesting may be useful?"). Since all questions MYCIN asks are generated by rules, and since the rules are selected according to their ourcOse as previously mentioned, an English language translatixthe rule under consideration generally serves as an adequate response to the NHY query. The F&C therefore responds by displaying the current rule. In addition, it places an identifying number before each of the preconditions in the ?REMISE and indicates whether the condition is (a) already known to be true, or (b) Still under investigation (note that one of the latter group of preconditions will have generated MYCIN's current question to o he user). The physician can in turn inquire why the displayed rule was selected by asking 9.. a second time, and the RSC will accordingly dispiay the next rule in the reasoning network. `The fmi coiTfman3. As mentioned above, ss"r?en ?lYCIN displays a 15 Sec. I.A. Overview rule in response to the WHY conunand , it labels each precondition in the PREMISE with a unique number. The physician may then respond to the displayed explanation by entering How followed by one of the identifying labels. If the reference condition is one that MYCIN has already concluded to be true, the RSC assumes that the physician is asking "HCN did you decide that the specified precondition is true?" and answers by citing the relevant rules used to make the decision. If, on the other hand, the cited condition has not yet been fully investigated, MYCIN assumes the physician is asking "HCX will you decide if the specified precondition is true?" and responds by citing the rules it intends to try, only some of which may actually succeed. The General Question-Answerer (GQA) (Appendix A - Section V) - The general question-answerer (CGA) is a more comprehensive explanation system which, at any time during or after the consultation session, has full access to all static and dvnamic knowlezlge in MYCIN (Fig. 2). Since it cannot make simple assumptions based on context, as the RSC can do, the CQA must accept and answer questions expressed in natural language. MYCIN's rule-based knowledge representation schme, and some techniques borrowed from early work in computational linguistics I131 ,[30] ,147], permit a straightforward but powerful approach to interpreting simple English questions without contending with several of the complex problems of natural language understanding. The details of this approach have been documented WI. Questions About Static Knowledge. The ability to retrieve informatlon fro-ecstatic knowledge base gives the GQA a tutorial capability. Since the static-knowledge-is acquired from experts, the GQA can essentially act as an intermediary between an expert and a physician seeking general information about the infectious disease field. The user might ask simple questions of fact (eg., "Fyhich culture sites are normally sterile?") or questions regarding judgments stored in rules. Questions of the second variety are termed "rule-retrieval" questions because they may be answered simply by identifying and displaying English versions of relevant rules from the knowledge base. Retrieval may be keyed to the rule PREMISE (eg., "How do you use the gram stain of an organism?"), the ACTICN (eg., "When do you decide an organism might be a streptococcus ?"), or to both the PREXISE and ACTICN (eg., "Do you ever use the morphology of an organism to determine its identity?"), Furthermore, a guestion may deal with a specific rule (q., "ivhat is rule037?") . Xote that none of these rules refers to a specific patient or consultation and thus requires no access to the dynamic knowledge base (Fig. 2). 16 &erview Set I.A. Questions About Dynamic Knowledge. Although the RSC permit~es~rding the dynamic knowledge base, its scope is limited by the context of the current question being asked by MYCIN. If the physician wishes to ask more general uuestions regarding the status of MYCIN'S reasoning, or if he &.hes to review the prcgram~s decisions after the consultation is complete and MYCIN is no longer questioning him, the GQA gives him free access to all information about the specific consultation. Once again, the user might ask simple questions of fact (eg., "From what site was culture-2 obtained?") or questions regarding the basis for Z4YCIN.s jtigments. The second variety is again a rule- retrieval question, but is keyed to the consultation record in dynamic data rather than to the knowledge base of rules in static data (see Fig. 2). Thus guestions may again reference the PREMISE (eg. "How did you use the gram stain of organism-l?"), the ACTION (eg., What makes you think that organism-2 might be a streptococcus?"), or both (eg., "Did you use %he morphology of organism-l to determine its identity?") . sate that these guestions parallel the examples given in the previous section but that they are consultation-specific and thus request the retrieval not of all relevant rules, but only those that were actually used sucZE%fully in the specified context. Finally, one may again wish to ask about a specific rule (eg., "Did you use ruleW37 when considering organism-l?"). Knowledge Acquisition The only component of Fig. 2 not yet discussed is the crucial step of acquiring domain-specific knowiedge from experts and coding it for storage in the static knowledge base. Cohen LMYCIN was first being developed, such knowledge was acquired by extensive meetings during which infectious disease experts and computer scientists discussed specific patients and attempt& to analyze and extract the individual facts and rules that they were using. Recently extensive work has been devoted to the problem of automating the knowledge acquisition process in sessions involving clinical experts interacting with lMYCIN directly (Appendix A - Section IX). This problem has been the subject of a doctoral dissertation by one member of our group [15]. Certainty Factors Efforts to develop techniqes for modeling clinical decision makinu have had a dual motivation, clinical significant Their potential e has of course been apparent. The design of such programs also has required an analytical approach tc medical reasoning that has in turn led to a distillation of decision criteria that in some cases had never been explicitly stated before. It is a fascinating and educational process for experts 17 Sec. I.A. Overview to reflect on the reasoning steps that they have always used when providing clinical consultations. Several programs have successfully modeled the diagnostic process [27],[28],[55]. Each of these examples has relied upon statistical decision theory as reflected in the use of &yes* Theorem for manipulation of conditional probabilities. Use of the theorem, however, requires either large amounts of valid background data or numerous approximations and asscnnptions. The successful performance of Gerry and Barnett's early program [27], for example, and a similar study by Warner using the same data [55], depended to a large extent upon the availability of good data regarding several individuals with congenital heart disease. Gorry [28] has had similar access to data relating the symptoms and signs of acute renal failure to the various potential etiologies. Although conditional probability provides useful results in areas of medical decision making such as those mentioned, vast portions of medical experience suffer from so little data and so much imperfect knowledge that a rigorous probabilistic analysis, the ideal standard by which to judge the rationality of a physician's decisions, is not possible. It is nevertheless instructive to examine models for the less formal aspects of decision making. Physicians seem to use an ill-defined mechanimn for reaching decisions despite regarding the interrelationships a lack of formal knowledge are considering. of all the variables that they This mechanism is often adeguate, in well- trained or experienced individuals, to lead to sound conclusions on the basis of a limited set of observations. We have examined the nature of such nonprobabilistic and unformalized reasoning processes, have considered its relationship to formal probability theory, and have proposed a model whereby the incomplete medicine might be quantified. "artistic" side of the practice of We have had to develop this model of inexact reasoning in response to MYCIN's needs: i.e., the goal has been to permit the opinion of experts to become more generally available to nonexperts. The model is, in effect, an a@roximation to conditional probability. Although conceived with MYCIN*s problem area in mind , it is potentially applicable to any domain in which real world -.knowledge must be combined with expertise before an informed opinion can be generated. The model has been described in detail [75] and is based upon a scheme of weighted numbers we call "certainty factors". .M.though the model has been implemented in the NYCIN system, and in ENYCIN (see below) , and although it has allowed the program to demonstrate impressive decision making performance, we still recognize many problems with the formalism. The model has generated considerable attention in the literature [l] and many important suggestions for further research have been forthcoming. 18 Overview Evaluations Of MYCIN's Performance - set I.A, Work on MYCIN to date has concentrated on the infectious disease subfields of bacteremia and meningitis. Formal evaluations have been undertaken which show that MYCIN compares favorably with infectious disease experts in selecting therapy for patients with bacteremia 1621 or meningitis [63]. However, we have not undertaken a clinical implementation of MYCIN yet, and do not intend to do so in the near future. The reasons for this decision are reason that we have important in that they explain part of the turned from infectious diseases at this time, to oncology on the First, we have felt it is crucial that PinCIN not be $aced wards for clinical use if it does not already compare favorably with other forms of consultative advice available to primary care physicians. We have learned that this requires that KKIN know about essentially all major infectious disease subfields since the various Gease syndromes interrelate clinically in such important ways. In our evaluations of the progr~, it has tended to be in those cases in which a concomitant infection existed at some other site that MYCIN has failed to perform adeguately. Yet the time required for us to develop the required knowledge bases infections, endocarditis, pneumonia, and pelvic for genitourinary infections wculd necessarily be at least as long as the period it has acquire system o ? required to and test the knowledge of bacteremia and meningitis. We therefore anticipate a considerable ,period of time before the program will be able to provide consistently reliable infectious disease consultations and hence be ready for ward implementation. There are other problems as well that have been brought out by the complex decisions involved in infectious disease therapy selection. First, the truth model we have devised (see discussion of certainty factors above) has several reccgnize? inadequacies that will require further research and testing. Secondly, no computer-based decision making program with which we are familiar variables, has adeguately managed time relationships amongst and ?JYCIN is no exception. We see the need for continued research into the ways in which the production rule formalism can be suitably adapted to accommodate the need to rsFresent time dependencies in clinical reasoning and to use scch dqendencies to make aF?ropriate decisions. For exam$e, trends in a fever or white count over time may be much more imprtant in assessing an infected patient's illness than the actual values of these parameters at t.!!e precise time when the consultation is king reqestti. 19 Sec. I.A. Overview Finally, in order to expand MYCIN-s infectious disease knowledge into new problem areas, improved capabilities for knowledge acquisition would be extremely useful. Although we have made important initial steps in the development of this kind of complex capability [15], there is clearly much more to be done before an infectious disease expert who is a computer novice will be able to comfortably interact at a computer terminal in order to "teach" MYCIN the infectious disease judgmental knowledge that it needs to know. I.B. Resources that exist to aid this project p-e-- The research work proposed herein will not stand alone or apart from other research already under way in the two sites. The personnel and facilities in place at the University of Missouri#s Health Care Technology Center are described later in the appropriate Project section. At Stanford there is an interlocking set of existing grants and contracts supporting the work of a large group of scientists and students, the Heuristic Programming Project of the Stanford Computer Science Uepartient. This group has, over the years, produced the various systems summarized earlier. Historically the most significant sources of funding have been: 1. contracts from the Defense Advanced Research Projects @m-q, the leading government agency for funding artificial intelligence research. 2. grants from the Biotec.hnology Resources Program of XIH for the SUMEX-AIM computer facility, without which it would have been very difficult to accomplish what was accomplished. The other grants have had a short-term character. Some have been renewed, others not. The proposed NL24 grant is important to this complex of funding not only because it represents a significant amount of funding but most importantly because it represents stable funding over a five year period. It, therefore, like the ARPA funding, will constitute the stable base of support that will allow the work to advance steadily without personnel and funding fluctuations. The XLJ4-sponsored work will, in turn, benefit from Gverview Set I.B. the other supported work in the usual coordinated and synergistic way that significantly amplifies the effect of the NLJ4 support. The grant for the SUMEX-AIM computer resource ends in mid- 1981. There is no reason now to believe that at renewal time the grant will face trouble. However such large facilities grants are always subject to a great deal of pressure, not always from peer- review. The need to service the research activities of an ongoing five year NLM research project will definitely add strength to the renewal application. Finally, a resourc e of the greatest significance for the success of this work are the collaborative links that we have built over the years with medical scientists and clinicians at the Stanford Medical Center, the Pacific &adical Center, and the University of Missouri. It takes years to make such links work smoothly, but the resource is indispensable to a project on biomedical knowledge representation. I.C. Significance Collectively, we stand on the threshold of a new era in our understanding of the nature of medical and scientific knowledge, its distribution, and its effective use, Superficially, the cause of this has been the emergence of electronic symbol- processing and digital communication. .%re substantially, the reason for optimism is the emergence of knowledge-based computer systems research and application as a viable scientific and technical discipline. We are now beginning to understand in a scientific and technical way what practitioners have always understood about their fields of learning and practice: that the bulk of the knowledge they employ is not the knowledge of textbooks and journals, but the informal and judgmental !knowl&ge gained from long experience and practice. This knowledge is almost never codified, but is passed from mentor to apprentice by long periods of training and interaction, such as the internship, residency, and the Ph.D. graduate program In the last decade there have been significant demonstrations that such heuristic knowledge can be explicated, representti , and put to use. Needed is an interdisciplinary tem consisting of computer scientists, domain specialists, and various computer programs and compJter-orient& methodology. Sec. I.C. Overview Once explicated, this knowledge can participate in the ordinary processes of emulation of understanding in a field. For example, it can be subject to further analysis and be the basis for empirical studies and experimental investigation. It can be criticized by peer review. And it can be taught, or disseminated by library methods (electronic or otherwise). In addition, the formal knowledge of a field can be coupled to the informal knowledge to produce computer programs that act as "intelligent agents" to assist practitioners in solving large numbers of routine problems, and even scme of the more difficult problems, with which they are faced. Some methods of computer- based inference are available today to do this, and more are coming as research in this area matures. "active knowledge" The concept is one of available to work for users, in contrast to the passive knowledge of texts and articles (knowledge which is useless until "discovered" by the practitioner through library search and reading). Such a prospect is not visionary. It demands our immediate attention, We have known for many decades that computers are general symbol processing devices, not merely calculators. We have known for two decades how to program them to infer lines-of- reasoning through complex problems of a symbolic nature. In the last decade we have learned how to make such reasoning powerful and useful-by supplying such programs with considerable bodies of knowledge about the problem domains. And we have had to learn how to represent the knowledge. Now microelectronics has brought the time of low-cost computing upon us. The electronic processing necessary to make the pwer of symbolic computing available to a wide connnunity will be available. 'We should not allow ourselves to drop behind in the developent of the concepts and methods necessary for the emergence of the applications. There are also roles for knowledge-based symbolic computing that are visionary, but must be explored. The kind of "active" knowledge we have been discussing can be used to assist in the discovery of new knowledge. of new knowledge is a The very human process of discovery slow and halting process at best, done by very few and marked by very rare bursts of creative insight. It now seems possible (even plausible) that models of certain kinds of discovery can be formulated that will systematize for computer ap@ication the intertwined activities of inferential search and literature (i.e. knowledge) search, The Meta-DENDRAL program (that has formulated new rules of fragmentation in mass spectrometry) and the Futl program (that conjectured some not-so- new objects and theorems in number theory) are demonstrable precursors of this type of knowledge-acquiring program, 22 Overview Set I.C. living We envision a National Library of Medicine that will be a library of the knowledge of medicine and biology, not merely the repsitory of texts, journals, and articles and not merely the immense file of their electronic images terminals. available at 23 sec. II. Project 1 II, CODIFICATION .AND USE OF MEDICAL KNcwLEM;E IN oNcoLcm e -- - 1I.A. Introduction II.A.l. Objectives The long term objective of our research effort is the development of tools for the representation and use of medical knowledge in computer-based clinical consultation systems. Such systems will provide useful assistance to primary care physicians while incorporating features that heighten the acceptability of the systems to their intended users. We also wish to increase our understanding of the logic of medical diagnosis and therapy planning through this work. To that end we propse a five year research effort with the following goals: (1) to demonstrate that a rule-based consultation system with explanation capabilities can be usefully applied and gain acceptance in a busy clinical environment: (2) to improve the tools currently available, and to develop new tools, for building knowledge-based expert systems for medical consultation; (3) to establish both an effective relationship with a (3) to establish both an effective relationship with a specific group of physicians, and a scientific specific group of physicians, and a scientific will together will together foundation, that foundation, that facilitate future research and implementation of facilitate future research and implementation of computer-based tools for clinical decision making, computer-based tools for clinical decision making, The basic research will build on our group's prior experience with a computer-based consultant, termed MYCIN, that uses production rule symbolic reasoning techniques to assist in therapy 'selection for patients with serious infections. The domain we have selected for the first clinical implementation of these techniques is the management of research therapy protocols for cancer outpatients at Stanford Medical Center-s new oncology day-care center. 24 Project 1 Set II.A. II.A.2. Background This research builds on a long history of work on the MYCIN and EWCIN projects directed principally by Shortliffe and Buchanan. -Many of the persons developing those systems will be involved with the research proposed here. ,These two projects are described elsewhere and thus need not be described here as well. II.A.2.a. Stanford Division Of Oncology - In the past decade chemotherapv has assumed a more important role in the treaiment of patients with cancer. Some 2,000 patients are under the direct care of the five facultv physicians of Stanford's Division of Cncolcgy in the Department of Medicine. Most patients are receiving care on an outnatient basis, either at the Debbie Probst Oncology Day Care Center in Stanford Hospital or at the Division's twice-mreekly clinic at the Palo Alto Veterans Administration Hospital. Altogether, about 9,000 outpatient visits are made to the Division physicians each year. Effective management of cancer often involves more than one therapeutic technique. Increasingly, the initial course of treatment utilizes a combined modality approach. radiation may be Surgery and/or remaining c,ancer. follow4 by chemotherapy to control any However, chemotherapy alone may be curative in some cases. Rafined programs (protocols) have been developed for the administration of radiation and chemotherapy for many forms of cancer. The Division has had particular success with those used qainst Hodgkin's disease (the sixth most commcn cancer) and other lymphomas. In designing and carrying out individual programs of treatment, the physicians of the Division of Oncology work closely with Stanford specialists in other areas, particularly radiotherapists, surgeons, pathologists, diagnostic radiologists, pharmacologists, and immunologists. Stanford's expertise in these many discipline s contributes to the high level of care received by patients 'm the Division of Oncology. The Division is of course aiso involved in educating and training physicians on all levels, from medical students to . * practicing physicians. Among the trainees are nine clinical feiiows in oncology who participate actively in both clinical research and patient c2re. Five physician specialists and pr:vate physicians are involved directly with patient c3re in the 25 Sec. 1I.A. Project 1 Debbie Probst Day Care Center. Numerous others participate in the protocol studies. The Division of Cncology also firmly be1 ieves that excellence in patient care and in teaching programs is best achieved where there is a continuing pursuit of new knowledge. Each of the six full-time faculty members in the Division is actively engaged in cancer research. The clinical research efforts are concerned with the refinement and development of more effective methods of treatment. New chemotherapy is being sought and tested. Better combinations of chemotherapy, and of chemotherapy with other methods (=rgery, radiation, immunotherapy) , are also being developed. Debbie Probst Oncology Day Care Center The Division's new, modern, outpatient clinic was designed in response to the physical and emotional needs of cancer patients undergoing chemotherapy. Located on the lower level of the Stanford Bospital, it is designed as a self-contained unit, convenient and comfortable for both patients and attending medical personnel. Three kinds of treatment rooms are provided, including some for observation or for lengthy (six tb eight hours) infusions that formerly had required hospitalization. Efficient service to patients is facilitated by a television monitoring system (see discussion of Motorola system below) , a computer-based medical record system (see discussion of TOD below), and facilities for preparing chemotherapy, analyzing blood , and viewing x-rays. Information Display System When the Q~~ology Day Care Center was designed, plans ware made for an automated scheduling and information display system. This system was developed in conjunction with the Botorola company and is now in operation in the clinic. The microprocessor-based systa signals alphanumeric VideO information to remote locations via video cables. scheduling secretaries keep appointment records on an associated floppy disc, and on any given day four video display monitors in the oncology conference room are used to display the day's schedule, relevant lab test results for the outpatients being seen that day, room assignments, and the name of the oncologist who will be attending each patient. At present all data are entered by secretarial personnel and there is no hands-on interaction between the physicians and the small computer. 26 Project 1 Tiiie-Or iented Databank (`KID) Set 1I.A. For the last several years the Division of Oncology has also been using the time-oriented record keeping system (`KID) originally designed by Dr. 3. Fries for use in the Stanford Innnunolcgy Clinic [25], [58]. The data and all TOD programs are stored in the Stanford campus computer facility, an IBM 370/168. The emphasis in the design of the TOD system has been the analysis of large amounts of data on a body of similar patients, not on interactive record keeping in the clinical setting itself. Thus there are large amounts of data on Stanford oncology patients, stored by dates of clinic visits, kept on this distant computer for retrospective analysis. TOD provides several programs for statistical analysis of correlations, assessing prognosis by attribute matching, and assisting with other tasks that have traditionally required arduous chart review. Since the data are not currently being used for the care of individual patients, there *aY be a time lag of weeks before transcriptionists extract the relevant data from paper-based oncology outpatient charts and enter them into the TOD databank. Oncolouy Treatment Protocols As mentioned above, the Division of Oncology is active in clinical research and has .mny patients being treated under research protocols. There are currently about 30 operational protocols, about half of which are active in the sense that several patients are enrolled in the treatznent plan at any given time. Nany of the protocols are designed and overseen by Stanford oncologists, but there are also cooperative studies involving Stanford and several other institutions. In many cases, the cooperative studies are overseen by the Northern California Oncology Group (XIX) which has its headquarters very near the Stanford campus. Each protocol is described by a lengthy article, of ten 45-60 pws r that explains the justification for the therapeutic aperoach, outlines criteria for patient selection for the st*3y, describes therapeutic options, and details the specific chemotherapy doses, dose modif ication, and laboratory and clinical data that must be obtained on each visit. It is quite impossible for any single individual to know the details of all 30 protocols. This is a particularly great problem because the physicians seeing oncology outpatients illClUd2 fellows, residents, and m& ical students; these individuals have limited oncology experience and, in the case of house staff and students, generally rotate through oncoloqy for only 4-8 weeks at a time. (See [?l] for discuss ion of one approach which emphasizes use by primary care physicians, but has not emphasized a well-designed human interface.) 27 Sec. I1.A. Project 1 II.A.3. Rationale The rationale for the proposed research has largely been described in previous sections. In short, there has been limited success of statistical, data retrieval, and decision analysis programs in dealing with the judgmental knowledge of expert physicians and the uncertainties of medical data. We have made encouraging strides in the development of symbolic reasoning techniques for application to clinical decision making and believe that the time is now appropriate for the clinical implementation of such a system. to assess the power only then will it be possible of capabilities which have been designed to make consultation systems acceptable to physicians. recognize that the short term impact of such systems Although we is limited by the current state of the art in computer science, the impetus for appropriate basic research and development of new interactive techniques will come largely through the lessons learned in undertaking'clinical implementations. exist that have potential Since techniques already for considerable short-term clinical impact, we believe it is now appropriate to spend part of our time on a project for clinical use. Although our interest is in the development of systems for offering 3 practitioners, kind of subspecialty expertise to primary care the initial application selected has been the management of complex therapy protocol information in an outpatient oncology clinic. This domain was selected for a number of reasons: (1) There are large amounts of information in the protocols but relatively little inferential complexity; those problems that have prevented us from attempting clincal implementation of the MYCIN System for infectious diseases can therefore largely be avoided. (2) There is a small core of faculty members and oncology fellows who are largely responsible for patients in the day care center. Hence a relatively small number of individuals will need to be introduced to the consultation system, and their continuing roles in the clinic will heighten their comfortable with computer-based techniques. chances of becoming (3) There is already an awareness of, and involvement with, computers in the Oncology Day Care Center (in the form of the information display system previously described and associated video display monitors). Thus, although there is not yet hands- on computer use by oncologists in the clinic, computer-related hardware is evident and accepted by the clinicians at the outset 28 Project 1 Set 1I.A. of the proposed research. Many fellows and faculty also use the TOD system for clinical research and thus have limited, but very positive, experience with computer use. (4) Although the application of symbolic reasoning techniques to the protocol management problem will not tax many of the capabilities we have developed in the MYCIN context, it is precisely this simplicity which makes the problem appealing as a first clinical venture. If the information handling task can be implemented relatively easily within the EMYCIN formalism, as we expect it can, then we will be able to concentrate initially on issues of making the system's reasoning and 'knowledge base understandable as well as making the system's interaction acceptable to physicians. (5) The initial investment in establishing a role for interactive computing in the oncology outpatient setting at Stanford will have considerable potential for facilitating interactions between our protocol management system and the Division of Oncology's current computer-related efforts (the information display system, and the time-oriented databank). We envision some challenging extensions to the consultation program whereby physicians interacting with the protocol management system may simultaneously 'benefit from direct connections between our computer and the other oncology systems. 29 Sec. 1I.B. Project 1 1I.B. We propose core research as well as new demonstrations of the clinical usefulness of present capabilities developed under LMYCIN research. As has been discussed, we have identified an important clinical problem in the outpatient oncology clinic at Stanford, and have begun a collaboration with members of the oncology division to develop and implement a Protocol Management System (PMS) for use in the oncology clinic. Our proposal is to demonstrate that computer-based reasoning and interactive techniqes developed during MYCIN research can be effectively applied to an important clinical problem, namely the management of oncology protocol data. The infectious disease domain with which we have been involved to date involves complex reasoning and computing problems that we feel prevent the short term development of a clinically useful infectious disease consultation system. The oncology problem, on the other hand, involves large amounts of knowledge but rather simple reasoning that current techniques should be able to manage effectively. The complexities of infectious diseases, however, have provided a particularly appropriate domain for devising new computing approaches while analyzing clinical reasoning. These difficult problems remain major research interests of our group. Fje propose spending approximately half our time continuing to work on basic tools for expert medical consultation systems , using the current content of the infectious disease knowledge base without any efforts to extend its scope in the short term. Specifically, our aims during the five years of propsed research are: Artificial Intelligence Objectives (1) To implement and evaluate recently developed techniques designed to make computer technology more natural and acceptable to physicians; (2) To extend the methods of rule-based consultation systems to interact with a large database of clinical information; 30 Project 1 Set 1I.B. (3) To continue basic research into the following problem areas: mechanisms for handling time relationships, technigues for quantifying uncertainty and interfacing such measures with a production rule methodology, interactively approaches to acquiring knowledge from clinical experts. These are some of the problems we have identified that have prevented the MYCIN infectious disease application from being clinically implemented as yet. Gncology Clinic Objectives We plan to develop and implement a Protocol Nanagement System (PI%), for use in the oncology day care center, with the following capabilities: (1) To assist with identification of current protocols that .msy apply to a given patient; (2) To assist with determining a patient's eligibility for 3 given protocol; (3) To provide detail& information on response to guestions from clinic personnel; protocols in (4) To assist with chemotherapy dose selection and attenuation for a given patient; (5) To provide reminders, at appropriate intervals, of follow-up tests and films required by the protocol in which a given ,Datient is enrolled; (6) To reason about managing current patients in light of stored data from previous visits of (a) the individual patients (b) the aggregate of all "similar" patients. .Mvantages over present paper-based protocol files: (1) Can be kept readily accessible and upto-date; (2) Can provide customized patient-yecific calculations and advice not possible with 3 manual system; 31 Sec. 1I.B. Project 1 (3) May be augmented to provide important additional capabilities once interfaced with a patient data base (e.g., the time-oriented data bank [TOD] already used for retrospective data analysis by the oncology division) ; (4) can provide customized explanations of protocol information and the specific recommendations made by the management system; (5) Can improve the quality of clinical research by encouraging enrollment of all patients in an appropriate protocol, and assuring that necessary data are obtained to assure uniformity of information on patients in the individual study groups; (6) Can improve the quality of patient care by: (6a) Saving time by making protocol information easily available, thus decreasing the waiting time patients must now occasionally sustain while physicians track down necessary protocol information; (6b) Making certain that important tests are done to screen for potentially serious toxicity of the powerful agents used in cancer chemotherapy. 32 Project 1 Set I1.C. 1I.C. b!ethods II.C.1. Overview Our general approach to the research will be to emulate the organizational and technical framework used during development of several interdisciplinary computing efforts involving Stanford's Heuristic Programming Project (HPP), of which Prof. Buchanan is co-director. The cohesiveness of project workers has always been facilitated by a weekly group meeting in addition to smaller working sessions at other times. science At group meetings both computer and clinical personnel have opwrtunities to present their work and give and receive suggestions regarding further efforts. We believe it is important that the physicians and computer scientists get to know each other and their motivations for involvement in the project very well. computer scientists working on For example, the deal about infectious diseases, MYCIN have all learned a great and some have even taken formal courses in microbiology at the medical school. clinicians have been encouraged to understand Similarly, the the prcqram in depth and even to try some programming. We would expect similar relationships to develop among the computer scientists and oncologists working on the proposed research. can both computer Only in this way science and clinical concerns be taken adequately into account duringTptem design and implementation. In addition to the development of the EFlS for the oncolouy clinic, we anticipate continued research into the basic science issues discussed previously. As has been noted, we have already identified several problems that must reasoning be solved before complex programs such as MYCIN can be made available for clinical use. We also anticipate that work in the oncolcuy domain will uncover new problems, not previously encountered, that may require significant modification or redesign of the DIYCIN formalism. interrelated efforts: Thus we envision two parallel but highly (1) development of the PMS for the oncology clinic, using EMYCIX and writing new production r-ules to embody the protocol knowledge that will be needed for consultation sessions: (2) continued mapping of basic science research, from t!he core research section of this program, into the doma iii in order to facilitate oncology problsn complex decision making and acceptable consultations in the clinical setting. 33 sec. 1I.C. Project 1 II.C.2. Oncology Protocol Management System The first year of research on the PMS will be spent developing the program before it is made available in the clinic. Years 2-3 will be devoted to revisions and extensions of the protocol management system in light of initial experience with a knowledge base about oncology. Years 4-5 will be devoted to revisions and extensions of the basic methodology, as well as of the working system, to facilitate use of a clinical data base for patient management in oncology and related disciplines. We expect that the five years will be spent as follows: (1) We will beg' m by selecting the 2 or 3 most frequently used oncology protocols (e.g., Hodgkin's Disease, oat cell carcinoma of the lung, non-Hodgkin's lymphoma) . The extensive knowledge in these documents will be extracted by the oncologists working closely with those who know the EMYCIN formalism well. Although much of the knowledge can be represented in typical EMYCIN production rules, we anticipate that some information may be best contained of 'the schemes. in alternate representation We therefore expect that new techniques for interfacing FHYCIN production rules with tabular data structures may be necessary. or algorithmic Most problems that will arise along these lines should develop during codification of the first few protocols: since the protocols all follow a similar structured format, it is unlikely that new problems will arise when the 29th or 30th protocol is being considered. (2) EMYCIN's knowledge acquisition capabilities remain somewhat rudimentary (see next section), so we expect that most new rules will be explicitly written by members of the research group. (3) Specific attention will be given to extracting knowledge regarding patient eligibility for a protocol, tests and films needed at various stages of treatment, therapeutic alternatives available, and patient-specific indications for modifying or withholding therapy. the protocol details We recognize that these are that are often most difficult for the oncologists to remember or to extract easily from a lengthy written protocol (an up-to-date copy of which may not even be readily available in the clinic). (4) Once the knowledge internal testing by has been codified, we will beqin knowledge interfacing the new production rules and structures interest will with the E?lYCIN program. be the Of particvular adequacy of EMYCIN'S exolanation capabilities when interfaced with this new knowledge has;. 34 Project 1 Set 1I.C. (5) Wdifications will be made to the EWXN system in response to suggestions made by the clinicians working on the project as they gain experience with its caoabilities. Of primary concern will be an assurance that the hum& interface is sufficiently comfortable that the other Division oncologists will be willing to experiment with the system once it is introduced in the clinic. (6) After these first few protocols are operationally managed by the PMS as described, the system will be introduced in the Oncology Day Care Center. to the clinic Orientation sessions will be given oncologists, and suggestions for further refinements solicited. (7) The next 3-5 therapy protocols will then be added to the system, with appropriate when a new protocol notification to clinic physicians is available for Pp!S access. (8) Based on the experience gathered in ctiifying the first several protocols, developed. a protocol-entry system with editor will be This should greatly facilitate the entry of the remaining protocols, which we anticipate should be fully codified by the end of year 2. (9) Anticipating an interfac earlier, plus progress Q with the TOD system described in the basic research that we will be undertaking simultaneously, we will next begin to related data in TOD format within the PMS. store patient- Much of the information in the 'POD Databank is also required by the PMS, so there would be minimal if any additional effort rqdired of the PMS user. (10) Assuming a breakthrough in the representation and management of time-dependent variables, we would anticipate that the PMS capabilities would be greatly augmented by access to patient data stored in TOD format. During Years 4-5 we would attempt to begin the implementation of this kind of interface between TOD and the P?4S. All research described J&ove muld occur on a research computer that could not guarantee reliable service to the oncology clinic. Xe therefore recognize that we cannot initially undertake any tasks crucial to clinic or Division oneration. The clinic must be abie to ccntinue to function even when our tool is unavailable for scheduling or hardware reasons. 35 Sec. I1.C. Project 1 Therefore, when the PMS is ready to progress into a more integral role in clinic operation, we would anticipate, in a separate proposal, the need for a dedicated machine to permit reliable clinic service. We recognize that many of the most interesting and challenging decision making tasks, including those related to the use of symbolic reasoning techniques in conjunction with large databases, can not be made available to clinicians without a dedicated computer, but that this is beyond the scope of the present proposal. 36 Project 2 III. ; 4 WORKBENCH FOR KNXLEEGE REPRESENTATI~ 1II.A. Objectives of the Research and their Significance A- m- Our primary strategy for conducting our investigations has been to allow the problem to condition the choice of scientific paths to be explored. Projects One and Three, dealing with problems in oncology outpatient consultations and with the clinical laboratory , are the newest examples. i We are also motivated, however, by the importance (to us and others) of generalizing our techniques and systematizing our methodology. This is a normal part of the activity of cumulating the results in our science, in which the experiments we choose to generalize upon are the experimental systems we construct for different domains of knowledge. In Computer Science , one effective method of cumulating our growing understanding is construct software packages that are the working manifestation of what we believe we have come to understand. These packages allow us to transfer into tomorrow's yesterday o s "exper imental technique" "tool" for accelerating the research. These pat kages also ai 1OW investigators in other institutions to build rather directly upon tile results of our work, thereby amp1 ifying the science as a whole. It is particularly appropriate to clmulate our knowledge as software packages in the SVYIEX-AIN community in which the users share the same computer and system. We have sought to extract from our various projects the uniformities that have general applicability; to eliminate the ad-hoc features that accrue in any large-scale programming effort; and to build helpful II front-end" inter faces that will allow others to couple smoothly to our work. A number of such packages are beginning to emerge. We prowse to continue their development and test; and to merge them appropriately into a larger software system that (for lack of a better term) we refer to as the "knowledge representation war kbench" . The Stanford group is fortunate to have the collaboration of the Missouri group to act as a test-and-evaluation site for this workbench concept. It is expected that much of the research of ?rojec t Three will be done c&g the emerging "wor!&ench". 37 Sec. 1II.A. Project 2 We propose the following major objectives: 1. To develop AI technology as software packages that solve general classes of problems. 2. To actively disseminate the technology by publication and by encouraging pilot projects using the technology. 3. To apply these packages to medical applications forming collaborations over time as opportunities arise. 1II.B. Background and rationale Artificial intelligence research at the Heuristic Programming Project has concentrated on programs having real- world applications. Each program has been a case study for representing and manipulating the task-specific knowledge for an application. Feigenbaum 1221 has described this case study approach as essential in building a science for "knowledge engineering". Because the cases have been carefully chosen, the experience from this approach has accumulated. For example, the GA1 program [53] was developed recently for inferring DNA structures from enzyme digest data. This prcgram used the Generate-and-Test paradigm - in which the combinatoric output of a complete and canonical generator of possible structures is limited by pruning rules which use the digest data. That basic approach was pioneered by the DENDPAL [ll] program ten years ago. With DENDRAC as an example, the development of this analogous program was completed in only tm months. This example shows how the accumulation of theory speeds the development of new AI programs. Significantly, the Heuristic Programing Project has also accumulated methods - in the form of software packages which can perform specific symbolic computations. These packages are the state-of-the-art tools for applied artificial intelligence. A trained "knowl&ige engineer" can combine these packages to create computer prcgrams for new applications - without having to re-program the solution of standardized subproblems which have been solved before. 38 Project 2 Set 1II.B. E4YCINl is an example of such a package. It is the domain independent core of the MYCIN [51] program for the diagnosis of infectious diseases. E4YCIN provides a framelark for building consultation programs in various domains, It uses a production rule mechanism and backward-chaining control structure during the solution phase and has dialogue production rule knowledge base. facilities for acquiring a EDIYCIN is the PUFF system for An example of an application of disorder. diagnosing pulmonary function PUFF was the product of a collaboration with the Pacific Medical Center in San Francisco. the first version of PCTFF was built in the following way. One hundred cases, carefully chosen to span the variety of disease states were used to extract 55 rules. The knowledge base was created with E?JYCIN and then tested with 150 additional cases. -Agreement between PUFF and the human sxpert was excellent and a later version of PUFF is now in routine use at PK. The first version of PUFF was created in less than 50 hours of interaction with experts at PK and with less than 10 man-weeks of effort engineers. by the 'knowledge Other applications of E4YCIN will be discussed in the Section 1II.C.. The example shows that methods, in the form of usable computer packages, have. now been developed. reflect the commonalities we These packages now perceive applications. among separate They are the recently available tools of applied artificial intelligence - programs providing practical symbolic methods for common problems. Our current repertoire of "methods" packages also include the Unit Package, and AGE-l. The EMYCIN program, as discuss&! above, is based on production rule technology and has been successfully appliecl to diagnosing pulmonary function disorders and consulting on structural analysis in application. an engineering The Unit Package [52] is based on the so-call& "frames" approach and is being applied to experiment planning in molecular genetics. The AGB-1 program is based on the HEARSA,Y 1201 "cooperating knowledge sources" model and is the product of experience with the SU/X and SU/P [43] programs. New applications are currently being developed for each of these packages. Heiser and Brooks at the California at University of Irvine are using EMYCIN to ?ieVdOp a psychopharmacolcgy consultant, termed HBAOMED [34]. Blum f5] has proposed using the Unit Package in a systan which will combine StatiStiCZii methods and artificial intelligence tec;hniques to perform studies on a clinical database. SS7erai other applications have been proposed and are under consideration. I_----- -- 'The nane " EqYCIN" comes from "essential XKIN", the !QJCIN reasoning framework without any domain-specific knowl+e. 39 Sec. 111-B. Project 2 We propose to continue the development and application of these packages and to develop new ones as results become available from core research. 111.8.1. Relating the Workbench to Core Research -- Over the five year course of this research, there will be a movement of topics from core research into developed packages for the workbench. Our overall strategy has two main thrusts: 1. To expand the problem solving capabilities of the workbench by developing more sophisticated methods of symbolic reasoning. 2. 'Ib expand the capabilities of existing packages following core research in other topics - knowledge acquisition, knowledge integration, tutoring, and explanation. This mode of research reflects a bias towards the creation of systems to perform specific tasks. First an approach to problem solving is developed and tested in a task domain. Then research in other topics follows. Three methods of problem solving are discussed in this proposal and elaborated in the following. The simnlest of these is a backwards chaining approach - exemplified in D4YCIN - which links together the premises and conclusions of rules to construct a direct line of reasoning. The next level of sophistication in these packages is represented in the AGE-l which is based on the HEARSAYII [20] architecture. AGE-l allows (1) both data-driven and goal-driven reasoning and (2) reasoning at different levels of abstraction. This architecture has been used effectively by Stanford researchers in a signal-processing application [431* Providing other AI capabilities - such as explanation or knowledge acquisition - is more difficult in AGE-1 than in ENYCIN. The next level of sophistication appears in a proposed "planning package" which is expected to grow out of on-going research in the MOXEN project. This approach to planning formalizes the selection of what to do next as a choice in any of several problem-solving "spaces". The viability of the latter problem- solving method is still being tested and essentially none of the other system capabilities have been developed. The following is a list of several AI issues discussed in this proposal. These will be explored within some formalisms 40 Project 2 Set 1II.B. already developed by us, EMYCIN, AGE-l, and the Unit Package 2 as well as new formalisms,e.g., the Planning Package as the need arises. The planning package is expected to materialize at the end of some core research which is currently in progress. Problem Solving Knowledge Acquisition Explanation Tutoring iKnowledge Compiling Time-Dependence Meta-Knowledge 1II.C. Nethods of procedure - This section describes our plan for creating an integrated collection of well-designed software packwes, which can be combined by a knowledge engineer to meet the needs of a specific application. In this section we will show examples of each of the packages and discuss the nature of their applications. We will also discuss the work proposed for further developing the packages. There is a great deal of overlap in the proposed work among the packages. While the packag es problem solving and differ reflect different approaches to analogous lines in their state of develoment, of research are proposed in each. The PYCIN px kage , which is the most developed, uses approach to problem solving the the simplest and has the broadest range of proposed wxxk following several lines of core research. .&s discussed already in Section III.B.l., similar lines of development are planned later in the grant period for the other pat kages . III.C.1. EMYCIN The BHYCIN ("Essential NYCIM") project is an attemnt to provide a framework for building consultation progr,zns in various domains. It uses the domain-independent comwnents of the LWCI~ 'The Unit Package is a pa=- does not provide ar,y -=ive representation package and used, software for problem-solving. however, as a It is being representation medim for the Planning Psck~ge and can also 'be used in conjunction with AGB-1. Sec. 1II.C. Project 2 system, notably the production' rule mechanism and backward- chaining control structure. Then for each consultation domain, particular the system builder supplies the rules and parameters of that domain to produce a functioning program. Work on the ElMYCIN project is devoted to providing a useful environment for the new system builder, with emphasis on speeding the acquisition and debugging of the knowledge of the new domain. 1II.C.l.a. An Example of EMYCIN - - - The PUFF Application h-P The PUFF system for the interpretation of laboratory measurements from the pulmonary function laboratory. The EMYCIN system was used as base upon which 60 production rules concerning the presence of pulmonary disease were created. The data from over 100 cases were used to create the rules by the pulmonary physiologist in cooperation with the biomedical engineers who instrumented the laboratory and Stanford computer scientists who had previous experience with the MYCIN program. of Figure 1 shows several rules created during the development the system. These rules are used to create a complete report including the input measurements, historical information, and the measurement interpretation. Figure 2 shows a copy of this report. IF 0 < DLCO < 80 (DLCO is the measurement of diffusion capacity for Carbon Monoxide) THEN "Low diffusing capacity indicates loss of alveolar capillary surface which is " IF 70 <= DLCO < 80 THEN "mild" IF 60 <= DLCO < 70 THEN "moderate" IF 0 <= DLCO < 60 THEN "severe" IF The severity of obstructive airways disease of the patient is greater than or equal to mild, and The degree of diffusion defect of the patient is greater than or egual to mild, and The total lung capacity measured by the body box (TLCB) is greater than 110 percent of predicted, THEN "The low diffusing capacity, in combination with obstruction and a high Total Lung Capacity, would be consistent with a diagnosis of emphysema." The subtype of obstructive airways disease is emphysema. Figure 1. Typical PUFF interpretation rules. Conclusions are made for internal system use and for inclusion in the summary. 42 Project 2 Set 1II.C. PRES%YTERIAN HOSPITAL OF PMC DOE JANE 5S2 CLAY AND BUCHANAN, BOX 7999 P336666. SAN FRANCISCO, CA. 94120 DR. =ITH, JOHN PUWONARYFUNCTION LAB KJl' 56.7 KG, HT 166 CM, AGE 58 SEX F SMOKING 40 PK YRS,CIG 1.0 PK QUIT O,PIPE 0 QUIT 0, CIGAR 0 QUIT 0 DYSPN.EA-W/MILD-MOD. EXER, COUGH-NO , SPUTUM-LT 1 TBS, MEDS-YES REFERRAL DX-CORONARYAPTERY DISEASE , PRE OP ****+t*tt***f*****************X**ST DAm 10-26-78 PREDICTED POST DILATION (+/-SD) OBSER(%PRED) OBSER(%PRED) INSPIR VITAL CAP (WC) L 3.1(0.4) 3.0 ( 98) RESIDUAL VOL (RV) L 2.1(0.3) 3.0 (140) 3.5 (166 'Y?ITAL LUNG CAP mJJ3 L 5,2(0.7) 6.0 (116) 6.5 1 (125 RV/TLC % 40. 49. 53, FORCED EXPIR VOL(FEV1) L 2.6(0.3) 2.1 ( 81) 2-l t 84) FORCED VITAL CAP (FVC) L 3.1(0.4) 2.9 ( 95) 3.3 ( 98) FEWFVC % 83. 70. 71. FORCED EXP Fm 200-1200L/S 4.2(0.8) 4.5 4.4 FORCED EXP FLCW 25-75% L/S 2*9(0.7) 1.5 1.5 FORCED INS FLOW 200-1200L/S 2.9(0.6) 2.9 2.9 AIRWAY RESIST(RAW) (TLC= 6.0) l.l(O.5) 1.6 (WZI) 1.4 DF CAP-HGB=14.4 (TLC= 5.3) 25. 17.2 ( 68) ( 69%IF TLC= 5.2) *******************~***********************~******************* INTERPRETATION: Elevated lung volumes indicate overinflation. In addition, the RV/TLC ratio is increased, suggesting a mild degree of air trapping. Forced vital capacity is normal but the FEVl/FVC ratio is obstruction of a mild degree. reduced, suggesting airway Reduced mid-expiratory flow indicates mild airway obstruction. Obstruction is indicated by curvature in the flow-volume loop of a small degree, Following bronchodilation, the expired flow shows slight improvement. This is confir!& by the lack of change in airway resistance. The low diffusing capacity indicates a loss of alveolar capillary surface, which is moderate. CONCLUSIONS: The low diffusing capacity, in combination with obstruction and a high total lung capacity would be consistent with a diagnosis of emphysema. The patient-s airway obstruction may be caused by smoking. Discontinuation of smoking should help reiieve the symptoms. PUDCWRY FUNCTION DIAaOSIS: 1. Mild Obstructive Airways Disease. Emphysematous type. Figure 2. Robert Fallat, M.D. Sample PC'FF Report 13 Sec. 1II.C. Project 2 1II.C.l.b. Applications of EMYCIN w- To date, EMYCIN has been successfully applied at Stanford to the domains of pulm nary function (PUFF) [37] and structural analysis (SACON) [3]. 9 EKKIN is also being applied to clinical psychopharmacology [34] at the University of California at Irvine. 1II.C.l.c. Proposed Work for EMYCIN --- SYSTEX-BUILDING MOLS 1) Acquisition of Knowledge - Acquire the 'SACON (Structural Analysis Consultation): The purpose of the consultation ' 1s to provide advice to a structural enuineer regarding the use of a structural analysis program called~KAX, The LMARC program uses finite-element analysis technigues to simulate the mechanical behavior of objects. The engineer typically knows what he wants the MARC program to do, e.g. examine the behavior of a specific structure under expected loading conditions, but does not know how the simulation program should be set up to do it. The MARC program offers a large (and, to the novice, bewildering) choice of analysis methods, material properties, and geometries that may be used to model the structure of interest. The user must learn to select from these options an appropriate subset that will simulate the correct physical behavior, preserve the desired accuracy, and minimize the (typically large) computational cost. The goal of the SACON program is to bridge this gap, by recommending an analysis strategy. This advice can then be used to direct the MARC user in the choice of specific input data, e.g. nmerical methods and material properties. The performance of the SACON program matches that of a human consultant for the limited domain of structural analysis problems that was initially selected. To bring the SACCN program to its present level of performance, about two man-months of the expert's time *ere reguired to explicate his task as a consultant and formulate the knowledge base , and about the same amount of time implementing and testing the rules (this estimate does not include the necessary time devoted to meetings, formulation, demonstrations and report writing). problem 44 Project 2 Set i1I.C. framework, vocabulary, and decision rules of the domain from the expert. 2) Rule Checking - Check syntax and semantics of new rules and check for possible conflict with existing rules. 3) Alternative Models for Reasoning under Uncertainty - Provide the system builder with a fixed set of alternative methods for propagating degrees of certainty in the reasoning chains. a~ Time-Dependent Features - Enable the systemi to make use of parameters whose values change with time. 5) ?leta Knowledge - Add capabilities for using meta-rules and other meta-level knowledge. In addition, we propose extending the power and flexibility of the present system in the following ways: K?MAI?J-mDE?ENDENT CONSUL,TATION SYSTEM 1) Answering Questions - Incorporate guestion- answering capabilities into the system. 2) Tutoring - Couple the system to a tutoring program to teach the contents of the knowledge base. Many of these items involve substantial research before we understand the best way to add them to the program or even what, precisely, needs to be added. We present below our best ideas on the approach we will take, but wish to emphasize that the nature of the solution may change as our research progresses. The products of the research will be presented in scientific papers and in an intagrated ccmputer program that can be useJ by scientists to encode their own 'knowltige of their domains for reasoning about difficult problems. 45 Sec. 1II.C. Project 2 1II.C.l.d. Acquisition of Knowledge - The preliminary facilities for acquiring knowledge (called TEIFESIAS [Davis76]) developed in the context of the MYCIN application will be incorporated into EBIYCIN for use by experts when building any consultation system. This facility will allow an expert to specify the major parameters of a consultation. Then, following a consultation, the system will show the expert the values of these parameters, and ask for verification that they are correct. If the values are not correct, the system will explain to the expert the line of reasoning that led to the incorrect values. This allows the expert to pinpoint an error in the system-s rule set, which the expert can then repair by adding, deleting, or modifying rules. In addition to incorporating the existing rule-acquisition facility, we plan to automate the aquisition of a large portion of the initial knowledge that is required in building a consultation system. The system will prompt an expert through an intermediary for the conceptual framewOrk, vocabulary, and major lines of reasoning of the domain before any rules are enter&. The conceptual framework includes the definition and hierarchy of objects or states that will be used to structure the reasoning process (called the "context tree") as well as the attributes and values of these objects that will be used for writing rules. Nlrmerous internal pointers needed for correct associations among concepts will be set up autcnnatically at this time. Improvements to Teiresias - The TEIPESIAS facility, for interactively debugging the rule base, is most useful when the knowledge base is reasonably well developed and the necessary changes to the rule and parameter base are small. This facility is currently being improved primarily by using the existing question-answering system to explain the systemas lines of reasoning [48], and by using a new English parser based on a semantic grammar to understand any rule additions or changes from the expert [8]. An EXYCIN sketchpad As a result of our recent etiperience eliciting - arule base for structural mechanics [31, we have found it useful to characterize the knowledge acquisition process as occzurring in a nWer of distinct phases. The first phase corresponds to making initial decisions about the typical advice the consultant will give and the major reasoning steps the consultant will use. 46 Project 2 Set II1.C. This is followed by an extended period of defining Farameters and objects and then, using developing this initial domain vocabulary, a substantial Fortion of the rule base. This process, lasting approximately 2 months in the structural analysis case , captures enough domain expertise to allow the consultation system to give advice on the large number of common cases. In the final phase, further interactions with the expert tend to refine and adjust the established rule base, primarily to handle more obscure or complicat4 cases. Future research on knowledge acquisition will explore the design and implementation of interactive facilities to 'be usti during the early phases of the knowM!ge base design. In particular, methods will be developad for rapidly acquiring and manipulating definitions of the context tree of objects, their major parameters, as well as the major problem solving strategies to be used by the consultant. During the initial passes at defining objects, the system muld begin to acquire some detail about the actual methods (the rule sets) that will be used to reason about the major parameters of the consultation. For each of these parmeters the expert typically knows what major factors and subgoals will be relevant to concluding the parameter. These factors can be specified by the expert, but need not be acquired in detail until the system actually must begin gathering the rules for determining these imyrtant parameters. In this manner, the expert can be free to concentrate on the more general process without having to aspects of the problem solving be bothered with the spzification of detail. Using the EMYCIN sketchpad, the expert and intermediary would develop and acquire substantial Iprtions of the knowledge base and an explicit representation of the overall reasoning strategy that the program will use to advise about the user's problem. This framework and knowledge of overall strategy can be used later to motivate explanations of the system*s lines of reasoning produced by the question-answering system, We intend to investigate ways that this knowledge about the major parameters could be used by TEIEESIE (during the later phases of the knowledge acc@sition process) particular, to explain how and why a incorrect conclusion was made. Rule Caecking Sec. 1II.C. Project 2 While the production rule format permits any executable LISP expression as the premise or action of a rule, not all LISP forms make reasonable rules. Common syntactic errors include misspellings, misplaced argments, parenthesis errors and incorrect classification of the rule; such errors generally result from inaccurately inputting the rule, and if left undetected, may cause the rule to fail, or even cause runtime errors. Seznantic errors can result if a new rule is inconsistent with existing rules, or is incomplete, failing to take into account all the factors necessary for the conclusion. We plan to do extensive checking of each new rule entered into the system. We hope thereby to catch most errors at rule entry time, rather than finding them during later consultation runs when it is harder to (a) isolate the effects of a faulty rule and (b) correct any problems which result. Syntactic checking is fairly straightforward, The rule checker needs to know about the syntax of each argument to the predicates which make up a rule. form of predicate templates, This knowledge exists in the which have long been used by other parts of the system to "read" rules. The rule checker's use for them is, in effect, to make sure the rules are 'readable". For example, the template for the predicate SAME is (SAME CNTXT PAIMVALUE), for which a typical instance from the infectious disease domain might be (SAME CNTXT IDENT E.COLI). 'Ihe rule checker knows from this that a call to SAME should have three arguments: the first must be a legal "context atom", i.e., a variable used to select a binding in the context tree, the second must be a parameter, and the third must be a legal value for that parameter. If any of these is incorrect, the error is easily detectable, and in many cases correctable. Simple spelling errors may be corrected by invoking INTERLISP*s spelling corrector , using an appropriate spelling list: e.g., for the PAR4 slot use the list of all parameters, for the VALUE slot use the list of values legal for the parameter appearing in the PEml slot. (typically a Transposed arguments and spurious extra argluments result of parenthesis errors) detected by checking against the template. are also easily Another common syntactic error is incorrect classification of a rule, i.e., specification of what type of context apply to. it may In many cases it is possible for a rule checker to completely determine the correctly classification, simply by observing which parameters appear in the rule and comparing with the known structure of the context tree. At worst, the checker 48 Project 2 SPC 1II.C. could narrow down the possibilities to a small set of nodes of parallel structure. More subtle errors arise from fundamental "semantic" errors in a new rule, and the processing required to detect such errors is correspondingly more complex. error is inconsistency of a new One major type of semantic rule with existing rules. One rule might subsume another, i.e., another. one premise is For example, with the two rules implied by A -> x A & B -> X, the first subsumes the second, The error here is that if the second rule succeeds, the first will also, and the information A is contributing twice to the conclusion X. Our mcdel is predicatd on certainty factor rule premises being independent; subsumption is a blatant violation of that assumption. -Another possibility is that one rule might contradict another rule or rules. This is trickier. rules Certainly the two A -> X A -> 'X contradict each other. fairly unlikely: more But such obvious contradictions are subtle interactions can occur. For example, given a set of rules A->B, B->C A -> D, D -> -C it is difficult to determine whether there is a contradiction except in the special case that all the rules have definite conclusions (CF=1.0). But if the confidence attached to those conclusions is less than definite, there may be no direct contradiction at all, merely conflicting tendencies, perfectlv admissible under our certainty factor model. 'We investigate means of plan to analyzing rules to uncover contradictions, measure how great a conflict may exist, pssible to determine if the conflict is a real problem. and ways -Another type of semantic error may occur if a rule fails to take into account all the information relevant to a conclusion. The system can sometimes detect this by means of r1ul.e models, which currently consist of statistical observatiz of the correlation of parameter occurTences in existing rules [15]. These rlule models are constructed automatically by reading the rlules * As a tyflical use also mention , if rules mentioning parameter x usually Tarameter confirmation of a y, then the system might new rule which considers request increase the only x. ric.hness of the rule model language, ;Ve plan to to enable better semantic checking of the user*s rules, especialiv during early acquisition phases, when there do not exist sufficient rules to Zorin useful rule RlOdeiS on purelv statistical arou.rd-- 2 A L. 19 sec. II1.C. Project 2 For example, the user might wish to describe in some brief fashion the sort of rules he is about to enter, and the system could then make sure the rules are actually consistent with the userOs model, 1II.C.l.e. Alternative Models for Uncertainty Reasoning- under The method developed for ranking NYCIN's hypotheses based on measures of certainty is an approximate method. It developed from a pragmatic need for measuring the degree of confirmation of a hypothesis based on several non-independent (partially overlapping) pieces of evidence. The certainty factor (CF) model discussed above is a means of combining single "certainty factors" associated with each inference to arrive at a reasonable measure of how strongly the evidence supports each hypothesis. It is reasonably simple to understand. However, its main drawback lies in the difficulty of associating a CF with a single rule. Because the rules are not independent, the CFs are also not independent. This means that adding a new rule involves looking at similar rules in order to decide how high the CF ought to be set. For some experts (or problem areas), CFs seem to be more difficult to use than for others. Thus we propose to offer the system builder a choic e of evidence accumulation methods. One of them will be the CF scheme already in use. A second will be the likelihood ratio scheme used in the PROSPECTOR system [la], although that requires storing two measures with every inference: P[H/E] and P[H/-E]. A third method will be a very simple additive measure with thresholding , as proposed by one of the physicians working with MYCIN. In this model, measures of positive and negative evidence are added and subtracted into a total for each hypothesis, with action taken on the hypotheses in the end that lie above the threshold. Under other funding we are exploring other relationships between evidence and hyptheses. As measures are found that can be fit to new problem-areas we will find ways of adding them to the set of available confirmation methods. The impxtant pint 5ere is to give the systPJn builder a choice of evidence accxmulation sche-nes, any of which can !x usefi in EM!KIN. 50 Project 2 Set 1II.C. Time-Dependent Features A consultation system built under the current design of EMYCIN takes a snapshot of the available information about a case and makes a one-time evaluation of the situation. In cases where the nature of the diagnosis or repair is strongly dependent on an understanding of the proces? of failure over time, this static approach to the problem is inadequate. No provision is made in the present system for considering the same case later when more several days information is available or when the values of some parameters have changed. The system also lacks a mechanism for dealing with parameters whose values vary with time. In many domains, time considerations may be crucial to the solution of even the simplest problem. For example, it might be critical to track the values of various parameters over a period of time, or to check what value existed at a particular time in the past. In order to increase the number of domains in which EWKIN systems will be useful, we plan to add two new features. The first is a "restart" mechanism that will allow a user to run a follow-up consultation on a stored case, adding information that has become available since the - original consultation, and correcting old answers that are no longer accurate. The second is to expand the syntax and semantics of rules to deal with values of parameters changing over time. Follow-up Consultations The builder of an EMYCIN system should be able to specify which parameters are likely to change for a given case from one consultation to the next. In a follow-up consultation, the system should summarize its knowledge of the case and do the following three things: 1) ask whether new information is available for any of the parameters which are subject to change, and prompt for the new answers; 2) ask whether values are known for any of the paremeters whose values were UN'KNW at the time of the previous consultation, and prompt for the new answers; 51 Sec. 1II.C. Project 2 3) allow the user to specify changes which may have occurred in the values of any other ¶meters ( viz., change). those which do not usually EWending the Rule Syntax and Semantics to Deal with Time Relations - - -w -- The builder of an ENYCIN system should be asked to classify parameters according to their stability over time. classification scheme is shown below. A possible 1) Constant - value is always the same (e.g., Name and Sex of medical patients) 2) Regularly changing - new value is available at regular intervals; there will be several values stored for the parameter, each with a time (e.g., barometric pressure at a certain city) 3) Gradually refined time, - value is likely to change over from unknown to uncertain to definite (e.g., Identity of an organism growing on a culture plate) Parameters of the first type are the typical case that Dl!KIN now handles. For the second type, a time must be kept with each value-CF pair. The third type of parameter will typically change from one consultation to the next, and previous values will be discarded as new information becomes available. New PREMISE and ACTION functions must be defined so that EMYCIN rules can handle time-varying parameters. Functions will be needed to test and conclude (a) the value of a parameter at a given time, (b) the duration of a particular condition (e.g., it has been raining for three hours), and (c) trends in the values of numeric parameters (e.g., the volume of water in the tank has increased within the last hour). As we test EMYCIN in different domains, we may discover other types of tests and conclusions that must be made on time-dependent parameters. Add Capabilities for Using Meta-Rules and other Neta-Level Knowlez~ -- Cur preliminary research with meta-level knowledge [E] as 52 Projet 2 Set II1.C. well as, our preliminary experience with the GUIDQN tutorial program has shown the importance of acquiring, using and teaching structural and strategic meta-knowledge, as well as the domain rules. Structural meta-knowledge provides a framework that sets the context for domain rules, rules memorable to a student. and in tutoring helps make the It might include patterns and principles that are made specific by groups of rules. Strategic meta-knowledge constitutes planning knowledge for using the rules to solve different problems [10]. This meta-knowledge is written as meta-rVules and takes the form of strategies and domain-dependent diagnostic reasoning approaches for efficient consideration of a case. In o'ur work with EMYCIN, we will explore various kinds of structural and strategic meta-knowledge that is appropriate to the production rule representation and useful for explaining decisions made by the program (to a consultation user or a student). We will start by implementing in EXYCIN the capabilities for using the meta-level 'knowledge described by Davis: meta-rules to be used for pruning and reordering the object-level rules, and meta-level models of rule sets that aid in debugging (and tutoring) the domain knowledge. Experience with MYCIN programs like HEADMED and PUFF will provide us with particularly useful case forms of meta-knowledge. stdies of possible Incorporating Question-Answering Facilities into the System In order to make the questions-answering facility available to an EXYCIN consultation system, the system must be provided with a dictionary of synonyms and a list of definitions of the important concepts in the its domain of dictionary will contain expertise. The common synonyms in the domain, pointers between English words and parameters, and common phrases in the domain that can be given a single specified meaning. We will provide a facility for automatically constructing a dictionary from the parameters in the knowledge base. The system builder will also be able to add synonyms and fill in parts of the dictionary that cannot be created automatically. This should provide all the information necessary for answering standard questions about the consultation system. The kinds of questions that the system will be able to answer are: 1) the value of a parzmeter sec. II1.C. Project 2 2) how a parameter was used or concluded in the consultation 3) how a parameter is used or concluded in general 4) how a rule was used in the consultation 5) why a question was asked during the consultation 6) the translation (into English) of a rule 7) the definition of a concept These question forms. types will be recognized in a variety of For example, all of the following will be taken to be equivalent ways of asking for the value of a parameter 1) What is the value of X? 2) Is Y the value of X? 3) WhatisX? 4) Do you know what X is? The major benefits of providing these capabilities are that the user of a consultation system can understand the reasoning and the designer of the system can find the sources of reasoning errors. Coupling a Tutorial Svstem to EMYCM --- Work on the idea of automatic "Transfer of Expertise" from a human expert to a program [22], [15] has led to important advances in the representation of knowledge within the program. These advances have allowed the systems to explain their reasoning process to users, thus providing the basis for a tutor ial Frcqrmn. We have been building an intelligent computer aided instruction (ICAI) program [12] that guides a subject throFh problems in a complex domain with the goal of transferring the system's knowl&.ge of the domain to the student. 54 Project 2 Set III.C, Current ICAI techniques like planning mcdelling the the discourse, student, and teaching problem solving strategies all take a natural form in our system. In turn, the system serves as an excellent environment for experimenting with unsolved problems in the design of computer-based tutoring. We have demonstrated the feasibility of using the MYCIN knowledge base for teaching as well as for consultation, and this aspect of our research will be continuing during the grant period under separate funding4. We have not yet demonstrated the generality of the tutorial program GUIWN, in other domains; but avoid& introducing we have meticulously any domain-specific knowledge into GUIDON's control structure and teaching strategies. We believe that its design is as general as MYCIN's. Thus, all that is needed for tutoring in another domain will be (a) domain rules for EFIYCIN to use on cases which GUIDON can discuss and (b) domain specific meta-level knowledge that would be useful rules. for teaching these Moreover, we must keep the tutoring stritegies of GUIDON coupled to the representation of EMYCIN systems that we wish to tutor. III.C.2. Am-1 The basic idea behind AGE-i is to generalize the ideas found in specific problem-solving systems and make them available in a package - hence the name AGE, for "Attempt to GRneralize". AGE-l takes an active role in assisting a knowledge engineer in constructing a performance system. The soecific model that is incorporated in AGE-l - the model" - "cooperating knowledge sources was pioneered in the HEARSAY11 system ([20], speech understanding. [33]) for It was further developed by Stanford researchers in two data interpretation problems - SU/X and SU/P (otherwise known as HASP and CRYSALIS) [43]. III.C.2.a. Exampies from AGE-l m- The CRYSALIS program [19] is a 'knowldge-based program being developed in collaboration with the C,iiifornia at San Diego. University of from X-ray crys Its task is to infer protein structure tallcgraphy data. This program was deveiow in -- -- 4Joint proposal to Office of Naval Research, Personnel and Training Division and Advanced Research Projects Agency. Sec. 1II.C. Project 2 close collaboration with the AGE group at Stanford and has been using a very similar problem-solving model. Currently the top- level of CRYSALIS is being rewritten using the AGE-l package. Examples from the CRYSALIS program are used below to illustrate the problem-solving model in AGE-l. The Problem-Solving i%del AGE-l uses a uniform multi-level data structure, termed the "blackboard", to hold the status of the system. In CRYSALJS, the blackboard is used to hold various crystallographic data and structural hypotheses. Separate hierarchically organized panels of the blackboard correspond to "electron-density" space and "protein-model" space. These correspond roughly to data space and hypothesis space except that the electron density space has two levels of hypotheses above the electron density data. The protein-model space describes the three-dimensional structure of the protein at different levels of abstraction from the atomic level to the large-scale structural features like "beta-sheets". Skeletal Level Stereotypic Level / (backbone - graph (helices, beta-sheets) I of density nodes) I I I I Nodal Level I Superatomic Level I I (high intensity points) I (Side chains, proline) I I Parametric Level I Atomic Level I (electron density data) I (C,N,Fe etc.) I I I Electron Density Space Protein Model Space A set of procedures termed knowledge sources (KSs) are used to form and link the hypotheses on these panels. In the CRYSALIS application, these knowledge sources include such domain specific operations as skeletonization, helix identification, sidechain identification, bond rotation, sequence identification, cofactor identification, and heavy atom identification. The knowledge sources are expressed as production rules. AGE-l provides a framework for coordinating the activity of the KSs mixing goal- driven and data-driven reasoning as it searches for solutions. If the KSs had been perfect, the coordination could have be 56 Project 2 Set 1II.C. directed in a goal-driven manner analogous to the production rules in EMCIN. However, because of gaps in the theory and implementation of the individual KSs and noise in the data, they are individually incomplete and errorful. Like the BEARSAYII system, AGE-1 `uses an algorithm - and test paradigm - a version of the hypothesize which emphasizes cooperation (to help with incompleteness) and cross-checking (to help with errorfulness) . During the hypothesize part of the cycle, a KS can add a hypothesis to the blackboard: during the test part of the cycle, a KS can change the rating of a hypothesis in the blackboard. This process terminates when a consistent hypothesis is generated satisfying the requirements of the overall sdution or when knowledge is exhausted. In AGE-l, the hypothesize-and-test paradigm is formalized as a control structure with three levels. The first level is the hypothesis-formation level. KSs on this the blackboard panels. level make changes to In the hypothesize and test paradigm, they put hypotheses on the blackboard and test the hptheses of other KSs. A rating is associated with each hypothesis to store the overall judgment. Inmediately above the hypothesis-formation level is the KS-activation level which contains two KSs. The KSs are called the "event-driver" and the "expectation-driver" and correspond to data-driven and goal-driven policies for activating 'KSs on the first level. The highest level of KSs is called the strategy level. is to a solution, This level must decide (1) how close the system (2) how well the KSs on the second level are performing and (3) when and where to redirect the focus-of- attention in the data space. KSs on this level can invoke KSs on the second level. This problem-solving method is more complex and more general than the backward-chaining approach used in EMYCIN. It is designed to tolerate errorfulness in the data and in the KSs and allows the inferences direction. to be run opportunistically in either It also allows the inferences to be run at several levels of abstraction. Using AGE-l to Build a Knowledge-based System -p--e The purpose of the AGE-1 systsrn is to assist a computer scientist at building a problem-solving system. AGE-1 is intended to speed up process task when the task domain can be cast in the model of cooperating knowledge sources. To this end, AGE-1 has several software subsystems - a "TUTC)R" subsystem ai-ld several knowledge acquisition subsystems. The TUTOR is a module for the unfamiliar user brhich helps 57 sec. 1I.C. Project 2 him create an application program. It guides the user through a topdown design of his system by presenting him with a list of topics and subtopics at each level. Canned text is available for explaining the choices at each level. A "browse" option is available for random perusal of the topics and subtopics. Knowledge about the parameters of the application program is acquired by the DESIGN subsystem. The DESIGN subsystem provides the user with choices at each phase of the construction of the application program. This construction involves choices for hypothesis structure, rule acquisition, goals, ard expectations. Thus, the domain dependent particulars for each of the components of the aslication program are asked about in turn. 'KS 1. 2. 3. 4. 5. For example, the following items must be acquired for each preconditions inference levels links hit strategy local variable bindings The acquisition of each of these items is further broken into the most primitive elements. The DESIGN module has a "guided" approach for the novice and an "unguided" approach in which an expert calls for the knowledge acquisition functions quickly and directly. III.C.2.b. Applications of AGE-l -- The CRYSALIS example illustrates the most comprehensive application of AGE-l. AGE-l has also been used on an experimental basis to create a version of RJFF Section 1II.C.l.b. and on some cryptography problems (simple code-breaking). These applications have been used for testing the tutorial and knowledge acquisition components of AGE-l. 58 Project 2 See 1II.C. III.C.2.c. Prowsed Work for AGE-l Ye- In the current version of AGE-l, the DESIGN module provides choices and explains them with canned text. AGE-1 does not build up its own knowledge of the user*s application - only a knowledge of the design choices that the user makes. It does not make inferences about the relationships between design choices - so that it does not infer choices for-the user even when one set of choices implies another set. We plan to move toward a system where AGE-1 will ask the user about the domain and play a more active role in making the design decisions. This means that AGE-l must have a model of "how to build a system' and that we must encapsulate the reasons behind the design choices. Our plan is to begin to capture this information in the form of production rules which relate the form of the domain knowledge to the design choices of AGE-1 to a prediction of the performance consequences in the application program being built. Accompanying this effort we would like to beuin construction of two explanation subsystems - one for explaining the activity in the design phase and one for explaining performance of the application system. We expect to build on the explanation work in the MYCIN system for this. In the long term, we also plan some work on knowledge compiling. Our plans for this in the EMYCIN system have already been discussed. There is some experience in compiling the knowledge of a cooperating knowledge source system - notably the ??Y [39] system which can be seen as a "compiled" approach to the task performed by HEARSAYII. Pl!uch more work is neded before this could be done automatically. III.C.3. The Unit Package m- The Unit Package is a frame-structured representation q&em developed as a tool for building knowledge bases in the PQLGEN project. Unlike EXYCIN and AGE-l, the Unit Package provides no problem-solving framework. Eowever, the Unit Package can be used as a passive representational medium in conjunction with specific problem-solving approaches. Two approaches to experiment planning are being developed in this way as part of research in the XXGEN project. The Unit Package is also accessible frcm within the AGE-1 package. The Unit Package builds on a substantial amount of war!< (both here and elsewnere) sec. 1II.C. Project 2 on frame-structured languages. A comprehensive description of this work is available as a tec,hnical report [52] which is included with this proposal. Knowledge in the Unit Package is organized in a semantic network of nodes and links. Following other mrk on frames [42], the nodes are called "units" [6] and the links are called slots. The major software components of the Unit Package are (1) an interactive editor for adding new information or modifying existing information, (2) a set of routines for matching and manipulating descriptions, and (3) a set of access functions which maintain network relations (such as inheritance of properties) and provide an extended address space to hold the semantic network. III.C.3.a. Examples from the Units Package --- The Unit Package is a fairly extensive set of software for defining the symbolic entities of a domain. of conventions and methods for defining It provides a number standard kinds of relationships between the symbols. domain There are three main steps building a knowledge base for a with the Unit Package. The typical user of the Unit Package is a computer scientist, although four geneticists on the MOLGEN project routinely use the Unit Package. The main steps are using the interactive editor are as follows. (1) Define the symbols of the domain. These symbols take the form of units as illustrated below. (2) Define the operations which manipulate these symbols. Operations are proc&lural knowledge in the form of production rules or LISP functions. (3) Define an approach for problem solving. The steps are not necessarily performed in this order or by one person. In an evolving knowledge base, the user uses the editor both to create new symbols and to modify old ones as his Iunderstanding improves. The expertise to define all of these things may be spread over several pople working on a cmon 'knowledge base, 65 Project 2 Set 1II.C. "Specialization" is a relation which is indicated by a user when he defines a symbol. It is used to indicate subclasses among concepts - e.g., g is a the unit for the restriction enzyme Eco specialization of the unit for general restrict= eny which is a.spe+lization of the unit for endonuze;; is a specialization for the unit for nuclease ana General properties of a class are inherited specializations. This is by it; formalized in part by having descriptions in slots of those units that correspond to classes. These descriptions delineate legal values for the corresponding slots in specializations of the class. Descriptions can be progressively tightened as one proceeds down a specialization hierarchy. This feature makes the process of specialization correspond to the addition of non-contradictory new Iunits. A specialization (or generalization) knowledge to concepts from a molecular genetics knowledge base hierarchy of is illustrated below. LAB+BJECT ANTIBIOTIC AMNCGLYCOSIDE KANAMYCIN NEKXYCIN BETA-LACTAM -WICILLIN *.. . . . ENZYME LIGASE . . . NUCLEASE ENDCNUCLEASE RESTRICTION-ENZYXE ALU1 ASUl 1.1 Symbols in the Unit Package are organized in a generalization hierarchy. This hierarchy indicates "inheritance paths" by which symbols accqire the attributes of their generalizations. Each of the symbols in a knowledae base is defined in terms Of " slots" . A unit corresponds approximately to a property list 61 Sec. 1II.C. Project 2 except that (1) the structure of a slot has several explicit fields for information about such things as modes of inQeritance and datatype and whether the value is stored or computed and (2) the value of a slot can be a description of a value. The following figure illustrates tno units of different complexity. NAME: IXCUMENTATION : SITE-E: 3 `-END: 5'-END: MODE: OPTIMAL-PH: . . . Endonuclease A nuclease that cuts internally in a DNA structure. One of (MONO, STICKY-HEXA, FLUSH-HEXA, PENTA, STICKY-TETRA, FLUSH-TETRA) One of (P, OH) One of (P, OH) One of (Precessive, Non-precessive) RANGE (0 14) . . . NAME: Rat-Insulin-Problem ~UMENTATION: This unit gives the parameters of an experiment for cloning the gene for rat-insulin. GENE: RAT-INSULIN GENE-PRECURSOR: RAT-INSULIN-RNA ORGANISM: A Bacterium Default: E.COLI VECTOR: A Vector GOAL: A Lab-goal with STATE = A Culture with ORGANISMS = A Bacterium with EXOSOMES = A Vector with HAS-GENES = RAT-INSULIN CONDS = (PURE? ORGANISMS) Two units from a MOIGEN knowledge base. Each unit is organized as a list of slots. The slots are filled with values or descriptions of values. These units are examples of Wsymbols" from the `molecular genetics domain. While the Unit Package is not a problem-solving program, it does provide a large number of routines for creating and matching units in a knowledge base. , modifying, These routines are called by problem-solving programs in the MOLGEN project which are currently being tested. Some of the built-in features - such as the generalization hierarchy and symbolic descriptions - seem to be especially useful for problun-solvers that work with 'See the technical report for details. 62 Project 2 Set 1II.C. abstractions. For a discussion of other features of the Unit Package - such as the various modes of inheritance, set notation, or the attachment of procedural knowledge - the reader is referrf4 to the enclosed technical report. III.C.3.b. Applications of the Units Package -e- !mLGEN = Planning Experiments in Molecular Genetics - Molecular genetics is a rich and rapidly growing science. Several aspects of molecular genetics make it attractive as a task domain for artificial intelligence. It is a young science and new tec'hniques and ideas are developed regularly. This makes it attractive for studying the process of discovery ([38], [23]). It is a la&oratory science and experiments are clearly defined in terms of laboratory steps and results. This makes it attractive for studying the processes of planning and plan debugging. Finally, many kinds of knowltige are used in molec*ular genetics. This motivates work on representation in the Unit Package, Planning research in MOEEN has focused on two broad classes of experiments - structural synthesis and structural analysis. The synthesis experiments use various laboratory techniques to build DNA structures. Analysis experiments use various laboratory techniques to identify an unknowl str'uct'ure. An analyst seeks to discriminate 'between competing hypotheses for the structure of a sample. Other Applications In the past few months , several other projects have begun to use the Unit Package as a representational medium. Dr. Blum [5] is using it in an application which will combine statistical methods and AI methods for performing studies on a clinical data bank at Stanford, The Unit Package is being used to represent a set `of medical models to permit a more sophisticated interpretation of patient record data in the data base than is possible using statistical methods alone. The Unit Package is also being used in a mathematical application at Stanford and is being application at the PAND corroration. tested for a planning Other expected over the course of this grant period. applications are 63 sec. 1II.C. Project 2 III.C.3.c. Proposed Work in the Units Package p-e- The propsed Wrk on the Unit Package may be divided into two main categories - representational work and research-related work. Barring surprises from the emerging applications of the Unit Package, most of the work on representational machinery is finished. There are a few outstanding tasks such as (1) generalizing the concept hierarchy to be a concept graph so that units can have more than one generalization and scane more flexible forms of inheritance. (2) providing Since the Unit Package became operational in June 1977, the rate of change to the system itself has slowed dramatically. This reflects the need for a stable system for development of applications and the fact that the Unit Package has found an important niche for the applications in the Heuristic Programming Project. This standstill in development also reflects the current interests of the research group - which is to work on the problem-solving applications of the Unit Package. A great deal more development will become important as this work is completed. For example, the Unit Package provides a substantially richer descriptive language for concepts than is available in MYCIN or ENYCIN. It lacks, however, substantial facilities for knowledge acquisition - beyond a simple interactive editor. As applications of the Unit Package develop, an increased need for a stronger user interface is expected - incorporating such things as the natural language interface (BAOBAB [a]). Another line of development is the development of standard relationships which appear inmany domains. The Unit Package currently provides only a very small set of built-in relationships - such as generalization and specialization - which are utilized by the semantic network processing functions. Creating additional relationships is part of the knowledge- engineering task of applying the Unit Package to a task domain. Scme of these relationships - such as "part-of" or "abstraction- Of" - seem to appear in many domains. To the extent that these relationships have general utility and can be standardized, they will be made part of the initial 'mowledge base for new applications - thus expanding the apparent power of the Unit Package and reducing the effort of starting new applications. III.C.4. Long Term 'Work and New Packages -y-e The development of packages over the next five years will be opportlunistic - relying on the most usable results from core research in artificial intelligence. Thus, while the following 64 Project 2 Set i1I.C. ideas indicate only our best current ideas for continued development. III.C.4.a. Planning Package One of the areas in which we see future mrk is in the general area of planning. The artificial intelligence research on this problem is currently being performed in the domain of ex-per iment planning in molecular genetics. Some interesting ideas are just beginning to emerge from this work which, if successful, could become the basis of a ,"plar~ni.ng package". This research is investigating the viability of a new approach to planning called "orthogonal planning", The thrust of this approach is to take the elements of a planning out of a "planning algorithm" and put them into explicit "planning spaces" . Rxplicit planning operations such as refinement (mapping from abstract to specific) and evaluation and subgoal proposing are expressed as operators in a planning space. Different combinations of these operators can be arranged to create top-down (goal-driven) planning, bottom-up (opprtunistic) planning, and various hybrid methods. The planning research seeks to find general methods for deciding when to apply these different planning operators in order to plan flexibly and effectively. Currently ten planning operations have been formalized in the planning space and four strategic operations have been formalized in a overseeing "stratqy space". TSis approach is being tested in the domain of experiment planning in molecular genetics and uses the Unit Package for representing the symbols and operations in all of the spaces. III.C.4.b. kage Pat Time-Oriented Knowledge Representation One important topic in computer-based diagnosis and therapy programs is the representation of knowledge about situations that are changing over time. Most current programs have concentrated on the interpretation of a single instance in the course of the patient *s disease process. As the &Datient status changes over time, a program must be able to modify its representation to conform to t`he new situation. The ability to represent trends in t-he health of the patient is an important part of the diagnostic process. Creation of a package that supports the representation of 65 Sec. 1II.C. Project 2 changes over &me will be important for applications based on clinical data bases. These data bases typically contain the results of a variety of tests which were administered at each patient visit to the clinic. The problem of interpretation of updated test results has also come up in each of our current applications, for example, initially negative culture results that grow out a particular pathogen after several days in our infectious disease program or the comparison of new pulmonary test results with the previous findings. No general purpose approach has been incorporated into these programs. A program for a particular dynamic clinical setting - interpreting measurements from the intensive care unit has been developed at the Heuristic Programming Project. That program, named the Ventilator Manager (VM) [21] , is able to evaluate a stream of thirty measurements provided on a 2 - 10 minute basis by a computer-based physiological monitoring system. The system: (1) provides a summary of the patient physiological status appropriate for the clinician: (2) recognizes untoward events in the patient/machine system and provides suggestions for corrective action; (3) suggests ad j ustments to ventilatory therapy based on long-term assessment of the patient status and therapeutic goals; (4) detects possible measurement errors: and, (5) maintains a set of patient specific expectations and goals for future evaluation, Removing the the basic assumption about the regularity of the changes in the ICU setting is the major area of research in the development of this package. A typical problem is the interpretations of a series of test values that are higher than normal over several testing instances. Specializti knowledge about the typical rate of change of the underlying disease process is necessary to determine whether these values represent a trend. The representation of dynamic settings also requires a model of the stages of the disease and treatment process that best characterize the clinical status of the patient, Often a particular value of a measurement takes on entirely different interpretations based on the current context. For example, the meaning of critical measurements one hour after surgery compared to the same measurement after three days of recovery. A rudimentary model of this type based on various therapeutic regimens is built into the ICU measurement interpretation system. Additional work in required in the generalization of this type of modeling process, 66 Sec. I I I Codification and Use of Medical Knowledge from Clincial Laboratories XDXINISTICiTIVE INFJXMATION ONLY Project 3 1. TITLE OF PROPOSAL /Do nor rxcwd 53 rypewrilw s&wcwJ FOR FIRST lZ.MONTH FERIOC Columbia, MO. 65211 Stanford, California Technology Center `1 . tiesearch lnvolvmg Human Subleas l&t lnrtrucaonsl 8. Inv~nrlom (Rcnmd Applrcmrr Only . see Inrrrocrlonrl A.aNO 8-a YES Approved: A.m NO 8.0 YES - Not previously reported CC YES - Pending Arview oat* C.aYES - Prewously reported TO BE CDXPLETED BY RESPONSleLE AOMINISTRATIVE AUTHORITY fIrems 8 rhroup7 r3 rnd 1501 9. APPLICANT ORGANIZATION(S) lSn lrarrucrrons~ 11. TYPE OF ORGANIZATION ICheck o pplrcablc rtcd The Curators of the Univers ity of Missouri 0 FEDERAL GJ STATE 0 LOCAL OOTHER (smif~l . 215 University Hall IVPT< I tv Columbia, MO. 65211 12. NAME. TITLE. ADDRESS. AN0 TELEPHO~JE r~uf.tUER Of OFFlCtAL IN BUSINESS OFFICE WHO SHOULD ALSO 8E NOTIFIED IF AN AWARD IS MADE H. Kent Shelton Asst. Vice President Financial Services 215 University Hail IQ NAME. TITLE. ANDTELEPHONE NUMBER OF OFFICIALIS) SIGNING FOR APPLICANT ORGANIi!ATION(S) H. Kent Shelton Asst. Vice President Financial Services Columbia, MO. 65211 Telephone Number 3 1 rJ-88J-?5 1 ? 13. IOEtJTIf? OkGm!rkLiI~r TO rlECEIvE CREDIT FOR INSTITUTIONAL GRANT PURPOSES KSS ln~rf~crionrl Graduate School Sec.111 PROJECT 3. Codification and Use of Medical Knowledge from Clinical Laboratories ADMINISTRATIVE INFORMATION ONLY RESEARCH OBJECTIVES NAME AND ADDRESS OF APPLICANT ORGANIZATION University of Missouri-Columbia NAME, SOClAL SECURITY NUMBER, OFFICIAL TITLE. AN0 DEPARTMENT OF ALL PROFESSONAL PERSONNEL ENGAGED ON PROJECT, BEGINNING WITH PRINCIPAL I Donald A. B. Lindberg, M.D. Director, Health Care Technology Center and Information Science Grou * Robert Abercrombie, Ph.D. ~o~~%%al Fe1 low Information Science Group Paul Blackwell, Ph.D. $ P?ofessor of Computer Science Lamont Gaston, M.D.,$- ' Professor of Pathology Lawrence Ki ngs 1 and, W. B. Stewart, M.D.? Senior Electronics Technician, Information Science Group Professor of Pathology, Director of Laboratories Henry Taylor, M.D. ; ,rofessor of Pathology John Townsend, M.D.7 Professor and Chairman, Department of Patholoqy TITLE OF PROJECT John Yio Ph.D., 227 68 0029, Post Doctoral Fellow, Information Science fr (NOT TO EXCEED 101 IN YOUR ABSTRACT. A. Objectives 1. To represent within a computer-based information system the knowledge and procedures of the clinical laboratory expert. 2. To determine how to implement this information system such that benefits result to the clinical laboratory service which are measurable in terms of: a. Increased quality of laboratory determinations b. Reduced costs to the laboratory and/or the institution C. Increased access to pertinent information by laboratory data providers and users. 3. To determine how to interface this information system with the hospital and clinic services such that benefits result in actual patient care. We propose to seek "process" measures rather than `!outcome" measures, 4. Using this operational testbed to shed light upon certain important questions basic to artificial intelligence in medicine research. These objectives will be pursued by construction of a knowledge representation system for the domain of the clinical laboratory expert. Subject matter expertise will be provided by directors of the clinical laboratories of the University of Missouri Medical Center. Fundamental artificial intelligence methodology and special- ized computational facilities will be provided by the SUMEX Laboratory and the Department of Computer Science at Stanford University. Management and interfacing of the project and site-testing will be provided by the Health Care Technology Center at the University of Missouri-Columb.ia. 68 Project 3 Sec. 1II.A. PRQJECT 3: The Clinical Laboratory Expert Project III. A. Objectives 1. To represent within a computer-based information system the knowledge and procedures of the clinical laboratory expert, 2. To determine how to implement this information system such that benefits result to the clinical laboratory service which are measurable in terms of: (a) Increased qua1 i ty of laboratory determinations (b) Reduced costs to the laboratory and/or the institution (c) Increased access to pertinent information by laboratot=\r data provi'ders and users. 3. To determine how to interface this information system with the hospital and clinic services such that benefits r&ult in actual patient care. We propose to seek "process" measures rather than "outcome" measures. 4. To seek through this operational testbed to shed light "upon certain important questions basic to artificial intelli- gence in medicine research. These include the following: (a) How best to retain the power of symbolic representa- tions traditional to Al techniques while at the same time obtaining the benefits of the numerical methods which are traditional to fields such as laboratory management? (b) How best to set up an information system so as to accommodate to the endless stream of changes which occur In the operating environment of a system such as the clinical laboratory? (c) How to improve, and hopefully optimize, the interface hQ Sec. 1ll.B. Project 3 of the knowledge engineer and the subject matter expert, in this case the clinical laboratory expert7 111.8. Background and Rational Use of artificial intelligence techniques, especially the recent focus on formal representation of the knowledge of experts, is the latest and. most promising of applications of the computer to medicine. It is already clear that the techniques are powerful and that the proof-of- concept and feasibility .phases of medical applications have been success- fully passed. This technique has been shown feasible in the areas of infectious disease (Shortliffe et al., 1973), glaucoma management (Weiss, Kulikowski, Safir, 19781, patient present illness (Pauker, Gorry, Kassirer, Schwartz, 1976), and in the general differential diagnosis in internal medicine (Lawrence, 1978). In many ways the Al techniques are still in development, but the real question remains: in what areas of medicine are they most usefully going to be employed? Some raise the question, in which areas would such techniques even be accepted? The clinical laboratories offer the very best application sites for exploring Al techniques as a basis for biomedical information systems. The following observations support this contention: 1. The clinical laboratories were the first sites for successful implementation of computer-based information systems of any kind (Hicks, 1969; Lindberg, 1965, O'Kane, Haluska, 1977). 2. There are a host of current computer systems al ready disseminated in this field which form a basis for advanced technological developments, Project 3 Sec.1II.B. 3. Clinical laboratory services constitute a major part of hospital expenses (estimates vary from 25-40'8). 4. Clinical laboratories, for the most part, are administered by professional medical personnel who have training in technological matters, including hardware and .information systems, and who therefore are 1 ikely to be receptive to advances in this kind of methodology. 5. There is an expertise in clinical laboratory operation and interpretation which is recognized by medical specialty training. 6. Knowledge in this field is plentiful; and expertise takes the form of a multitude of.`tiny empirical pieces of information, which await unification into an overall information framework. This situation is compatible with the way in which formal knowledge systems have been built for other Al appl icat ions. 7. On the other hand, the field does offer an advantage in another (almost counter) sense: namely, that there are true and realistic models of the basic data generating sources. For example, one knows quite surely that impedance transients in a Coulter Counter are caused by particles, and that these particles are (for the most part) erythrocytes. Likewise, the concept of "serum electrolytes" is known to have a solid basis: namely, that there are actual, Immutable ions of sodium, potassium, chloride, and bicarbonate (and C02) within the serum. Furthermore, chemical laws describe the relationship between many blood constituents. Curiously, the chemical laws are not used ordinarily as the 71 Sec. 111.8. Project 3 basis of laboratory management, and only partially as a basis for test interpretation and subsequent patient management . The chemical laws and the physical models are , however, a potential advantage in building advanced information systems. 8. The clinical laboratory offers a setting which is receptive to and safe for development of new information systems, yet which also offers a home base for extension out toward the more purely clinical setting. The meeting ground of the two is clear: it is the interpretation of the results of laboratory measurements. For these reasons, we feel that clinical iaboratories are in general a potentially fruitful place for Al in medicine applications. There are reasons which make us think that the particular laboratories and group at the University of Missouri are a good choice among those institutions with excellent clinical laboratory programs. 1. The school has a long history in lab system developments. The first automated lab system in the country was built here ln 1962 and has operated continuously since then. 2. The system incorporates all clinical laboratories and all test results. 3. These results are in computer processible form, indeed are reported through the computer systems. Consequently test data is accessible. 4. Experts in clinical laboratory medicine are members of the team who propose to build the Clinical Laboratory Expert system. 5. The project is sponsored by the health Care Technology 72 Project 3 I I I. Sec.II1.C. Center, which has ample experience and capability in the management and conduct of multi-disciplinary technical projects. The Center management review of all projects includes participation of an evaluation team with members from opetatlons research, medico! sociology, economics, health services management, and meeicine. 6. Most important of all, we have a plan to accomplish the system building, and we have predecessor systems to build on and to compare with. C. Methods of Procedure We propose to grow the information system beginning with a nidus or model system and to expand the scope of the system by adding to it information and values from,additional areas. That is, our strategy will be to begin with what is clearly feasible, to build our collaborative patterns about an early success, and then to expand in a systematic fashion to more ambitious goals. We feei this is not only a good general management strategy but the best way to build programming systems too. Fvant;ldl Iv. for instance it LJocld be desirable for the svstsm tn be able to learn from the data. First, however, the system must be given the logic by which laboratory data are evaluated and understood. WC plan for development of the system in four phases. Phase One: incorporate the medical logic which takes into account the information which is available within the laboratory Itself: e.g. test results, quality control results, methodological lnformat ion. Phase Two: Incorporate the additional medical iogic which takes Into account information about the patlent: first simple aspects such as gender, age, race; then more complex concepts such as drug therapy, 73 Sec.iI1.C. Project 3 operative status, clinical service assignment and provisional diagnosis. Phase Three: incorporate medical logic which includes concerns for hospital function. Phase Four: incorporate medical logic which attempts to link to considerations which are outside the hospital Setting. Following is a more detaiIed description of the phased development. Phase I. The aspect of the lab results which is of primary concern within the laboratory hinges upon quality control considerations. These are the first logical aspects which must be represented. We are referring initially to thinking wh;ch currently goes on strictly in the laboratory, previous to release of a test resui t. Subsequently, there may or may not be significant discussion between the laboratory director and the clinician concerning further lab work and/or clinical concerns. Previous to this stage, however, there is a great deal of evaluation done now within the lab and based on laboratory on only partially clinical grounds. Not enough evaluation of this sort is possible with today's high volume instruments. This function can be greatly enhanced by advanced computational techniques. We would plan to introduce knowledge into the system along the following lines: 1. Knowledge of the labs selected (likely we would start with hematology and clinical chemistry) 2. Knowledge of what tests are done, what methods are used, what parameters are estimated, what units are used. It should be noted that there are often multiple extant methods for a single determination, as wei 1 as multiple laboratory locations throughout the institution at which it might be 74 Project 3 Sec.tl1.Z. done a Methodology and unitage change COntinuallY- Since a referral-type laboratory may do 3,000-5,000 different determinations, it is a serious problem to choose a representation which will be amenable to the endless updating 3. Knowledge of the kinds of patients and hospital locations. 4, Logic permitting an initial evaluation of the test result for credibi 1 ity. This natural iy includes arithmetic ranges, formats, etc. 5. Logic permitting evaluation taking into account other results from examinations performed as a battery. An example is the well known relationship between hemo- globin and hematacrit. 6. Logic permitting evaluation of test result taking into account laboratory qual i ty control procedures and records. We have recently completed an evaluation of the proposed 6~11 statistic for control based on a weighted-moving- average of mean corpuscular hemoglobin concentration, which is a slight but still insufficient improvement on the traditional method. This is an example of the need to bring numerical methods into alignment with the symbolic logic. In essence, this asks the general question, is it likely the result is valid con- sidering the quai ity of the particular "run" or batch which produced the result? The outcome of ail the laboratory logic is the resolution of the following questions: a, Should the test be repeated using the same blood sample? b, IS the issue important enough (or specimen identification SUffiCientiy questionable) that a new specimen must be obtain?,: 75 Sec. I I I .C. Project 3 from the patient? c. Should the result be reported to the clinician and to the chart with some kind of qualification attached? d. Is there a quality control problem in the laboratory which requires immediate action? t. Is there a breakdown in the clinical procedure (ordering,specimen collection, etc.) which requires imnediate adtion? Phase I I. There are a number of clinical but relatively elementary considerations which may be taken into account within the- laboratory -- and which certainly should be taken into account by the knowledge-based system we propose. Examples are: 1. Logic permitting evaluation of test results taking into account basic information about the patient, i.e., aget race, sex, and ward location. 2. Logic permitting evaluation of test results taking into account previous test results in the same patient. These pieces of information are often of critical importance in evaluating the credibility or significance of laboratory reports. Normal ranges, for example, vary for some tests with age, race, and sex, Previous results on a patient, to take another example, may be the first clue to a mismarked specimen: the blood-from-the-wrong- patient blunder which is so fundamental a problem for all Iabora tor ies . 3. Logic permitting evaluation of test results taking into account the general nature of the putative diagnosis (e.g., admitting diagnosis or treatment regimen). Project 3 Sec.II1.C. It should be noted here that we are not proposing that the system permit or encourage that clinical knowledge of the patient influence the test result, but only the interpretation of the result and the handling of the specimen. A general diagnosis or even a treatment regimen can greatly influence these matters. Plasma specimens from patients on oral anti- coagulants, for example, usually should not yield normal prothrombin times; indeed for these patients, `:normal" is abnormal and dangerous. The implication here is for interpre- tation of the result, and when to report an "abnormal i ty" through-: the stat or emergency systems. Similarly, patients with leukemias, especially under chemotherapy, often have remarkedly elevated uric acids which have nothing to do with the usual reasons for hyperuricacidemia. The issues which are relevant at the patient or the clinician`s level hinge upon matters of test interpretation, the possibility of needing to order further tests, the possibility of new diagnoses. There is obviously an immense amount of logic which concerns laboratory test interpretation in the context of.all of the possible clinical diagnoses and management problems. \Ie are not proposing to include this mountain of knowledge, which really pertains more reasonably to programs such as Myer's INTERNIST System. We propose to stop with knowledge which might reasonably be construed to represent the conversation of the laboratory director with the patient's clinical physician. It is difficult to specify stage when we are only proposing ion of our intent might be provided precisely this cut-off at the the system. The best indicat by an example. 77 Sec. II I.C. It frequent1 y happens that the lab director and a clinical hematologist wil 1 discuss a set of lab findings for a patient Project 3 (with or without the question of errors in the findings) up to the point at which it is clear that the findings support the interpretation "iron deficiency anemia". This stage of reasoning represents a kind of intermediate between findings and diagnosis which Al systems sometimes call a concept. The semantic network system of Kulikowski, Amarel and Weiss, for instance, has such "concepts" within its logic. From the point of view of the logic we propose to write, this interpretation would be a proper termination,wholly supportedby lab findings but requiring more clinical information about the patient than is obtainable from such paper systems as lab requisitions. The cause of the iron deficiency anemia would remain for another system to take up. iological There are a host of such intermediate pathophys concepts which constitute a kind of proper frontier lab reasoning and more purely clinical reasoning. between clinical In practical terms, the resolution frequently is reached either by a telephone conversation between the lab director and the clinical physician, or by personal contact on such an occasion as rounds. We are not eager to automate the personal contact, although time does not permit enough of these discussions to occur; we would like to automate at least the decision to make the telephone call or appointment. Most test results, even batteries of results do not permit an interpretation at the laboratory level, In some cases, we feel the logic could take us further, The most extreme case and the most complete logic we feel would end with a tentative patho- physiologic concept (such as anemia) and in selected important cases a decision on the part of the computer system to recommend the lab director call the clinician. Because of the limitations of 78 Project 3 Sec.II1.C. time, this is not a minor decision. Only the most important cases should be selected for such conferences, whether telephone or in person. A system with full and explicit logic should form a good basis for such a decision. Furthermore, previous experience has shown us that even our non-Al current lab monitoring systems must bring together all pertinent (available) information about a patient before bringing the abnormal report to the attention of the user. This simple assembling of data aids current decision making; we anticipate thatassembly based on a more extensive logic will prime a clinically useful discussion. Phase It I. Logic relevant to hospital function primarily concerns institutional patterns. This includes changes in laboratory patterns, timeliness of ,. reporting, distribution of costs among services and patients, and examination of interactions between procedures. For example, do screening batteries including such tests as LDH's result in an inappropriate number of repeat kinetic enzyme studies? These matters are derivative measures of institutional function which are the natural by-products of semantic understanding of the laboratory transactions. They wou Id not be examined until after the more fundamental logic in Steps I and II had been dealt with. Phase tV, Logic which links to considerations outside the hospSta1 environment. it Is difficult to detail these linkages ab initio, They are made - up potentially of at least two separate concerns: derivation of facts of general scientific interest; and the provision of linkages to educational functions. it must be emphasized that firm promises for such accomplishments 79 Sec.tI1.C. Project 3 cannot be made. Still, one should point out some potentially important implications outside the immediate hospital realm, and should attempt to make the connections. A more or less modest scientific fact which could with luck result from the studies is the long awaited mUltiVariate normal for application to multi-channel screening (Letotte,l977; Grams, l?Y'?). Building of instructional systems is beyond the scope of the present proposal, but provision of the connections is an inherent part of our plan. Good Al systems are (partly) characterized by their ability to defend their decisions. That is, a classification or advice provided from such an automated system can be challenged, and it can be expected recapitulate the rules or criteria which produced its the the system can conclusion. It is users outside the precisely this ability which should allow potential laboratory to benefit directly from the existence of such a knowledge-based system. Me would hope to allow for this educat by-product usage by providing suitable means to challenge and converse wTth the system. iona a0 Project 3 Sec.II1.C. System building We have given thought to the architecture of the proposed system. It should be emphasized that this project is a long term development in an area of fundamental importanle to medicine: namely, the knowledge which surrounds clinical laboratory testing. We feel that there exists an adequate base of expertise in this field at the University of Missouri, acknowledging of course that we would utilize the full resources of the published literature and that the knowledge and logic of the system would be subjected to outside review by consultants as each major step was taken. We do not, nowever, have an adequate experience in work in artificial intelligence techniques per se to undertake the project alone. It is clear that this competence exists in the group at Stanford. We feel we have a sufficiently good working relationship with Professor Feigenbaum and his colleagues that a joint develop- ment will be successfully concluded. The form of the actual computer representation has not been selected. Our lab systems have used table driven assembly code for years. The HCTC is collaborating with clinicians at UMC and computer scientists at Rutgers to create a rule-based rheumatology consultant. We wish to explore with Dr. Feigenbaum the possible appropriateness of the imputational "blackboard" of the Hearsay system. The knowledge-based system to incorporate clinical laboratory expertise will be built on the SUMEX machine via the existing time-sharing network. We have used terminal connections to SUMEX for five Years in Connection with operation of the AIM network, the SUMEX Cxecutive Committee, and smaller experimental projects, The communications are sufficient to support development of such 81 Sec. I I I.C. Project 3 a system, At the same time, we recognize that it is inappropriate (and probably impossible) for the SUMEX computer complex in California to support a real-time service activity in hissouri. Fortunately this is not necessary. Testing of the model in its sequential versions against actual lab data in batches or bench- mark sets can easily be done on a periodic basis. This will not be a problem. Even the status of the quality control results can be accessed and included in the model's operation in this fashion. Since alI.ttansactions are recorded, one can accurately recreate "real time" for any moment. The issue of implementation of the full model in a real laboratory setting is a separate problem. The system has not yet been built, so we can't say what kind of computer would be needed to run it. If We are correct in assuming, like other systems, that a part of a PDP-10 is capable of running the model, then it is not unreasonable to expect our laboratories to acquire this level of computer support. .The current lab systems are using a combination of two PDP-12's, an IBM System 7, substantial services of an IBM 370/158 (which is being replaced by an Amdahl machine), and several microprocessors, including M6800's and LSI-11's. All this does not add up to an AI machine, but we don't want it to yet. There is a commitment to having computing gear at UMC, and in most large clinical laboratories, At the same time, one must acknowledge that the five year duration of the project will doubtless see a continued reduction in the cost of computing 9-5 as well as a continuation of the advances in hardware which will have made Al techniques more realistic in the past. Machines equivalent to DEC PDP-10's may well come to be offered for small amounts of money in microforms. This kind of breakthrough is. not necessary in order C r .us to moye dvepnto an iFI-baljed svstem. What is necessary I+ -that a2 Sec. i i I .I:. the system work well and be able to keep UP with the changes in laboratory procedures which have plagued and almost destroyed previous systems. Our institution is currently supporting six full time programmers in a vain attempt to keep rigid old programming systems current with methodological and adtiinistrative changes. If the Al techniques succeed in producing a competent flexible software system, we feel that ongoing personnel savings will offset even large one-time hardware costs. While the major model system is being built, we wi 11 naturally implement as improvements whatever parts of the logic are reasonable and feasible on the existing hardware, This is not difficult to imagine, because the current system is somewhat distributed already. It is through this means that we would expect to identify and hopefully to achieve cost savings and quality improvements. We assume that the major advances would come through implementation of the full new system. These should be calculated ahead of time. If the savings and improvements are "there", the project will have been successful and the system will be implemented as a whole at UMC and elsewhere. -33 Sec.1ll.C. Project 3 Concepts to be included There are certain genera1 concepts which are suffused throughout all elements of laboratory practice. These will necessarily be incorporated in all phases of the proposed development. These concepts include the following: 1. Statistical significance of testing, including sensitivity - specificity of tests. This orientation is inherent in lab work. Recent reports (Casscel 1 s, Schoenberger Graboys, ,978; Ransohoff and Feinstein, 1978) indicate that it is not well understood by the clinical users of laboratory services. 2. Related to this idea is the concept of normal, which is very much dependent upon each particular laboratory, and even upon specific methodologies. The knowledge of normal ranges regarding the methodology and regarding age, sex, race, and special circumstance? of the test population must be firmly associated in the system with each test specification, The system must be able to defend its interpretations, and hence to inform the user of the laboratory's assumptions and adjustments to methodology. 3. The concept that automatic error detection is the essential first step before interpretation of results is attempted, and that the attempt at error detection must be vlgorous , With the present systems we are able by careful after-the-fact daily checking to recognize and correct errors in data which have passed through the computer checks and have actually been reported to the patient's chart, Two and one half percent of results are in error, Of these 0.5% (ln retrospect) actually represen: a4 Project 3 true techn ician or technologist method0 logical errors. The remainder are a very mixed bag of clerical and administrative errors. Our performance (which is probably good compared with many wholly manual or semi-automated labs) Is the result of incorporating extensive computer editing of the data. We long ago, for example, incorporated seif- check digit identification for patient and specimen numbers, since we had shown that this category alone accounted for half the errors detected by an earlier system (Lindberg, Sec.II1.C. Schroeder, Row Additional have been deve 1 and, Saathoff, 1969). empirical methods of pattern recognition oped for error deletion, and will be incorporated in the proposed system. These include analysis of electrolyte patterns, creatinine and others (Lindberg, 1968) . The current daily Abnormal Value Rounds in the laboratories will provide an ideal work setting for the model development and testing. Presently lab reports are transmitted by and reviewed by the several computer systems. Special cases, according to adaptive algorithms, are selected by the systems for review daily by the chairman of the Department of Pathology, Or. Townsend, and his residents and staff. They currently accept or reject the computer judgments based on their own internalized judgments and upon additional data about the patlents which is obtained by going to see the patient and/or the chart. It is this logic which should be represented in the new programs. a5 Sec. I I I.C. Project 3 4. Multi-step testing is a practice which has b&en common to labs for decades. The logic is not always made explicit to the user, and we feel there is an advantage in doing so. The classic example is the serological test for syphi1,is. Formerly, laboratories did a VDRL (for sensitivity), followed in the positive cases by a Mazzini (for specificity), Currently these have been replaced by the rapid plasma reagin test and the fluorescent treponema antigen test. The salne practice is .followed (appropriately) with many clinical enzyme tests such as CPK and LDH, their kinetic counterparts and their iso-enzyme extensions. Even more dramatic is the multi-step or branching tree. logic which is used by coagulation laboratoriesand thespecial immunology laboratories. The questions to be addressed by the system include: what test should be done first? What is available locally? What subsequent test.to do, dependent upon what initial results? What statistical significance do the results have? What further testing could be done7 If this involves a remote referral lab, how is the service obtained? Essentially, this logic is quite subject matter dependent. It is specific to the limited domains, but because of this, also quite synonymous with expert behavior. 86 Project 3 Sec.lI1.D. I I I.D. Significance The significance of a successful outcome would be: I. Advances in basic knowledge representation techniques 2. Formal and public representation of a major field of medical expertise which will be of interest to all fields of medicine, health care, and information science. 3. Advances in techniques for remote collaboration on information system development. That is, we would be much further aiong on knowing how to share rare computational facilities and unique computer science competence with a broader, perhaps even national, medical community. 4. improved understanding of evaluation of advanced health care technology. The significance of a less than complete success would be lessened. Undoubtedly some of the representation and testing would be accomplished, since we will commence with the easiest part. If one's success were limited to this, the results would be of real importance but of interest primarily to laboratorians and computer scientists. These are an important part of the audience, but not the only ones we see for the complete system. The "downside risk", in other words, is minimal. 87 Sec. I I I.E. Project 3 1II.E. Facilities available The Health Care Technology Center can house tne computer component of the project at the University of Missouri-Columbia. Space is available in a modern office building. The Center provides library facilities, computer laboratory facilities, telecommunication, etc. The Department of Pathology will be providing access to the working laboratories as required. These include Hematology, Chemistry, Microbiology, Clinical Microscopy, Coagulation, Immunology and Anatomical Pathology services for the University Hospital (440 beds), a simi lar arrangement' for the adjacent Harry S Truman Memorial Veterans Medical Center (426 beds), the Mid-Missouri Mental Health Center (175 beds), and Rusk Rehabilitation Center ( 100 beds). The combined laboratories process 2,100,053 procedures a year. Computer hardware per se includes 6 DEC LSI-11's; 3 M68OO systems; 2 DEC PDP-12's (tapes, disks, terminals);DEC PDP11/34; IBM System 7; and multiple direct connections to the University Network IBM 370/i58 and 370/168 (both to be replaced by Amdahl gear). The members of the Health Care Technology Center include 45 faculty from I4 University departments in 6 schools of the Columbia campus. The professional staff of the Department of Pathology includes 29 faculty and 20 residents and fellows. Only a subset of the faculty are planned as active members of this project team, but all are interested in the success of the venture and all are available as needed for help on specific knowledge areas within their own subspecialties, 88 Project 3 Sec. I I i .F. rfr.F. Collaborative arrangements The system would be developed jointly with members of Computer Science at Stanford and the Health Care Technology Center at the University of Missouri-Columbia, Computer support for the model system would be provided by the SUMEX computer facility. This is an NIH supported national resource. Use of local computers at UMC for data gathering, analysis, test implementation would be provided free of charge. An exception is minor maintenance charges for HCTC equipment. Telecommunications for approved projects are provided by the SUMEX contract with TYMNET and ARPANET. Access to Net nodes is provided by UMC WATS lines. In addition, the project would budget funds to provide for frequent travel between the two schools. Results of the project are to be published. Stanford University is viewed as the primary submitter of the proposed program project, with the University of Missouri-Columbia supporting the application and taking responsibility for the Laboratory Expert Project. Doctor Feigenbaum is the Principal Investigator for the program project. Doctor Lindberg is viewed as Director of the Laboratory Project. 89 PROJECT 3: REFERENCES I. Shortliffe, E.H., Axline, S. G., Buchanan, B.C., Merigan, T.C. and Cohen, S. N., "An Artificial Intel I igence Program to Advise Physicians regarding Antimicrobial Therapy". Computers and Biomedical Research, 6 (1973):1-17. 2. Weiss, S., Kulikowski, C. A.. Safir, A. "Glaucoma Consultation by Computer". Computers in Biology and Medicine,8 (1978): 25-40. 3. Pauker, S. G., Gerry, G. A., Kassirer, J. P., Schwartz, W. B. "Towards the Simulation of Clinical Cognition: Taking a Present Illness by Computer". American Journal of Medicine,60, (J une, 1976): 981-996. 4. Lawrence, S. V. "Internist: Computer Program Expressing Clinical Experience and Judgment of a Master Internist Constitutes a Unique Resource!`. Forum on Medicine (April 1978): 44-47. 5. Hicks, G.P.Evenson, M.A., Gieschen, H. M., Larson, F.C. "On Line Data Acquisition in the Clinical Laboratory". Comou ters i n I I I (Stacey and Waxman ) New York: Biomedical Research Vol. Academic Press, 1969, pp. 15-53. - 6. Lindberg, D. A. B.: "Collection, Evaluation and Transmission of Hdspi tal Laboratory Data" SymDosium (1965): White PI Proceedinqs 7th IBM Medic4 ains, New York, IBM, 1,965. 7. O'Kane, K. C., Haluska, E. A. "Perspectives in Clinical Computing". In Advances in Computers, I6 (1977): Academic Press, 161, 8. Lezotte, D. C. "A Multivariate Laboratory Data Analysis System: Introduction". Journal of Medical Systems, 1, No. 3 (1977): 293-98. 9. Grams, R. R. System "Progress Toward a Second Generation Laboratory lnformat ion (LI S)". Journal of Medical Systems,(l) No: 3, (.1977):263-74, IO. Casscells, W., Schoenberger, A., Graboys, T., "Interpretation by Physicians of Clinical Laboratory Results". New Ensland Journal of Medicine 299, No.18 (November 1978): 999~lOOl. Il. Ransohoff, 0. F., Feinstein A. R., "Problems of Spectrum and Bias in Evaluating the Efficacy of Diagnostic Tests". New England Journal of Medicine 299, No. I7 (October 26, 1978): 926-30. 12. Lindberg, D.A.B., Schroeder, J.J., Jr., Rowland, L.R.,. Saathoff, J., "Experience wi th a Computer Laboratory Data System". In Strandjord, J. (ed), Hkltiple Laboratory Screening. Academic Press, New York, 1969, 245-55. Project 3 Project 3 The undersigned agrees to accept responsibil ity for the scientific and technical conduct of the research project and for provision of required progress reports if a grant is awarded as the result of this application. Sec. Iv. CORE FEZARCH Core Research N.A. Objectives of Research The long term goal of artificial intelligence research at the Heuristic Pragrarraning Project (HPP) is to understand and build knowledge-based "intelligent agent" programs. Over the past decade we have studied such systems in the context of scientific and medical applications where human expertise for solving the problems was evident and where the difficulty of the problem seemed to lie just outside the boundaries of current AI methods. Because of the complexity of the applications, a significant part of the effort has been to make the expert knowledge of the problem explicit and to represent it appropriately in a knowledge base. This perspective has focussed attention on four areas for research: (1) (2) (3) (4) F&presentation - designing the symbolic structures for modeling the knowledge about a problem. Presently this phase is carried out by the system buil.ders; we intend to codify the knowledge used to make such decisions, both as an aid to the system builders and ultimately to enable the programs themselves to choose appropriate representations. Reasoning - modeling the appropriate inference mechanisms for a problem and building systems that incorporate those models. Knowledge acquisition - designing systems that acguire knowledge by corrrnunication with human experts. Multiple uses of knowledge - designing systems that use the symbolic representation of the domain knowledge for additional purposes such as consensus building (accommodating conflicting advic e from experts whose competence may be egual but whose "styles" vary), tutoring of human students by employing the knowledge base (both the information it contains and the way it is organized), and explanation (constructing a chain of rules which satisfactorily rationalize the system*s behavior to an observer. 92 Core Research Set IV.B. IV.B. Background and Rationale Artificial intelligence research at the Heuristic ProgrmrPning Project has utilized medical and scientific problems to focus the research effort. For many different applications over the last decade this has led to a cycle of research as follows: 1. Form a collaboration with a scientist to mrk on a specific problem in a challenging and interesting area. 2. Propose a,method for representing and manipulating the domain knowledge. This involves acquiring both formal and informal knowledge and developing a knowledge-based reasons with that knowledge. system that 3. Test the system. limits. In this phase the metbod.is pushed to its The relationship between the design and the performance of the system is used as the basis for future development. Both success and failure of a system can lead to further research steps. When a system fails to solve a problem, the seeds for further research can sometimes be found in the reasons for failure. Gn the other hand, when a knowledge-based system is successful, the desire to use it effectively uncovers a nlsnber of additional needs. Thus, intelligence many of the topics of artificial - such as the ability of a program to acquire knowledge, or to explain its reasoning knowledge base - , or to manage updates in a have grown out of programs that were at first successful only at problem solving. From this experience has come not only a set of approaches to building intelligent SF-=, but also a broader understanding of what intelligent systems should be like. The following sections discuss the background information about each of our major research areas. We will outline the progress that has been made on this topic and identify the major technological tools. Then in Section 1V.C. we will discuss our perception of the outstanding research issues and how we plan to approach them. IV.B.1. Representation 93 Sec. IV.B. Core Research One of the trends in our work has been to develop general purpose approaches for representing a broad range of knowledge in a knowledge base. This is illustrated by the Unit Package that has been developed for the MOLGEN project([40],[531) for experiment planning in molecular genetics. In the figure below are two units frgn a MOLTEN knowledge base. The first unit represents the restriction-enzyme EcoRl: the second unit represents a problem-solving goal for an experiment. NAME: SITEI"IPE: 3*-END: S.-END: NODE: MOIJAT : SUBSTRATE: RECCGNITICN-SITE: EJCORl STICUY-mxA OH P NON-PRECESSIVE 28500 DNA 12345678 G AATT C C TTAA 16 15 14 13 12 11 10"9 NAME: STATE : CONDS : LAB-GOAL-1 ACULTURE with ORGANISMS = ABACTERIUMwith EXOSONES = A VECTOR with GENES = RAT-INSULIN (PURE? ORGANISMS CLJLTURE) The usual way of using the Unit Package is to define general knowledge before specific knowledge. For example, general knowledge about enzyme, nuclease, and restriction enzymes would be entered before the specific knowledge about a particular restriction enzyme like EcoRl. The Unit Package is designed to encourage the use of description, such as the description of a culture in the second unit above. These descriptions are used for checking new information as it is entered and for pattern- matching operations that are part of a reasoning step. Reference [52] describes the Unit Package and compares it to other work on representation. *The examples above have illustrated the representation of "object-centered" or "noun-like" knowledge. Every reasoning program also contains a representation of the inferential 34 Core Research Set IV.B. knowledge. In the first version of the DENDPAL program, this kind of knowledge was represented as a program. This choice of representation had the consequence that a chemist could not enter new knowledge into the program (because he could not be presumed to be an expert programmer). Also, since the program structures were not understandable by the program itself, facilities for explanation of DENDRAL's reasoning had to be built into each part of the program. In the MYCIN program [51], developed more recently, the inferential knowledge was moved out of the program and into a knowledge base represented as production rules. This representation, because it was closer to the experts' representation than DENDPAL code was, allowed us to develop programs that could acquire rules from physicians. It also allowed the system to generate its own explanations bv examininu the rules it had used: Production rules-illustrate many of the themes which run through our work on representation. (1) Explicitness - Knowledge is encoded in a knowledge base and not just in programs. (For example, production rules are used to make inferential knowledge explicit:? The distinction between knowledge being in a program or in a knowledge base is a crucial one, for our purposes. Information encoded as a program can be run, and initially coded, more easily and quickly. However, as the program grows, it becomes more and more difficult to add new knowledge : its relationships to all the other knowledge must be considered and prograrraned explicitly. The latter methcd, storing knowledge in a separate data structure, a "knowledge base", enables the pieces of knowledge to be accessed and manipulated just like data. While their use, their running, may be somewhat slower, the system builder can now enter data in modular fashion, without much concern for the rest of the items in the knowledge base. He can give the system the knowledge it needs to reason about its own knowledge base. (2) Modularity - Knowledge is encoded in independent "chunks" as far as possible. (Production rules can be added or deleted from a knowledge base to change its problem-solving behavior.) The concepts chosen to represent the chunks of knowledge are those which are natural and useful to a domain expert. This is useful both if the expert is to input rules directly, and if he is to be convinced by the system*s explanation of its behavior. (3) Uniformity - Knowledge is represented so that it can be manipulated by general purpose programs. (Production rules and frames are two of the uniform methods for which we have general purpose processing routines. ) 95 Sec. IV.B. Core Research Cur perception of the outstanding research issues in representation is discussed in Section IV.C.l.. As canbe seen from the examples above, how knowledge is to be used is important in determining how it should be represented. With more uses for knowledge - explanation, tutoring, problem-solving - come more constraints on its representation. IV.B.2. Reasonina The first step in creating a problem-solving system is to develop and test a method for reasoning. In the DENDRAL program[ll] for inferring chemical structures from mass spectrometry data, the reasoning framework that we tested was called the Generate-and-test paradigm. This consisted of (1) al exhaustive generator of all pssible solutions (chemical structures) and (2) a set of pruning rules which used the mass spectrometry data to eliminate inconsistent answers. One of the issues that became relevant in studying this reasoning framework is the combination of possibly contradictory evidence. Data in many problems is incomplete and errorful; there is seldom a perfect match between an internal model and empirical data. Even if DENDRAL had a parfect model of how mass spectrometry data corresponds to chemical structures, the data from any particular run of a mass spectrometer are erroneous with respect to both extraneous and missing data. InDENDRAL, an overall domain- specific matching function was used which reflected a priori probabilities of errors in the data. Recently we have rcxamined this problem in the context of the GA1 prcgram[53] which solves an analogous problem from molecular genetics. For the MYCIN program we used backwards-chaining as a reasoning framework. This method develops a line of reasoning by chaining together MYCIN's inference rules (production rules) backwards from the goal of making the diagnosis towards the available evidence. This particular reasoning framework has proved especially convenient for developing computer explanations of the program's reasoning. To deal with imperfect evidence and inexact rules of inference, a mathematical model of certainty based on numeric "certainty factors" was developed. This constitutes a model of "plausible reasoning". In order to test the NYCIN approach in other domains, a domain independent package, EYYCIN (for "Essential MYCIN") has been created and is being utilized in other amlications discussed elsewhere in this proposal. When MYCIN is chaining back through its inference rules and discovers a need for information that cannot be inferred, it stops and asks for it. This approach is appropriate only when 96 Core Research Set IV.B. there is a way of supplying data as needed by the reasoning progrm. For some applications, such as signal interpretation, it is better for the program to make use of whatever itknows, because there is little chance that specific items of information can be suppliedon demand. Further limitations of a simple backwards-chaining model are (1) it is unidirectional, hence cannot mix top-down and bottom-up processing and (2) it is exhaustive, hence less efficient than approaches that reason hierarchically by mrking with abstractions. An alternative reasoning model which does not have these limitations is the "cooperating knowledge sources" model developed for the HEARSAY11 [201 system and incorprated in our AGE-I program. This model consists of (1) the "blackboard", a global data structure which holds the system's hypotheses, and (2) a set of "knowledge sources" (KSs) which contain the inference rules for the system. Because of gaps in the theory and implementation of the individual KSs and noise in the data, the KSs are individually incomplete and errorfill. A version of the "hmthesize and test" cooperation (to help overcome paradigm is used which emphasizes and data) incompleteness in both knowledge and cross-checking (to help correct errors). During the hypothesize part of the cycle, a KS can add a hypothesis to the blackboard: during the test part of the cycle, a KS can change the rating of a hypothesis in the blackboard. This process terminates when a consistent hypothesis is generated satisfying the requirements of the overall solution or when knowledge is exhausted. The power of the blackboard - over, say, a uniform QA4 assertional net - is its structure: it is n- dimensional, where the dimensions have some meaning (time, level of abstractness, geographic location, etc.). Hence each rule can know what part(s) of the blackboard to monitor, and each hypothesis is carefully placed at a meaningful spot on the blac.kboard, This is a simple modelling of the domain. but pwerful tyypa of analcgic Iwo research programs based on #is paradigm have been developed by our group 1431. One is the CRYSALIS program for interpreting x-ray crystallography data and the other is a military signal interpretation program. In these prcgrrms the HFARSAY rrcdel was extended by (1) extending the blackboard to allow for several independent hierarchical relationships among data and hypotheses and (2) extending the control structure. In each of the examples above, our study of reasoning methods always starts in the context of a problem in a scientific or medical domain. for further We then generalize the method and package it testing in other domains. When a framework for reasoning works well enough, research on other artificial intelligence topics, such as explanation or knowledge 97 Sec. IV.B. acquisition, often follows. Our perception of open research issues in reasoning methods is discussed in Section IV.C.2.. Core Research IV.B. 3. Knowledge Acquisition and Management One characteristic of the domain problems we have studied is their requirement for a substantial amount of domain expertise. Goldstein addressed this point in 1261: may there has been a shift in paradigm. The fundamental problem of understanding intelligence is not the identification of a few Ipwerful techniques, but rather the question of how to represent large amounts of-knowledge in a fashi& that permits their Zfective -i use and interaction. This shim based on azsof exoerience with programs that relied on uniform search or logistic techniques that proved to be hopelessly inefficient when faced with complex problems in large knowledge spaces. The relevant problem solving knowledge includes much formal and informal expertise of the domain expert; it also includes many mundane facts and figures that make up the elementary knowledge of the domain. Before a computer system can solve problems in the domain, this information must be transferred from the expert to the computer. Over the last decade, there has been some encouraging progress along this dimension. In DENDRAL, the rules of inference about mass spectrometry had to be put in machine form, but knowledge acquisition by the program from the chemist was beyond our technology. Knowledge was added by a painstaking process in which a computer scientist together with a chemist learned each other's terminology and then wrote down the chemical rvules for the simplest kinds of chemical compunds. Then the computer scientist entered the rules into the computer and tested them and reported the results back to the chemist. The reward for this effort over several years was a program with expert- level performance. It is interesting to compare the knowledge acquisition effort of the DENDRAL program with that of a more recent program 98 Core Research Set IV.B. - PUFF, the system for diagnosing pulmonary function disorder. In contrast with DENDRAL, PUFF was created in less than 50 hours of interaction with experts at PMC and with less than 10 man- weeks of effort by the kno&adge engineers. Part of this tremendous difference in development time is due to the fact that the domain of pulmonary function is much simpler than mass spectrometry. However, the main reason that the development was so rapid is that PUFF was built with the aid of an interactive knowledge engineering tool, EMYCIN. When knowledge engineers at the Heuristic Programming Project started the PUFF project, they already had a reasoning framework in which to fit the problem and an "English-like" language for expressing the diagnostic rules. The facilities that make ETMYCIN such a powerful tool are the direct result of the core research over the last five years on the MYCIN program. Another dimension of progress closely related to knowledge acquisition is knowledge management, that is, management of the global structure of a knowledge base. A knowledge base is more than a set of isolated facts: its elements are related to one another. In the DENDRALprogram, all of the knowledge was represented as programs and LISP data structures. If changing one part of the programmeantthatanother part had to be changed as well, the programmer had to know that. As programs or knowledge bases get large, this kind of effort becomes substantial. A system becomes too large to maintain when no one can remember all of the interactions and every change introduces bugs. TEIRESIAS[lS] extends the idea (developed init'ally in automatic programming research) that a system can i ai substantially in identifying sources of errors and can take on scme of the responsibility for making changes. Research issues in knowledge acquisition and management are discussed in Section IV.C.3.. IV.C. Methods of Procedure - We are interested in exploring the effects of new ideas about knowledge based programming on a variety of systems to effectively test the generality of these ideas. Each of the topics in the core research area will be developed in the context of more than one example program (see discussions of Projects l- 3) o The expert systems developed at the Heuristic Programming Project over the last decade can be used as tools for the 99 Sec. lS7.C. Core Research development of the core research topics. Each of the biomedical domains has particular aspects that can be utilized in this work: the IWLGEN program for molecular genetics research has methods for representing experiment planning, the MYCIN program for infection disease diagnosis and therapy has a well developed rule set, the PUFF program for pulmonary function test interpretation has a small rule set, and the VM program for interpreting physiological measurements from the Intensive Care Unit has a knowledge base that emphasizes knowledge that changes over time. rV.c.1. Representation In Section IV.B.l. we traced our work from specialized representations as in the DENDRAL program to representations of more general applicability - such as our production rule and frame methodology. Today's representation systems, even the "general" ones, do not solve all of the problems that we are encountering in our research. In most science, methods which are general are also weak. There seems always to be a need to tailor aspects of a representation to particular problems. The following representation issues stand out in our mrk: Time-based 'knowledge Several problems which we are working on involve situations that evolve over time. In the Ventilator Management (W program [21], time enters as instrument data that varies over time. The program must correctly track the stages of treatment on the treatment machines. In the RXprogram [S] for reasoning from time-based clinical data bases, statements about disease and treatment of patients need to be adequately quantified over time. In the MYCIN [Sl] work, we want the system to be able to resume a consultation session about a patient and appropriately @ate new knowledge about the patient as treatment progresses. In the rWLGEN project [40], the experiment planning program must plan a sequence of steps. It must predict how the laboratory objects will be changed over time as the manipulations proceed. The basic issues common to these projects are (1) time-specified reference to objects and (2) tracking causal changes on objects over time. difficult, triile these problems do not seem conceptually they do require extensions to the representational tools which we have available. Grain Size in Complex Systems P-e 100 Core Research set rv.c. Among the virtues of production rules ' are (1) their modularity allows easy addition and modification of inferential knowledge and (2) they can be written in such a way that their grain size seems appropriate for explanation systems. As we move toward hierarchical reasoning methods the grain size of individual production rules seems too .snal.l for coherent explanations. Just as the reasoning methods work with abstractions to reduce the combinatorics, explanations of this should also be abstract. art. At present, the problem of factoring knowledge is an opaque When a frame-structured representation is used, a knowledge engineer makes decisions about what facts to group together. This decision takes into account indexing during problem solving and the interactions among items in the knowledge base. In hierarchical reasoning methods knowledge is viewed with a varying grain size; it starts with an astract conceptualization at the beginning of problem solving and moves toward finer detail as the solution proceeds. Although we have some understanding of how to organize a bcdy of knowledge hierarchically, much tvJork remains to be done to make the best use of that organization during knowledge acquisition and problem solving. Matching representation methods to problems In our current systems, a knowledge engineer must learn the particulars about a problem and then pick or appropriate representation. develop an We Vauld like to extend current AI ideas in the design of a system which takes more responsibility for choice of representation. Such a system will select or modify its representations combining the knowledge of the limits and &vantages of representations needs. with the knowledge of its own Iv.c.2. Reasoning In Section IV.B.2.j we traced our research on methods of reasoning fram the Generate-and-Test paradigm (DENDRAL, GAl), to bac.kwards chaining (MYCIN, EMYCIN, -PUFF); to knowledge sources model (CRYSALIS, HASP, AGE-l). we discuss core issues related to these reasoning as some ideas for new models. the cooperative In this section models as well Incomplete &asoning %ee [lS] for a discussion of different ways of using this formalism. 101 sec. ls7.c. Core Research One of the themes in all of our methods of reasoning is the treatment of inexact and incomplete knowledge. One of the difficulties which we have perceived in MYCM*s simple Cl? model is that the representation is inadequate for discriminating between (1) absence of evidence and (2) evidence of absence. This example illustrates how the needs of the reasoning program have to influence the fundamental representations used in the system. Reasoning with Abstractions The availability of the Unit Package [52] has broadened our capabilities for representing abstractions. For example, an organism can be variously described as "a bacterium", "E.coli I(- 12", "a bacterium that is grampositive", or even "a bacterium with a vector which has the rat-insulin gene". A reasoning program can use the descriptions available in the Unit Package as abstractions in its reasoning process. We are currently using this idea in the MOLGEN project for reasoning. about experiment planning. Orthogonal Planning One of the themes in our representation work is to make knowledge explicit for general processing. We have carrid this theme into an exoerimental framework for reasoning being developed currently *&I the MOLGEN project. The idea is to make the reasoning explicit in the operations, which are carried out by a planner, knowledge base. These operators then implicitly define an abstract "planning space", Our hope is that this will provide a computer with a planning method more powerflul and flexible than previous hierarchical planning methods. The feasibility of this approach is currently being tested. Matching Reasoning ~Yethods to Problems One of our long term goals in developing and Iunderstanding reasoning methods is to develop a theory for matching reasoning methods to problems. Such a program would combine knowledge of the limitations of available reasoning frameworks with the needs of an application to aid in the design of a knowledge based system. We have started on this problem with the research of the AGE project within the HPP. 102 Core Research set Iv.c. IV.C.3. Knowledge Acquisition and Management In Section IV.B.3., we traced our work on knowledge acquisition from the DENDRAL program, where knowledge was acquired by a knowledge engineer and then programmed into the system, to the PUFF example where the EMYCIN package greatly accelerated the creation of a consultation system for pulmonary function diagnosis. Three Phases of Knowledge Aquisition As a result of our recent experiences with the SACCN program [3], we have found it useful to characterize the knowledge acquisition process as occurring in three distinct phases. We have done the most research on the third phase and plan to work our way towards the first phase. (1) Framework Identification. making The first phase corresponds to untlal decisions about the typical advice the consultant will give and the major consultant will use. reasoning steps the (7.1 Acquisition of Fundamental Concepts. This is followed by an extended peril of defining parameters and objects. These objects form the fundamental vocabulary of the domain. Using this initial domain vocabulary, a substantial portion of the rule base is developed. This process, captures enough domain expertise to allow the consultation system to give advice on the large n&er of common cases. (3) Acquisition in a Well-Developed Knowledge Base. final phase, f&gr In the ' interactions with the eet tend to refine and adjust the established rule base, primarily to handle more obscure or complicated cases. In this phase, the system can draw on examples from the knowltige base to guide the acquisition process. Previous work on the ~TEIRESIAS program [15], which explored one possible method for handling the "final phase", the basis for our research in knowledge acquisition. will provide of the acquisition task This phase utilizes the large bcdy of knowledge to set the appropriate context for understanding new facts. Consistency 103 sec. Iv.c. Core Research Developing an understanding of the automatic management of knowledge during and after its acquisition is an important aspect of our research aims. The knowledge base consists of the totality of concepts and relations between concepts that have been presented to the program. We will investigate methods for determining the consistency of the aggregate knowledge base. The quality of the knowledge base is improved through experimentation. Cases are run (for medical domains) by selecting a diverse set of patients and comparing the results to the conclusions of our expert. When the results don*t match, the knowledge base must be updated to account for those changes. Ttio operations are important for this process: (1) the ability to determine the piece or pieces of knowledge that must be changed and (2) determining that changing the knowledge to correct the results on one patient will not produce incorrect results when applied to another patient. Another possibility is to identify and;' in effect, live with inconsistency, just as people apparently do. Predominantly rational behavior may be evinced by a system which does not satisfy consistency requirements. The key test is whether the elimination of any "inconsistent" rule makes the system behave better or worse in the long run. This is closely tied to consensus-formation, as discussed in the next section. N.C.4. Multiple Uses of a Knowledge Base --- We are exploring many additional uses of the knowledge base beyond the performance aspects for which we acquired the knowledge. Three areas are of interest: using the knowledge for explanation of the reasoning steps of the program, using the knowledge for intelligent teaching about the domain, and using the knowledge base as a vehicle for building consensus among exparts. ESrplanation The use of explicit inference rules in a knowledge base has made it possible to generate an explanation of the programs' reasoning steps. While this has been achieved in the "backwards chaining" reasoning model, it is more difficult in the reasoning methods which reason hierarchically. We will examine methods for modifying the level of explanation based on the abstractions used by the program and a model of the user. 184 Core Research set Iv.c, Tutoring The act of explaining the knowledge has led to the problem of using the knowl&ge base for tutoring purposes. Our initial expriment with this in the MYCM framework [12] demonstrates the potential educational value of this use of the knowledge base. Under another proposal (pending to CNR & AReA) we will be exploring strategies for presenting the contents of a knowledge base represented as a set of rules. Here we propose to extend those methods for relating to the user the contents of knowledge bases stored in other representations. Consensus Building We prolpse to investigate awroaches for building consensus among experts. Because the strength of consultation programs will in large part lie with their ability to pool knowledge from several sources, it is important to recognize apparent differences of opinion among experts and to assist, when possible, with arriving at a consensus. This represents another version of the consistency checking problem: comparing the ramifications of multiple versions of knowledge and providing the capability to guide an interaction in which such differences are "ironed out". Of course there may be times when both versions of the knowledge may need to be stored and appropriam flagged so that users can select which experts' opinion they will follow during a consultation. reasoning (e.g., The exprts may wish to select a stvle of empirical vs theoretical), rather thana particular individual's set of rules. itself may be able Ultimately, the system to choose from differing advice in its knowledge base. All of these areas require some aqmentation to the knowledge base to provide the causal reasoning steps upon which the knowledge is tied. This allows a program to explain why a particular rule was written in addition to telling how the rule was used to make a particular conclusion. Similar needs have been shown in the use of a rule base for tutoring and for determining consensus among experts [37]. Often, a rule will be put into the system cast in a much more specific form than that to which the knowledge truly applies. One task to investigate is how to generalize to just the proper level. More complex still are the subtle changes that accompany a rule as it is generalized (e.g., changing certainty factors). 105 Sec. lS7.D. Core Research IV.D. Significance The significance of this work is twofold: 1. Understanding how to represent inexact and incomplete knowledge symbolically so that a system can perform complex intelligent processes - like .diagnosis and explanation. This work expands the boundaries of what we understand how to do with computers. 2. Investigating the fundamental questions that underlay the development of domain-independent tools of AI discussed elsewhere in this proposal, Gne of our ultimate goals is to understand the techniques employed in building such programs. It has always been difficult to determine if a particular problem-solving -method used in a particular knowledge-based program is domain-specific or whether it can generalize easily to other domains. In current knowledge- based programs, the domain knowledge and the manipulation of it using AI techniques are often so intertwinti that it is difficult to uncouple them, to make a program useful for another domain. This long range goal, then, is to isolate AI techniques that are general, to determine the conditions for their use: to build up a knowledge base about AI techniques themselves. We will carry out our research with this question in mind: what are the criteria determining whether a particular problem-solving framework and representation system is suitable for a particular application? 106 Facilities v. FACILITIES AVAILABLE set v V.A. Hardware Al.1 computixq work will be carried out initially on the SUMEX facility, a dual processor DElC XI-10 system running TENBX. The system is located at Stanford, but is supported by NIH under grant RR-6785 as a national resource for the study of applications of artificial intelligence to problems in biology and medicine. It has available a wide variety of advanced programming languages (e.g., INTERLISP, SAIL), and support programs (e.g., text editors), as well as powerful file handling and storage management capabilities. Resources available at no cost to this program incllude CPU usage and disk storage, while access is via local dial-up lines and three networks (TYMNET, TELENET,andAFtP~), Within the next 18 months the SUMEX installation is also scheduled to receive a PDP-20/20 system that will be interfaced with the currently existing PDP-10. The new machine is intended for service-related applications of artificial intelligence to medicine, and some of our programsr once operational, would most appropriately be run on this machine. The machine will be used by other projects, however, and may occasionally be scheduled for sole use by one of these. Thus SUMEX can make no commitment to provide scheduled service to medical personnel wishing to use the programs routinely. The PDP-20/20 hence will function as a prototype for the kind of dedicated small machine that may eventually operate in the clinic. V.B. Software and Personnel Our proposal is to build on the knowledge representation and control techniques developed during work on the MYCIN, Molgen, PUFF, and AGE systems in the Heuristic Programming Project. New programs and data structures will, of course, be required. Starting with existing software packa9es, however, is a considerable advantage over developing the software - and design experience - de novo. The base language will continue to beINTERLISP. -- In addition to the computing ,zower and the large collection of existing software , access to the SUMEX system also offers the 107 Sec. V.B. Facilities benefit of being a part of the SUMEX-AIM community, The SUMEX user cornunity inclties a wide range of researchers in artificial intelligence united by a number of common interests. We have found our interchanges with them in the past to be very useful, and expect this to continue. lfj8 Collab. Arr. Sec. VI. VI. coLriJABoF!ATIvE ARRANGEMENTS Formal collaboration with Dr. Lindberg's group at the University of Missouri is informal exchange. the natural result of many years of The formal arrangement between the two institutions is that Dr. Lindberg's project will be funded as a subcontract from Stanford, with budget as indicated in the budget section. There is a long history of successful collaboration between the Stanford Medical School and the Computer Science Deprtment. The SUMEX Computer Facility is a physical demonstration of this collaboration, while the large number research publications is more evidence. of interdisciplinary In part, this is due to the physical proximity of the two groups; but more importantly, it is due to common interests and common goals. The SUMEX facility itself has removed many of the communication barriers which often halt interdisciplinary research. 109 sec. VII. VII. PRINCIPAL, INVESTIGA!IOR ASSURANCE P. I. Assur ante The undersigned agrees to accept responsibility for the scientific and technical conduct of the research project and for provision of required progress reports if a grant is awarded as the result of this application. JibI. 30, f 97% Date &bieG' k& &bv- V, I Principal Investigator 1lB Appendix 2. VIII. APPEFDICES VI1II.A. APPEhTIP_ A -- Annotated 1!YCIN Typescript -- In the following pages we have included many detailed examples of the t!YCI?? program in operation. These exemplify both the accomplishments and the limitations of the work we have done so far. Although we are not proposing expansion of the program's infectious disease knowledge at this time, these examples should help illustrate the kinds of capabilities that we intend to develop in a system for oncology protocol management. The examples in this appendix include the following: Section I - A sample production rule, translated into English. Section II - Instructions printed for new users if they request assistance when trying KCIN for the first time. Section III - Free-text case summary that may be entered by a physician for purposes of case identification in the future. Section IV - Detailed example of a consultation session for a patient with meningitis; the WRY and HOW commands of the reasoning-status checker @SC) are also demonstrated. Section V - Interactive session with the general question answerer !COA) regarding the consultation session in Section IV. Section VI - Example of ?!-YCXN's ability to assist with antibiotic dosage modification in renal failure patients; note that the program can also explain its decisions at this specialized task. Section VII - Example of a graphical option we have developed which pernits interested physicians to display a chart estimating the steady state blood levels of an antibiotic at a variety of regimens for modified dose or dosing interval. Section VIII - Example of a subsystem of FTPCTB in which the user can circumvent much of the extensive consultation session demonstrated in Section IV. If a physician is relatively certain of the infection and organisms to be treated, he may specify these as shown and MYCIN will simply assist with therapy selection. 111 Sec. VIII .A. Appendix 1?. Section IY - Example of MICIN's ability to rerun previously stored patients and to interact with an expert when a problem in performance is identified- rote that MYCIY and the expert have a "discussion" in which a missing rule is identified. The physician tells HYCIN the missing rule (in English) and the program translates it into its internal LISP representation. The case is then run again to see if the performance improves with the new rule in place. 112 Appendix A. Set VIII.>. T -0 Sample Rule with Addditional Stored Information RLTLE3GG -a---- [This rule applies to all cultures and suspected infections, and is tried in order to find out about the organisms (other than those seen on cultures or smears) which might be causing the infection] If: 1) The infection which requires therapy is meningitis, and 2) The patient does have evidence of serious skin or soft tissue infection, and 3) Organisms were not seen on the stain of the culture, and 4) The type of the infection is bacterial Then: There is evidence that the organisms (other than those seen on cultures or smears) which might be causing the infection is staphylococcus-coag-pos f.75) streptococcus-group-a t.5) Author: YU Literature: G. Karalazin "Sickle-cell anemia - CLINICAL F!ANIFESTATIONS IN 100 PATIENTS" AmJEledSci 264:51 1975. E. Barret-Connor "Acute pulmpnary disease and sickle-cell anemia" PRRD 104:155, Aug 1071. M. Robinson "Pneumococcal meningitis in sickle-cell anemia"NEJH 274:1@06 1966. 113 Sec. VII1.A. Appendix A. MYCIN 3-Jun-7g . . . Special options (type ? for help): ** Instructions? (Y or N) ** YES This is a computer program named mCIN that has been designed to advise you regarding an appropriate choice of infectious disease therapy. It is understood that you have a patient who may have an infection. Please answer the following questions, terminating each response with RETURN. To correct typing errors, use the DELETE key to delete single characters, W to delete a word, and C to delete the whole line. If you are not certain of your answer, you may modify the response by inserting 2 certainty factor (a number from 1 to 10) in parentheses after your response. Absolute certainty (10) is assumed for every unmodified answer. It is likely that some of the following questions can not be answered with certeinty. You may change an answer to a previous question in two ways. If the program is waiting for a response from you (that' is, has typed rc**,, ), enter CRANGE followed by the number(s) of the question(s) whose answers will be altered. You may also change a previous answer at any time (even when the program is not waiting for a response from you) by typing F (Fix), which will cause the program to interrupt its computation and ask what you want to change. (If the response to F is not immediate, try typing the RETURN key in addition.) Try to avoid going back because the process requires reconsidering the patient from the beginning and therefore may be slow. Note that you may also enter UNK (for UNKown) if you do not know the answer to 2 question, ? if you wish to see a more precise definition of the question or some examples of recognized responses, ?? if you want to see all recognized responses, the word RULE if you would like to see the decision rule which has generated the question being asked, the word WRY if you would like to see a more detailed explanation of the question, or the letters CA if you would like to interrupt the consultation in order to ask questions regarding the decisions made so far in the consultation. If you are ever puzzled about what options are available to you during a consultation, enter the word HELP and a list of options will be listed for you. Sample Response [user input follows the "**"I Does the patient have a risk factor for tuberculosis? **? One or more of the following are considered risk factors for tb: a) positive PPD (STU), b) history of close contact with a 114 Appendix ,1. Oec VII1.A. person having active tb, c) household member with a past history of active tb, d) chest X-ray showing apical scarring, e) granulomas seen on biopsy of any organ tissue. expected responses are: YES N@ Enter HELP for user options. ** Y-ES sLl-m?AJY: (type ctrl-0 to abort printout) UNH - answer not known 3 - Repbrases the question and gives examples of recognized responses ?? - prints a list of all recognized responses RULE - prints the current decision rule CA - program enters question-answering mode CHAPGE - go back and re-request answer to question number COr!MENT - enter comments about any aspect of the system's performance. Your comments will be forwarded to those in charge of the l!YCINIprogram. WHY - gives high-level explanation of the current reasoning chain that provoked this question. HOW - explains HOW the system will achieve 2 goal referred to by number in a previous eXQlan2tiOfl. EXPLAIN - provides a more detailed explanation of a previous answer given by a WPY command. FORGET - resets the explanation of the reasoning chain back to the lowest level, as if you never used the WPP/EXPLAIN commands. STOP - Kalts the program without completing the consultation. SAVE - Halts the program, saving the current patient on a disk file, retrievable at a later date. HELP - Prints this list 115 Sec. vII1.A. Appendix A. III. User's Description of Patient This will be patient 538 If you would care to enter a brief summary of PATIENT-538's case, to be used for easy future reference to this file, please do so now (type as many lines as required, terminate with a control-Z): ** 34 yr old white male with severe burn. After 15 days ** of hospitalization pt became disoriented and febrile. On 16th ** hospital day LP performed- PYN 100, PKNX99, Pro 2S0, Glu 5 (periph ** blood glu unk). WBC 1900 with 88X PMN and 51 bands. Pt had drug- ** induced aplastic anemia. Current med at time of LP: Cephalothin. ** Final org: Pseudomoas-aeruginosa. Correct Rx: Ideal would have been ** Carb + Tobra (or Genta). Same as pt 504 except gram stain of CSF ** omitted. z ** 116 Appendix P. Set !`I I I . .: . TV. Infectious Disease Consultation Session [User's responses follow double asterisks] --------PP_TIENT-53e-------- 1) Patient's name: ** PT53g 2) Pge: ** 34 YEARS 3) Sex: ** MAEL =MALE 4) Race: ** InY li o e. WHY is it important to determine the race of PATIENT-538?1 [l.Ol The race of a patient is one of the 5 standard parameters that identify a patient. This may be useful later in the consultation. For example, it is relevant in determining the organisms (other than those seen on cultures or smears) which might be causing the infection. [back to question 4...1 ** CAUCASIAN Please supply one answer under each heading. If your response is not understood, you will be prompted by the standard questioning method. (enter ? for normal prompting, * for questions that are not applicable). Please enter the results of positive cultures using the following table. site 5)**D@NE culture date date1 method! number collected positive of collection [urine,sputum only1 Please enter the results of pending cultures in the following table. site 6)** CSF culture date stain1 method of! number collected examined collection [urine,sputum only! 1234 30-Jan-77 YES * stain of morph1 smear [type DONE if no organism was seen on the smear1 7)**DONE site g)**DONE cul date ~011. smear examined?1 ~011. method1 Please enter negative cultures in the following table. 117 16) Please give the date on which clinical evidence (symptoms, signs, or laboratory tests) of the meningitis first appeared. ** 2?-Jan-77 22:15 17 ) Does Pt53P have an abnormal chest X-ray? ** NO 18 ) Does Pt538 have one or more of the following risk factors for tb: a) positive intermediate strength PPD, b) history of close contact with a person with active tb, c) a household member with a past history of active tb, d) apical scarring on chest xray, e) history of granulomas on biopsy of liver, lymph node, or other organs.? ** NO 151) Does Pt538 have any cutaneous lesions or rash on physical examination? ** NO Appendix A. Sec. VII1.A. site !?)**DONE culture date] method ofi number collected collection [urine,sputum only1 10) Do you suspect Pt53S may have an infection at a site from which you have not obtained culture specimens? ** NO Please enter information on current antimicrobials. drug name II)** CEPFALOTHIN 12)**DONE route of date) administration started IV 23-Jan-77 Please enter information on prior antimfcrobials. drug name 13)**DONE route of administration date started date1 stopped [Considering organisms (other than those isolated from cultures) which might be present...] 14) Has Pt538 recently had symptoms of persistent headache or other abnormal neurologic symptoms (dizziness, lethargy, etc.)? ** YES 15) Has Pt538 recently had objective evidence of abnormal neurologic signs (nuchal ridigfty, coma, seizures, etc.) documented by physician observation or examination? ** YES The CSF cultures will be considered to be associated with XFNINGITIS. 20) Does Pt538 have evidence of ocular nerve dysfunctfon? ** NO 21) Is Pt53S a compromised host (e.g. alcoholic, sickle-cell-disease, immunosuppressed)? * YES 22) Is Pt538 immunosuppressed (e.g. corticosteroid therapy, cytotoxic drug therapy, radiation therapy, leukemia, lymphoma)? ** NO 23) For how many days has Pt538 had abnormal neurologic signs? ** ?.2 HOURS 24) For how many days has Pt538 had neurologic symptoms? ** 1 DAYS 25) H2s Pt538 had recent evidence of tender or enlarged salivary glands? ** NO 26) Has Pt538 been exposed to any contagious disease recently (e.g. meningococcal disease, mumps) ? ** NO Please enter CSF findings in the following table CSF Percent Protein CSF Peripheral! WBC Pm Glucose Glucose 27)** 100 99 280 5 CJKNOWN 28) Do you know the results of a recent CBC of Pt538? ** YES Please enter CBC results in the following table WBC Percent Percent1 in 1000's PMN BANDS 2Q)**- 1.9 UNKNOWN UNKNOWN 30) Has Pt538 had any injury or insult to, or defeEt in the central- nervous-system (excluding neurosurgery)? ** NO 31) Has Pt538 ever undergone any type of surgery? ** YES 32) Does Pt538 have a history of NEUROSURGERY? ** NO 23) Does Pt538 live in a crowded environment? 4" MC 34) Do you suspect recent alcoholic history in Pt538? ** NO 35) Is meningitis a hospital-acquired infection? ** -cJffi [i.e. WHY is it important to determine whether the infection was acquired while the patient was hospitalized?] [2.0] This will aid in determining the organisms (other than those seen on cultures or smears) which might be causing the infection. It has already been established that c2.11 the infection which requires therapy is meningitis, and 12.21 organisms were not seen on the stain of this pending csf culture, and L2.31 the type of the infection is bacterial, and 12.41 neurosurgery is not one of the relevant items from the history of pt538, and L2.51 the age of pt538 is greater than 10 days Therefore, if [2.6] the infection was acquired while the patient was hospitalized 110 Sec. VII1.A. P.ppendi:s ,! . then there is evidence that the organisms (other than those seen on cultures or smears) which might be causing the infection is e.coli (.75) staphylococcus-coag-pos (.3) pseudomonas-aeruginosa (.3) klebsiella-pneumoniae (-5) Author: Yu [RULE5451 [back to question 35...1 ** HOW 2.3 [i.e. HOW was it established that the type of the infection is bacterial?] The following rules concluded about: the likelihood that the type of the infection is bacterial Rule Yes No i2.71 RULE148 (.19) L-2.81 RULE500 (051) l2.91 RULE501 C.95) f3.01 RULE502 C.97) l3.11 RULE526 C-98) f3.21 RULE504 (.?7) f3.31 RULE524 C-98) Thus, is has been established that the type of the infection is bacterial. [back to question 35...] ** HOW 2.5 ri .e. HOW was it established that the age of pt538 is greater than 10 days?] In answer to question 2 you said that the age of pt538 is 34 years Thus, it has been established that the age of pt53& is greater than 10 days. [back to question 35...] ** y 36) Is Pt538's clinical history consistent with EPIGLOTTITIS? ** NO 37) Is Pt53S's clinical history consistent with OTITIS-EIEDIA? ** NO 38) Has Pt538 ever undergone splenectomy? ** NO 30) Is Pt538 a bum patient? ** WHY [i.e. wl?Y is it important to determine whether PATIENT-538 has been seriously burned?] L4.01 This will aid in determining the organisms (other than those seen on cultures or smears) which might be causing the infection- 12c Appenc!iu .A. Set -KII.A. It has already been established that l4.11 the infection which requires therapy is meningitis, and [4.2] organisms were not seen on the stain of this pending csf culture, and I4.31 the type of the infection is bacterial Therefore, if l4.41 pt53E has been seriously burned then there is suggestive evidence C-5) that pseudomonas-aeruginosa is one of the organisms (other than those seen on cultures or smears) which might be causing the infection Author: Yu [RULE5781 [back to question 39...1 ** WiIy [i.e. WHY is it important to determine the organisms (other than those seen on cultures or smears> which might be causing the infection?] [S.G] . . . in order to determine the therapeutic regimen of PATIENT-53g. If L5.11 information has been gathered about organisms isolated from the patient, organisms noted on smears of the patient, negative cultures of the patient, suspected infections without microbiological evidence, current drugs of the patient and prior drugs of the patient, and r5.21 an attempt has been made to deduce the organisms which require therapy, and I5.31 You have given consideration to organisms (other than those seen in cultures or smears) which might be present, and found at least one organism which requires therapy. then determine the best therapy recommendation from amonethe drugs likely to be effective against the significant organisms [RULE0921 [back to question 39...] ** Y I have found evidence (RULE545 RULE-557) that treatment should cover for E.coli causing meningitis. I have found evidence (RULE545 RULE557 RULE578) that treatnent should cover for Pseudomonas-aeruginosa causing meningitis. I have found evidence (RULE545 RULE5571 that treatment should cover for Klebsiella-pneumoniae causing meningitis. I have found evidence (RLZE507) that treatment should cover for Diplococcus- pneumoniae causing meningitis. 40) Does Pt53$ have focal neurological signs? ** UNKNOWIT [Determining which drugs are desirable for use against the E.coli...] [Determining which drugs are desirable for use against the Pseudomonas- aeruginosa...] [Determining which drugs are desirable for use against the Klebsiella- pneumoniae...] 121 Sec. VII1.A. Appendix A. [Determining which drugs are desirable for use against the Diplococcus- pneumoniae...] l?y therapy recommendations will be designed to treat for organisms that either are very likely or, although less likely, would have a significant effect on therapy selection if they were present. It is important to cover for the following probable infection(s) and associated organism(s): INFECTION-l is MENINGITIS + l?.COLI [from clinical evidence only] + PSEUDOMONAS-AERUGINOSA [from clinical evidence only] + ELEBSIELLA-PNEUMONIAE [from clinical evidence only] + DIPLOCOCCUS-PNEDMOBIAE [from clinical evidence only] 41) Does Pt538 have a clinically significant allergic reaction to any antimicrobial agent? ** NO [Considering AMPICILLIN-AKD-GENTAMICIN for use against INFECTION-l...] 42) Do you have reason to suspect that Pt538 may have impaired renal function? ** NO 43) Pt538's weight in kilograms (or PCUXDS): ** 70 [REC-11 vy preferred therapy recommendation is as follows: In order to cover for Items (1 2 3 4>: Give the following in combination: 1) AKPICILLIN Dose: 3.5g (28.0 ml) q4h IV [calculated on basis of 50 mg/kg 2) GENTAMICIN Dose: 119 mg (3.0 ml, 8Omg/2ml ampule) q8h IV [calculated on 1.7 mg/kgl plus consider giving 5 mg a24h Intrathecal Comments: Monitor serum concentrations basis of Since high concentrations of penicillins c2n inactivate aminoglycosides, do not mix these two antibiotics in the same IV bottle. Do you wish to see the next choice therapy? ** YES Which, if any, of the above drugs do you wish eliminated from consideration? ** AMPICILLIN [Considering GENTAMICIN for use against INFECTI@f~-l...! [Considering PENICILLIN for use against INFECTION-l...] [REC;2] Next best therapy recommendation: 122 Appendix A. In order to cover for Items <1 2 3~: Give: GEXTAMICIN Dose: 119 mg (3.0 ml, 80ng/2ml ampule) qSh IV [calculated on bas 1.7 mglkgl plus consider giving 5 mg q24h Intrathecal Comments: Honitor serum concentrations In order to cover for Item <4>: Give: PENICILLIN Fee VTI1.A. is of Dose: 3.5 million units (7.0 ml) q4h IV [calculated on basis of 500C0 units/kg] RECOMMENDATION-2 is equivalent to RECOMMEh?>ATION-1. Would you like to see details of this comparison? ** YES Looking at the coverage of the most likely items, we see that: RECOKKENDATIOM-1 consists of 1 first choice drug (&!PICILLIN for ITE'H-1 and ITEM-4) and 1 third choice drug (GENTAMICIN for ITM-1, ITEM-2 and ITEM-3) RECOYMEKDATION-2 consists of 1 first choice drug (PENICILLIN for ITRM-4) and 1 third choice drug (GEMTAMICIN for ITE-2, ITEH-1 and ITEX-3) ORGANISMS RECOMMENDATION-1 RECOMMENDATION-2 most likely' Drug -- Choice Ranking Drug -- Choice Ranking ITEi?-1 AMPICILLIN-AND-GENTAMICIN - 1st GENTAMICIN - 3rd ITEM-2 GEMTAMICIN -- 2nd GEXTAMICIN -- 2nd IT?B-3 GENTAMICIN - 3rd GENTAMICIN - 3rd ITES-4 AMPICILLIN -- 1st PENICILLIN -- 1st (The desirability of a drug is defined to be its lowest ranking for the items it covers.) Do you wish to see the next choice therapy? ** NO 123 Sec. VIII-A. Appendix A. V. Answers to User's Questions about this Consultation and About the Rnowledge Base Please ask any simple-English auestions regarding the above consultation session or the overall knowledge of the consultation system. When discussing the consultation above, please refer to organisms, cultures, etc. by their numeric names (eg. ORGANISM-l). Questions which were asked during the consultation may be specified by the numbers that preceded them above, (e.g., 'WHY DID YOU ASK QUESTION l?'). Control-O will abort type-out, or 'cancel' the current question. Type HELP for a list of the options available. Enter the word STOP when you have no further questions. Type BELP for the list of question options. ** HELP PA-OPTIONS: (Type ctrl-0 to abort printout) ADVICE - prints suggestions for using the @A system effectively QT - prints examples of question types recognized by the system EQ - prints an explanation' of the specified question(c) from the consultation, e.g., EC 8 13 LP - prints list of attributes associated with object type, e.g., LP CULTURE. (Object types are patient, infection, culture, organism, and drug.) PR - prints specified rule(s), e-g., PR 373 49 87 2 EXPLTHEF - prints a short description of the therapy program REC - prompts user for alternative therapy RECommendation and compares it to mycin's recommendation RX - prints treatment lists for specified organisms causing the indicated infection, e.g., Ry MENINGITIS EMTEROCOCCUS J-USTIFICATIOW - when rules are printed out, include their justifications LITERATURE - when rules are printed out, include their literature references NOJUST - stop giving justification and/or literature with rules TERSE - print rules in terse format VERBCSE - turns off terse printing of rules STOP - escape from QA-module HELP - prints this list ** JUSTIFICATION ** LITERATURE ** WFIY DID YOU TREAT E.COLI? ITEX-1 is E.coli associated with meningitis. I decided to treat ITI?!-1 on the basis of clinical evidence only. The following rules concluded about: the likelihood that e.coli is one of the organisms (other than those 124 Appendix A. Set V1II.P. seen on cultures or smears) which might be causing the infection RULES45 was used to conclude that e.coli is one of the organisms (other than those seen on cultures or smears) &ich might be causing the infection C.74). The last question asked before the conclusion was made was 35. RULE557 was used to conclude that e.coli is one of the organisms (other than those seen on cultures or smears) which might be causing the infection (.?3). The last question asked before the conclusion was made was 27. The parameters most frequently used in these rules were: the infection which requires therapy whether a smear of the culture was examined whether organisms were seen on the stain of the culture the type of the infection Would you like a summary of ALL the parameters used by these rules? ** YES Parameter (rules with this parameter) the white count from the patient's peripheral CBC (in thousands) (557) whether the results of a recent CBC of the patient are available (557) whether the infection w2s acquired while the patient was hospitalized (545) the age of the patient (545) the relevant items from the history of the patient (545) the type of the infection (545, 557) whether organisms were seen on the stain of the culture (545, 557) whether a smear of the culture was examined (545, 557) the infection which requires therapy (545, 557) ** WBY DIDN'T YOU GIVE TETRACYCLINE FOR E.COLI I?J REC-l? Shall I describe the therapy algorithm for you?** YES DESCRIPTION OF THERAPY SELECTION [Type cntrl-0 to abort printout] MYCIN is designed to provide optimal therapy recommendations. The program uses the following three steps: The first step ranks the drugs for each organism that requires therapy. Rules for doing this take into account invitro (lab) sensitivities and the patient's response to prior and current therapies. In the absence of other information, default lab sensitivities are used. Preference is given to current therapies if certain criteria are satisfied. Next , 1 or 2 drugs are selected to cover for the organisms which are most likely to be causing the infection. As many of the "less likely" organisms as possible are treated, but they are disregard altogether if 3 or more drugs are necessary. Throughout, preference is given to drugs that appear higher in the treatment lists. (These lists can be examined via the RX option.) Finally, the set of drugs is tested for patient factors that preclude their administration, such as allergies, age, and renal function. TETRACYCLINE was not prescribed for ITEM-1 TETPACYCLINE was not prescribed for ITE?!-1 in RECOXl+EPTDATION-1 because: 125 Sec. VII1.A. Appendix A. TETP.ACYCLIPE was discounted for ITEM-1 because there is evidence that this e.coli is not sensitive to it. Would you like to see some details?** NO ** HOW DO YOU USE THE TYFE OF THE INFECTION? The following rules use: the type of the infection L54, 578, 574, 572, 570, 565, 564, 559, 557, 553, 551, 549, 547, 545, 543, 541, 539, 537, 535, 533, 531, 529, 523, 520, 519, 517, 515, 513, 511, 507, 300, 289, 288, 287, 280, 174, 167 The parameter most frequently concluded by these rules is: the organisms (other than those seen on cultures or smears) which might be causing the infection Would you like a summary of ALL the parameters concluded by these rules? ** NO Which of the rules do you wish to see? ** 2SO RULE2P.O v-w- [This rule applies to all cultures, an d is tried in order to find out about the organisms (other than those seen on cultures or smears) which might be causing the infection] If: 1) The infection which requires therapy is meningitis, and 2) The type of the infection is fungal, and 3) Organisms were not seen on the stain of the culture, and 4) The patient is not a compromised host, and \ 5) The patient has been to an area that is endemic for coccidiomycoses, and 6) The race of the patient is one of: black asian indian, and 7) The cryptococcal antigen in the csf was not positive Then: There is suggestive evidence (.5) that cryptococcus is not one of the organisms (other than those seen on cultures or smears) which might be causing the infection Author: YU Justification: Dark-skinned races, especially Filipino, Asian, and Black (in that order) have an increased susceptibility to coccidiomycoses meningitis. Literature: Stevens, D.A. et al. Hiconazole in Coccidiomycosis. Am J lled, 60:191-202, Feb 1976. * HOW DO YOU TREAT PNEL'MCCCCCAL PNEUMONIA The MYCIN system currently can treat the following infections: BACTEPE?!IA MENINGITIS CYSTITIS. Because the knowledge base is incomplete, all other infections will be treated like bacteremia. For treatment of DIPLOCOCCSS-P?!!ENIAE in BACTEREKIA 126 Appendix A. Set VIII.>. The drugs of choice (ranked by relative efficacy, assuming sensitivity) are: 1st choice: PENICILLIB AMPICILLI?? 2nd choice: NAFCILLIN CEPRALOTRIN ERYTRROMYCIN CLINDAMYCIN 3rd choice: TETUCYCLINE CRLORAMPHENICOL VAMCOMYCIN Duration of therapy: 28 days ** UPAT DOSE OF ERYTRROMYCIN DO YOU RECOK'KEXD? 'i&en ERYTRROMYCIN is given for BACTER~IA, the recommended dosage is: if age is > 12 yrs then 12 me/kg q6h IV else if age is > .Og yrs then 5 mg/kg q6h IV else i-f age is > .02 yrs then 5 mg/kg q12h IV [Different dosages are recommended for different infections.1 ** ROW DO YOU TREAT CYSTITIS In general, treatment for an infection depends on the causative organism. Please include in your question the organism of interest. ** STOP Sec. VII1.A. Appendix A. VI. Dose Calculation Consultation Session MTCIN 3-Jun-78 . . . Special options (type ? for help): ** DOSE Instructions? (Y or N) ** YES This program makes dosage recommendations for a patient with an infectious disease. You will be asked a few facts about the patient whfch will be used to determine the appropriate dosages. Since the recommended dose of an antibiotic may differ with infection, you will first be asked to enter the infection for which the drugs are to be given. Then you will be prompted for an antibiotic and will be told what dosage to give. You may then request dosing information for another antibiotic (type RETURN if you are not interested in other drugs). You may also request dosing information for antibiotics to treat another infection (type RETURN when you are done). --------------------------------------------------------------------------- -------PATIENT-541-------- I) Patient's name: ** OSCAR LOVEJOY 2) Age: ** 34 3) Sex: ** M 4) Do you have reason to suspect that Oscar Lovejoy may have impaired renal function? ** Y 5) What is the most recent creatinine clearance in ml/min which you feel is a true representation of Oscar Lovejoy's renal function? ** u 6) Vhat is the most recent serum creatinine of @scar Lovejoy (in mg/10Oml)? ** 1.9 7) What is the previous serum creatinine of Oscar Lovejoy (in mg/lOQEl)? ** 1.8 8) Number of days between Oscar Lovejoy's two most recent serum creatinines: ** 2 9) Oscar Lovejoy's weight in kilograms (or PC!UNDS): ** 70 10) Oscar Lovejoy's height in centimenters (or IXCRES): ** 175 Estimated body surface area is 1.9 sq. meters. Dosage adjustments will be based on the calculated creatinine clearance of 42.7 ml/nin/l.73 sq. meters (adjusted to average body surface area.) Infection: XENINGITIS Drug: GENTAMIC IN 128 Appendix A. Set VII1.A. After a loading dose of: 112 mg (2.8 ml, SOmg/2ml ampule) IV [calculated on basis of 1.6 mg/kgl, give: 70 mg (1.8 ml, SCmg/2ml ampule) qgh IV [calculated on basis of 1.0 mg/kgl plus consider giving 5 mg q24h Intrathecal Or, after a loading dose of: 140 mg (3.6 ml, gOmg/2ml ampule) IV [calculated on basis of 2.0 mg/kgl, give: 119 mg (3.(! ml, 8Omg/2ml ampule) ql4h IV [calculated on basis of 1.7 mg/kg] plus consider giving 5 mg q24b Intrathecal [normal dose is (1.7 mg/kg q8h IV plus consider giving 5 mg q24h Intrathecal)] New interval (or CR) ** 12 After a loading dose of: 133 mg (3.4 ml, gOmg/2nl ampule) IV [calculated on basis of 1.9 q/kg], give: 105 mg (2.6 ml, POmg/2ml ampule) q12h IV [calculated on basis of 1.5 mg/kgl plus consider giving 5 mg q24h Intrathecal Drug: CHLORAXPHENICCL Give: 1.75g (17.6 ml) q6h IV [calculated on basis of 25 mg/kg! Drug: Infection: Please ask any simple-English questions regarding the above consultation session or the overall knowledge of the consultation system. When discussing the consultation above, please refer to organisms, cultures, etc. by their numeric names leg. ORGANISM-l). Questions which were asked during the consultation may be specified by the numbers that preceded them above, (e.g., 'WRY DID YOU ASK QDESTICN 11'). Control-Q will abort type-out, or 'cancel' the current question. Type HELP for a list of the options available. Enter the word STOP when you have no further questions. Type HFLP for the list of question options. ** JUSTIFICATION ** LITERATLRE ** HCW DID YOU CALCULATE THE DOSE FCR GENTAMICIN? When GENTAMCIN is given for MENINGITIS, the recommended dosage is: if age is > 2 yrs then 1.7 mg/kg q8h IV plus consider giving 5 mg a24h Intrathecal else 2.3 mg/kg q8h IV plus consider giving 2.5-4 mg/day intrathecal 129 Sec. VII1.A. Appendix A. The nornal dose for Oscar Lovejoy is: 119 mg (3.0 ml, 80mg/2ml ampule) q8h IV [calculated on basis of 1.7 mg/kgl plus consider giving 5 mg q24h Intrathecal. GENTAMICIN is excreted by the kidneys, so its dosage must be modified in renal failure. The following table shows how the patient's renal function was determined: Identifier Value Definition SCRl 1.9 SCR2 1.8 CCR 46.9 CCr(f) 42.7 CCr(n) 100 CCr(min) 80 the most recent serum creatinine of Oscar lovejoy (in mg/lOOml) the previous serum creatinine of Oscar lovejoy (in mg/lOCml) Oscar lovejoy's most recent creatinine clearance (in ml/min) Oscar lovejoy's creatinfne clearance, adjusted for normal body surface area (ml/min/1.73 sq. meters) the average normal creatinine clearance for Oscar lovejoy's age group the minimum normal creatinine clearance for Oscar lovejoy's age group Since CCr(f) < CCr(min), PATIENT-541 has renal impairment. To modify the dosage of GENT,&ICIN, the following pbarmacokinetic parameters were used: Identifier Value Definition a 7.0 the percentage hourly loss of a drug due to nonrenal processes b .22 the slope of the plot of the elimination rate constant of a drug versus the patient's creatinine clearance Q .57 the drug's elimination rate fraction: elimination rate of the drug as a fraction of the drug's 'normal' elimination rate. Q was calculated using the formula: Q= [a + b * CCr(f)l / [a + b * CCr(n)l My first dosage recommendation decreased the amount of GEPTMICIN and left the dosing interval unchanged. The new amount was calculated by multiplying the old amount by Q. The second recommendation increased the dosing interval and left the amount of GENTIXICIN unchanged. The new interval was calculated by dividing the old interval by (3. ** HOW DID YOU DETERMIXE THE PATIENT'S BODY SURFACE AREA? RLZElO4 was used to conclude that Oscar lovejoy's estimated body surface area 130 is 1.0. Tee last question asked before the conclusion was made was IO. ** PR 104 RlJLE104 ---s-w [This rule applies to any patient, and is tried in order to find out about the patient's estimated body surface area] If: 1) The weight of the patient is known, and 2) The height of the patient is known Then: Use,Boyd's algorithm to calculate body surface area from weight and height [Boyd, E. The Growth of the Surface Area of the Puman Body, 1935 Qp 1 - 133.1 Author: Wraith Justification: A comparative evaluation with statistical analysis of the Boyd method with the DuBois formula demonstrated that the Boyd formula is more accurate for abnormal body types. [Sendray J et.al., Determination of human body surface area from height and weight, Applied Physiology, 7(1):1-12, July lQ54.1 Literature: Boyd, E, The growth of the surface area of the human body, c. 1935, pp 1-133. 131 Sec. VII1.A. Appendix A. VII. Graph of Blood Levels of Anitbiotics The graph below provides an estimate of the steady state blood levels of gentamicin over time for the three regimens suggested by Mycin. A graph of this form may aid the physician in the selection of the most appropriate regimen. T i III e i n Blood level of CENTAMICIN [mcg/mll 1.7 2.7 3.7 4.6 5.6 6.5 +----f---3---3---3----c----e----t---3-----4---4 I DB I D B I D B I D B I 2.5@+ D B I ! D B I D B I I D B IDB I I 5.00+ IDB I 1 * B I DIB I I D * I I D B 1 f 7.5c+ DB * i DB 11 B 1 I D I B I D I B I / D 1c.o0+ B I I D / BB 1' I D I D IB I IB 1 1 D D 12.50+ I ID D B II1 B * B 11 D I B I JJ I B I 15.00+ D I I! I I D D I B I B I / i B B D I D I 17.50+ B DI I /B DI * DI I B 1 DI B 1 * 20.00+ B I DI HIC ICZY D - drug level for modified dose [ 7C.0 mgm every S.C hours1 132 Appendix A. $ec iTI11 1 . . . I - drug level for modified interval. [ 119.0 mgm every 14.0 hours1 B - drug level for modified interval and dose 1 98.0 mgm every 12.@ hours.1 SC - minimum inhibitory concentration [mcg/mll TL - toxic level [mcgjmll * - indicates overlapping curves 133 Sec. VI1T.A. Appendix A. VII. Therapy Recommendation Consultation Session MYCIN ?-Jun-78 . . . Special options (type ? for help): ** R.EC Instructions? (Y or N) ** YES This program recommends therapy for a patient with infectious diseases. You will be asked a few facts about the patient which will be used to determine the dosages and appropriateness of giving certain antibiotics. Then you will be prompted for the infection(s) that you want treated and the organism(s) which you feel may be responsible for each infecton. MYCIN cannot make a recommendation without knowing the infection(s) and organism(s) to be treated. Other information is requested for the purpose of refining therapy selection, but is not essential; you may answer UNKNOWN to any of these questions. Case 2, AA, 47-24-31, WC, A 3 year old female with one day history of headache, lethargy, fever to 104. L.P.: Protein 25, glucose 95 110 cells with 962 polys. Pe: Meningismus, petechiae on arms and shoulders. 28-NOV-77 11:05:51 [consultation of 23-JUN-76 12:OOJ --------PATIENT-683-------- 1) Patient's name: (first-last) ** PT683 2) Age: ** 3.5 3) sex: ** FEMALE 4) Is Pt683 currently receiving therapy with any antimicrobial agent? ** NO 5) During the present illness, has Pt683 been treated with an antimicrobial agent which is no longer being administered? ** NC Infection: MENINGITIS Organism: VIRUS [Determining which drugs are desirable for use against the Virus...] Are there any other organisms which might be causing this meningitis? ** YES Organism: HEMOPHILUS-INFLUENZAE 6) Has the lab reported the in-vitro antibiotic sensitivities of the hemophilus-influenzae causing the meningitis (IT'EM-2)? ** N [Determining which drugs are desirable for use against the Hemophilus- influenzae... 1 7) Does Pt6S3 have a clinically significant allergic reaction to any antimicrobial agent? 134 Appendix A. Zec VII1.A. ** NO Are there any other organisms which might be causing this meningitis? ** YES Organism: NEISSERIA-MENINGITIDIS S) Has the lab reported the in-vitro antibiotic sensitivities of the neisseria-meningitidis causing the meningitis (ITEM-3)? ** N [Determining which drugs are desirable for use against the Neisseria- meningitidis...] Are there any other organisms which might be causing this meningitis? ** NO Are there additional infections for which you want therapy recommendations? ** NO INFECTION-l is MENINGITIS i- . . . VIRUS Supportive therapy. + HEMOPHILUS-INFLUENZAE + NEISSERIA-MENINGTTIDIS [Considering AMPICILLIN-AND-CHLORAMPHENICOL for use against INFECTION-!...] 9) ?&at is the most recent creatinine clearance in ml/min which you feel is a true representation of Pt683's renal function? ** UNKNOWN 10) What is the most recent serum creatinine of Pt6.83 (in ng/1@0ml)? ** UNKNom! 11) Pt683's weight in kilograms (or POUNDS): ** 17.7 [REC-11 My preferred therapy recommendation is as follows: In order to cover for Items <2 3>: Give the following in combination: 1) AMPICILLIN Dose: S86 mg (7.0 ml) q4h IV [calculated on basis of 50 mg/kgl [The dose should be modified in SEVERE renal failure. I can adjust the dose only if a creatinine clearance or a serum creatinine is obtained, and the patient's weight is known.! 2) CHLCRAMPHENICOL Dose: 443 mg (4.4 ml) q6h IV [calculated on basis of 25 og/kgl Please ask any simple-English questions regarding the above consultaticn session or the overall knowledge of the consultation system. When discussing the consultation above, please refer to organisms, cultures, etc. by their numeric names (eg. @RGANISM-1). Ouestions which were asked during the consultation may be specified by the numbers that preceded them above, (e.g., 'WHY DID YOU ASK OUFSTION l?'). Control-O will abort type-out, or 'cancel` the current question. Type HELP for a list of the options available. Enter the word STOP when you have no further questions. Type HELP for the list of question options. 135 Appendix A. Sec. VII1.A. ** JUSTIFICATION ** LITERATURE ** VKY DID YOU GIVE AMPICILLIN AND CBLORPl+PHENICOL IN REC-l? Shall I describe the therapy algorithm for you?** E?O AMPICILLIN-AND-CRL@RAMPHEFICOL was not prescribed for ITW-1 or ITEM-3 For ITEM-1 no drug was prescribed. For ITEM-3 AMPICILLIN was prescribed. AMPICILLIN-AND-CRLOIWQHENTCOL was prescribed for YE?!-2 in RECOMKEhQATION-1 because: - AMPICILLIN-AND-CHLORAMPHENICOL is used for treating hemophilus- influenzae in meningitis -- AMPICILLIN-AND-CHLCRPHENICOL was considered for treating ITEM-2 because it is more appropriate than AMPICILLIN for this patient. [RULEO66] -- ITEM-2 is sensitive to AMPICILLIN-AND-CHLORAMPHE~ICOL -- there were no contraindications for it -- it was part of the recommendation which covers for all of the items, using the fewest number of drugs. Vould you like to see some details?** YES The following decision(s) are relevant: The drugs considered for the hemophilus-influenzae ranked according to desirability are ampicillin-and-chloramphenicol (.95) chloramphenicol c-89) carbenicillin c.79) tetracycline c-76) erythromycin t-75) cephalothin (.7) [RULE101 RULE0661 The drugs to which the hemophilus-influenzae is sensitive are ampicillin- and-chloramphenicol (1.0) chloramphenicol (1.0) carbenicillin (l.C) tetracycline (1.0) erythromycin (1.0) cephalothin (1.0) [RLZE098 RLZE~451 For the conclusion concerning ampicillin-and-chloramphenicol see RULE445. The meningitis is one of the infections for which ampicillin-and- chloramphenicol is appropriate [RULE4411 ** PR 66 RULEC66 ---a-- If: 1) Ampicillin is one of the drugs considered for the item ranked according to desirability, and 2) The diagnosis of the infection is meningitis, and 3) The age of the patient is greater than or equal to 1 week, and 4) The identity corresponding to the item is hemophilus-influenzae, and 5) The item is one of the most likely (top cluster) identities causing an infection in the patient, and 6) It is not known whether ampicillin is one of the drugs to which the item is sensitive in vitro, and 7) A: The patient is not allergic to one or more B: Chloramphenicol is not one of the drugs to allergic antibiotics, or which the patient is 136 Appendix A. Set VII1.A. Then : It is definite (1.p) that the combination of ampicillin with chloramphenicol is a more appropriate therapy for use against the item Author: Wraith Justification: Due to the increasing number of Ampicillin-resistant Hemophilus influenzae isolated in the last few years a combination of ampicillin and chloramphenicol is recommended until sensitivities of the organism are knotm. At that time, chloramphenicol should be discontinued unless the organism is ampicillin-resistant, in which case, ampicillin is discontinued. Literature: I. Katz SL: Ampicillin-resistant Eemophilus influenzae type I?: A status report. Pediat. 55:66, 1975. 2. Katz SL et.al.: Ampicillin-resistant strains of Eemcphilus influenzae type b, Pediat. 55:145, 1075. 3. Nelson JD: Should ampicillin be abandoned for treatment of Remophilus influenzae disease? JAMA 229:322, 1974. ** REC For each item in turn, enter the drug you would have prescribed in RECOI%EKDATION-1. In order to minimize the number of drugs in your recommendation, you nay not want to prescribe therapy for every item. Items which represent the most likely organisms are indicated with a plus sign (+). + ITEM-1 -- the virus ** i= ITEM-2 -- the hemophilus-influenzae ** CARBENICILLIN + ITDI-3 -- the neisseria-meningitidis ** CARBENICILLIN [Checking for contraindications...] [Considering CARBENICILLIN for use against INFECTION-l...] [No contraindications found...] [Now comparing vour prescription to ?!YCIN's...] . ORGANIS% XXIX's regimen "most likely" Your regimen Drug -- Choice Ranking Drug -- Choice Ranking ITM-2 AMPICILLIN-AND-CHLORAMPREMICOL -- 1st CARBENICILLIN -- 3rd ITEX-3 AMPICILLIN -- 1st CARBENICILLIX -- 2nd 137 Set VII1.A. Appendix A. (The desirability of a drug is defined to be its lowest ranking for the items it covers.) Your recommendation for the most likely organisms(s) is not optimal, since your prescription of 1 third choice drug (CARBENICILLIN for ITEM-3 and ITEM-2) should be avoided. [You may refer to your regimen as RECOMMENDATIOM-2 in later questions.] ** STOP .:aScnci:; is. I . IX. Running a Consultation of a Stored Patient in Summary Form with Rule Acauisition to Correct Diagnosis Case 10, TS, WC, o-49-13-77, A 73 year old female with rheumatoid arthritis, congestive heart failure, and gastritis on chronic prednisone 15 mg/day and coumadin. Admitted for a GI bleed. Progressive obtundation began while in hospital. LP revealed: protein 158, glucose 20, 38 cells with 34 % Polys, 66% lymphs. Gram stain and India Ink prep revealed budding yeast-like cells. Treatment: Begun on Amphotericin B IV and IT as well as S-fc. Final dx: Cryptococcal meningitis. ------------------------------------------------------------------------------ 29-NOV-77 01:45:12 [consultation of 9-OCT-76 12:00] Pt709 is a 73 year old female, Caucasian. Patient-709 is not an alcoholic. Patient-709 is a compromised host. Patient-709 is immunosuppressed. Patient-709 does not live in a crowded environment. Past Medical History: Patient-709 is not allergic to one or more antibiotics. Patient-7C9 has not undergone surgery. Patient-709 does not have a tb risk factor. Patient-700 has not recently been exposed to a contagious disease. Recent Medical History: The csf has not been tested for cryptococcus antigen. Patient-709 has not shown symptoms of mumps= Otitis-media is not one of the diagnoses which are consistent with the patient's clinical history. Epiglottitis is not one of the diagnoses which are consistent with the patient's clinical history. Patient-709 has not had an injury or insult to, or defect in the CNS. Patient-709 has had recent neurologic signs. The duration of the neurological signs is 4 days. Patient-709 has had recent neurologic symptoms. The duration of the neurological symptoms is 2 days. Physical: The weight of PATIENT-709 is 68.1 kgms. The height of PATIENT-709 is 165.1 ems. Patient;709 is febrile. Patient-709 has not been seriously burned. Patient-709 does not have a rash or cutaneous lesions. Patient-709 has not focal neurological signs. Patient-709 does not have evidence of ocular nerve dysfunction. Patient-709's X-ray is abnormal. Active-tb is not one of the diseases which the patient's chest X-ray suggests. Lobar-pneumonia is not one of the diseases which the patient's chest X-ray suggests. 139 Sec. VII1.A. Appendix A. CULTURE-l: This pending csf culture was drawn on 9-OCT-76 12:OO. A smear of this pending csf culture was examined. The stain of ORGANISM-l is indiaink. The CSF cultures will be considered to be associated with MENINGITIS. The number of days since clinical evidence of the infection was first noted is 4 days. [Considering organisms (other than those isolated from cultures) which might be present...] Lymphoma is not one of the diagnoses of the patient. Leukemia is not one of the diagnoses of the patient. The CSF protein is 158. The WBC from the CSF is 38. The percent of pmns in the csf is 34. The csf glucose value is 20. The simultaneous blood glucose value is 95. The results of a recent CBC of pt709 are available. The white count from pt7CO's peripheral CBC (in thousands! is 9.4. Pt709 has received corticosteroids just prior to or during this infection. The infection was acquired while the patient was hospitalized. I have found evidence (RULE564) that treatment should also cover for VIRUS causing YENIPGITIS. [Determining which drugs are desirable for use against the virus...] Xy therapy recommendations will be designed to treat for organisms that are either very likely or, although less likely, would have a significant effect on therapy selection if they were present* It is important to cover for the following probable infection(s) and associated organism(s): INFECTION-1 is ?IENI?IGTTIS + . . . VIRUS [from clinical evidence only1 Supportive therapy. Please repeat the LP in 8 - 24 hours for cultures (bacterial, fungal, Tb, and viral), cells, and chemistries. Blood cultures should be drawn. Is this correct? ** NO I was unable to make any conclusion about the identity of ORGANISM-l Is that correct? ** NO Should it have been possible to conclude a value? ** YES Vhich one(s)? ** CRYPTOCOCCUS Let's look at the CRYPTOCOCCCS . . . Appendix A. SEC VI'1 .A+. Do you want to 11 track doqm the problem yourself, or 21 allow me to guide the debugging, or 31 ignore it for now and fix it another time? Type 1, 2 or 3 *Jr 2 The following applicable rule(s) were unable to conclude about whether the identity of ORGANISM-l is cryptococcus RULE378 --s--w If: 1) The infection which requires therapy is meningitis, and 2) The stain of the organism is acidfast Then: 1) There is suggestive evidence (-7) that the identity of the organism is mycobacterium-tb, and 2) There is evidence that the organisms (other than those seen on cultures or smears) which might be causing the infection is coccidioides c-.&3) cryptococcus c-.8) Clause 2 of RULE378 [the stain of the organism is acidfastl was already known to be false for ORGANISM-l, so the rule was never tried. Is this rule correct? ** YES Should its premise have succeeded for this case? ** NO Is there a rule missing? ** YES Ok, then I need a rule that would allow me to deduce that the identity of ORGANISM-l is cryptococcus Would you like to give me such a rule now? ** YES Good... The new rule will be called RULE260 If: 1 - THE INFECTION WHICH REQUIRES THERAPY IS MENINGITIS and 2 - THE STAIN IS INDIAINK and 3 - Then: 1 - THE IDENTITY IS CRYFTOCOCCUS (.7) and 2 - This may take a few moments. This is my understanding of your rule: RULE260 -s-e-- If: 1) The infection which requires therapy is meningitis, and 2) The stain of the organism is indiaink Then : There is suggestive evidence (.7) that the identity of the organism is cryptococcus Okay? (Yes or No) ** YES Sec. VI1I.P. Appendix A. This rule may be best described as a rule which: l- applies to all organisms 2 - applies to organisms from positive cultures 3 - applies to organisms seen on smears of specimens for pending cultures ** 1 Please type your last name- BENNETT Please describe briefly why you added (or changed) this rule. Type control-Z when done. ** THE SYSTEM FAILED TO CONCLUDE CRYPTOCOCCUS FOR AN INDIAINK STAIN ARD ** ?+'ENINGITIS AS THE INFECTION ** RULE260 has now been added to the knowledge base. I will now re-run the consultation to check the effectiveness of your new rule. Get comfortable, this may take a few moments. -------------------------------------------------------------------------------- 2%NOV-77 C1:53:52 [consultation of 9-OCT-76 12:00] The CSF cultures will be considered to be associated with ?fENINGITIS. [Considering organisms (other than those isolated from cultures) which might be present...] I have found evidence (RULE564) that treatment should also cover for VIRUS causing SNINGITIS. [Determining which drugs are desirable for use against the cryptococcus...] [Determining which drugs are desirable for use against the virus...] Ely therapy recommendations will be designed to treat for organisms that are either very likely or, although less likely, would have a significant effect on therapy selection if they were present. It is important to cover for the following probable infection(s) and associated organism(s): INFECTION-1 is MENINGITIS c CRYPTOCOCCUS [ORGANIS+11 . . . VIRUS [from clinical evidence only1 Supportive therapy. [My recommendations will cover for all of the most likely items (indicated by a "+" above). Less likely items will be covered unless this would 142 Appendix A. Set VIII-A. require an excessive number of drugs.] Well, it appears that things are straightened out. Is this correct now? ** YES Good, then the problem's been fixed. . ..contfnuing with therapy selection now. [Considering 5-FLUOROCYTOSINE-AND-AHPUOTERICIN-B for use against IWFECTTCN-1 . ..] Estimated body surface area is 1.8 sq. meters. Dosage adjustments will be based on the calculated creatinine clearance of 85.4 ml/min/1.73 sq. meters (adjusted to average body surface area.) Formal doses of all drugs will be given I XEC-11 Ky preferred therapy recommendation is as follows: Give the following in combination: 1) 5-FLUOFOCYTOSINE Dose: 2.568 (l@ 250mg-capsules) q6h PO [calculated on basis of 37.5 mgkgl Comments: Gradually increase dose from 12 mg/@ q6h to the above dose as tolerated. 2) AMPROTERICIN-B Dose: 20 mg (200.0 ml) q24h IV [calculated on basis of .3 mg/kgl Comments: Begin with 1 mg IV on day 1, 5 mg on day 2, 10 mg on day 3, and then gradually increase dose by 5-10 mg/day until the above dose is administered. Infuse slowly over 3-6 hrs. Plus, consider giving C.Smg intrathecally every other day. Greater than 15 mg intrathecally is rarely indicated. Please obtain samples for fungal, TB, and viral cultures, cytology, VDRL (blood and CSF), coccidiodes complement-fixation (blood and CSF), cryptococcal antigen (blood and CSF), viral titers (blood and CSF). An infectious disease consult may be advisable. Elood cultures should be drawn. 143 Appendix R APPENDIX B: ~~CVLEDGE ENGTITEEF.I>TG FOR mrm~ DECISION FJKIPTG: X Peview of Computer-Eased Clinical Decision Aids 1 IBTTiODUCTION As early as the 1950's ft was recognized that computers could conceivably assist with clinical decision makirg [57], and both physicians and computer scientists began to analyze medic21 diagnosis with a view to the potential role of automated decision aids in that domain [55]. A variety of techniques have been applied to computer-eided clinical decision raking, accounting for at least 600 references in the clinical 2nd computing literature C1041. In this article we review severs1 bethodologiss and attempt to identify tbe important issues thtt eccount for both the multiplicity of approaches to the problem and the limited clinical success of most of the systems developed to date. Although there have been previous reviews of computer-aided diagnosis [42!, (SGI, IIOGI, our emphasis here trill be somewhat different. ,- We will focus on the representation and utilization of knowledge, termed "knowledge engineering," and the inadequacies of data-intensive techniques which have led to the exploration of ncvel symbolic reesoning approaches during the last decade. 1.1 Beasons For AttenDtins Cornouter-Aided Medical Decision Hakina It is generally recognized that accelerated growth in medFca1 knowledge has necessitated greeter sub-specialization among physicians and more dependence upon assfstance from other experts when 2 patient presents with a conplex problem outside one's own area of expertise. The prinary care physician who sees the patient initially has thousands of tests available with a wide range of costs (both fiscal and physic211 and potential benefits (i.e., arrival et a correct diagnosis or optimal therapeutic management). Ever. the experts in a field may reach very different decisions regarding the msnagenent of a specific case [l??]. Diagnoses that are made, and upon which therepeutic decisions are based, hzve been shown to vary widely in their accuracy 1221, L771, I831 - Furthernore, medical decision making has traditionally been learned by medical students in 2n unstructured way, largely through observing and emulating the thought processes they perceiae to be used by their clinical mentors [Gel. Thus the motivations for attempts to understand and automate the process of 144 Sec. I IXTRODVC?ION clinic21 decision making have been numerous [LO61. They are directed both 2t d!esel'S case because it allowed them to perform tasks that they would previously not have been able to undertake at all. -- Fetrospective review of cases thet were treated et tbc referral center, -- but bzithout the use of the protocols,%howed 2 16X rate of variance from the rznagement guidelines specified In the algorithms; there was no such variance when the protocols were utilized directly. Thus algorithms nay be effective tools for the administration of complex specialized therapy in circumstances SUCK 2s those described. 2.3 DLscussFon of tbe Yethodologv -e Although clinical algorithms 2re emong the most widespread 2nd accepted of the decision aids described in t?is article, the sinplicity of their logic cakes it cl.ear why the tecknique cannot be effectively 2ppl.ied in most medic21 dorains. Decision points in the algorithms are generally binery (i.e., a given sign or symptom is or is not present), and there tend to be neny circumstances that c2n arise for which the user Ls advised to consult the supervisi np physician (or specialist). Thus the conplex decision tasks are left to experts, and there is generzlly no formal algorithn for managing the case from that point on . It is precisely the sinpllclty of the algorithmic logic, and the supervising expert "esc2pe valve", which has permitted nany 2lgorithns tc be Sec. 2 Clinical Aigotithms and Putomation represented on one or tuo sheets of paper and has obviated the need for direct commuter use in most of the systems. The contributions of clinical algorithms to the dfstribution 2nd delivery of kealtb care, to the trainirzg of paramedics, and to quality care audit, have been intpressive and substantial. Powever, the methodology is not suitable for extension to the complex decision tasks to be dfscussed in the following sections. 3 Databank Analysis for Proenosis and T'herany Selection 3.1 Caemfex.7 Automation of medical record keeping and the development of computer-based patient databanks have been major researc?. concerns since the earliest deys of medical computing. ??ost such systems have attempted to avoid direct interaction between the computer and the physician recording the data, with the systems of Yeed [115!, 11161 and Gteenes [I21 being notable exceptions. Although the earliest systems were designed merely as record-keeping devices, there have been several recent atterzpts to create programs that could also provide analyses of the information stored in the computer databank. Some ezrly systems l321, [471 had retrieval modules that identified all patient records matching 2 Eoolean combination of descriptors; however, further analyses of these records for decision making purposes was left to the investigator. Weed has not stressed an analytical component fn his automated problem-oriented record f1161, but others have developed decision a',ds which Llse medical record systems fashioned after his 1961. The systems for databank analysis all depend on the development of 2 cotnplete and accurate medical record system. If such a system is developed, a number of additional capabilities can be provided: (1) correlations among variables can be calculated, (2) prognostic indicators can be measured, and (3) the response to various therapies can be compared. A physician faced with 2 complex management decision can look to such 2 system for assistance in identifying patients in the past who had similar clinical probleas and can then see how those patients responded to varfous therapies. A clinical iavestigator keeping the records of hFs study patcents on such 2 system can utilize the prcgram's statistical capabFlities for data analysis. pence, although these applicat'ons are Inherently data-intensive, :he kinds of "knowledge" generated by? . specialized retrieval and statistical routices can provide valuable 151 Appendix B Sec. 3 Databank Analysis for Prognosis and Therapy Selection assistance for clinical decision Fakers. Fcr esmple, they c2n help physicians 2void the inherent biases that result when the indi*6due! pr2ccitioner bases his decisions pri,narily on his own anecdotal experience wfth one or trro patients having a rare disease or compl2x of symptoms. There are many excellent prcgraas in this category, one of which is discussed in some detail in the next section. Several others warrant mention, however. The HELP System at the University of Ut2.h [lO?], 11111, [112] utilizes a large data file on patients in the Latter-Day Saints Hospital. Clinic21 experts formulate specizlized "PELP sectors" which are collections of logical rules that define the criteria for a particular medical decfsion. These sectors are developed by an interactive process F;hereby the expert proposes important criteria for a given decision and i,.s provided with zctual data regarding that criterion based on relevant petients and controls fron the computer dstebank.. The criteria in the sector are thus adjusted by the expert until adequate discrimination is made to justify using the sector's logic as a decision tool". 7'h.e sectors are then utilized for a variety of tasks throughout the hospital. jnother system of !nterest is that of Feinstein et al. 2t Y2ie (171. TI-ey had specific petient nanagement decisions in mind when they developed their interactive system for estimating prognosis and guiding management in patients with lung cancer* Similarly, Rosatf et al. have developed 3 system at Duke Eniversity which utilizes a large databank on patients who have undergone coronary arteriograpby [821. Xew patients can be matched against those In the databank to help determine patient prognosis under a variety of management alternatives. 3.2 Example One of the most successful projects in this category is the P.PN??S systerr! of Fries [?Q]. The approach was designed originally for use in an outpatfent rheunatologv clinic, but then broadened to a general clinical database system (TOD) [MS;, [IL?] so that it became transferable to clinics in oncolcgy, metabolic disease, cardiology, endocrinology, and certain pediatric subspecielties. All clinic records are kept in a flow-cherting format in which a column in a large table indicates a Specific clinic visit and the rows indicate the relevant clinical partmeters that are bei_ng followed over time. ------------------------------------------------- 'This P recess might be seen as a tool to assist vith the forr.ulatFon cf cli,nical a por%thms ;Ls discussed In t5.e pre-iious sectton. hother apprztch u s I n p dttabank anaiysis for algorithm development is described in 1261. 152 Sec. 3 Databank Analysis for Prognosis and Therapy Selection '9:ese charts are naintairred by the physicians seeing the patient in clinic, and the new colum of data is later transferred to tbt? COEQUt??I Gatabank by a transcriptlonist; in this way tine-criented data on all patients are kept current. ?he defined database (clinical parameters to be followed) is determined by clinical experts, and in the case of rheumatic diseeses has now been standardized on a national scale [?61. The infomation in the databank can be utilized to create a prose summary of the patient's current status , and there are graphical capabilities which can plot specific parameters for a patient over time tll81. Eowever, it is in the analysis of stored clinical experience that the system has its greatest potential Utility [211. In addition to performing search and statistical functions such as those developed in databank systezz for clinical investigation [L5], !5?], 1??A!IS ofzers a prognostic analysis for a new patient when a management decision is to be made. Using the consultative services of the Stanford Immunology Division, an individual practitioner Eay select clinical indices for his patient that he would like oatched against other patients in the databank. Eased on 2 to 5 such descriptors, the conpuikr locates relevant prior patients and prepares 2 report outlining their prognosis with respect to a variety of endpoints (e.g., death, development of renal failure, arthritic status, pleurisy, etc.). Thetapy recorz!endations are 21~0 generated on the basis of- a response index that is calculated for the matched patients. A prose case analysis for the physician's patient can also be generated; this readable document summarizes the relevant data from the databank and explains the basis for the therapeut'c recommendation. The rheunatologic databank generated under M.BIS has now been expanded to involve a nation21 network of immunologists who are accunulating tine-oriented data on their patients. This national project seeks in part to accumulate a large enough databank so that grcups of retrieved patieots will be sizable and thus control fcr some observer variability and make the system's recommendations more statistically defensible. 3.3 Discussion of the Yethodologv The databank analysis systems descrl bed have pcwerful capabFlities to offer to the Fr.divldual clinical decision maker - Furthermore, medical computir,g researchers recognize the potential value of large databanks Fn suppcrting many of the other decision making approaches discussed in subsequent sections. There 153 Appendix B Sec. 3 Databank Analysis for Prognosis and Therapy Selection are important 2ddition21 issues regarding databank systems. ho\.-ever, which are discussed belo%:- (1) Data 2cquiSltiOn remains a major problem. >!any systems have avoided direct physician-computer interaction but have then been faced with the expense 2nd errors of transcription. The developers of one well accepted record system still express their desire to implement a direct interface with the physician for these reasons, although they recognize the difficulties encountered in encouraging hands-on use of 2 computer system by doctors [l@C]. (2) Analysis of data in the system can be conplicated by missing values that frequently occur, outlying values, and poor reproducibility of data across time and among physicians. (3) The decisicn aids provided tend to empbesize patient mznagemcnt rat5er than diagnosis. Feinstein's system [L7] is only useful for patients with lung c2ncer, for example, and the PRAXIS (TCD) prognostic routines, which are designed for patient manegement, assume that the patient's rheumatologic dfagnosis is already known. (4) There is co formal correlation between the way expert physicians approach patient management decisions and the way the programs arrive at recommendations. Feinstein and Koss felt that the acceptability of their system would be limited by a purely statistical approach, and they therefore chose to mimic human reasoning processes to a large extent 1531, but their approach appears to be an exception. (5) Data storage space requirements c2n be large since the decision aids of course require a comprehensive medical record system as a basSc component. Slamecka has distinguished between structured and empirical approaches to clinical consulting systems 1961, pointing out that databanks provide a largely empirical basis for advfce whereas structured approaches rely on judgmental knowledge elicited from the literature or the minds of experts. It is fmportant to note, however, that judgmental knowledge is itself based on empirical. information. Even the expert "intuitions" that many researchers have tried to capture are based on that expert practitioner's own observations and "data collectionn over years of experience. Thus one might argue that large, complete, and flexible databanks could form the basis for large amounts of judgment21 knowledge that we now have to elicit from other sources. Some researchers have indicated a desire to experiment with methods for the automatic generatfon of medical decision rules from databanks, and one component of the Set o 3 Databank -4nalysis for Prognosis and Therapy Selection researcn on SI amecka's YApIS system is apparently pointed in that direction [?6J. Indeed, some of the mst exciting and practical uses of large databanks may be found precisely at the interface with those knowledge engineering tasks that have most confounded researchers in medical symbolic reasoning [S] - 4 Mathematical Models af Physical Processes 4.1 Overview Pathophysiologic processes can be well-described by mathematical formulae in a limited number of clinical problem areas* Such donains have lent themselves well to the development of computer-based decision aids since the Lssues are generally well-defined. The actual techniques used by such program tend to reflect the details of the individual applications, the most celebrated of which have been in pharmacokinetics (specifically digitalis dosing), acid- base/electrolyte disorders, and respiratory care [63]. Cne or two cooperating experts in the field generally assist with the definition of pertinent variables and the mathematical characterization of the relationships anong them. Often an interactive program is then developed which requests the relevant data, makes the appropriate Computations, and provides a clinical analysis or recommendation for therapy based upon the computational results. Soaze of the programs have also involved branched-chain logic to guide decisions about what further data are needed for adequate analysis5. Program to assist with digitalis dosing have progressed to the inclusion of broader medical knowledge over the last ten years. The earliest work was Jellif fe's 1431 and was based upon his considerable experience studying the pbamacokinetics of the cardiac glycos ides - His computer program used mathematical formulations based on parameters such 2s therapeutic goals (e.g., desired predicted blood levels), body weight, renal function, and route of administration. In one study he showed that computer recoumendations reduced the frequency of adverse digitalis reactions from 35% to 12% 1441. Later, another group revised the Jelliffe model to permit a feedback loop in which the dfgitalis blood levels obtained with initial doses of the drug were considered --------------------------------------------- 5"Branched-chain" logic refers to mechanisms bv which portions decision network can be considered or ignored depending bpon the data on a of a cas2. For example, in an acid-base program Yen the anion gag night be calcu ated and a branch-point could then determine whether the pathway for analyzing an elevatad anion F 2p would be required. If the gap were not elevated, that whole portion of the ,ogic network could be sk:pped. Sec. 4 Yathenatical Models of Physical Processes Appendix 0 in subsequent therapy recommendations 1721, [e9J. More recently, a third group in 3oston, noting the insensitivity of the'first two apprcaches to the kinds of nonnuzeric observations that experts tend to use in modifying digitalis therapy, augmented the pharmacokinetic model with a patient-specific model of clinical status [3LJ. Running their system in a monftori.ng mode, in parallel with actual clinical practice on a cardiology service, they found that each patient in the trial in whom toxicity developed had received more digitalis than would have been recommended by their program. 4.2 Example Perhaps the best known program in this category is the interactive system developed at Boston's Beth Israel hospital by Bleich. Originally designed as a program for assessment of acid-base disorders [21, it was later expanded to consider electrolyte abnormalities 2s well I31, [4J. The knowledge in Bleich's program is a distillation of his own expertise regarding acid-base and electrolyte disorders. The system begins by collecting initial laboratory data from the physician seeking advice on 'a patient's ganagenent. Eranched-chain log%c is triggered by abnormalities in the initial data so that only the pertinent sections of the extensive decision pathways creeted by Bleich are explored. Essentially all questions asked by the progrzm are numerical laboratory values or "yes-no" questions (e.g., "Does the patient have pitting edema?"). Depending upon the complexity and severity of the case, the program eventually generates an evaluation note that nay vary in length from 2 few lices to several pages. Included are suggestions regarding possible causes of the observed abnormalities and suggestions for correcting them. Literature references are also provided. Although the program was made available at several East Coast institutions, few physicians accepted it 2s an ongoing clinic21 tool. Bleich points our t-hat part of the reason for this was the system's inherent educational impact; physicians simply began to anticipate its analysis after they had used it a few tines [3j. Yore recently he has been experimenting with the program operating 2s a monitoring system6, thereby avoiding direct interaction with the physician. The system's lack of sustained acceptance by physicians is probably due to more than tts educational iqact, however. For exemple, there is no feedback in the system; every patient is seen as a. new case and the program has no concept ------------------------------------------------- 6?ersonal coutuunication with Dr. Blefch, lP7.5. Sec. 4 ?!athenatical Models of Physical Processes of folloving 2 patient's response to prfor therapeutic measures. Furthermore, the program generates differentia, 1 diagnosis lists but does not pursue specific etiologies; this can be particularly bothersome when there are multiple coexistent disturbances in a patient and the program simply suggests parallel lists of etiologies without noting or pursuing the possible interrelationships. Finally, the system is highly individualized in that it contains consideration of specific relationships only when Bleich specifically thought to include them in the logic network. Cf course human consultants also give personalized advice which may differ from that obtained from other experts. P.owever, a group of researchers in Britain 1791 who analyzed Bleich's program along with four other acid-base/electrolyte systems, found total agreement among the programs in only 20X of test cases when these systems were asked to define the acid-base disturbance and the degree of compensation present. Their analysis does not reveal which of the programs reached the correct decision, however, and it may be that the results are more an indictment of the other four programs than a valid criticism of the advtce from BleiCh'S acid-base component. 4.3 D%scussion of the ?Zethodoloeies -- The programs mentioned in this section are very differene in several respects, and each tends to overlap with other methodologieg we have discussed. Eleich's program, for example, is essentially a complicated clinical algorithm interfaced with mathematical formulations of electrolyte and acid-base pathophysiology. As such it suffers from the weaknesses of all algorithmic approaches, most importantly its highly structured and inflexible logic which is unable to contend with unforseen circumstances not specifically includad in the algorithm. The digitalis dosing programs all draw on cathemtical techniques from the field of biomedical modeling (not discussed here), but have recently shocn more reliance on methods from other areas as well. In particular these have included symbolic reasoning methods that allow clinical expertise to he captured and utilized in conjunction with mathematical techniques [311. The Boston group that developed this most recent digitalis program is interested in similarly developing an acid-base/electrolyte system so that judgmental knowledge of experts can be interfaced with the mathematical models of pathophysiology7. -----------------------------------------~---- 'Personal communication, lQ78, xrith ?rof. Peter Szolovfts. Sec. 5 Statistical Pattern Yatching Techniques Appendix B 5 Statistical Pattern Yatching Techniques 5.1 Overview Pattern matching techniques define the mathematical relationship between measurable features and classifications of objects 1121, (461. In medicine, the presence or absence of each of several signs and symptoms in a patient may be definitive for the classification of the patient as "abnormal" or into the category of 2 specific disease. They are also used for prognosis [II, or predicting disease duration, time course, and outcomes. The'se techniques have been applied to a variety of medical domains, such 2s image processing and signal analysis, in addition to computer-assisted diagnosis. In order to find the diagnostic pattern, or discriminant function, the method requires a training set of objects, for which the correct classification is already knob-n, as well as reliable values for their measured features. If the form and parameters are not known for the statistical distributions underlying the features, then they must be,estimated. Parametric techniques focus on learning the parameters of the probability density functions, while non-parametric (or "distribution-free") techniques make no assumptions about the form -of the distributions. After training, then, the pattern can be matched to new, unclassified objects to aid in deciding the category to which the new object belongsg. There are numerous variations on this.general methodology, most notably in the mathema:ical techniques used to extract characteristic measurements (the features) and to find and refine the pattern classifier during training. Por example, linear regression analysis is a commonly used technique for finding the coefficients of an equation that defines a recurring pattern or category of diagnostic or prognostic interest. Recent work emphasizes structural relationships among sets 0, 6 features more than statistical ones. Three of the best known training criteria for the discriminant function are: (a) Eayes' criterion: choose the with incorrect diagnosesg; function that has tbe mfnimum cost associated (b) clustering criterion: choose the function that produces the tightest clusters; Cc) least-sauared-error criterion: choose the function that minimizes the squared differences betueen predicted and observed measurement values. I----------------------------------------------- 81t is uossible to detect patterns, even .without a known classification for objects in the training set, with so-called "unsupervised" learning techniques. Also, it is possible to work with both numerical and non-numerical measurements. gSee Section 5 for further discussion. 158 Sec. 5 Statistical Pattern Yatching Techniques Ten cotrmonly used mathemetical roodels based on these criteria have been shoc;r? to produce renarkzbly similar diagnostic results for the same data [7J. 5 .? Example There are numerous papers reporting on the use of pattern recognition methods in medicine. Armitage [ll discusses three examples of prognostfc studies, with an errphasis on regression methods. Siegel et al. [271 discuss uses of cluster analysis. One recent diagnostic application using Bayes' criterion 1671 classifies patients having chest pains into three categories: Dl: acute myocardial infarction MI); D2: coronary insufficiency: and D3: non-cardiec causes of chest pain. The need for early diagnosis of heart attacks without laboratory tests is a prevalent problem, yet physicians .ere known to misclassify about one third of the patients in categories D1 and D2 and about S@Z of those in D3. In order to determine the correct classification, each patLent in the training set was classified after 3 days, based on laboratory data including electrocardiogran (ECG) and blood data (cardiac enzymes). There remained some uncertainty about several patients with "probable HI." Seventeen variables were selected from many: 9 features with continuous values (including age, heart rates, .white blood count, and hemoglobin) and g features width discrete values (sex and 7 ECG features). The training data were measurements on 247 patients. The decision rule was chosen using Bayes' theorem to compute the posterior probabilities of each cZagnostic class given the feature vector X. (X = [x 1, x 2, . . . , x 17}.'C. Then a decision rule was chosen to minimize the probability of error, that is, to adjust the coefficients on the feature vector X LI such that for the correct class Di: P(TQ~X)=~X (P(D1iX1, P(D2IXL P (D31X)) The class conditional probability density functions must be estimated initially, and the performance of the decision rule depends on the accuracy of the assumed model. Using the sarre 247 patients for testing the approach, the trained ------------------------------------------- lOThe posterior probabi is the probability lity of a diagnostic class, represented as P(D; IX), feature vector X h that a patient falls in diagnostic category Di given thaf the as been observed. llSee [56J for their medical irrport. a study Fn which the coeff icfents are reported because of 159 Sec. 5 Statistical Pattern Matching Techniques Appendix B classifier averaged gOI correct diegnoses over the three classes, using only data available at the tine of admission. Physicians, using more data than the coaputer, averaged only 50.5X correct over these three categories for the same patients. Training the classifier with a subset of the patients, and using the remainder for testing, produced nearly 2s good results. 5.3 Discussion of the Hethodoloev -- The number of reported medic21 applications of pattern recognition techniques is large, but there are also numerous problems associated with the methodology. The most obvious difficulties are choosing the set of features i.n the first place, collecting reliable measurements on a large sample, and verifvina the iKlitia1 Cl2SSifiCations among the training data. Current techniques are inadequate for probler=s in which trends or movement of features are important characteristics of the categories. Also the problems for which exfsting technfques are accurate are those that are well characterized by a ~1~211 number of features ("dimensions of the space"). - As with all technfques based on statistics, the a of the sample used to -- define the categories is an important consideration. AS the'number of important features 2nd the number of relevant categories increase, the required size of the trzining set also increases. In one test [7], pattern classifiers trained to discriminate among 20 disease'categories from 50 symptoms were correct 512 - 6&Z of the time. The same methods were used to train classifiers to discriminate between 2 of the diseases, from the same 50 symptons, and produced correct diagnoses 922 - ?gx of the time. The context in which 2 local pattern is identified raises problems related to the issue of utilizing medical knowledge. It is difficult to find and use classifiers that are best for 2 small decfsion, such as whether an are2 of an X- ray is inside cr outside the heart, and integrate those into a global. classifier, such as one for abnormal heart volume. Accurate application of 2 classifier in 2 hospital setting also requires that the measurements in that clinical environment are consistent with the measurements used to train the classifier initially. For example, if diseases 2nd symptoms are defined differently in the new setting, or if lab test values are reported in different ranges - or different lab tests used -- then decisions based on the classification are not reliable. "attern recognition techniques are often misapplied in medical. domains in Sec. 5 Statistical Pattern Matching Techniques which the assumptions are violated. Some of the difficulties noted above are avoided in systems that integrate structural knowledge jnto the numerical methods and in systems that integrate human and machine capabilities into single, interactive systems. These modificatfons will overcome one of the major difficulties seen in completely automated systems, that of provfding the system with good "intuitions" based on an expert's 2 priori knowledge and experience [461. 6 Bavesian Statistical Approaches 6.1 Overview Xore work has been done on Bayesiaa approaches to computer-based medical decision making than on any of the other methodologies we have discussed. The appeal of Eayes' Theorem T2 is clear: it potentially offers an exact method for computing the probability of a disease based on observations and data regarding the frequency with which these observations are known to occur for specified diseases. Tn several domains the technique has been shown to be exceedingly accurate, but there are also several limitations to the approach which we discuss below. In its sinplest formulation, Bayes' Theorem can be seen as a mechanism to calculate the probability of a disease, in light of specified evidence, from the a uriori probability of the disease and the conditional -- probabilities relating the observations to the diseases in which they may occur. For example, suppcse disease Di is one of 2 mutually exclusive diagnoses under consideration and E is the evidence or observations supporting that diagnosis. Then if P(Di) is the 2 priori probability of the &th disease: P(D$E) P(Di) P(E/Di) 2 PUIj) P(EiDj) j.1 The theorem can also be represented or derived in a variety of other forms, including an odds/likelihood ratio fo~ulation. We cannot include such details here, but any iniroductory Statistics book or Lusted's classic volume [Sal presents the subject in considerable detail.. -------------------------------------------- 1' Lalso often referred to as Bayes' rule, discriminant, or criterion 161 Sec. 5 Dayesran Statistic21 Approaches Appendix B r?co n g the most c OQEonly recognized problems with the utilization of a Eayesian 2pproPch is the large amount of date required to determine all the conditional probabilities needed in the rigorous appl!.cation of the formula. Chart review or computer-based analysis of large dat2banks occasionally allows most of the necessary conditional prcbabilities to be obtained. A variety of additional assumptions must be made. For example: (1) the diseases under consideration are assumed mutually exclusive and exhaustive (i.e., the patient is assumed to heve one of the 1 diseases, (2) the clinical observations are assumed to be.conditionally independent over a given diseasel3, and (3) the incidence of the symptoms of 2 disease is assumed to be stationary (i.e., the model generally does not allow for changes in disease patterns over time). One of the earliest Bayesian programs was Warner's system for the diagnosis of congenital heart disease [!07!. Ee compiled dats on 83 patients and generated a symptom-disease matrix consisting of 53 symptoms (attributes) and 35 disease entities. The diagnostic performance of the computer, based on the presence or absence of the 53 symptoms in a new patient, was then compared to that of two experienced physicians. The progran was shown to "reach diepncses tith 217 accuracy equal to that of the experts. Furthermore, system performance t~2.s shown to improve as the statistics in the symptom-disetse matrix stabilized with the addition of increasing numbers of patients. in 1068 Gorry and Barnett pointed out that Warner's program had required making 211 53 observations for every patient to be diagnosed, a situation which would not be realistic for many clinic21 applications. They therefore utilized a modification of Sayes' Theorem in which observations are considered sequentially. Their computer program analyzed observations oue at a time, suggested which test would be most useful if performed next, and included termination criteria so that a diagnosFs could be reached, when appropriate, without needing to make all the observations [281. Decfsions regarding tests 2nd termination were made on the basis of calculations of expected costs 2nd benefits at each step :',n the logical process14. Using the sane symptom-disease matrix developed by Warner, they were able to sttain equfvalent diagnostic 13The purest form of Eaves the order in which evidence* is obtained, Theorem atto;; conditional dependencies, and analysis. Eowever , the number of explicitly considered 1-n the is so unwieldv that conditional required conditional Fndependence of probabilities observations, and the order of observations, is generally assumed [101]. non-dependence on l&See the decision theory discussion 53 Section 7. 162 Sec. 6 Bayesian Statistical Ppproaches performance using only 6.9 tests on aver2ge15. They pointed out that, because the costs of medical tests may be significant (in tems of pztient discomfort, tfre expended, and financial expense), the use of inefficient testi,r?g sequences should be regarded as ineffective diagnosis. Warner has also more recently included Gorry and Barnett's sequential diagnosis approach in an application regarding structured patient history-taking [IlO!. The medical computing literature now includes many exzmples of Bayesian diagnosis programs;most of which have used the nonsequentfal approach, in addition to the necessary assumptions of symptom independence and mutual exclusivity of disease as discussed 2bove. One particul2rly successful research effort has been chosen for discussion. 6.2 Example Since the late 1960's deDomba1 and associates, at the University of Leeds in England, have been studying the diagnostic process and developing computcr- based decision aids using Bayesian probability theory. Their area Of investigation has been gastrointestinal diseases, o;piCI>T `S developers felt that barriers to acceptance were largely conceptual 2nd could be counteracted in large part if a system were perceived as a clinic21 tool rather than a dogmatic replacement for the prin2ry physician's own reasoning. Kccwl ed ge of infectious disezses is representedin ZCI?? 2s production rules, each containing a "packet" of knowledge obtained from colleboratjng espsrts [95]21. A production rule is simply a conditional statement which relates observations to associated inferences that may be drawn. For example, a ?!YCIV rule might state that "if a bacterium iS a gram positive coccus growing in - chains, then it is apt to be 2 streptococcus." M'CIN's power is derived from such rules in a variety of ways: (1) (21 (3) (4) the it is the pro ram that determines which rules to use and how they chained toget.er to make decisions about a specific case2L; F should be the rules can be stored in a machine-readable format hut translated into English for display to physicians; by removing, altering, or adding rules, the system's knowledget~~ructures can be rapidly modified without explicitly restructuring , entire knowledge base; 2nd the rules themselves can of teo form a coherent explanation reasoning if the relevant ones are translated into English and of system response to a user's question. displayed in Associated with all rules and inferences are numerical weights reflecelng degree of certainty associated with then. These numbers, termed certainty factors, form the basis for the system's inexact reasoning in this complex task ---------------------------------------------- ZIProduct+cn rules methodologv freouently emploved in [Oj and effectively appli%etz other scientific-problem domains [iji. AI research 22Tbe control structure utilized is :erned "goal-oriented" and is similar to the ccnsequest-thecren methodolcgy used is Hewitt's ?L'LiKXEB [371. 174 Sec. 8 Symbolic Reasoning Approaches dcnain [?6]. They 2110~ the judgmental knowledge of experts to be captured in ruI2 form and then utilized in a consistent fashion. lhe YYCIX System has been evaluated . regarding its perfornacce at therapy selection for patients with either septicemia [123] or nenlngitis (1121. The prcgran performs comparably with experts in these tvo task domains, but as yet it has no rules regarding the other infectious disease problem areas. Further knowledge base development will therefore be required before YYCIX is made available for clinical use; hence questions regarding its acceptability to physicians cannot yet be assessed. However, the required implementation stages have been delineated [%I, attention has been paid to all the design criteria mentioned above, and the program does have a powerful explanation capability [eel. 8.3 Discussion of the E"ethodolocv -- Sydolic reasoning techniques differ from the other methodologies mentioned in this article in that the computer techniqt?es thcmselves are 2s yet experimental 2nd rapidly changing. Vhereas the comput&ions involved in Bayes' '="r.eorea, for example, involve straightforward applicstion of computing techniques already well-developed, basic researchers in computer science cant inue to develop new methodologies for knowledge representation, language understanding, heuristic search, and the other symbolic reasoning problems we have mentioned. Thus the AI programs tend to be developed in highly experimental environments where short term practical results are often unlikely to be found. 'The programs typically require large amounts of space and tend to be slow, particuhrly in time-sharing environments. As has been true for most of the methodologies discussed, AI researchers have still not developed adequate methods for handling concurrent diseases, assessing the time course of disease, nor acquiring edequate structured knowledge from experts. Furthermore, inexact reasoning techniques tend to be developed and justified largely on intuitive grounds. Despite these sfgnificant limitations, the techniques of artificial intelligence & provide a way to respond to many of Gerry's observations regarding the inadequacies of prior methodologies as described above [301. There are now several programs responsive to his criticisms. Szolovits and Pauker have recently reviewed some applications of AI to medicine and h2ve atter?.pted to weigh the successes of this young field agacnst the very real 175 Sec. 8 Symbolic P.easoninp Approaches Appendix B problem thee lie ahead (lQl1. They identify several serious deficiencfes of current systems- For example. termination crrterfa are still poorly understood. Although INTEEIFIST can diagnose sinultaneous diseases, it also pursues all ebbnormal f indir.gs to completfoc, even though a clinician often ignores minor unexplained abnormali ties if the rest of a patient's clinical status is well understood. In addition, although some of these programs now cleverly mimic some of the reasoning styles observed in ez.z.erts [143,[48], it is less clear how to keep the systems from abandoning one hypothesis and turning to another one as 900~ as new information suggests another possibility. Programs that operate this way appear to digress from one topic to another - a characteristic that decidedly alienates a user regardless of the valic!itp of the final diagnosis or advice. 9 Conclusicns This review has shown that there sre two recurring issues to confront in considering the field of computer-based clinical decision caking: (I) Bow can we design systems that reach better, more reliable decisions in 2 broad range of applications, and (2) Bow can we more effectively encourage the use of suth systems by physicians or other intended users? We shall summarize by reviewing these points separately. Performance Issues Central to assuring a program's adequate performsace is a matching of the nos t appropriate technique with the problem domain. I!e have seen that the structured logic of clinical algorithms can be effectively applied to triage functions and other primary care problems, but they would be less naturally matched with complex tasks such as the diagnosis and management of acute renal failure. Good statistical data may support an effective Bayesien program in setrings where diagnostfc categories are smll in number, non-overlapping, and well-defined, but the lack of higher level docain knowledge limits the effectiveness of the Rayesian approach in more complex patient management or diagnostic environments. A mathematical approach may support decision making in certain veil-described fields in wb%Ch observations are typically quantified, and related by functional. expressions. These examples, and others, demonstrate the the need for thoughtful. consideration of the technique most appropriate for managing a clinical problem. In general the sirplest effective methodology Js Sec. 9 Conclusions to be preferred, but acceptability issues must also be considered as discussed belcv. It is also always appropriate to ask whether computer-based approaches are needed at all for a given decision making task. The clinical algorithm developers, for example, have almost uniformly discarded the machine, and Schwartz et al. pointed out that a useful decision analysis can often be accomplished la a qualitative manner using paper and pencil [87]. Finally, it is important to consider the extent to which a program's "understanding" of its task domain will heighten its performance, particularly in settings where knowledge of the field tends to be highly judgmental and poorly quantified. We use the term "understanding" here to refer to the degree of judgmental or structural knowledge (as opposed to data) that is contained in the program. Analyses of human clinical decfsion making [141, [4g? suggest that as decisions move from simple to complex, a physician's reasoning style becomes less algorithmic and more heuristic, with qualitative judgmental knowledge and the conditions for invoking it coming increasingly into play. It is likely that medical computing researchers will similarly have to become "!;nowledge engineers" ill the sense that they will look for effective ways to natcb the knowledge structures that they use to the cOmplexity of the tasks they are undertaking. Accentability ISSueS . A recurring observation as one reviews the literature of computer-based medical decision making is that essentially none of the systems has been effectively utilized outside of a research environment, even when -- i_cs performance has been shown to be excellent! --P-b This suggests that it may be an error to concentrate our research effort primarily on improving the decision making performance of computers when there is evidently much more required before these systems will have clinical impact. It iS tempting to conclude that the biases of medica!. personnel against computers are so strong that systems will inevitably be rejected, regardless of performance, and in fact there are some data to support this view (991. However, we are beginning to see examples of applications in which initizl resistance to automated techniques has gradually been overcome through the incorporation of adequate syster! benefits [113]. Perhaps one of the most revealing lessons on this subject is an observation Sec. 9 Conclusions Appendix t) regarding the system of ?fesel et al. that tre described earlier (641. Despite documented physician resistance to clinical algorithms in other settings [?&I, the physicians in Yesel's study eccepted the guidance of protocols for the management of chemotherapy in their cancer patients- It is likely that the key to acceptance in this instance is the fact that these physicians had previously had no choice but to refer their patients with cancer to the tertiary care center in Birmingham where all complex chemotherapy w2s administered. The introduction of the protocols permitted these physicians to undertake tasks that thev had previouslv been unable to do, and it simultaneously allowed maintenance -- ---- of close doctor- patient relationships and helped the patients avoid frequent long trips to the center. The motiv2tion for the physician to use the system is clear in this case- It is reminiscent of Rosati's assertion that physicians will fir st welcome ccmputer decision aids when they become aware that colleagues who are using the machine have 2 clear advantage in their practice [811. A heightened awareness of "human engineering" issues among medical computing researchers is also apt to help improve acceptance of computers by physicians. Fox has recently reviewed :tis field in..detail [ ie]. The issues rtnge from the mechanics of interaction at a computer terminal to program charscteristics designed to make the system appeer as a tool for the physician rather then a dogmatic advice-giving machine. Adequate attention must also be given to the severe time constraints perceived by physicians. Ideally they would like programs to take no more t'ne 4. than they currently spend when accomplishing the same task on their own. Time and schedule pressures are similarly likely to explain the greater resistance to automation among interns and residents than among medical students or przcticing physicians in Startsman's study [99]. Finally it must be noted that acceptability issues should generally be considered from the outset in 2 system's design because they nay dictate the choice of methodology as much 2s the tzsk domain itself does. The role of formal knoeledge structures to facilitate expl2nation capabilities, for example, fmy argue in favor of using symbolic re2soning techniques even when a somewhat less complex methodology might have been adequate for the decision task. In summary, the trend towards increased use of knowledge engineering tech.niques for clinical decision programs has been in response to desires for both improved performance and improved acceptance of such systems. As greater Sec. 0 Conclusions evner:cnce 1s gained v!ith these techniques and they become better known throughout the medical coo~uting ccmmunity, it is likely thet we will see increasingly powerful unions between symbolic reasoning and the alternate nethodolcgies we have dFscusaed. Cne lesson to be drawn lies in the recognition that there is basic computer science research to be done in medical computing, and that the field is more than the application of established computing techniques in medical domains. Ackncwledsments Fe wish to thank R. Blum, L. Fagan, J. King, J. Yunz, 6. Sax, and G. Wiederhold. for their thoughtful advice in reviewing earlier drafts of this paper. 179 Sec. References Appendix El References 1. 2. 3. 4. 5. 6. 7. e. 9. 10. 11. 1' -. 13. 14. 15. Armitage, P. and Gehan, E.A. "Statistical methods for the identification and use of prognostic factors." lnt. J. Cancer, 13, pp. 16-36, (1974). -- evaluation of acid-base 689-1696 (1369). disorders." J. Bleich, H.L. 147 (1071). "The computer as a consultant." K. Eng. J. Fed. 284, pp. 141- base disor!&;." Aner. J Blefch, "Computer-based consultation: electrolyte and acfd- . Fed 53, pp. 285-291 (1972). -A- Blur?, F.L. and Wiederhold, G. " Inferrin f * knowledge fr;m clinical data ~a~~:,~t~l,~infot~~~ni~~e~,~ToZ~~tif~E~: Intelligence, Proc. 2nd .J.nn. ;1,". k3=3& Q . -tip' 1. , Washington D.C., ?:ovemberlm Buchanan, B.G. and Feigenkaum, E.A. "Dendral and Fl?;$i)applications dimension. yeta-Dendral: Artificial Intelligence 11, pp. 5-24 . Croft D.J. "Xs computerized diagnosis possible?" Comn. pp. 331-367 (1972). Biomed. * 5, Cumberpatch, J. and Heaps, H.S. "A disease-conscious method for sequential diamosis by use of disease probabilities without assumption of symptom independence." Int. J. Biomed. Comout. 7, pp. 61-78 (197G). -- Davis, R. and King, J. "An overview of production systems." In ?"achine Representation of Knowledge (E.W. Elcock and D. yichie, eds.), Wuey, LYib. - Hew York: deDomba1, F.T., Leaper, D-J., Staniland, J-R., et al. aided diagnosis of acute abdominal pain." Brit. Med. J. 2, pp.-- "Corn uter- --- Q 13 ,872). deDomba1, F.T., Lea et, D.J., Horroc!TER: A L Anal,.,s i s (Ustna Schemata) mot. rf7.D. ansueze tar Provino Urssertation, Lheorems ant LYaninulating " 1 cepartment rnstitute of Technology, Cambridge, ?!ass., i5f Xathenatics, X&,"z&Eus%!t$ 1072. Eorrocks, J.C., aided diagnosis: McCann, A.P., Staniland, J.R., et al. descri tion of an operational experience wit rl adaptable "Computer- 2,C34 cases." Brit. Yed. J. system, and --- 2, pp. 5-9 (1072). Eorrocks, J.C., and deDomba1, of dyspepsia." Amer. J. Diges. Dis. 2C,;4T14C6 (',,ymputer-aided diagnosis -- Eoward, R. A. (ed.). "SueciaJ Issue IEEE Transactions on Svstems, Science and CvbernzFics, Dec'sion vo 1 ?*?o???? o ? ???? ssc-4 (3)) Sept., "Decision in medicine', (1975). (editorial). L Enp..J, &;quez, J.A. _ .: Charles C. Connuter Dia nos+s & Diagnostic b!ethods, aomes, + Springfield, Jelliffe R . W . Buell, J. Ealaba R. et al. for digitalis d;sage regimen;." >!ath.'Biosri. 9, pp. 179-163 (P970) o "A corn uter program Jelliffe, R.W., Buell, J., and Kalaba, R. "Reduction of digitalis toxicity bv computer-assisted glycoside dosage regimens.,' Anns. 891-906 (1972). Int. Ked. -- 77, PP- Johnson, D.C. and Sarnett, G.O. "XEDIDF@ - a Coma. Pros. +n Biomed. 7, pp. l?l-2Cl (1977). medical information system.', --L Eanal, L.N. "Patterns in Pattern Reco nition: 1?68-1974," Information Theorv, vol. X-20, no. 5 (9974). IEEE Trans. on --- Yarpinskf, and E.H.S. and Eleich, R.L. "MISAR: a miniature information storage retrieval system." Comu. Eioced. Res. 4, pp. 655-660 (1971). Kassirer, J.?. and Gorry , C-.A. "Clinical behavioral analysis." Arms. Int. ?!ed. 89, pp. 245-255 problem (1078). solvfng: a --- 182 Sec. Peferences Appendix B 40. fG. 51. 52. 53. 54. 55. 56. 57. 58. 50. 60. 61. 62. 63. 56. Kteinnuntz, B. and &Lean, digita.?conputer." Behav. Sci. 13, pp. "Dia nostic P'sis-eo (19%8,. interviewing by Knapp, R-G., Levi, S., Lurie, D., and Vestphal, H. r( A conputer-generated diagnostic decision guide: a conparison of statistical diagnosis and clinical diagnosis." Conput. Biol. Med. 7, pp. 222-2?@ (1077). -- Komoroff, A-L., Black, V.L., Flatley, M., et al. "Protocols for physician of diabetes and hypertension." 2 Eng. 3. Med. -- Korein,,, record, (1971). J-9 Lyman, M., and Tick, J.L. " The ;zmp$erized medical Bulletin Kew York Academy of Medicine, -- . , pp. 824-826 Koss, N. and Feinstein, A.R. "Computer-aided 2 prognostic algorithm." Arch. Intern. Ved. 127 rognosis: II. development of , pp. 448-459 (1971). kea+pes, P.J., Borrccks, J.C., * .-. Couputer-assisted diagnosis Staziland, J.F., and abdoninal deDoEba1, 'estinetes' provided by clinicians." Brit. !!ed. J. 4, pp. --e 350-E (L97p Ledley, R.S. and Lusted, L.B. medical diagnosis." Science 130,?-21 (1959). "Reasoning foundations of Levi, S., Frant, J.R., Westphal,t Y.C., and Lurie, D. decision guide - optimal discriminations "Development of a statistical analysis.' Yeth. Inform. Xed. for meningitis determined by 15 (21, 87-90 (1976). Lipkin, H. and Rard' J.D. 1' Wechanical correlation of data in differential dia nosis of hemato ogic diseases." J. Amer. Med. Assoc. 166, pp. 113-125 (1958). --- Lusted, L.B. Introduction To Medical Decision Yakinq. Charles C. Thomas, lYb8. Springfield, Ill.: Yabry, J.C., Thompson, F.K., data management and Ropwood, Y.D., and Baker, W.R. "A ps;;;o:VP; analvsis system (CLINFO): svsten description : experience." In YEDIKFC 77, Amsterdam: North-iiolland Publishing Co., 1977, pp. 71-75. XcDonald, C., Bhargava, B., and Jeris, D. "A clinical (CIS ) for ambulatory care," Proc. information svsten of the 1975 NCC vol. h4 (1975) pp. 749-756 B---p AFIPS Press, ?!cMeil, B.J., Keeler, E., and Adelstein, S. J. "Primer of medical decision making." N. Enc. on certain elen?ents J. Yed. 293, pp. 211-225 (1975). -- McNeil, B-J. and Adelstein, S.J. "Ceterminin the value of diagnostic and screening tests." J. Yucl. Ked. 17, pp. --- 439-4&8 (1977). ??enn, S-J., Barnett, G.O., Schmechel, D., et al. "A computer progra;25o assist in the care of acute respiratory failure." J. Amer. Med. Assoc. pp. 30%312 (1973). --- -,, Yesel, E., for cancer Wirtschafter, D.D., Carpenter, J-f., et al. chemotherapy Clinical algorithms and oncology centers. - systems for coumunity-based consultant-extenders Veth. Inforn. Med. 15, pp. leg-173 (1976). 183 Sec. References Appendix 8 65. 66. 67. 68. 69. 70. 71. -,l IL. 73. 74. 7s. 76. 77. 78. 79. PC. ??Tordyke, I?..?., Kulikowski, C.X., and Kulikowski, C.W. ;f,z;ghods fir the automated d:',agnosis of thyroid dysfunction. 1 .I . Res. I F?* 374-389 (1971). ?!orusI,s, M.J. and Jacquet, J.P. mathenatical "Diagnosis. I. 172 (1075). models for diagncsis. Symptom nonindependence in Fes. P, pp. 156- Corn- Biomed. Patrick, E.A. "Pattern Recognition in and Cvbernetics Pevietj, 6, p. 6 (1977). Medicine," Systems, & Pauker, S.G. and Kassirer, J.P. 'Therapeutic decision making: a cost- benefit analysis." N. Enc. J. Med. 293, pp. 229-234 (1975). -- Pauker, S.G., Gerry, G-A-, Kassirer, J.P., and Schwartz, W.B. "Towards the simulation of Prer. clinical co nition: taking J. Wed. 6@:981-996 (8976). a present illness by computer." --- Pauker, S.G. "Coronary the use analysis." +.nns. Int. Ved. 85, arter;-l6syffejy; of decision -- pp. Pauker, S-P. and Pauker, S.G. 'Prenatal diagnosis: oenetic counseling using decision analysis." Yale J. a directive aeoroach to ?1?77). Biol. liled. -- -- >D,275-2S9 Peck, C.C., Sheiner, L.P., Martin, C.Y. theraFy." 8. Eng. J. "ed. 229, pp. - -- "C@mputer-assisted digoxin PI berger, P E.V. "Clinical application of a second generation e 5?7- 608 (1975% ectrocardio raphy computer program." Amer. J. Electrocardiolopv -- 35' pp. Pople, H.E.,'yyers, J.D. and Filler, R.A. logic for internal medicine.' Proc. 4th "DIAL@!: A model of diagnostic Int. Joint. Conf. on Artif. Intell., MIT, Cambridge, Mass., 1T - -- ----- PO le, H. SO Ving: P "The formation of comoosite an exercise in synthetic +easonio hv,o;h,;zes in diagcostic prcblem of 5th Intl Joint Conf on Prtlf. -- Intelligence, Cambridge, Mass, i%j7,- lu?O=rU3T;- -- Prutting, J. "Lack of correlation between zntcmortem postmortem diagnosis." N.Y. J. Med. 67, pp. 2081-2@84 (1967). and Pm- Raiffa, F. Decision Analvsis: Introductory Lectures on Choices Under Uncertainty- P .eading, ?lass.: Adalson Wesley, l>bti. - Richards, P. of patients and Goh, A.E.S. assistance in the treatment with acid-base and "Computer &sterdnn: electrolyte disturbances." MEDIXFO 77, Forth-Polland Publishing Company, 1077, pp. 407-410. Rodnick, J., and Wiederhold, G. "Review of automated ambulatorv medical record sys tens : charting servic;s that are of essential benefit physician," UCDINFO 77, Amsterdam: Xorth-Rolland Publishing Co., to the 5157-961. 1077, ?? * 184 Sec. FL. P", . e3. 84. 85. 87. P8. SO. 91. 92. 93 e al. OS. 06. References Appendix B ?osati,,l R-A., and Stead, E.A. the future. Uallace, F.G., Arch. Irtern. Med. 131, ?F* 28S-227 (1973). "The way of Rosati, R.C., ?!creer, J.F., Starrer, C-F., et al. "A. new information systen for medical practice." Arch. intern. F'ed. 135, pp. lOI;- 1024 (1975). Rosenblatt, F.B., Ten?, P.K., and Kerue, S. "Dia nostic as determined by pest-mortem examination." Proe. E accuracy in cancer (1073). lin. Cancer 5, pp. 71-EC --- Rubin, A-D. and Risley, J.F. "The PROPEET system: in providing a computer resource to sc:!entists." experiment North-Holland Publishing Co., 1377, pp. 77-81. HEDINFO 2, Pmsterdarr: Safran, C., Tsichlis, P.N., Blurnine, A.Z., and Desforges, J-F. "Diagnostic F lenning ' .odgkins' d?~~~fe." co~~~:~~-~as'~~'d2`26-243` (1977). decision making for patients with Ecboolmsrn, H. and Fernstein, L. "Commuter use in diagnosis, prognosis, 2nd therapy." Science 200, pp. 926-031 (1978). Schwartz, W.B., Gorry , G.A., Kassif;er, J.P., and '$a~l;;ion analysis and clinical judpent. her. J. Es~ff~~,/,$~ yed 55, pp. -- - _ . Scott, >..c. Clancev "Exolaration laoabilitie; of w., Davis, R., and knowle+oe-besed lS79. Shortliffe,,l E.H. J. Cotzputational Lineuistics, >!ccroficKe 62, reduction system. Pmer. Sheiner, L.R., Halkin, E., Peck, C., et digoxin therapy." Arms. Int. Med. 82, pp. "IF. roved computer-assisted --- 6%627 (f?:5). Sherman, K., Xeiffen, E., and Komoroff, A.L. "Ambulatory care systems." In Problee-Directed and yedical Information Svstemso$:.F. or .: Intercontznental lVec!lcal i%ooK Lorporatlon, f. Driegs, ed.), New 9 PP. 143-li!. Sbimur2, X. and sune' .I' pp. 125-h. Shortliffe, . . "Learnin Proc. P Int,. procedures in pattern classifiers - introducrioa Joint Conf. on Pattern Recognition, Kyoto, 1978, --- E.H:, Axline, S.G., Buchanan, B-G., and- Cohen, S.N. "Design . . conslaer2tlons zor therapeutics." Proc. a &"~::" to provide consultations in clir.?cal Cieeo Eionedical Diego, Calif., F?%?Xruarym4- - Cv~~osiu~, 311-310, Sen Chortlifqe E.E. and Davis R. Gf knowie?ge-based exper; "Some consideretfons for the iaplemental:j.on. systems." STGdRT December 1975. XeWsletter, No. 55, 9-12, Shortliffe, E-E., and Buchanan, E.G. "A node1 of inexact reasoning in medicine." >?ath. Biosci. 23, pp. 251-379 (1975). Shortliffe, E.P. Copauter-??ased ElsevierfSorth Polland, icfib. ?!edical Consultations: ?!YCTy, New York: sec. References Appendix 8 97. sox, H.C:, sox, C.P., and Tompkins, R.K. "The training of gh;si$;;ns 2ssistants: the use of a clinical algorithm system." ?!. Eng. J. L_ a , pp. f?lF-824 (1973). e 88. Sridharan, V.S. Guest editorial. Artificial Ictellizence 4 (1078). 11, pp. l- O?. Startsmen, T-S., and Robinson, R.F. "The attitudes medical and aramedical personnel towards 2lg-527 (1972). computers." Coma. Biomed. &. -5, PP* lo@. Stead, W.V., Brane, R-G., Hammond U.E., et al. "A computerized obstetric medical record." Obstet. & Gyn. 4!, pp. SC2-5C9 (1977). 101. Szolovits, P. and Pauker, S.G. medical diagnosis .I' "Categorical and probabilistic reasoning in Artifici21 Intelligence 11, pp. 115-144 (197&J. 102. Taylor, T.R. 216-224 (1076). "Clinical decision analysis." F?eth. Inform. Med.. IS, pp. 103. Vickery, D.?!. "Corn uter support of paramedical personnel: au2lity control. " MPDIZFO 2, Amsterdan: Forth-Holland Publ!~~i~~ck$~n~f f974, pp. 221-287. 1@4. Waener, G., Tautu, P., and TYolber, U. "Problems of 2 bibliography." Yeth. Info. Yed. 17, pp. 55-74 (J97F). medical diagnosis: --- 105. Walsh, B.T., Bookhein, W.N., Johnson, R.C., et al. "Reco nition of streptococcal pharyngitis in adults." Arch. Int. Yet!. 13S, pp* 14%3Il4?7 (1975). --- lC6. Wardle, A. 2nd V2rdle, L. "Corn uter-aided of research." Meth. Info. Med. 17, pp- T S-2P (1978). diagnosis: 2 review -- lC7. WarneT, H.R., Toronto, A.F., and Veasy, L.G. Bayes Theorem for comnuter diagnosis of Arms. ET-Y. 4cad. Sci. 115, pp. 558-567 (1?64). congenital "Fxperiencc with heart disease." 108. Warner, E.F. "Experiences with computer-based patient & An2lgesia Current Researchers 47, pp. 453-461 (196g)- monitoring." Aries. 1LQ. \Jarnef, H.R., Rutherfz;:, B.D., and,,Eo;;~~ens, E. "A sequential a `(y07$;s tory taking dragnosis. preach . Eioaed. Res. 5, pp. ET 36-262 ? ? o 111. Warner, H.P., For an, !? J-D., Pr or, T-A., et al. %* "HELP - a self-improving system for medic2 decision ma xg." YEDIFFO 7(r, Amsterdam: F?orth-kolland Publishing Company, 1974. 112. Warner, F.B. Rnowledge sectors the AELP svstem." Proc. of for logical processing cf atient data in 2nd. A-no. Cvw. on Comnuter P Yedical u, IEEE, T;asn. XC.,(l'?ie), pp.-7X4. pplications in - 113. Watson, F.J. "Fedical stsff response to 2 medic21 information system with 186 Sec. Feferences Appendix R direct husician-computer interface." North-Holland Publishing CorQary, !07$. mI!r?TFC ih, Q. 2?9-3c2, ~l?StCrdX!!: 1iG. Pechsler, P. "A fuzzv a flroach to 191-1c3 (FE76). medical Fioned. Cmn. 7, pp. diagcosis." Int. J. -- llj. Heed, L.L. "Medical records that guide and teach." N. Ens. J. Med. 278, QQ* 5?3-5"?,652-65: (1968). -- 116. Weed, L.L. "Problem-oriented medical records." In Problem-Directed and Medical Information Svstems (M.F. Driggs, ed.), 'PIedlcal Book Gorporam973. New York: IntercontinenTZI 117. GTeiss, S-M., "A model- based method Ful~~;wski, C.A., Amarel, SD and Safir, A. computer-aided medical Artificial Intelligence 11, pp. 145-172 (1878). decision-making." 118. lJey1, S., Fries, .3., Wiederhold, G. "iz modular selz- c?fzs;fbing clinical databank and Gernano, F. system. ?I Conp. Eiomed. ti P, QQ. 27F-293 - . Ll?. Wiederhold, G., Fries, J.F., and Ueyl, S. "Structured oroanization cf clinical databases," 47%&SS Proc. of the lo75 NCC AFIPS Press vol. 24 (1475) QQ. ---Lv' !2C. yi;;ton, P.E. Artificial Intellicence, Beading, Fass. : Addison-ireslev _ 1 - 1. L2.1. Wortman, P.M. "Medical diagnosis : cit.? information approach." Connut. Biomed. Res. 5, pp. 315-328 (1972). Qr'scessing 122. Yu, P.L., Fa an, L.M., Wraith, S.M., et al. antimicrobia f selection - a 'Computer-based consultaticn in ccmoarative evaluation by experts." Stanford University School of Medicine. Submitted for publication, Movember lo78. 1 23. Yu, V.L., Euchanen, B.G., Shortliffe, E-H., et al. 'I 4 t: n evaluat',cn of the i: erforr.ar?ce of a i iomed., 1970. computer-based consultant." To aQQe2r in Comnut. Prop-. -- 124. Zadeh, L.A. "Fuzzy sets." Tnfcrmation and Control 8, pp. 338-353 (1965). 12.5. Zoltfe, N., Eorrocks, J.C., and assisted diagnosis of deDomba1, F.T. dyspepsia - report on transferabilitv of "Computer- 2 system, with em hasis on early diagnosis of gastric cancer." Yeth. Lhform. & 16, QQ. PO-92 (1077). 187 Appendix C TUE ART OF ARTIFICIAL INTELLIGEYCE: X. Themes and caee studies of knowledge eneineering Edward A. Feigcnbaum Department of Computer Science, Stanford University, Scanford, California, 94305. Abstract The knowledge engineer practices the art of bringing the principles and tools of AL research CO bear on difficult applications problems requiring experts' knowledge for their solution. The technical issues of acquiring this knowledge, representing it. and using it appropriately to construct and explain lines-of-reasoning, are important problems in the design of knovledge- based systems. Various systems that have achieved expert level performance fn scientific and medical inference illuminate the art of knowledge engineering and its parent science, Artificial Intelligence. INTRODUCTION: AN EXlUlPLE This is the first of a pair of papers that vi11 exnnlne emerging themes of knovledge engineering, illustrate them with case studies dravn from the vork of the Stanford Heuristic Programming Project. and discuss general issues of knovledge engineering art and practice. Ler me begin vith an example nev to our workbench: a system called PUFF. the early fruit of a collaboration betvean our project and a group dC the Pacific Medical Center WC) in San Francisco. A physician refers a patient to PnC's pulmonary function testing lab for diagnosis of posstble pulmonary function disorder. For one of the tests, the pacienr inhales and exbalcs a fev t imea in a tube connected to an instrument/computer combination. The instrument acquires data on flow rates and volumea, the so- called flov-volume loop of the patient's lungs and airvays. The coapucer maasures cettaln parameters Of the CUNC and presents them to the diagnostician (physician or WFF) for interpretation. The diagnosis is made along these 1 ines : normal or diseased; restricted lung disease or obstructive airvays disease or a combination of both; rhe severity; the likely disease type(s) (e.g. emphysema, bronchitis. etc.); and ocher factors important for diagnosis. PUFF is given not only the measured data but also certain items of information from the pattent record, e.g. sex. age. number of pack-years of cigarette smoking. The task of the PUFF system is to infer a diagnosis and print it out in English in the norma medical summary tform of the interpretation expected by the referri-rg physician. Everything PUFF knovs about pulmonary function diagnosis is contained in (currently) 55 rules of the IF...THEN... form. No textbook of medicine current 1 y records these rules. They constitute ehe partly-pub] ic, partly-private knovledge of `* an expert pulmonary physiologist at PMC. and vere extracted and polished by project engineers vorking intensively vitb the expert over a period of time. Here is an example of a PUFF rule (the unexplained acronym refer to vorfous data measurements): ------------------------------------------- RULE 31 SF: 1) The severity of obstructive airways disease of the patient is greater than or equal to mild, and 2) The degree of diffusion defect of the patient is greater than or equal to mild. and 3) The tlc(body box)observed/predicted of the patient is greater than or equal to 110 and 4) The observed-predlctcd difference in rv/tlc of the patient is greater than or equal to 10 THEN: 1) There is strongly suggestive evidence (.9) that the subtype of obstructive airways disease is emphysema. and 21 It is definite (1.0) that "CAD. Diffusion Defect. elevated TLC, and elevaLed RV together indicate emphysema." is one of the findings. ----------------------------------------------- 188 Appendix C Onr hundred cases. carefully chosen co span thr Klriccy of disease scotes oith sufflcienc rxrnplary tnformatton for each, were used to exccacc the 55 rules. As the knowledge emerged, it was represented in rule form. added co the system anti rested by running additional cases. The expert was sometimes surprised, sometimes frustrated, by the occasional gaps and inconsistencies in the knowledge. and the incorrect diagnoses chat were logical consequences of the extstlng rule set. The interplay between knowledge engineer and expert gradually expanded rhe set of rules to remove most of these problems. As cumulation of techniques in the art demands and allows. a new tool was not invented when m old one vuuld do. The knowledge engineers puIIed out of their toolkit a version of the M'YCIN system (to be discussed later). with the rules about infectious diseases removed. and used it as the inference engine for the PUFF diagnoses. Thus PUFF, like HYCIH. fs a relatively simple backuard- chaining inference system. It seeks a valid line- of-reasanfng based on its rules and rooted in the tnstrumenc and patient data. With a little more work at fitting some existing tools together, PUFF will be able co explain this line-of-reasoning. juac as NYCIN does. As it is, PUFF only prints out the final tncerprecation. of which the following is an example: PAT I ENT DATA : The degree of dyspnea: HODERATELY-SEVERE The severity of coughing: MILD Sputum production MODERATELY-SEVERE ihe number of pack-years of smoking: 4 referral diagnosis: BRONCHITIS KVC/IVC-predicted: 80 RVJRV-predicccd: 191 FVC/FVC-predicted: 87 TLC(body box)observed/predicted: 127 Predicted FEVl/FVC: 83 TLC(DLCO)observed/predicted: 83 FEVllFVC ratio: 50 RV/TLC Observed-Predicted: 21 HHF/hHF-predicted: 19 the DLCOIDLCO-predicted: 48 The slope (FSO-obs-F25-obs)/FVC-obs: 19 DECREE OF OBSfRUCTIVE AIRWAYS DISUSE: om degree by SLOPE: (~~DE~ATELY-SEVERE 700) OAD degree by MMF: (S&WE 900) OAD degree by FEV1: (MODERATELY-SEVERE 700 1 Ft!lhL OAD DECREE: (MODERATELY-SEVERE 910) (SEVERE 900) so conflfcc. Final degree: (YODEPATELY-SEVERE 910) Obstruction fs indicacrd by curvature of the flow-volume loop. Forced Vital Capacity is normal and peak flow rates are reduced. suggesting airway obstruction. Flow race from 25-75 of expired volume is reduced, indfcating severe airway obacruccion. MD. Diffusion Defect, elevated TLC, and elevated RV together indicate emphysema. OAD. Dfffusion Defect. and elevated RV indicate emphysema. Change in expired flow rates following bronchodllation shows chat there is reversibility of airway obstruction. The presence of a productive cough is an indication that the MD is of the bronchitic type. Elevated lung volumes indicate overinflation. Air trapping is indicated by the elevated difference between observed and predicted RV/TLC ratios. Improvement in airway resistance indicates some reversibility of airvay Airway obstruction'ls consistent virh the patient's smoking history. The airway obstruction accounts for the patient's dyspnea. Although bronchodilators were not useful fn this one case, prolonged use may prove to be beneftcial to the patient. The reduced diffusion capacfcy indicates atrvay obstruction of the mixed bronchitic and emphysematous types. Low diffusing capacity indicates loss of alveolar capillary surface. Obstructive Airvays Disease of mixed types 150 cases not studied during the knowledge acquisition process were used for a test and validation of the ruIe set. PUFF inferred a diagnosis for each. PUFF-produced and expert- produced interpretations were coded for sracistical analysis to discover the degree of agreement. Over various types of disease states, and for two conditions of match between human and computer dfagnoses ("same degree of severity" and "vfthin one degree of severity"), agreement ranged between approximately 902 and IOOI. The PUFF story fs just beginning and 0111 be cold perhaps at the next IJCAI. The surprising punchline to my synopsis is that the current state of the PUFF system as described above was achieved in less than 50 hours of interaction with the expert and less than 10 -n-weeks of effort by the knowledge engineers. We have learned much in the Appendix C p.lac decade of the art of engineering knouledgc- based incelligcnt agenta! In the raataindct of this essay, I vould like to discuss the route that one research group, the Stanford Heuristic Programming Project, hns taken, illustrating progress vith case o tudiea, and discusming tbemea of the uork. 2 ARTIFICIAL INTELLICXNCE b KNOULEDC& EXINEEIIINC~ The dichotomy that vsa used to classify rhe collected vwr= la the volume Computers and Thought still characterizes veil the motlvetlons end research efforts of the AI community. First, there are some vho work toward the construction of intelligent artifacts, or seek to uucover principles, methods. end techniques useful in such construction. Second, there are chose who view artificieI intelligence ea (to use Newell's phrsaa) "theoretical psychology." sacking explicit and valid informscion prOCSSSing models of human thought. For purposes of this essay. I wish to focus on the motivations of the first group, these days by far the larger of the two. I label these motivations "the lots: ligent agent vievpoint" end here is my understandlug of thet vlevpolot: "The potential wea of computers by people to accomplish rash can be `one- dlmenaion.alised' into a apec trum representing the nature of instruction that must ba given the computer to do its job. Call it the URAL-TO-HOW spectrum. At one extreme of the spectrum. the user supplies his lntelllgmce to instruct the -china vith precision axacrly XOY to do hie Job, step-by-step. Progress in Computer Science ceo be seen ss atepa svsy from the extreme `HOW point on the spectrum: the familiar peooply ?? o aaembly languagea ( subroutine librarice, compilers, extensible languagea, etc. At the other extreme of the spectrum la the user vlth his real problem (WUT he wishes the computer. ee him instrument. to do for him). He aspires to colmmfcetc UllAT he veota dooe In a langusge that ia comfortable to him (parhepa English); vie cmicatfon modes that ere coovenleot for bin (lncludlng perheps. aperch or pieturen); with some gemerallty, some vagueness. imprecision. even error ;` without having to lay out lo det'ail all necessary subgoals for adequate performance - with reasonable ssaurence that he is addressing en InteIIlgent agent that La using knowledge of hia vorld to understand his intent. to fill 10 his vagueneaa, to sake spaciflc his abstractions, to correct his errors, LO discover appropriate suhgoela. and ultimately to translate WHAT ha really wants done into processing steps that define HOU it shall be done by .a real computer. The research activity aired at cresting computer programs that act iia "intelllgeot egentsv nesr the MlAT end of the UDAT-To-HOW spectrum COO be viewed ss the long-range goal of AI research." (Felgenbaua, 1974) Our young acieoce 1s still more art than science. Art: "the principlea or methods governing soy craft or branch of learning." Art: "skilled uorkmsnship, execution. or agency." These the dictionary teaches us. Knuth tells us that the endesvor of computer programming is an arc. in Just these vays. The art of constructing lntelllgent agents is both pert of and an o xtenalon of the programming art. It la the art of building complex computer programs that represent and reason with knowledge of the world. Our art therefore lives in symbiosis with the other uorldly arts, whose practitioners -- axparts of their srt - hold the knowledge ve need to construct fntelllgent agents. In moat "crafts or branches of lesroing" what us cell "expertise" is the essence ot the art. And for the domains of knowledge that w touch with our art. it is the "rulea of expertise" or the rules of "good Judgment" of the expert practitioners of that domaio chat we seek to transfer to our programs. 2.1 Lessons of the Past TWO insights from previous vork are pertinent to this essay. The first concerna the quest for generality and power of the inference engine used fn the perfornsnce of intelligent acte (what Nlnsky and Papert [see Goldstein and Paperr, 19771 have labeled "the paver strategy"). UC must hypothesize from our experience to date that the problem aolvviog paver exhibited In an lntelllgent agent's perfomnce is primarily a consequence of the specialist's koovledge employed by the agent. and only very secondarily related to the generality end power of the inference aethod employed. Our agents oust ba bowledge-rich, even if they are methods-poor. In 1970. reporting the firet msfor sunnary-of-results of the DWDBAL program (to be dlacuaaad lster). we addressed thia i.aa~e as folloua : "...general problereolvera are too weak to be used as the bash for buiIding high-perfornsnca systems. The behavior of the best general problem-solvers ve know, human problem-solvers, Is observed to be nak and shallow, except in the area* in which the human problan-solver is a specialist. And it la observed that the transfer of expertise becveen specialty 190 Appendix C areas is slight. A chess master 1s unlikely to be an expert algebraist or P" experr mass spectrum analyst, etc. In this view. the expert Is the specialist, with a specialist's knowledge of his sees rnd specialfst's heurist:c5." (Feigenbauo. methods and Buchanan and Lederberg. 1971. p. 187) Subsequent evidence from our laboratory and aI others has only confirmed this bclfef. AI researchers have dramatically shffted their vfew on generality ahd power in the psar decade. la 1967, the canonical question about the PENDW program was: "It sounds like good chemistry. but what does it have to do with AI?" In 1977, Goldacain and Paperr write of a paradigm shtft in AI: "Today there has been a shift in paradigm. The fundamental problem of understanding Intelligence is OOC the identification of a fcv powerful techniques, but rather the question of bov to represent large amounts of knouledge in a fashion that permit5 their effective use and interaction." (Goldstein and Papert, 19771 The second insight from past work concerns the nature of the knowledge that an expert brings to the performance of a task. Experience has shown us that this knowledge is largely heuristic kmnrledge. experiential. uncertain - mstly "good guesses" and "good practice," in lieu of facts and rtg0r. Experience has also taught us that much of this knowledge fs private to the export. not because he is unvillLng CO share publicly how he performs. but because he is unable. He know more tbsn he is avare of knoving. Imy else is the Pha. or the Internship a gufld-like apprenticeship co a presumed "master of the craft?" What the masters really knov is not written in the textbooks of the masters.] But we have learned also that this private knowledge can be uncovered by the careful, painstaking analysis of a second p=rty v or sometimes by the expert himself, operating in the context of a large number of highly specific performance problems. Finally, we have learned that expertise is multi- faceted, that the expert brings to bear many and varied sources of knowledge in performance. The approach to capturing his expertise must proceed on many fronts 5imultaneousiy. 2.2 The Knowledge Eoqineer The knowledge engineer is chat second party jurt discussed. [An historical note about the tern. In the mid-60s. John McCarthy, for reasons obvious from his work. had been descrfbing Artificial Intelligence as "Applied Epistemology." `&en I first described the DENDRAL program to Donald Hichle in 1968, he remarked that it was "epistemological engineering," a clever but ponderous aad unpronounceable turn-of-phrase that I slmplffled into "knovledge engineering."] She (in deference to my favorite knowledge engineer) works intensively with an expert to acquire domain-specific knovlcdgc and organize It for use by a program. Simultaneously she is matching the tools of the AI workbench to the task at hand -- program organi2atlon8, methods of symbolic inference, techniques for the structuring of symbolic information, and the like. If the tooi fits, or nearly fits, she uses it. If not, necessity mothers AI invention, and a oew tool gets created. She builds the early versions of the intelligent agent, guided always by her intent chat the program eventually achieve expert Leveis of performance in the task. She refines or reconceptualiaes the system 5s the tncreasing amount of acquired ,\novledge eau5es the AI tool to "break" or slow down intolerably. She 0180 raffnes the human interface to the intelligent agent with several aims: co make the system appear "comfortable" to the human user in his linguistfc transactions vith it; to make the system's inference processes understandable to the user; and to make the assistance controllable by the user vhen, in the context of a real problem, he has an insight that previously was not elicited and therefore not incorporated. In the next section, I vish to explore (in summary form) some case studies of the knovledge engineer'5 art. 3 CASES FROH THE KNOWLEDGE ENGINEER'S WORKSHOP I will draw material for this section from the work of my group at Stanford. Much exciting work la knovledge engineering is gofng on elsewhere. Since my intent is not to survey literature but to illustrate themes, at the risk of appearing parochial I have used as ca5e studies the wrk I knov bert. w collaborators (Professors Lederllerg and Buchanan) and I began a series of projects, initially the development of the DENDRAL program, in 1965. We had dual motives: first, TV study scientific problem solving and discovery, particularly the processes scientists do use or should use in fnferring hypotheses and theories from empirical evidence; and second, to conduct this study in such a way that our experimental programs would one day be of use to working, scientists. providing Lntelligent assistance on feportant and difficult problems. By 1970. we and Appendix C our co-workers had gafned enough experience that we felt comfortable in laying out a program of research encompassing work on theory formation. knowledge ucllFratton. knowledge acqutoltion, explanation, and knowledge engineering techniques. Although there were some surprises along the way (lfke the All program). the general lines of the research sre proceeding as o nvfsioned. THEMES As a road map to these case studies, it is useful to keep in mind certain major themes: Cenerstion-and-test: Omnipresent tn our exneriments is the "classical" neneration-and- test framework that has been the hallmark of AI programs for tvo decades. This is not a consequence of s doctrinatre attitude on our part about heuristic search, but rather of the usefulness and sufficiency of the concept. Sftuation -> Action Rules: Ue have chosen to represent the kno-o,ledge of experts in this form. Making no doctrinaire claims for the universal applicability of this representation, we nonetheless point to the demonstrated utility of the rule-based representation. From this representation flov rather directly msny of the characteristics of our programs: for example. ease of modification of the knowledge, case of explanation. The essence of our approach is that a rule must capture s "chunk" of domain knowledge char is meaningful, in and of itself, to the domain specialist. Tbucr our rules bear only a historical relationship to the production rules used by Newell and Simon (1972) vhich we view as "machine-language programming" of a recognize -> act machine. The Domain-Specific Knowledge: It plays s critical role in organizing and constraining sesrch. The theme is that in the knowledge is the power. The interesting action srises from the knowledge base, not the inference engine. We use knowledge in rule form (discussed above), in the form of inferentially-rich models based on theory, and in the form of tableaus of symbolic data and relationships (i.e. f rawlike structures). System processes are made to confons to nstural and convenient representations of the domain- specific knowledge. Flexibility to aodifv the knovledKe base: If the so-es1 led "grain sire" of the knowledge representation is chosen properly (i.e. small enough to be comprehensible but large enough to be mesningful to the domain epecislist), then the rule-based approach allows great flexibility for adding. removing. or changing knowledge in the system. Line-of-reasoning: A csncrsl organizing principle in the design of knovledge-based intelligent agents is the naintenance of a line-of-reasoning that is comprehensible to the domain specialist. This principle is, of course, not a logical necessity, but seems to us to be an engineering principle of major importance. Multinle Sources of Knowledge: The forwcion and maincensnce (support) of the line-of-reasoning usually require the integration of many disparate sources of knowledge. The representational and inferential problems in achieving s smooth and effective integration are formidable engineering problems. Explanation: The ability to explain the line-of- reasoning in a language convenient to the user is necessary for application sad for system development (e.g. for debugging and for extending the knowledge base). Once again, this is an engineering principle, but very important. Uhat constftutcs "an explanation" is not a simple concept, and considerable thought needs to be given, in each case, to the structuring of explanations. CASE STUDIES In this section I vi11 cry to illustrate these themes with various csse studies. 3.1 DENDRAL: Infertine Chemical Structures 3.1.1 Historical Note &gun in 1965. this collaborative project with the Stanford Hass Spectrometry Laboratory has become one of the longest-lived continuous efforts in the history of AI (a fact that la no spa11 way has contributed to irs success). The basic framwrk of generation-and-test and rule-based representation has proved rugged and extendable. For us the DENDRAL system has been a fountain of ideas. many of which have found their way, highly metamorphosed, into our other projects. For exsmple. our long-standing commitment to rule- based represenrstioas srose 0oC of our (successful) attempt to head off the imminent ossification of DHNDRAL caused by the rapid accumulation of new knowledge in the system around 1967. 3.1.2 Task To enumerate plausible structures (atom-bond graphs) for organic molecules, given two kinds of information: analytic instrument data from a mass spectrometer and a nuclear magnetic resonance spectrometer; and user-supplied constraints on the answers, derived from any other source of knowledge (instrumental or contextual) available to the user. Appendix C 3.1.3 ReprCSCntaLions Chemical structures are represented as nodr link graphs of atoms (nodes) and bonds (links). Constraints on search are represented as subgraphs (acomic configurations) to be denied or preferred. The empirical theory of mass spectronecry is represented by a set of rules of the general form: Sttwtfoa: Parricular atomic configuration (subgraph) ( Probability, P, I of occurring V Action: Fragmentation of the particular configuration (breaking links) Rules of this forn are natural and expressive to mess speccrometrists. 3.1.6 Sketch of Hethod DENDRAL's inference procedure 1s a heuristic search that takes place in three 5cages. without feedback: plan-generate-test. "CenVate" (a program called CONCEN) is a generation process for plausible structures. ICS foundation is a combinatorial algortthm (vith aathemaeically proven properties of complecene55 and non-redundant generation) that can produce all the topological ly legal candidace 5trUCturss. Constraints supplied by the user or by the "Plan" process prune and seer the geaeracion to produce the plausible set (i.e. chose satfafyiag the constraints) and not the enormous legal set. "Test" ref fnes the o ??o?????* of plausibility. discardfng less worthy candidates and rank-ordering the remainder for examination by the user. "Test" first produces a "predicted" set of instrument data for each plausible candldete. ustng the rules described. It then evaluates the wrrh of each candidate by comparing its predicted data vtth the actual input data. The evaluation is based on heuristic criterfa of goodness-of-fit. Thus, "test" selects the "best" explanation5 of the data. "Plan" produces direct (i.e. not chained) tnf o ??*?? about like Ly subrcructure In the molecule from patterns in the data chat are indicative of the presence of the substructure. (Patterns in ehe data trigger the left-hand-sides of substructure rules). Though composed of many atoms whose interconnections are given. the substructure can be manipulated as atom-like by "generate." Aggregating many units entering tnto a combinatorial process into fewer htgher-level units reduces the size of the combinarorial search space. "Plan" sets up the search space so aa to be relevent co the input data. "Generate is the inferencs cacticlan; `Plan" is the inference strategist. There is a separate "Plan" package for each type of instrument daca, but each package passes substructures (subgraphs) to "Generate " . Thus, there is 5 uniform interface between "Plan" and "Gearrace." User-supplied constraints enter this fnterface. directly or from user-assist packages. tn the form of substructures. 3.1.5 Sources of Knovledse The various sources of knowledge used by the DENDRAL system are: Valences (legal connections of atoms); stable and uastable configurations of atoms; rules for m5ss speccrometry fragmencacions; rules for NMR shifts; expert's rules for planning and evaluation: user-8uppL ied constraints (contextual). `* 3. X.6 Results DgNDRAL's structure elucidation abilftles are. paradoxically. both very general and very narrow. Ia general, DENDRAL handles ail molecules, cyclic and tree-like. In pure 3tructure elucidation under constraints (without instrument data).CONCgN is unrivaled by human performance. In structure elucidation with inacrumenc data, DLNDRAL's performance rivals expert human performance only for a smail number of molecular families for which the program has been given 5pecialisc's knowledge. namely the families of interest to our chsmfsc collaborators. I will spare this computer science audience the list of name5 of these families. Uithia these areas of knovlsdge-intensive specialization, DENDRAL -s perfomnce is usually not only much faster but also more accurate than expert human performance. The statement just made summarlses thousands of runs of DENDRAL on problems of interest to our experta, their colleagues, and their studencs. The results obtained. along with the knowledge chat had to be given to DERDRAL to obtain them, are published in major journals of chemistry. To date, 25 papers have been published there, under a aerie title "Applications of Artificial Intelligence for Chemical Inference: " (see references). The DKNDRAL system fs in everyday use by Stanford chemists, their collaborators at ocher universities and collaborating or otherwise interested chemists in fndustry. Users outside Appendix C I;t:rnford BCL'PS.s the syacrn ""CT rommerr i 3 1 compuc*r/commcinicacions network. Thr prob I c-es chry are solving are often dlff lcul t and novrl . The Pri tfsh povcr"mmc is currently supporting work at Edinburgh aimed at transferring DENDRAL to induscrlal user cnmmuntttes in the UK. 3. I. 7 Discussion Representaclon and extensibility. The wpresentatlon chosen For the molecules, ronscraincs, and rules of instrument data interpretation Is sufficiently claw CO that used hY chemists In thinking about scructurc elucidation that the knowledge base has been extended smoothly and easily. mostly by chemists themselves in recent years. Only one major reprogramming effort took place In the last 9 years -- when a new generator was created to deal vlth cyclic ~cr~~c~re~. Representation and the Integration of multiple sources of knowledge. The generally difficult problem of fnteRratinR various sources of knowledge has been made easy in DENDRAL by careful engineerinK of the representations of objects, constraints, and rules. We insisted on a C"lll?O" language of compatlblIity of the representations vfth each other and with the 1"ference processes: the language of molecular structure expressed as graphs. This leads to a stralqhtforward procedure for addlng a new source of knowledge, say. for example, the know1 edge associated with a new type of instrument data. The procedure is this: write rulea that describe the effect of the physical processes of the Instrument 0" no1 ecu1 es using the situation -> action form with molecular graphs on both sides: any special Inference process ustng these rules must pass its results to the generator only(!) in the common graph language. It is today vldely believed in AI that the use of many diverse sources of knowledge In problem solving and data interpretation has a strong effect on quality of performance. HOW stronq Is. of course, domain-dependent. but the impact of bringlnq just one additional source of knowledge to bear on a problem can be startling. In one difficult (but not unusually difficult) mass spectrum analysis problem*, the program using its mass spectrometry knovledge alone would have generated an impossibly large set of plausible candidates (over I.25 million! 1. Our engineering response to this was to add another source of data and knowledge, proton NHMR. The addition on a simple interpretive theory of this !DfR data, from which the proDram could infer a few additional ronstraints. reduced the set of olauslble candidates to one, the riRht structure! This was not an isolated result but shoved up dozens of times In subsequent analyses. ------------------ o the analysis of an acyclic amine vith formula C2OKLW. DENDRAL and data. DEKDRAL's robust mode1 s (topologlca1. chemical. instrumental) permit a strategy of finding solutions by generating hypothetical "correct answers" and choosing among these with critical tests. ThlS strategy 1s opposite to that of piecing together the Implications of each data point to form a hypothesis. UC call DENDRAL's strstegy largely model-driven, and the other data-drive". The consequence of having enough knowledge to do model-driven analysis is a large reduction in the amount of data that must be examined since data fs being used mostly for verification of possible answers. In a typical DENDRAL mass spectrum analysis, usually no more than about 25 data points out of a cyptcal total of 250 points are processed. This important point about data reduction and focus-of-attention has been discussed before by Gregory (1968) and by the vision and speech research Rroups, but Is not widely understood. Conclusion. DERDRAL was an early herald of AI's shift to the knovledge-based paradigm. It demonstrated the point of the primacy of domaln- specific knowledge in achieving expert lavels of performance. Its developmew t brought to the surface important problems of knowlcdee representatiofl. acquisition. and USC. It shoved that, by and large, the AI tools of the first decade were sufficient to cope wfth the demands of a complex scientific problem-solving task,Or were readily extended to handle unforsecn difflcultles. It demonstrated that AI's conceptual and programming tools were capable of producing programs of applications interest, albeit in narrow special ties. Such a demonstration of competence and sufficiency was Important for the credibility of the AI field at a critical juncture in its history. 3.2 HETA-DENDRAL: inferring rules of mass snectrometry 3.2. I Historical note The META-DERDRAL program is a case study tn automatic acquisition of domain knowledge. It arose out of our DENDRAL vork for two reasons: first, a decision that vlth DENDRAL we had a sufficiently firm foundatlon on which to pursue our lonR-standing interest in processes of scientific theory formation; second, by a recbgnltion that the acquisition of domain knovledRc was the bottleneck problem In the building of applications-orirnted Intelligent agents. X2.2 Task MS'TA-DEDDRAL's job is LO infer rule? of fragmentation of molecules in a mass spectrometer for possible later use by the DE?IT)RAI. performance Appendix C proqram. The inference is co be made from acrusl spoccra recorded from know molecular structures. The oucpuc of the SYSLWI is the set of frn~mcntation rules discovered, summary of the evidence tuppartlnR each rule, and a summery of contra-indicating widener. User-supplied constraints cm also be input to force the form of rules along desired lines. 3.2.3 Represencactons The rules arc, of course, of the same form as used by DENDRAL char was described earlier. 3.2.4 Sketch of Vethod URA-DEBDRAL. like DEh3RAL. uses the rcnrration-and-test framework. The process is orA.xnited fn chrre staRes: Reinterpret the data an-' summari*e evidence (lrrSuM) ; generace plausible candidates for rules WJLECEN); test and refine the set of plausible rules (RUtMOD). INTSLM: gives wery data point in wery spectrum an interpretation as a possible (highly specfficl fraReentacion. It then sumaarizcs stntiscically the "ueiRht of evidence" for fragmentations and for atomic configurationa that cause these fragmentations. fhue. the job of INTSLD4 is to translate data to DENDRAL subgraphs and bond-breaks, and co summarize the evidence JccordlnRly. RULECEN: conducts a heuristic search of the space of all rules chat are legal under the DENDBAL rule syntax and the user-supplied constraints. Xc searches for plausible rules, i.e. those for which posirlve evidence exists. A search path is pruned when there is no evidence for rules of the class just qeneraccd. The search tree begins vi th the (single1 most general rule (loosely put, "anything" fragment9 from "anything") and proceeds level-by-level toward more detailed specifications of the "anything." The heuristic stopping criterion measures whether a rule being generated has become too specific. in particular vhccher tt is applicable to too few molecules of the input set. Slmi1arly there is a criterion for decidinR vhecher an emerRing rule is too Rsncral. Thus. the output of RULECEN Is a set of candidate rules for which there is positive evidence. RULE%OD: tests the candidate rule set using more complex criteria. includlnR the presence of neRatfve evidence. It removes redundancies in the candidat* rule set ; merges rules that are supported by the same evidence: tries further special izacton of candidates to remove negative iCVl4CnCe; and tries further Reneralizstion chat preserves posirivc evidence. 3.2.5 Results HFTA-DENDRAL produces rule sets chat rival in quality those produced by our collaborating experts. In some rests, HETA-DENDRAL recreated rule sets that ve had previously acquired from our experts during the DENDRAL project. In d more stringent test involving members of a family of complex r ingcd molecules for vhich the mass specrral theory had not been completely worked out by chemists, META-DENDRAL discovered rule sets for each subfamlly. The rules were judged by experts to be excellent and a papcr describing them vaa recenciy published in a maj or chemical journal (Buchanan. Smith, et al. 1976). In a test of the generality of the approach, s version of the META-DENDRAL program 1s currently being applied to the discovery of rules for the analysis of nuclear mognecic resonance data, 3.3 WCIN and TEIRESIAS: Medical Diagnosis 3.3.1 Htstorfcal not; HYCIN orlglnated in the Ph.D. thesis of E. Shortliffe (now Shortlfffc. M.D. as well). Ln collaboration vith the Infectious Disease group at the Stanford Medfcal School (Shorcliffe, 1976). TEIRESIAS, the Ph.D. thesis vork of R. Davis., arose from issues and problems indicated by the MYCIN project but generalized by Davis beyond chr bounds of 1976). medicaln~~~~n~:::ee~pl~~ea:f~ns W:vls, Ocher In progress. 3.3.2 raJks The WCIN performance cask is diagnosis of blood infections and meningitis tnfectlons and the recommendation of drug treatment. !iYCIN conducts a coneultacion (in English) with a physic ian-user about a patient case, constructing lines-of- reasoning leading to the diagnosis and treatment plan. The TEIRESIAS knovledge acquisition task can be described as follow: In rhe context of a particular consultation. confront the expcrc vith a diagnosis vith which he does not agree. Lead him systematicaLly back through the line-of-reasoning that produced the diagnosis CO tho point ac vhich he indicates the analysis wane awry. Incerncc with the expert co modify offending rules or to acquire new ruies. Rerun the consul tat ion to test the soIution and gain the expert's concurrence. 195 Appendix C 3.3.3 IF Rcpresentattons: NYCIN's rules are of the form: THEN Here is an example of a HYCIN rule for blood fnfections. RULE 85 IF: 1) The rice of the culture is blood. and 2) The gram acain of the organism is gramncg , and 3) The morphology of the organism ia rod, and 4) The patient is a compromised host THEN: There is suggestive evidence (-6) that the identity of the organism is pseudomonas-aeruginose THIRESIAS allow the representation of HYCIN-like rules governing the use of ocher ru1es.i.e. rule-baaed strategies. Aa example follows. 3.3.5 Language of Intetac tion The language used looks like it raight be English but is actually the dialect "DocLor-ese" ured by members of the tribe of healing arts practitioners. Reasonably simple I anguage processing methods suffice. When ambiguities of interpretation are encountered, they are Ted back to the user for decisions about meaning. 3.3.4 Sketch of method MYCIN employs a generation-and-test procedure of a familiar sort. Tbe generation of steps in the line-of-reasoning is accomplished by backward chaining of the rules. An IF-side clause is either immediately true or false (as determined by patient or test data entered by the physician in the consultation); or is to be decided by subgoaling. Thu.. "test" ia interleaved with "generation" and serves to prune out incorrect lines-of-reasoning. Each rule supplied by an expert has o ssocieted with it a "degree of certainty" representing the expert's confidence in the validity of the rule (a number from I to LO). HYCIB usea a particular ad-hoc but simple model of inexact reasoning to cumulate the degrees of certainty of the rules used In an tnference chain (Shortliffe and Buchanan, 1975). It follows that there may be a number of "aomevhat true" lines-of-reasoning -- some indicating one diagnosis, some indicating another. All (above a threshold) are used by the system as sources of knovledge indicating plausible lines- of -reasoning. TEIRESIAS' rule acquisition process is based on a record of tWCIN's search. Rule acquisition is guided by a set of rule models that dictate the form and indicate the likely content of new rules. Rule models are not given in advance, but are inferred from the knowledge base of existing rules. IF: 1) the patient is a compromised host, and 2) there are ruler vbich mentioa in their premise pseudomonas 3) there are rules vhich mention in their preoiee klebsiellra THEN: There is suggestive evidence (.4) that the former should be done before the latter. 3.3.6 Example diagnosis and treatment plan The folloving is an example of a diagnosis reported by RYCIN to one of its users foll,ovlng a consultation. An example consultation vi11 not be ahovm. but one can be found in Shortliffe's book (Shortllffe, 1976). --------- m-s- Fly therapy recommendations vi11 be designed to treat for organisms that are either very likely or, although less likely, would have a significant effect on therapy selection if they were present. It is important to Appendix C cover for chr followtnR probable infection(s) and associated organism(s): INFECTION-l is CYSTITIS PSEUDOIYONAS-CEPACIA [ORCANISl#-51 LISTERIA [ORGANISM-21 INFECTION-3 is RACTgRDfIA WTEROCCCCUS [ORGANI%!- 1 [REC-II tly preferred therapy recommendation is as folloue: In order to cover for Items ~1 2 3 4 5X Give the following in combination: I ) KANAMYCIN Dose: 7.50 mg (7.5 =&kg) q12h IF! (or IV) for 28 days Comments: Modify dose in renal failure 2) PENICILLIN Dose: 2,500.OOO units (25000 unitm/kg) q4h IV for 28 days --e-------w-- 3.3.7 Results MYCIN ts currently bcfng prepared for clinical evaluation. In a prellmtnary evaluation. done outside of the clfaical setting, o panel of expert judges uas asked to rate MYCIN's perfomance along a number of dimensions (e.g. infecting organism ldentif ication, organism siRniflcance1. In 90X of the casem submitted CO the judges, a majority of the judgca raid thet the program's decisions uere the-some-as or as-good-as the decisiona they would have made. 3.3.8 Exulaaacfon MYCIN can expose its line of reasoning by offering explanattons of various kinda. `Why" and "How" questions can be asked by the user during the consultation. "my" is the user's usy of inquiring. `Why do you vaac to know that?" "Row" is his uay of asking, "Hou did you conclude that...?" In a post-consultation, he can similarly inquire about how the diagnostic conclusiona uere reached, and inquire about the reasons why certain ocher conclusions verc ruled out. Examples of the XYCIN-TEKRESIAS explanation capabilities follow: IlYCXN: Is the patient's fllness with ORGANISM-I a hospital-acquired infection? USER: UKY TEIRFSIAS: (i.e. WRY is it imporcrnt to determine whether or not the infection with ORGANISM-l uas acquired while the patient uaa hospitalized?) Il.01 This ~111 aid in determining the category of ORGANISM-I. It has already been o acablished that Il.11 the gram stain of ORGANISM-1 is gtameg. and Il.21 the morphology of ORCANISR-1 is rod. cud [I.31 the aerobicity of ORCANISH-I is facultatlve Therefore. if 11.41 the infection with ORGANISM-I was not acquired while the patient was hospitalized then there is strongly suggestive evidence (.8: that the category of ORGANISM-1 is enterobscterirceae Also: there is weakly suggestive evidence (. 1) that the identity of ORGANISU-1 is pseudomonas-acruginosa [RULE 371 USERR: KOU 1.3 TEIRESIAS: [i.e. HOU vas it established that the aeroblcfty of ORCXNISH-I ts facultative?I The following uere wed: f3.11 RULE 27 indicated there is strongly suggestive evidence (.8) that rhe aerobicity of ORGANISR-1 1s facultativc 13.2) RULE 27 indicated there is weakly suggestive evidence (-2) that the rerobicity of ORCANISX-I is anaerobic Since this gave a cumularive CF of c.8) for facuitaclve, and c.2) for anaerobic, i.t has been established that the aerobicity of ORGARISH- is faculcative. Appendix C USER: WHY DIDN'T YOU CONSIDER STREPTOCOCCUS AS A WSSIBILITY FOR ORGANISM-1 7 TEIRESIAS: The following rule could have been used CO detetmiue that the fdentlfy of ORCANISM-1 was strepcococcw: RULE 33 But claue 2 ("the morphology of the organlao IS coccus") WAS already know co be false for ORCANISP!-I, so the rule ves never tried. 3.3.9 Discussion A fev short comments aimed nt knowledge engineering are in order. English is the language moat natural snd comfortable for out experts and users. and van therefore the language chosen for interactive consultation, explan8tion. and external representation of the rules (the internal format is INTERLISP). This situation is not peculiar to doctors; in most mew of applfcat1on of intelligent agents I believe that Eaglieh (i.e. natural langusge) vi11 be the language of choice. Programming an English language proceesor aad front-end to such SyBtcw is not . sc4ry enterprise because: a) the domain is specialized. so that possible interpretations are constralned. b) specialist-talk is replete with stsndard jargon and stereotyped veye of expressing knowledge and queries - just right for text templates, simple gr-rs and other simple processing schemes. cl the ambiguity of interpretation resulting from simpple schemes caz~ be dealt with easily by feeding back Cntcrpretatlous for confirmation. If this is done tith s pleasant "I dido't quite understand you..." tone. it is not irritating to the user. English msy be exactly the wrong language for representation and interrctioo in 80-e dorrfns. It would be svkvard. to sey the lust. to represent DE24DUAL's chemical atructuren and knovledfle of MAW apectronetry in English, or to interact about these vlth a user. Staple crplsnation schenee hsve been .s part of the AI scene for a number of years and 4r'c not hard to implement. Rcaily good models of what o xplanstion is as a traauction between user and agent, vf th programs to implement these models, vi11 be the subject (I predict) of much future research in AI. Without the o xpLanaCion capability, I asert, user acceptance of WIN would have been nil. and there would have been a greatly diminished effectiveness and contribution of our experts. HYCIN wss the first of our progrsos that forced ua to deal with uhst ve bad always understood: thst experts' knovledge is uucertafn and that our Inference engines had to be msde to reason titb this uncertainty. It is leas importaot that the inexact reasoning scheme be formal, rigorous, md uniform thro it is for the scheme to be natural to and eaolly underataudable by the experts and users. All of cheat points can be summarized by saying that HYCIN snd its TEIRESIAS sdjuact are exper1mmts in the design of a see-through system, whose represeatatioas and processes are almost transparently clear to the domain specialist. "Almost" here is equivalent to "with a few minutes of introductory description." The various pieces of MTCIN - the b&ward chaining, the English traiasac tious , the explanations. etc. - are each simple fn concept and realization. But there are great virtues to simplicity in syetcr design; and vieved as s total intelligent Agent system. HYCIN/TZIRESIAS'is one of the best engineered. 3.4 SU/X: signal understanding 3.4.1 Historical note su/x ia a system design that vas tested in 00 application vhose details arc classified. &cause ot thin. the easuiug discussion vi11 appear considerably less concrete and tangible thro the preceding CS.C studies. This system design vss done by H.P. Nii and PC, and vss strongly influenced by the QLU Hesrmy II system design. 3.4.2 Task SU/X'U task la the formation and continual updatf ng , over long periods of time, of hypotheses shout the identity, location, and velocity of objects in s physical apace. The output desired is a display of the "curreoc best hypotheses" with full explanation of the support for each. There are two types of input data: tha primary signal (to be understood); and suxiliary symbolic data (to supply context for the understanding). The primary signals are spectra, represented as descriptions of the spec:ral lines. The various spectra cover the physics1 space vith some spatial overlap. Appendix C 3.4.3 Represencactons The rules give" by the expert about objects. their behavior. and the interpretation Of SfgM1 data from them are all rcpreaented in the situation -> action form. The "slt"atiom3" constitute 1,WOktng conditions and the "actiona" are processes that modify the current hypotheses, P-t unresolved issue5, recompute evaluations. etc. The expert's knowledge of how to do analyafs in the task is also represented in rule form. There strategy rules replace the normal exccutfve program. The situation-hypochaeis fa repreaaoted a8 a node-link graph. tree-like in chat it has distinct "levels," each representing a degree of abstraction (Or Za88reg~CiOll) that is natural to the expert in his understanding of the domaio. A node represents an hypothesis; a link to that node represents support for that hypothesis (as in HEARSAY 11. "support from above" or "support f roa below"). "tower" levels art concerned with the specif its of the signal data. "Higher" levels represent symbolic abstractions. 3.4-h Sketch of method The altuation-hypothesis 15 formed incrementally. As the situation unfolds over time, the triggering of rules modifiet or discards existing hypotheses, adds new ones, or changes support values. The situation-hypothesis is a common wrkspace ("blackboard." in HEARSAY jargon) for all the rules. In general, the incremental steps toward a more complece and refined o ituacioa-hypothceis can be viewed as a sequence of local generate-and-test activities. Some of the rules are plausible move generators. geoeracing either nodes or links. Other rules are evaluators. testing and modifying node descriptiona. In typical operation. aev data is submit ted for processing (say. N tint-units of neu data). Thfs inftiaces a flurry of rule-triggerlogs and consequently rule-actions (called "events"). Some tvencs are direct consequences of the data; other avents arise io a cascade-like fashion from the triggering of rules. huxflisry symbolic data also cause events, usually affecting the higher levels of the hypothesis. As a consequence, aupport- fro-above for the lower level ptoceaaes is made available; aad expecrrtioar of possible lower level events can be formed. Rventually all the relevant rules have their say aod the system becomes quiescent, thereby triggering the input of new data to cc-energize the inference activity. The ayatem uses the almplifying strategy of Mincainfag only oae `*best" situation-hypothesis PC any moment, modifying it incrementally as required by the changing data. Thls approach is made feasible by several characteristics of the dcmain. First, there is the strong conclnuity over time of objects and their behaviors ,(specifically, they do not change radically over time. or behave radically differently over short periods). Second, a single problem (identity, location and velocity of a ptrtlcular aat of obj ecca) persists over numerous data gathering periods. (Coapere this to speech understanding in which each sentence is spoken just once, and each presents a neu and different problem.) Finally. the syscen's hypothesis is typically "IIlEJosc right." in part becsuse it gets numerous opportunities to refine chc solution (i.e. the numerous data gathering periods), and la part because the availability of many knouledga sources tends to over-derernine the solution. As a result of all of thatt, rhc current best hypothesis changes only slwly vith time, and hence keeping only the current best is a feasible approach. Of latereat are the time-based events. These rule-like expressions, created by certain rules, trigger upon the paasaqs of specified amounts of time. may implement various "wait-and-see" strategies of anelysis that are uaaful in the domaia. 3.4.5 Results In the teat application. using signal data generated by a simulation program because real data uas not available, the program achieved expert level6 of performance over a *pan of test problems. Some problems wre difficult because there vao very little primary signal to auppor t inference. Others were difficult because too much signal induced a plethora of alternatives with much ambiguity. A difitd SU/X design is currently being used as the basis for an application to the interprctatlon of x-ray crystallographic data, the CRYSALIS program mentioned later. 3.4.6 Discussloo The role of the auxtliary symbolic sources of data is of critical importance. They supply a symbolic model of the erlstlng situation that Is used to generate txpactationt of events to be observed in the data stream. This allovs flow of inferences from higher levels of abstraction to lover. Such a process, 50 familiar to AL researchera. apparently is al?ESt unrecognized 0-s 5igTlal processing engineers. In the application task, the expectation-driven analysis is essential in controlling the combinatorial procesaiog explosion at the lover levels,exactly the explosion chat forces the traditional afgM1 processing angineers to aeek out the largest possible number-cruncher for their vork. The de8lgU of appropriate explanations for the user takes an interesting twist in SU/X. The Appendix C situation-hypothcals unfolds piecemeal over time,, but rhe " appropriate" explanation for the user is one chat focueea on individual objects over time. Thus the appropriate o xpIenatinn muet be synthesized from a history ol all the events that led up to the current hypothesis. Contrast this virh the HYCIN-TEIRESIAS reporting of rule invocatioee in the eonetructioa of a reeeoning chain. Since its knowledge beee and its auxiliary eymbolic data give ?? o model-of-the-eituetion that StrO&y coastreins interpretation of the primary data Stream. swx Is relarivelp unperturbad by o rrorful or missing date. These data conditions merely cause fluctuations in the credibility of individuel hypotheses and/or the creation of the `*"aIt-end-see" events. SU/X can be (but has not yet been) ueed to control o enaore. Since its rulce specify what types end vel"ee of evidence ere nece8eary to establish support, nnd since it Is constantly processing a complete hypothesis structure. It can request "critical readings" from the sensors. In general, this a1 lows an efficient use of Umited sensor bandwidth and data acquieirion processing capability. 3.5 OTHER CASE STUDIES Space does not ellov more than just e brief sketch of other interesting projects that have been completed or are in progress. 3.5.1 A?+: mathematical discovery AM is o knowledge-based system thet conjectures interesting concepts in elemantary mathematics. It is a diecoverer of interesting theorem8 to prove, not a theorem proving program. It was conceivad and executed by D. Lenat for his Ph.D. thesis, and Is reported by him in these proceedings ("An Overview of An"). API's knWltdgt 18 beeicelly of CM types: rules thet S"g8t`t possibly interesting new concepts from previously coojectured concepts; and rulae that evaluetc the mathematical "lntereetingne*e" of a conjecture. These rules attempt to capture the expertire of the profeeeionel mathemeticlan at the teak of mathematical diecovery. Though LeneC ie not a profeeeionel mathematieien. he vae able successfully to eerve as his owe expert in the building of this program. A?4 conducts a heuristic aearch through the space of concepte treatable from its rules. Its basic frnmewrk is generetion-end-test. The generation is plausible uove gtwrecion, se indicated by the rules for formation of new concepts. The test I` the evaluation of "incereatingness." Of particular note is the method of test-by-example that lends the flavor of scientific hypothesis testing to the caterprlee of mathematical discovery. Initialized tith concepts of elementary set theory, it conjectured concepts in elerencary aumhtr theory, euch as "add." "multiply" (by four dietlact paths!), *primes," the unique fectorlution theorem, and o concept o imllar to prima` but previouely not much studied called %exlmally divlelble numbers." 3.5.2 HOLCEN: planning experimaate in molecular genetics UOLGW a collaboration with the Stanford Genetics Depertmenc, is wrk La progress. HOLCEN'e caek le to provide intelligent advice to a molecular geneticiet on the planning of experimmcs involving the manipulation of DNA. The geneticist hae various kinds of laboratory technique6 available for changing DNA material (cute. joins, insertions, deletione, and so on); techniques for determining the biological coneequencee of the changes; various instruments for meeeurlng effects; various chemical methods for inducing, facilitating. or Inhibiting changes; end many other'toole. NLGEN will offer planning aeelatance in organizing and sequencing such tooie to accomplish an experimental goal. In o dditioa HOIXEN will check ueer-provided experiment plans for fceeiblllt~; and its knovledge baee will be a repository for the rapidly expanding knowledge of this specialty, available by interrogation. Currant efforts to tngiMSr a knouledgc-base management eymtes for HOLGEN are described by Kertin et al la a paper In these proceedings. This aubeyetem uses and o xteude the techniques of the TEIRESIAS system diecueeed earlier. In HOLCEN the probIeo of integration of many divcree eourcee of knowledge is central since the essence of the experiment planning process is the successful merging of biological, 8tnetiC, chemical, topological, and inetrumcnt knovlcdge. In MOUXN the problem of rtprtaenting processes is also brought into focus since the expert's knowledge of txperlmmcel strategies -- proto- plane - wet also be represtated and put LO USC. 3.5.3 cR=ALIs: fnftrr%nR DrGtein StrUCtUre fKOm electron density maos CRYSALE, too, is uork in progress. Its task is to hypothesize the Structure of a protein from a map of electron density that is derived from x- ray crystallographic data. The map is three- dimensional. and the contour information is crude and highly ambiguous. Interpretation is guided and eupporttd by auxiliary information, of which the amino acid sequence of the protein's backbone is the moat important. Density map interpretation 200 Appendix C is 5 protein chemist's art. Ae always, capturing this arc in heutiscic rules and putting it to use vtch an inference engine f4 the project's Soal. The iofcrcace engine for CRYSALIS is a modification of the SU/X system daafga described above. `The hypothesis formation process must deal vtch mauy levels of pomsibly useful aggregation sod abstraction. For example. the map itself can be vieuctd as consisting of "paake." or "peaks and vslleys." or "skeleton." The protein model has "atoms," "amide planes," "amino acid sidechains," and eves massive substructurea such am "hmlfces." Proteiu moleculea are so complex that ? o ????????? generation-and-test strategy like DENDRAL'e is not feasible. Incremental piecing together `of the hypothesis using region-growing methods is lUCC4U*ry. me CRYSALIS design (alias SU/P) is descrtbd in a recent papr by Nfi nnd Feigmbauo fL97f). 4 SUM?LUY OF CASE STUDIES Some of the themes preseated earlier need no recapiculacioo. but I wish to revisit three here: genrratton-and-teat; situation -> action rules; and explanations. A.1 Generation aad Test Aircraft come ia a vide variety of sixes, shapee, and functional designs and they are applied in very many weye. But almost all thet fly da 50 because of the unifying physical principle of Lfft by airflow; the others are deaoribed by exreption. So it is with intelligent agent pmg rams and, the informatioa processing psychaLoglscs tell us. vfth people. One unifying prfnciple of "intelligence" is gmeration-amd- cast* No wonder that it ham been so thoroughly atudlkd in AI resrarch! m the case studies. gerraration is ~ifeeced in a variety of forms aed processing sch4mmm. There are legal move generators defined formaLly by a generating algorithm fDENDRAL's graph generating algorithm): or by a Logical rule of iaference WCIN'a backvard cbainiog). When Lega move generation is not possible or not o fffctenr. them are plausible move generators (u in SJ/X and AHI. Soretire gsamratioa is tnccrhaved vith testing (ss in HYCIM, SU/X. and AX). fn o*e caee, all generation precedee testing CDESDRAL). One case (~A-DENDRAL) is mixed, vith some tasting caking place during generation. some after. Test also shows great variecy. There are ;;;`a tests (NXIN: "15 the organism aerobic?"; : "H5e a spectral line appeared at position P?") Some teecs are complex heuristic evaluations (AU: "Is the new concept `interesting'?.`; MOWEN: "Vi 11 the reaction actually take place?") Someclmes a complex test can involve feedback to modify the object being tested (as in MRA- DENDRAL 1. `fbe evidence from our came studlea supports the aeeer'tioa by Newell and Simon chat generation- and-test is s lav of our science (New11 and Simon. 1976). 4.2 Situation - > Action rules Situatioo 0, Action rules are u4ed to repree4nt experts' knoulcdga in all of the ease reudies. Alwaye the situation part fndicatee the specific conditions under which the rule is relevant. The action part can be simple (KYCIN: conclude preeence of particular organism; DENDRAL: conclude break of particular bond). Or it can be quite complex WOLCZN: an experienclal procedure). The overriding consideration in making desigu choices is that the rule form chosen be able to represent clearly and directly what the expert wishes to upress about the domain. As illustrated. this may neceesitace a wide variation in rule eyntax snd samantics. From a study of all the projects, a regularity emerges. A salient feature of the Situation -> Action rule technique for representing expert's knowledge is the modularfty of the kwvladge base, with the concomitant flexibility to add or change the kmowledge easily am the experts' understanding of the domain changes. Nere too one must be pragmatic, not doctrinaire. A technique such am this can not represent modularity of knowledge Ff that modularity does not exist in the domain. The virtue of this techofque is that it serves as a frsmeuork for discovering vhat modularity exists in the domalu. Discovery may feed beck to ceuee reformulation of the knowledge toward greater modularity. Finally, our csee studies have shovn that strategy knovledge can be captured in rule form. x0 TEmEsL4s, the matarules capture knowledge ol hov CO deploy domain knovledge; in SU/X, the strategy rdee rapreaenc the experts' knovledge of "bov to awlyfe" Lo the domain. 4.3 Explamation Nest of the programs, and 111 of the more recent ones o ???? available an explanation capability for the user, bc he end-user or system developer. Our focus on end-users in applications domains hoe forced attention to human eagineering issues. in particular makfag the need for the explanation capability lmperatfve. The Intelligent Agent viewpoint seems to us to demand that the agent be able to explain its activity; else the question arises of rho is in 281 Appendix C conLro1 of the apent's activity. The issue is not a...7deaic or philosophical. xc is an snglncering f ssue that has arisen in medical and military appl icocions of intelligent agencri. and will ppJern future acceptance of AI wrk In app!icotions areas. And on the phtloaaphical level one might aVeIl argue that there is a moral impentiue to provide accur*tc explanationa to end-users whose lntuitiona about our systems are almost nil. F1Cl*Lly. the o xpJ anation capability is needed as part of the concerted attack on rhe knowLedge acquistcioa problem. Explanation of the reasoning process is cenrral to the interactive transfer of expertise to the knowledge base, and ic is our most powerful tool for the debugging of the knowledge base. 5 EPILOGUE What we have learned about knovledge en,gineerlng goes beyond what is discernible in the behavior of our case study programs. In the next paper of this two-part serfea. I will raise and discuss many of the general concerns of knowledge rn*ineers, including these: Uhac constitutes an "application" of AI tee hniques? There is a difference betvcen a serious application and an application-flavored toy problem. What are some criteria for the judicious selection of an application of AI techniques? What arc some applications areas wrrhy of serious attention by knowledge engineers? For example. applications to science. to signal interpretation. and to human interaction oith complex systems. HOW to find and fascinate an Expert. The background and prfor training of the expert. The level of commitment that can be elicited. Designing syutems that "think the way I do." Sustaining attention by quick feedback and incremental progress. Focusing attention to data and specif fc problems. Providing oays to express uncertainty of expert knovledge. The side benefirs to the expert of his investment in the knovled~e engineering activity. Gaining consensus among experts about the knowledge of a domain. The consenmr, may be a more valuable outcome of the knovledge engineering effort than the building of the program. Problems faced by knowledge o *??*???? today: The lack of adequate and appropriate computer hardvare. The difficulty of export of systems to end-users, caused by the lack of properly- sized and -packaged combinations of hardware and software The chronic absence of cumulation of AI techniques in the form of software packages that can achieve vide use. The shortage of trained knowledge engineers. The difficulty of obtaining and sustaining funding for jncerestiog knovlcdge engineering projects. 6 ACKNOULEDCHENT The wrk reported herein has received Jong- term support from the Defense Advanced Research Projects Agency. The National Instftutes of llealch has supported DENDRAL, PIETA-DEHDRAL, and the SUWX-AIM computer facility on which ve compute. The National Science Foundation has supported research on CRYSALXS and MLCEN. The Cursau of Health Sciences Research and Evaluation has supported research on BYCTN. I am grateful to these agencies for their continuing support of our wrk. 2 wish to axprcss my deep admiration and thanks to the faculty, staff and students of rhe Heuristic Programming Project, and to our collaborators in the various worldly arcs. for the creativity and dedication that has made our work exciting and fruitful. My particular thanks for aaeiatance in preparing thfs manuscrfpc go to Randy Davis, Penny Nii, Reid Smith. and Carolyn Taynai. 202 7 REFEREHCES General FetRenbaum, E.A. "Artificial Intellfgencc Research: What is it? Vhat has it achfeved? kbere is it going?," invited paper, Symporium on Artificial Intelligence. Canberra, Australia. 1974. Caldscetn, I. and S. Papert. "Artificial Intelligence. hww3e. and the Study of Knowledge." Coqnicive Science, Vol.1. No.1. 1977. Gregory, R., "on How so Little Information Controls so Yuch Behavior ," Bionics Research Report No. 1. Machine Intelligence Department, Untversity of Edinburgh, 1968. Newell, A. and H.A. Simon, Human Problem Solving. Prentice-Hall, 1972. NevelI. A. and H.A. Simon. "Computer Science as Empirical Inquiry: Symbols-and Search,*' Corn AW -A' 19. 3. ?larch, 1976. DENURAL and !!ETA-DENDRAL FelRenbaum. E.A.. Buchanan, B.C. aad J. Lederberg, "On Generality and Problem Solving: a Case Study Usfng the DENDRAL Program." Machine InteIligence a. Edinburgh Univ. Press, 1971. Buchanan, B.C., Duffield. A.H. and A.V. Robertson, "An Application of Artificial Iotelligence to the Interpretation of Mass Spectra." HaSS Spectrometry Technioues and Applications, C.x Hilne. Ed.. John Uiley b Sons, Inc., p. 121. 1971. ?!lchie. D. and B.C. Buchanan, "Current Status of the Heurfstic DENDRAL Program for Applying Artificial Intelligence to the Interpretation of bss Spectra," Computers for Spectroscopy, R.A.C. Carrington, ed., London: Adam Hilger, 1970. Buchanan. B.C., "Scientific Theory Formation by Computer," Nato Advanced Study Institutes Series, Series E: Applied Science, l&:515, Noordboff- Leyden, 1976. Buchanan. B.C.. Smith. D-8.. White, U-C., Critter, R-J., Feigenbaum. LA.. Lederberg. J. and C. Djerassi. "Applicattona of Artificial Intelligence for Chemical Inference XXII. Automatic Rule Formation in Mase Spectrometry by Means of the Heta-DENDRAL Program." Journal of the ACS. 98:6168. 1976. HYC!N Short1 iffe, E. Computer-based Medical Consul- tations: HYCIN, Nev York, Elsevier. 1976. Davis. R. . Buchanan, B.C. and E.H. Shortliffe. "Production Rules as a Representation for a Knowledge-Based Consultation Program," Artificial IntelIiRence. 8. 1, February, 1977. Shortlfffe, E.H. and B.C. Buchanan, "A Model of Inexact Reasoning in Medicine." HathematicaI Biosciences, 23:351, 1975. TEIRESIAS Davis, R., "Applicationa of Heta Level Knowledge to the Construction, Maintenance and Use of Large Knowledge Bases," Memo HPP-76-7, Stanford Computer Science Department, Stanford, CA. 1976. Davis. R.. "Interactive Transfer of Expertise I: Acquisition of Nev Inference Rules," these Proceedings. Davis, R. and B.C. Buchanan, "Metn-Level Knowledge: Overview and Applications." these Proceedings. Nit, H.P. and E.A. Feigenbaum, "Rule Based Understanding of signals." Proceedtnqe of the Conference on Pattern~Directed Inference Systems, 1977 (forthcoming), also Hemo HPP-77-7, Stanford Computer Science Department, Stanford, CA, 1977. Lenat, D.. "Au: An Artificial Intelligence Approach to Discovery fn Mathematics ae Heuristic Search," Memo BPP-76-8. Stanford Computer Science Department. Stanford, CA, 1976. MOLCEN liartIn, N., Friedland. ?? o king, J.. and tit Stefik, "Knowledge Base Management for Experiment Planning tn holecular Genetics." these Proceedings. CRYSALIS Engelmore, R. and H.P. Nii, "A Knowledge-Based System for the Interpretation of Protein X-Ray Crystallographic Data," Memo HPP-77-t. Department of Computer Science, Stanford, CA. 1977. References 1. 2. 3. 4. 5. 6. 7. a. 9 _ . Adams, J.B. A probability model of medical reasoning and the MYCIN model. Xath. Biosci. 32,177-186 (1976). Anderson, R.H., Gallegos, M., Gillogly, J.J., Greenberg, R o , and Villanueva, P.. RITA Reference Elanual, Report R- 1808~mm, The Rand Corporation, Santa Monica, CA., September 1977. Bennett J.S., Creary L.G., Engelmore R-E., Melosh R.B., A Knowledge-based Consultant for structural analysis, forthcoming. Bleich, H.L. The computer as a consultant. New Eng. J. Xed. 284,141-147 (1971). Blum, Robert L. and FJiederhold, Gio: Inferring Knowledge from Clinical Data Banks Utilizing Techniques from Artificial Intelligence. "Proc. 2nd Annual Symp. on Comp. Applic. in Med. Care," pp. 303-307, IEEE, Washington D.C., Nov. 5-9, 1978. Bobrow D-G., Winograd T., An Overview of KRL, a Knowledge Representation Language, Cognitive Science 1:l (1077). Eobrow D-G., Winograd T., Experience with KRL-0, One cycle of a knowledge representation language, Proceedings of the 5th International Joint Conference on Artificial Intelligence, Cambridge, Mass. (August 1977). Bonnet A., BAOBAB, A parser for a rule-based system using a semantic grammar, Technical Report HPP-78-10, Heuristic Programming Project, Stanford California (September 1978). Erown, J.S., Steps toward a Theoretic Foundation for Complex, Knowledge-Based CAI. BBN No. 3135. 10. Erown, J.S., Collins, A., and Earris, G. 11. 12. 13. 14. 15. 16. 17. la. 19. 20. Artificial Intelligence and Learning Strategies. To appear in Learning Strategies (ed. Harry O'Neil), Academic Press, New York, 1978. Buchanan, Bruce G. and Feigenbaum, Fdward A. DEPDRAL and Meta-DENDRAL: Their Applications Dimension, Artificial Intelligence, 11:5 (1978). Clancey, W. "The Structure of a Case Method Dialogue", to appear in Int. Jnl. of Man Machine Studies, Fall, 197&. Colby, K.M., Weber, S., and Hilf, F. Artificial paranoia. Artificial Intelligence 2,1-25 (1971). Croft, D.J. Is computerized diagnosis possible? Comput. Biomed. Res. 5,351-367 (1972). Davis, R. Applications Of Meta Level the Construction, Maintenance, Knowledge To And Use Of Large Knowledge Bases. Doctoral dissertation, Stanford University ; Memo HPP-76-7, Stanford Computer Science Department, 1976. Davis, R. and King, J. An overview of production systems. Machine Intelligence 8: Machine Representations of Knowledge (eds. E.W. Elcock and D. Michie), John Wiley, April 1977. de Dombal, F.T., Leaper, D.J., Staniland, J.F., McCann, A.P., Horrocks, J.C. Computer aided diagnosis of acute abdominal pain. Brit. Pled. J. 11,9-13 (lQ72). Duda, R. O., Hart, P., Nilsson, N. & Sutherland, G. "Semantic network representations in rule-based inference systems", in Pattern Directed Inference Systems cede.. Waterman and Hayes-Roth), Academic Press,New York, 1078. Engelmore R.S., Nii H-P., A knowledge-based system for the interpretation of protein x-ray crystallographic data, Heuristic Programming Project Memo HPP-77-2 (February 1977). Erman L.D., Lesser V.R., A multi-level organization for problem solving using many, diverse, cooperating sources of 21. 22. 23. 24. Friedman, R.B. and Gustafson, D.H. Computers in clinical medicine: a critical review. Comput. Biomed. Res. 10,199-204 (1977). 25. Fries, J. Time-oriented patient records and a computer data-bank. J. Amer. Med. Assoc. 222,1536-1542 (1973). 26. Goldstein, I., Papert, S. Artificial Intelligence, Language, and study of knowledge. Cognitive Science 1:l (1977). 27. 28. 20. knowledge, in Proceedings of the 4th International Joint Conference on Artificial Intelligence, Tbilsi, Russia (1075). Fagan L.Y., Ventilator Manager: A program to provide on- line consultative advice in the intensive care unit, Heuristic Programming Project Fern0 HPP-78-16 (Working Paper) , Computer Science Department, Stanford University (September 1978). Feigenbaum E.A., The art of artificial intelligence: I. Themes and case studies of knowledge engineering, Proceedings of the 5th International Joint Conference on Artificial Intelligence, Cambridge, Nass. (August 1977). Feitelson J., Steflk M., A case study of the reasoning in a genetics experiment, Heuristic Programming Project Report 7748 (working paper) ,Conputer Science Department, Stanford University (April 1977). Gorry, G-A. and Barnett, G.O. Experience with a model of sequential diagnosis. Comput. Biomed. Res. 1,49@-507 (1968). Gorry, G.A., Kassirer, J.P., Essig, A., and Schwartz, W.B. Decision analysis as the basis for computer-aided management of acute renal failure. Amer. J. Med. 55,473- 484 (1973). Gorry, G.A., Silverman, H., and Pauker, S.G. Capturing clinical expertise: a computer program that considers clinical responses to digitalis. Amer. J. Med. 64,452-460 (1978). 206 30. 31. 3' b. 33. 34. 3s. 36. 37. 38. 39. Green, P.F., Wolf, A.R., Chomsky, c., and Laughery, K. BASEBALL: An automatic question-answerer. In Computers and Thought (eds. E.A. Feigenbaum and J. Feldman), pp. 207-216, !rcGraw-Hill, San Francisco,l963. Harless, W.G., Prennon, G.G., Marxer, J.J., Foot, J.A., Wilson, L.L., and Miller, G.E. CASE - 2 natural language computer model. Comput. Biol. Med. 3,227-246 (1373). Hart, P.E. Progress on a computer-based consultant. AI Technical Note 99, Stanford Research Institute, Menlo Park, CA., January 1975. Hayes-Roth F., Lesser V.R., Focus of attention in the HEARSAY-II speech understanding system, Proceedings of the 5th International Joint Conference on Artificial Intelligence, Cambridge, Mass. (August 1977). Heiser J.F., Brooks R.E., Ballard J.P., (IProgress Peport: A Computerized Psychopharmacology Advisor", Proceedings of the 11th Colegium Internationale NeuroPsychonharmacoloeicum. Vienna, 1978. Feiser, J.F. and Brooks, R.E. A computerized psychopharmacology advisor. Proceedings of the 4th Annual AIM Workshop, Rutgers University, June 1978. Hoffer, E.P. Experience with the use of computer simulation models in medical education. Comput. Biol. Med. 3,269-279 (1973). Kunz J.C., Fallat R-J., McClung D-H., &born J-J., Votteri B.A., Nii H.P., Aikins J-S., Fagan L-M., Feigenbaum E.A., A physiological rule based system for interpreting pulmonary function test results, Heuristic Programming Project Memo HPP-78-19, Stanford University, 1076. Lenat D.B., The ubiquity of discovery, Artificial Intelligence 9:3 (1977). Lowerre B-T., The HARPY speech recognition systeu, Doctoral thesis, Department of Computer Science, Carnegie-Mellon University (April 1976). 207 4c. 41. 42. 43. 44. 4s. Pauker, S.G., Gorry, G-A., Kassirer, J.P., and Schwartz, W-B. Towards the simulation of clinical cognition: taking a present illness by computer. Amer. J. Med. 60,981-996 (1976). 46. Pople, H.E., Myers, J-D., Miller, R.A. DIALOG (INTERNIST): 2 model of diagnostic logic for internal medicine. Proceedings of the 4th International Joint Conference on Artificial Intelligence, pp. 849-855, Tbilisi, Russia, 197s. 47. 48. Martin N., Friedland P., King J., Stefik M., Knowledge Base Management for Experiment Planning, Proceedings of the 5th International Joint Conference on Artificial Intelligence, Cambridge, Mass. (August 1977). Mesel, E., Wirtshcafter, D.D., Carpenter, J.T., Durant, J.R., Henke, C., and Gray, E.A. Clfncial Algorithms for Cancer Chemotherapy - Systems for Community-Based Consultant-Extenders and Oncology Centers. Meth. Inform. Med. 15:3, 168-73 (1976). Minsky M., A framework for representing knowledge, in The psychology of computer vision, (ed. P. Winston), McGraw- Hill, New York (1975). Nii H.P., Feigenbaum E.A., Rule-based understanding of signals in Pattern-Directed Inference Sys terns (eds. Waterman and Hayes-Roth), Academic Press, f!ew York, 1978. Osborn, J.J., Kunz, J.C., and Fagan, L.M. PUFF/VM: interpretation of physiological measurements in the pulmonary function laboratory and the intensive care unit. Proceedings of the 4th Annual AIM Workshop, Rutgers University, June 1978. Quillian, M.R. Semantic memory. In Semantic Information Processing (ed. M. Minsky), pp. 227-270, M.I.T. Press, Cambridge, MA., 1968. Scott, A.C., Clancey, W.J., Davis, R., and Shortliffe, E.H. Explanation capabilities of knowledge-based production systems. Amer. J. Computational Linguistics, Microfiche 62, 1977. 49. so. 51. 52. 53. 54. 5s. 56. 57. 58. 59. Shortliffe, E.H. and Buchanan, E.G. A model of inexact reasoning in medicine. Math. Biosci. 23,3Sl-379 (1975). Shortliffe, E.H., Davis, R., Axline, S.G., Buchanan, B.G., Green, C-C., and Cohen, S.N. Computer-based ccnsultations in clinical therapeutics: explanation and rule-acquisition capabilities of the MYCIN system. Comput. Biomed. Res. 8,303-320 (1975). Shortliffe, E.H. Computer-Based Medical Consultations: MYCIN. ElsevierjNorth Holland, New York, 1976. Stefik M., An examination of a frame-structured representation system, Stanford Heuristic Programming Project Memo HPP-78-13 (working paper) (September 1978). Stefik M., Inferring DNA structures from segmentation data, Artificial Intelligence 11 (1978). Van Melle, W. Would you like advice on another horn? MYCIN project internal working paper, 8tanford University, Stanford, California, December 1974. Warner, H.R., Toronto, A.F., and Veasy, L.G. Experience with Bayes' theorem for computer diagnosis of congenital heart disease. Arms. N.Y. Acad. Sci. 115,558-567 (1964). Weinberg, A.D. CA1 at the Ohio State University College of Medicine. Comput. Biol. Med. 3,299-305 (1973). Weiss, s., Kulikowski, C. A., and Safir, A. Glaucoma consultation by computer. Comput. Biol. Med. 8,2S-4P (1978). Weyl, S., Fries, J., Wiederhold, G., and Germano, F. A modular self-describing clinical databank system. Comput. Biomed. Res. 8,279-293 (1975). Woods, W.A. et al. The lunar sciences natural language information system: final report, BBN Report 2378, Bolt, Beranek and Newman, Cambridge, MA., June 1972. 209 6C. Wooster, H. and Lewis, J.F. Distribution of computer- assisted instruction materials in biomedicine through the Lister Hill Center Experimental Network. Comput. Biol. Med. 3,319-323 (1973). 61. Wortman, P.M. Medical diagnosis: an information processing approach. Comput. Biomed. Res. 5,325-328 (1972). 62. Yu, V.L., Buchanan, B-G., Shortliffe, E.H., Wraith, S-M., Davis, R., Scott, A.C., and Cohen, S.N. Evaluating the performance of a computer-based consultant. To appear in Computer Programs in Eiomedicine, 1978. 63. Yu, V.L., Fagan, L.M., Wraith, S.M., Clancey, W.J., Scott, A.C., Hannigan, J., Blum, R.L., Buchanan, B.G., and Cohen, S-N. Computer-based consultation in antimicrobial selection - a comparative evaluation by experts. Submitted for publication, September 1978. 210 The appropriate programmatic and administrative personnel of each institution involved in this grant application are aware of the NIH consortium grant policy and are prepared to establish the necessary inter-institutional agreement(s) consistent with that policy. Page 211