RESOURCE-RELATED RESEARCH COMPUTERS AND CHEMISTRY (RR-00612 RENEWAL APPLICATION> Submitted to the BIOTECHNOLOGY RESOURCES BRANCH OF THE NATIONAL INSTITUTES OF HEALTH December, 1973 School of Medicine Stanford University DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE PUBLIC HEALTH SERVICE SECTION I Fom AppraMd Budget Bureau No. 6ER0249 LEAVE BLANK GRANT APPLICATION COUNCIL Ihfonrh, Year) OATE RECEIVED I I TO pE COMPLETED BY PRINCIPAL INVESTIGATOR Ilt8ms 1 thruuolr 78nd 15AI ,1. TITLE OF PROPOSAL (Do notwcwd 53 ty&t@r4==) Resource Related Research - Computers and Chemistry (RR-00612 renewal) 2. PRINCIPAL INVESTIGATOR 13.DATES OF ENTIRE PROPOSED PROJECT PERIOD (Th~aPP~i*tiOn) 2A. NAME (Last Fitzt, Inidel) IFROM 1 THROUGH Djerassi, Carl 26. TITLE OF POSITION Professor of Ch=mistry $1,350,795.00 I $276,197.00 2C. MAI Ll NG Am IStroet, City, SWW, Zip Coda) Depariment of Chemistry Stanford University Stanford, California 94305 B. PERFORMANCE SITE(S) &ernsrructions) Department of Genetics, Department of Chemistry, and Department of Computer Science Stanford University (SW lnsvuctionsl Department of Chemistry %f. MAJOR SUBDIVISION (SW /mWuctions~ I School of Humanities and Sciences . I . Rosewch Involving Human Subjects (S@e lnnruCtions~ 8. Inventions ~Renaw;rl Applicants Only - S-M hnnrCb0~) Aa NO 6-a YES Approved: C.m YES - Pending Review chte A.a NO B.0 YES - Not previously reported C. OYES - Prwio~!t.j rcportM TO BE COMPLETED BY RESPONSIBLE ADMINISTRATIVE AUTHORITY l/terns8 through 13and 1581 B. APPLICANT ORGANIZATION IS) [SW lmtructions~ TYPE OF ORGANIZATION l&sck app/iciib/e item) 0 FEDERAL r] STATE 0 LOCAL ?!3 OTtiER (Swifv) Stanford University Private, non-profit University Stabford, California 94305 , NAME, TITLE, ADDRESS. AND TELEPHONE NUl~6ER OF IRS No. 94-1156365 OFFICIAL IN BUSINESS OFFICE WHO SHOULD ALSO BE Congressional District No, 17 NOTIFIED IF AN AWARD IS MADE K. D. Creighton 11. F 12 Deputy Vice Pres. for Business & Finance Stanford University Stanford, California 94305 10. NAME,TITLE, AND TELEPHONE NUMBER OF OFFICIAL(St SIGNING FOR APPLICANT ORGANIZATION(S) T&phone Numter (415 ) 321-2300 ,x2551 13. IDENTlFm ' ~3s.:slLAtfONAc7NTT~ RECEIVE CREDIT FOR INSTITUXONAL G,RANT PURPOSES lSt?e lnsouctionsj I 2p School of Humanities and Sciences C/O Sponsored Projects Office 14. ENTITY NUMBER (Formerly PHS Account Number) Telap!mneNumbw(c) (415) 321-2300, X2883 458210 16, CERTIFICATION AND ACCEPTANCE. We, the undenigmd, certify thst the statements herein we true and complete to the best of our Mw nd rcbpt. 8s to uny mnt mrarded, the obllgstlon to &mply with Public Health Sewice terms and conditions in effect at the time of the (slpnatura muiredon original copy only. Lb ink, `?%r" sign8lums 8. &GNATURE(Sj OF PERSONiSi NAME0 IN ITEM IO OATE not wcqowbfd 111~ y$ (FORP~RLY PH~ 398 ,Rov. l/73 / DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE LEAVE BLANK PUBLIC HEALTH SERVICE PROJECT NUMBER RESEARCH OBJECTIVES I NAME AND 4DDRESS OF APPLICANT ORGANIZATION Stanford University Stanford, California 94305 NAME, SOCIAL SECURITY NUMBER, OFFICIAL TITLE, AND DEPARTMENT DF ALL PROFESSIONAL PERSONNEL ENGAGED ON Professor of Chemistry, Department of Chemistry; Joshua fessor of Genetics, Department of Genetics; Edward Feigenbaurr omputer Science, Department of Computer Science; Bruce arch Computer Scientist, Department of Computer Science; Research Associate, Department of Genetics; Dennis Smith, Department of Genetics; Nates Department of Computer Science; Harold Bro Associate, Department of Computer Science; Geoff Dromey, SS# at a later date, Department of Computer Science. .-- _-__- _._ _ - _ .__- TITLE OF PROJECT Resource-Related Research -- Computer and Chemistry USE THIS SPACE TO ABSTRACT YOUR PROPOSED RESEARCH. OUTLINE OBJECTIVES AND METHODS. UNDERSCORE THE KEY WOR' (NOT TO EXCEED 10) IN YOUR ABSTRACT. The ,bjectives of this research program are the development of innovative computer and biochemical analysis techniques for application in medical research and closely related aspects of investigative patient care. We will apply the unique analytical capabilities of gas chromatography/mass snectrometry ((X/MS) with the assistance of data interpreting computer programs utilizing artificial intelligence techniques, to investigate the chemical constituents of human body fluids in a variety of clinical contexts. Specific subtasks of-this program include; 1) the application of artificial intelligence (AI) techniques to programs capable of interpreting mass spectra from basic principles as well as extending mass spectral theory by analysis of solved spectrum-structure examples, 2) the extension of GC/MS data systems to provide stand-alone capabilities for collecting low and high resclution mass spectral and metastable ion data, 3) the application of GC/MS and AI techniques to analysis of biomolecular structure elucidation problems of a large number of collaborators,, and 4) the extension of artificial intelligence techniques to an interactive system for computer assisted structure elucidation based on a variety of data. LEAVE BLANK NIH 398 (FORMERLY PM 398) Rev. r/73 PAGE 2 SECTION II -PRIVILEGED COMMUNICATION FROM THROUGH DETAILED BUDGET FOR FIRST 12-MONTH PERIOD I 5/l/74 - - 163,935 28.962 192.897 CONSULTANT COSTS I -O- I EQUIPMENT -__ Equipment Purchase (First Year Items Only): DEC GT-40 Display Terminal 13,400 PDP 11/20 Upprade -- 1 - 34~,000 Equipment Maintenance: PDP-11 (DEC Contract) 4 200 -- I -L--- MAT-711 (Parts, etc.) 6.292 suppLIEsElectronics Supplies 4 400 - -I.-- GC Supplies -&olumns Liquid Nitrogen --- stock. etc. Data Recordinp Media PATIENT COSTS (See ins true tionsl I I -o- i ALTERATIONS AND RENOVATIONS -n- OTHER EXPENSES (Itemize) -~ Publications, telephone, office supplies, p ostage Computer Terminal Lease (4) Computer Usage - 370/158 (First Year Item Only) --I -- 4.000 --- ---UQQ- 5.OGO TOTAL DIRECT COST (Enter on Page 1, ltsm 5) INDIRECT COST (See Instructions)- ( -a",..., v- DATE OF DHEW AGREEMENT: 56 S&W' 0 WAIVED 0 UNDER NEGOTIATION &lTH: 47 %wxHTDC-J une 26. 1973' i 1 `IF TH:S IS A SF'ECIAL RATE le.@ off-site;, SO INDICATE. `NIH 398 (FORUERLY PHS 398) PAGE 3 Rev. 1173 . -PRINCIPAL INVESTIGATORS: C. Djerassi J. Lederberg E. Feigenbaum RESEARCH ASSOCIATES: B. Buchanan (1) A. Duffield D. Smith N..Sridharan H. Brown G. Dromey PROGRAMMERS: W. White R. Tucker SENIOR RESEARCH ASSISTANT: A. Wegmann ELECTRONICS ENGINEER: N. Veizades GLASS BLOWERIMACHINEST: E. Steed RESEARCH ASSISTANTS: L. Masinter M. Stefik To Be Appointed SECRETARIAL SUPPORT: K. Wharton TOTAL: $163,935 $28,962 $192,897 DETAILED SALARY DATA NIH GRANT #RR-O0612 5/l/74-4/31/75 % Effort Salary Fringe Benefits Total 10 -O- -O- -O- 10 -O- -O- -O- 10 2,910 514 3,424 50 7,000 1,237 8,237 25 6,195 1,094 7,283 100 16,200 2,862 19,062 100 16,050 2,835 18,885 100 16,200 -2,862 19,062 100 15,500 2,738 18,238 100 14,400 2,545 16,945 100 14,100 2,491 16,591 100 15,000 2,650 17,650 60 11,670 2,062 13,732 25. 100 100 100 100 4,410 5,070 4,915 4,915 9,400 779 5,189 895 5,965 868 5,783 868 5,783 1,662 11,062 -- (1)Dr. Buchanan's salary charges do not begin until 9/l/74 at which time his NIH Research Career Development Award expires. 3a SECTION II -PRIVILEGED COMMUNICATION BUDGET ESTIMATES FOR ALL YEARS OF SUPPORT REQUESTED FROM PUBLIC HEALTH SERVICE DIRECT COSTS ONLY (Omit Cents) DESCRIPTION PERSONNEL COSTS `ST PERlO:) ISAME AS 1 ADDITIONAL YEARS SUPPORT REOUESTED lThisapp/ication only) TAILED B 2ND YEAR 3RD YEAR 4TH YEAR 5TH YEAR 23XMmds 7TH YEAR TOTAL 192,897 210,611 225,129 240,630 257,383 1,126,650 CONSULTANT COSTS (Include fees, travel, etc.) -O- -O- -O- -O- -o- -O- EQUIPMENT 58,100 11,770 12,947 14,241 15,665 112,723 SUPPLIES 9,600 6,920 7,612 8,370 9,207 41,709 DOMESTIC TRAVEL 1,200 1,320 1,452 1,597 1,757 7,326 FOREIGN -O- -O- -O- -O- -O- -o- PATIENT COSTS -O- -O- -O- -O- -O- -O- ALTERATIONS AND RENOVATIONS -O- -O- -O- -O- -O- -O- OTHER EXPENSES 14,400 10,340 11,374 12,511 13,762 62,387 TOTAL DIRECT COSTS 1276,197 l 240,961 ,258,514 1 277,349 1297,774 ,1,350,795( I I TOTAL FOR ENTIRE PROPOSED PROJECT PERIOD (Enteron Page I, /tern 4) - I $1,350,795 I REMARKS: Justify all costs for the first year for which the need may not be obvious For future years, justify equipment costs, as we/J as any significant increases in.any other category. If a recurring annual increase in personnel casts is requested, give percentage. (Use continuation pege if needed) See following pages for budget justification. l?uaqet. Just.i.ficati0n Tt)t 3vailab%lity of exisiinq equipment - including the mass sp?ctromet3r and SUMEX computer - svoi3s the need for requesting f I1 n 3 5 for major laboratory itcns and substantial comp*lting costs. "bus, the major expense in the resulting budget is for personnel. i?e fcol +h3t the personnel listed here are necessary to carry out. the cesexrzh, as justified hsl3w. Rec!lrrinq costs are about ?227,0130 par ve3r. First year expenditures are higher to provide the tnstrumentation necessary for mass spectromctry service in the Ei rst ymar. i?e ace rclqaesting funds for five years to coincide with the funi!inq of the ATM-SUY??? resource, to which we hope to make iiqfif icant constribu+ions. This Surfqet overlaps sliqhtly with the budget for the Genetics Pnse3rch Canter (J. Lederberq, Principal. Investigator). Dr. Alan 9ufFiel+'s 25% salary budqetsfi here is covered by the other budget. (GtherY 1QOX of his salary is hudqeted). 10% of Ms. Annemarie $=+qman~*.s salary is covered there (with 10'37 of her salary h u-! qe t P ,i here) . These are the only overlapping items. We have no officF31 notification of Genetics Center funding; if the present pxpos3T! is SUCC@SSfU1, the Genet~ics Center budget will be adj31sted accordinqly. Tn the five-year hudqet, salaries are increased by 6X per year and staff henpfits are computed at 17% for +h~ period 5/74-81'74, 18% Fr)r the peti3d ?/74-q/75, and are increased l? per year thereafter, based on current l!niversi+y prnjections. Other budget zatc~ories are increased by 10" per year to account for inflation. ?ersonn31: Pf?l!C9 r;. YJCHANAN nr. Brnr:e Buchanan holds an ??I!! Research Career Development Award to work on applications of artificial intelli~~once to health-related problews, including theory Eormation by computer. 3 i.3 work on those aspects of thi.s grant is thus consistent with +he DeveI39ment Award. Half-time support is requested after the thlc3. vear of the Development Award (startinq September, 1974) to z3ver t-l??? zonticqency that the award wit1 not he extended to the full fiv3 years- These funds will be ret!lrned if the Rward is ext erlc?e?. `)ENNIS ff* S'JTTH ?r. P5nnis 9. Smith has been a member of the DSNDRAL project since July, lQ71. He has beer, responsible for +-he ?IS and its computer Sllpp3f t. * ani! has been involved in the application of the AI proqcrlms to structural studies of biomedically important 23mpaur.?s, primarily steroids. These responsibilitj.es will zontinuo in the future, with psrticular emDhasis on providing the !?lLlSS snPctrometer/AI program link to the usnr community and its 333s sncctrometry and general structure elljcidation needs, and in nrnvi.?ir:q the necessary chemical knowledge and input for qevelonment of the cotilputer programs acd user interfaces for the proposed compatar assis+ed structure elucidation effort. ?!T,A v 31~PFIELD 3r. %lan Duffield is the senior sci.nntist in charqr? of the mass sp?ctrnmetry facilities of the GRZ. Eocause of his expertise in t-k=- analvsis of mass spectra from various fractions of human body flfli3S, h? will provide the 'link hetwe~!r: the s+.ructure elucidation teshninues of this pronosai and other scientists with similar probleas, . The GC/HFMS faciiit.ies are also ~xuecter! to provide 7:3tvi73 &3 CI the Sensfics Cent.er for hiah resolution analysis of zom!a9urifis isolated from hoc?y flujds, VATFSA SRI3HARAN nr. Sridhacan will be responsible for developinq interface rnlltin?s that allow now researchers to make use of the structure elncidation proqrams, We expect these rout.ines to accept inforaatiol aboot a research problem, in semi-formal terms, and transl;Lte it into a format the proqram can use. They should be zoaple+e enough so that individual researchers do not need to know ih~?,u% the inner workinqs of the proqrams. In addition, he will nnntinu? t3 help Dr. Prnw~! and Fr. Vasinter with development of &he cvrlic qenera+or program. (Within a few days of this writing, Dr. Sridharan has decided to take a leave of absence. During his 3SsPncc we will recruit another Research Fssocia+e to perform his ,?iltias.) L!RROLP PFOWN nr. qarold Brown's knowledqe of graph theory and combinatorial 113.t hem3ti.c~ is essential to the development of the cyclic sfru'3ture generator. Pany problems with development and Lmolementation of this program have required sophisticated, new mathsmaticsl solutions worked out by Dr. Brown. For example, qenoratinq the dictionary of cyclic qraphs and assembling suh.;trr~ct:~~css involve problems irt graph theory that Dr. Brown is 3lirranfly *workinq on. Dr. 7rown has submitted a proposal to the NSF to cover his salary for this research. If that qr3nt is awardc:l, funds requested here for his salary will not be needed. P BS. GEOFF DROMEY Senff Dromey is a chemist with strD>nq compu+er science in+eres+s wh:, has been associat.nd with the project since T~ptemh?r, 1973. He has become familiar with many aspects of the D!??DR?` -- performance proqrams a?d will be exnected to help outside rosearchers use those proqrams. Tn addition, he will be I~veloplnq new proqrams, such as the proqram for molecular ion ?e+ermication from mass spectra. bJI!.tTA'L: C, #HI'?? Yr. iiillianr Whita provides hiqh-level Droqrnmming support for the theory formation proqrams, including helpinq to devise new proqr3tmt: in response to new research problems as well as iaalamentinq them. HP wrote almost all of the LISP code for the TVTS'lY proqram, for example, and is currently responsible for the 3!Jl.FGE?: proqram. MS . 4EiW??A"IF WRGMAN?r Yz 4nnqma:ie Wegmanr. -9. . ..I. is Che Senior Research Assistant in charge 35 'h2 Y:c/HRP!S system. She was fnrmerlv head of Yzwlett-Packard's Palo Alto qas chromstoqraphy applications laboratory and has been responsible for the operatior! of the GC/CS system since the 6 ?6?1iv?ry to our laboratory of th? MAT-711 (Yovembsr, 1971). Her tzzhnical 3hiliSy is absolutely essential. tr, the continued 3Peration and development of the mass spectrometry facility. `Y 2sc3rs, Vzizadcs and Steed will assist part time ir! maintaining tflc I;:/?5 system. fir. Veizades is an Electronics Snqineer who is rssnnnsib2e for the electronic and mechanical systems as well as pr3viIinq the necessary voltaqe rcaq-out an:! control development f3r t.he m?tastable analysis data system. Yr, Steed is a Research ?tlqineer r3sponsihI.e for the system glassw2rk 3rd vacuum system na;nkcr.;tnc2. R3!?"RT TJJCVFF Yr. Pobert Tucker implements and maintains the computer program5 for ,?a+3 azquisition and reduction of MS d3ta. This includes translatinq existing pL/ACEE into F3HTRAhT and PDP-11 assembly lnnsJa7e. In addition, he will he resporsi5le for improving these DrOqrtBS f3r repetitive HRY.S scans, implement inq the multinlet r~?snlut;on algorithm 2nd the .sDftware necessary for semi-automated zollarti9n ?f metastable ion data, LL;1B?Y H ?I b.SIKTEF `r. Lar;v Yasinter. Descarch Assistant, will continue to work with 3rs. Leclerherq and Erown on the developmect of ths cyclic stfllcturo qcnerator. His LTSP expertise has been an invaluable rPsourcF? for nvery member of the research team. YARK STF'PIK P!r. Yark Stefik, Research Assistant, combines two years of ?x3erienze on t?ie ACP?E/YS data acquisition system with a long-term nomni+ment to computer scie!nce. He has developed interactive Lihr*rv search capabilities for the mass spectrometer and uill continue to improve them, tzis knowledge of thP data acquisition. n3mputzr proqrafls will be very valuable in assistinq initial tran;la?ion of those programs into FORTRAN {from PL/AC!lE code) for t. h 3 ?XtQndea PDP-11/20 system. P?SPAR:f! I\SSIS!TANT - unnamed Ye h3ve interviewed +wo prospective Research Assistants, both of ub3Y h3v9 broad chemical experience and strong computer science ir,tPros+s. Ws request funds to hire one of tfiem to provide additi3n31 links between computer science tpchnigues and structure ~luci32t%on problems. 7n2 E:.llt-time secretary is necessary for the secretarial support 3f this number nf scientists. 5s. Kathlaen iaharton is now with t hI? C3mnut3r Science group. !G 3iszuss?I in thP tax+ (Sxtiop III.%), in the first year we 3lan to auqment our existinq PDF-Il.120 computer (4k memory) to -tll.?w its operation as a stand-3lonl r?3t3 systea. We plar to add 15k of memory (S3,OOO) v a floatir?q print arithmetic unit (27,500), in in-lllstry compatible tape drive ($9,300), a ,disk drive (lO,SOO), 2 low soee2 communications intzrfacp ($l,OOO), and a bootstrap 1->?3?r ann clock ($1, ZOG) . Th2se devices toqet her with state saLzs tax total t.0 %3tr,OOO. The prices nuot?c! ar? representations -tf the P.ost cost-effective suppliers 0 f the rpsppective devices we lavcl\ '7c~n 3?>le to locat?. 'rJ? will continue to review the market S?F3ce Smplementation to maxi.mize technical and cost performance. 1s stitn:l above, we plan to provide irikerf3ce programs to provide t-h9 .- ,-i,%CruEication I.ir?k between the users an? the programs. The ~lr.iversal lanquaqe of molscu3.3r structure is diagrammatic reDrPs?nt3tion of the structures, drawn usually in two dimensions (?,r 3s two-~Iimensio~~l represent.3tions of three? dimensional i r,E 3rn=i+io?) . ThercJfor?, WC fsel that a graphics terminal such as Fhp D9:: :. ;:-;fit:d Publications 13. Thermal Fragmentation of Quinoline and Isoquinoline N-Oxides in the Ion Source of a Mass Spectrometer. Acta Chem. Stand., 26, 2423 (1972). By A. M. Duffield and 0. Buchardt .- 14. -Applications of Artificial Intelligence for Chemical Inference. VII. An Approach to the Computer Interpretation of the High Resolution Mass Spectra of Complex Molecules. Structure Elucidation of Estrogenic Steroids. J. Amer. Chem. Sot., 94, 5962 (1972) By D. H. Smith, B. G. Buchanan, R. S. Englemore, A. M. Duffield, A. Yeo, E. A. Feigenbaum, J. Lederberg and C. Djerassi , 15. Mass Spectrometry in Structural and Stereochemical Problems. CCXIX. Identification of a Unidirectional Quadruple Hydrogen Transfer Process in 7-Phenyl-hept-3-en-2-one O-Methyl Oxime Ether. Org. Mass Spectr., &1?71 (1972). By R. J. Liedtke, Y. M. Sheikh, A. M. Duffield and C. Djerassi 16. &I Automated Gas Chromatographic Analysis of Phenylalanine in Serum. Clinical Biochem.,Il 166 (1972) , By E. Steed, TJ. Perelra, B. Halpern, M. D. Solcnon and A. M. Duffield 17. Pyrrolizidine Alkaloids. XIX. Structure of the Alkaloid Erucifoline. .Coll. Czech. Chen. CTmmun., (1972) By P. Sedmera, A. Klasek, A. M. Duffield and F, Santav;. 18. Mass Spectrometry in Structural and Stereoch.enical Problems, CCXXII, Delineation of Ccnpeting Frapentation Pathxalys of Con?lex !a!olezL.zs from a Study of Zetastzble Ion Transitions of Deuterated ZerivatlyJes. Org. Mass Spectr., 1, (1973) , -By D. H. Smith, A. M. Duffield and C. Djerassi 19. Chlorination Studies I. The Reaction of Aqueous-Hypochlorous Acid irith Cytosine. Biochem. Biophys. Res. Commun., 48, 880 (1972) By W. Patton, V. Bacon, A. ?l. Duseid, B. Halpern, Y. Hoyano, :?. Pereira and J. Lederberg 20. A Study of the Electron ImDact Fragmentation of Promazine Sulphoxide and Promazine using Speci.ficallv Deuterated Analogues. - Austral. J. Chem., 26., (1973). By M. D. Solomon, R. Summons, 67. Pereira and A. M. Duffield 21. Spectrometric de Masse. VIII. Elimination d'eau Induite par Impact Electronique dans le Tetrhydro-1,2,3,4-naphtalenediol-1,2. Org. Mass. Spectrom., 7 (1973). By P. Perros, J. P. Morizui, J. Kossanyi and A. M. Duffield 22. The Determination of Phenylalanine in Serum by Mass Pragmentography Clinical Biochem., submitted for publication (1973). By W. E. Pereira, V. A. Bacon, Y. Hoyano, R. Summons and A. M. Duffield SECTION II -PRIVILEGED COMMUNICATION BIOGRAPHICAL SKETCH (Give the following information for all professional personnel listed on page 3, beginning with the Principal Investigator. Use continuation pages and follow the same general format for each person.1 NAME 1 TITLE BIRTHDATE (MO., Day, Yr.l Research Associate I 11/12/42 Dennis H. Smith PLACE OF BIRTH iCity, Stare, Country) I I PRESENT NATIONALITY f/f non-U.S citizen, SEX indicate kind of vise and expiration date) New York USA EDUCATION (Begin with baccalaureate training and include postdoctorall b Male .`d Female INSTITUTION AND LOCATION Massachusetts Inst. of Technology Cambridge, Mass. DEGREE S.B. YEAR CONFERRED 1964 SCIENTIFIC FIELD Chemistry University of California, Berkeley Berkeley, California Ph.D. I 1967 I Chemistry I I .!?!?lJ P. Sloan Foundation Scholarship NASA Predoctoral Traineeship Phi Lambda Upsilon, Sigma Xi ---- MAJOR RESEARCH INTEREST ROLE IN PROPOSED PROJECT Mass Spect'ometry and A.I. in Chemistry Research Associate RESEARCH SUPPORT (See instructions) I -- N/A RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting withpresentposition, list traininq and experience reievant to area of projccr. List ail or most representative publications. 00 not exceed 3 pages for each individual.) 1971-Present Research Associate, Stanford University, Stanford,Ca. 1970-1971 Visiting Scientist, University of Bristol, Bristol, England 1967-1970 Assistant Research Chemist, University of Calif.at Berkeley, Berkeley, Ca. 1965-1967 NASA Pre-Doctoral Traineeship, University of Calif.at Berkeley,Berkeley, Ca. Publications: See attached list. RHS-398 Rev. 3-70 PUBLICATIONS: D. H. SMITH 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. H. G. Langer, R. S. Gohlke and D. H. Smith, "Mass Spectrometric Differential Thermal Analysis," Anal. Chem., 37, 433 (1965). S. M. Kupchan, J. M. Cassady, J. E. Kelsey, H. K. Schnoes, D. H. Smith and A. L. Burlingame, "Structural Elucidation and High Resolution Mass Spectrometry of Gai I lardin, a New Cytotoxic Sesqui terpene Lactone, " J. Amer. Chem. Sot., 88, 5292 (1966). - D. H. Smith, Ph.D. Thesis, "High Resolution Mass Spectrometry: Techniques and Applications to Molecular Structure Problems," Dept. of Chemistry, University of California, Berkeley, California (1967). H. K. Schnoes, D. H. Smith, A. L. Burlingame, P. W. Jeffs and W. DBpke, "Mass Spectra of Amaryl lidaceae Alkaloids: The Lycorenine Series, " Tetrahedron, 24, 2825 (1968). - A. L. Burlingame, D. H. Smith and R. W. Olsen, "High Resolution Mass Spectrometry in Molecular Structure Studies, XIV. Real-time Data Acquisition, Processing and Display of High Resolution Mass Spectral Data," Anal. Chem., 40, 13 (1968). - A. L. Burlingame and D. H. Smith, "High Resolution Mass Spectrometry in Molecular Structure Studies II. Automated Heteroatomic Plotting cs an Aid to the Presentation and Interpretation of High Resolution Mass Spectral Data," Tetrahedron, 24, 5749 (1968). - W. J. Richter, B. R. Simoneit, D. H. Smith ond A. L. Burlingame, "Detection and Identification of Oxocarboxylic and Dicarboxylic Acids in Complex Mixtures by Reductive Silylation and Computer-Aided Analysis of High Resolution Mass Spectral Data," Anal. Chem., 41, 1392 (1969). -- .w The Lunar Sample Preliminary Examination Team, "Preliminary Examination of Lunar Samples from Apollo 11," Science, 765, 1211 (1969). S. M. Kupchan, W. K. Anderson, P. Bollinger, R. W. Doskotch, R. M. Smith, J. A. Saenz Renauld, H. K. Schnoes, A. L. Burlingame and D. H. Smith, "Tumor Inhibitors, XXXIX. Active Principles of Acnistus arborescens. Isolation and Structural and Spectral Studies of Withaferin A and Withacnistin," J. Org. Chem., 34, 3858 (1969). A. L. Burlingame, D. H. Smith, T. 0. Merren and R. W. Olsen, "Real- time High Resolution Mass Spectrometry," in Computers in Analytical Chemistry (vol. 4 in Progress in Analytical Chemistry series ) C. H. 0l.r and J. Norris, Eds., Plenum Press, New York, 1970, pp. 17-38. PUBLICATIONS: D. H. SMITH Page 2 11. . 12. 13. 14. ' 15. 16. 17. 18. 19. 20. The Lunar Sample Preliminary Examination Team, "Preliminary Examination of Lunar Samples from Apollo 12," Science, 167, 1325 (1970). D. H. Smith, "Mass Spectrometry, " Chapter X in Guide to Modern Methods of Instrumental Analysis, T. M. GOUW, Ed., Wiley-Interscience, New York, 1972. D. H. Smith, R. W. Olsen, F. C. Walls and A. L. Burlingame, "Real-time Mass Spectrometry: LOGOS--A Generalized Mass Spectrometry Computer System for High and Low Resolution, GC/MS and Closed-Loop Applications," Anal. Chem., 43-, 1796 (1971). A. L. Burlingame, J. S . Hauser, B. R. Simonei t, D. H. Smith, K . Biemann, N. Mancuso, R. Murphy, D. A. Flory and M. A. Reynolds, "Preliminary Organic Analysis of the Apollo 12 Cores," Proceedings of the Apollo 12 Lunar Science Conference, E. Levinson, Ed., M.I.T. Press, Cambridge, Mass., 1971, p. 1891. D. H. Smith, "A Compound Classifier Based on Computer Analysis of Low Resolution Mass Spectral Data,`! Anal. Chem., 44, 536 (1972). - D. H. Smith and G. Eglinton, "Compound Classi fi cation by Computer Treatment of Low Resolution Mass Spectra-Application to Geochemical and Environmental Problems, " Nature, 235, 325 (1972). D. H. Smith, N. A. B. Gray, C. T. Pillinger, B. J. Kimble and G. Eglinton, "Complex Mixture Analysis - Geochemical and Environmental Applications of a Compound Classifier Based on Computer Analysis of Low Resolution Mass Spectra," Adv. in Org. Geochem., 1971, p. 249. D. H. Smith, B. G. Buchanan, R. S. Engelmore, A, M. Duffield, A. Yeo, E. A. Feigenbaum, J. Lederberg and C. Djerassi, "Applications of Artificial Intelligence for Chemical Inference, VIII. An Approach to the Computer Interpretation of the High Resolution Mass Spectra of Complex Molecules. Structure Elucidation of Estrogenic Steroids,"' J. Amer. Chem. sot., 94, 5962 (1972). -- D. H. Smith, A. M. Duffield and C. Djerassi, "Mass Spectrometry in Structural and Stereochemical Problems, CCXXII. Delineation of Competing Fragmentation Pathways of Complex Molecules from a Study of Metastable Ion Transitions of Deuterated Derivatives," Org. Mass. Spectrom., z, 367 (1973). P. Longevial le, D. H. Smith, H. M. Fales, R. J. Highet and A. L. Burlingame, "High Resolution Mass Spectrometry in Molecular Structure Studies, V. The Fragmentation of Amaryllis Alkaloids in the Crinine Set-i-es, I' Org. Mass Spectrom., 7, 401 (1973). -- PUBLICATIONS: D. H. SMITH 21. 22. 23. 24. B. R. Simoneit, D. H. Smith, G. Eglinton and A. L. Burlingame. "Applications of Real-time Mass Spectrometric Techniques to Environmental Organic Geochemistry, I I. San Francisco Bay Area Waters," Arch. Env. Contam and Tox., 1, 193 (1973). D. H. Smith, B. G. Buchanan, R. S. Engelmore, H. Adlercreutz and C. Djerassi, "Applications of Artificial Intelligence for Chemical inference, IX. Analysis of Mixtures Without Prior Separation as Illustrated for Estrogens," J. Amer. Chem. Sot., 95, 6078 (1973). - D. H. Smith, B. G. Buchanan, W. C. White, E. A. Feigenbaum, J. Lederberg and C. Djerassi, "Applications of Artificial Intelligence for Chemical Inference, X. INTSUM - A Data Interpretation and Summary Program Applied to the Collected Mass Spectra of Estrogenic Steroids," Tetrahedron, 29, 3117 (1973). - G. Loew, M. Chadwick and D. H. Smith, "Applications of Molecular Orbital Theory to the Interpretation of Mass Spectra. Prediction of Primary Fragmentation Sites' in Organic Molecules," Org. Mass Spectrom. f z, 1241 (1973). . a7 SECTlON II -PRIVILEGED COMMUNICATION BIOGRAPHICAL SKETCH (Give the following information for ell professionel personnel listed on pege 3, bqinning with the Primipel Invartiptor. Use continuation pegar end follow the seme generel former for each psrronl NAME ITITLE IBIRTHDATE ma, OW, rd Sridharan, Natesa S. Research Associate 10/2/46 PUCE OF BIRTH (City, Stete, Country) PRESENT NATIONALITY (II nonUS citizen, SEX indicete kind of Vise and expiration data) India . ', Madras, India 5/73-U.S. permanent residence (g Male 0 Famals EDUCATION (&gin with baccelaurwte training end include postdoctorell SCIENTIFIC INSTITUTION AND LOCATION DEGREE YEAR CONFERRED FIELD Indian Institute of Technology, Madras Bachelor of India Technology 1967 Electrical Engineering State University of New York, Stony Brook M.S. 1969 Computer Science Ph.D. 1971 Computer Science HONORS University Fellow - 1968-1971, SUNY Stony Brook; Graduate Assistant - 1967-1968, SUNY Stony Brook; Siemens' Award (awarded for top rank in Electrical Engineering) - 1967, ITT Madras; National Merit Scholarship - 1963-1967, ITT Madras - MAJOR RESEARCH INTEREST ROLE IN PROPOSED PROJECT Computer Application in Chemistry and MedicinL RESEARCH SUPPORT (See instructions) Research Associate ---- RESEARCH AND,QR PROFESSIONAL EXPERIENCE (Sterling with pfscent posidon,list end experience relevant to eree of p.ro@cl, L!%t efi or most representetivepublicetioru. Do not exceed 3 peges for eech individual.) 1971-present Research Associate, Heuristic Programming Project, Stanford University 1970-1971 Consultant, IAC Computer Corp., Long Island, N.Y. Sridharan, N.S., "An Application of Artificial Intelligence to Organic Chemical Synti-'esis" Doctoral Thesis, State University of New York at StonyBrook, 1971. Sridharan, N.S., "Search Strategies of Organic Chemical Synthesis", Third International Joint Conference on Artificial Intelligence (3IJCAI), Stanford, 1973 Sridharan, N.S. (co-author), "Heuristic DENDRAL: Analysis of Molecular Structure", Proc. NATO Advanced Study Institute, Amsterdam, 1973. Sridharan, N.S. (co-author), "Heuristic Theory Formation", Machine Intelligence, Volume 7, Edinburgh, 1972. SECTION II - PRlVlLEGED COMMUNICATION BIOGRAPHICAL SKETCH NAME (Give the following Inform&ion for sllprofessional pononnel listed on page 3, brginning w'th the Principal Invertlgator. Use continuetion pqm snd follow the snme genaruf format for eech person) TITLE BIRTHDATE (Ma, Day, Yr.1 Brown, Harold D. Associate Professor July 12,1934 PLACE OF BIRTH Kiry, Sate, Country) PRESENT NATIONALITY f/f n0n.U.S citizen, SEX indicate kind of vis, and expiration date) South Bend, Indiana u.s, B Male Ofcrnale EDUCATION f8eoin wirh bacce!auraate ttaining and include oostdoctorsll lNSTlTUTlDN AND LOCATION University of Notre Dame, Notre Dame, Indiana Ohio State University, Columbus, Ohio (No Baccalaureate Degree) HONORS Summa Cum Laude - Notre Dame YEAR SCIENTIFIC CONFERRED FiELD 1963 Mathematics 1966 Mathamatics MAJOR RESEARCH !NTEREST %%k IN PROPOSEC) PROJECT Applied Discrete Mathematics - Computer Science Research Associate RESEARCH SUPPORT (See innrvctionrl Principal Investigator, NSF-GP-16793 (Expires March, 1974) Pending Proposal NSF (Proposed starting date September, 1974) RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting withprarent position,&J&(&gandexpetien&rel@vant tort-e* i,fproht L.tiz or most representative publicetions. Do not exceed 3 pages for each individual.) Visiting Associate Professor, Computer Science, Stanford University , 1971-72, 1973-present Associate Professor, Mathematics, Ohio State University, 1966- Visitin.g Professor, Mathematics, Rhine,Westf. Tech. Hoch., Aachen, 1972 and 1973 Visiting Member, Courant Institute, New York University, 1967-65 Instructor/Assistant Professor, Assistant Chairman, Mathematics, Ohio State U.: 1963-65 Assistant to the Chairman, Mathematics, University of Notre Dame, 1960-63 Director or Associate Director, NSF-SSTP, 1964-70 Vitae Pace 2 7Ciea.r A`l@'L)ras, Ill. J. Kath, 12(1968), Pg. 215. Distrjbutor Theory in Rear AlZebras, Comm. Pure App. Eath. ??? ?o??O? ??? o ???? An Algorithm for the Determination of Space Groups, Ma& Comp. 23(w59), Pg. 499. Some Empirical Observations on Primitive Roots, with H. Zassenhaus, J, Number Theory 3(1971), Pg. 306. A Generalization of Farey Sequences, with K. Eahler, J, Number Theory 3(1971), pg.364, Basic Computations for tiders, Stanford CS Kemo STAN-CS-72-208. An Application of Zassenhaus' Unit Theorem, Acta Arith. xX(1972), Pg, 154. Integral Groups Tx The Reducible case, with J. Xeub:iser and II, Zassenhaus, Numer. Math. 19(i972), Pg. 3%. Integral Groups II: The Irreducible Case, with J, Neubtiser and H. Zassenhaus, Numer. Math. ZO(l972), Pg. 22. . Integral Groups IIIt Normalizers, with J. NeubGser and H, Zassenhaus, l-lath, Conp. 27(1973), PC;. 167. Constructive Graph Labeling Via Dollble Cosets, with L. Hjelmeland and L. Mnsinter, Di&rete Wath, in press and Stanford CS I:e?lo STAN-E-72-318. An Algorithm for the Construction of the Graphs of Organic liioiecljles, with I., Kasinter, Discrete Hatll. in press and Stariford CS E:eno STAN-G-73-261, ) The Crystallogaphic Groups of &-dimensional Space, with 2, Neubcser, H. Wondratschek and H. Zassenhaus, Wiley-Interscience in Fess. 30 SECTION II - PRIVILFtiEO COrulMUNICATlON --- ------ 8lOGRAPHICAL SKETCH -II {Give the following information forsllprofmsionel personnel listed on page 3, brglnning w'th the Princi@ Invartlpgtor. Use continuation moos end follow the s?rne aenerrtl former for eech person) NAME - TITLE BIRTHDATE (Ma, Day, Yr.) DROMEY, Robert Geoffrey PLACE OF BIRTH (Ciry, Stste, Countryj Research Associate 11121146 PRESENT NATIONALITY (If non4.S citizen, SEX indicete kind of visa end expiration de tel Castlemaine, Victoria, Australia ustralian, J-l Visa, Exp. 10/8/74 ~~~~~ o>cms'a EDUCATION (Begin with baccelaureate training and include postdoctoralj SCIENTIFIC INSTITUTION AND LOCATION DEGREE YEAR CONFERRED FIELD Swinburne College of Technology, Diploma of 1968 Chemistry Melbourne, Australia Appl. Chem. La Trobe University Ph.D. 1973 klolecular Science Melbourne, Australia HONORS CSIRO Postdoctoral Studentship Commonwealth Postgraduate Research Scholarship Walter Lindrum Memorial Scholarship Artificial Intelligence Techniques to Bio Medical and Chemical Problems. RESEARCH SUPPORT (Sseinn~ctionrl Research Associate ---- RESEARCH ANDjQR PROFESSIONAL EXPERIENCE (Starting withpracent positicn,~~sndexpen'sncerel#ent foor~s ofpmbt Lind ormort mpnasntetivepublication+ Do not exceed 3pages for sach individual.) 1973 DENDRAL Project, Stanford University, Computer Science Department 1973 Software Development for Graphics Systems, LaTrobe University, Computer Centre 1969-73 Construction, development and applications of an on-line photoelectron spectrometer LaTrobe University, Chemistry Department 1569-73 Application of Deconvolution Techniques to the Processing of Experimental Data. Publications: "Deconvolution and Its Application to the Processing of Experimental Data", Intl. Journal,of Mass Spectrometry and Ion Physics, 1970, 5. (co-author). "Inverse Convolution in Mass Spectrometry", Intl. Jnl. Mass Spec. Ion Phys.,1971, 6. (co- author). "A Combined Time Averaging-Deconvolution Technique Applied to Electron Impact Ionization Efficiency Curves", Internation Journal of Mass Spectrometry & Ion Physics, 1971, 5. (co-author). "The Perfect Direction and Velocity Focus at 254034' in a Cylindrical Electrostatic Field", Reviews of Scientific Instruments, 1373, 44. (co-author). - R. G. Dromey *'Detection of Spin-Orbit Splitting in the Photoelectron Spectrum of 02+ by Deconvolution", Chem. Physics Letters (in press), 1973. (co-author) "The Effect of Finite Line Widths on the Interpretation of Photoelectron Spectra", Journal of Electron Spectroscopic (accepted for publication). (co-author). "An On-line Ultraviolet Photoelectron Spectrometer for High-Resolution Studies o-f Molecular Structure", Australian Journal of Chemistry (accepted for publication). (co-author). "Photoelectron Spectroscopic Correlation of the Molecular Orbitals of the Alkanes and Alkyliodides", Journal of Molecular Structure (submitted for publication). (coauthor). "Comparison of the Photoelectron Spectra and the Photoionization EffFciency Curves for the Alkyliodides", Transactions of the Faraday Society (submitted for publication). (co-author). "A Convolution-Deconvolution Algorithm Using Fast Fourier Transforms), Decuscope, 1973 (in press). RESEARCH PIAN 33 I. rr, ITT, ?? o ?? ? ?? ? PTOMOLECTJLAR CHARACTPFT7AfI3N: AFTTFICTAL INTFLLIGENCE A Program of Fesource-!?elated 3esearch A. 3bject.ivcs n. Backqround artd Rat ionale C. Pelationship to AIY-SUMFX and the Genetics Research Center SPECIFIC AIFlS t3FTH3DS SIZNIFICANCE OF PROPOSED FESEARCE FnCTLITIES 8 Z;QUIPI?FENT ETBLIOGRAPHY SahL~ 1 Fiqurps l-3 Appendix 4: Letters of Interest 4op enlix R: 1973 Annual Report to the NTii 1. INTFDDUCTION Thi 3 r?czec3'L application is intended to susfain and augment the zapabilities of the mass spectrometry (MS) program which has ;ecv~!il 3s 3 major insTitutional resource at Stanford for some yp3rs. Rith previous support from NASA and NSF it has made oossible a hiqhly interdisciplinary set of research projects r;tnqiPq ov2r: artificial intelligence (AI) in biomolecular ~h3r3cterization, natural product chemistry, clinical biochemical sttl!ies on steroids, and the mechanisms of molecular fragment Formation in mass snectrometry. While the facility equipment for mass spectrometry has been funded mostly by other agencies, cDnne!cted research proqrams embrace several NIY research projects is wzll, In addition, this activity was closely coupled with the ?,"FF Ye!ical school computer resource (1966-19731 and will have sisflnr associations with the new RI?!-S!JEl?X conputer resource re:ontly flnded by the BP2 (see Section T."). Previous support ref1ect.s the diversified facets of this icperdisciplinacy research, RASA has supported projects in new i?:trunent3tion, inclurlinq the initial mass spectrometer-computer Ilnk, gSF has supported chemical research, and ARPA has supported our artificial intelliqencc research and initial application to nass spPctrometry, Overall cutbacks have forced NASA to reduce Ftinqinq for this area of research despite their interest. Under AFP.4 sunport to Drs. Feigecbaum an3 Lederherg for AI research, the DENDPAL proqram became recoqnized as one of the most successful AI 3pplic3tions programs. Eowevec * ARPA is chartered to fund Fr3nti~c computer science research and no longer provides funds fnr the DENDRAL applications programs. APPA has indicated a relozt3nz~ to continue funding to this groun for the theory fncmatior, work in chemistry, although we expect to continue to receive AfiPA support for more theoretical aspects of OUT research 3r3ffr451 (e*q*, 3utomati.c proqrarrming). iJc previously submitted a comprehensive proposal to the NIH (Rt?-~07SS, 3/25/73) which included an application for the 4fY -SUVEX zr>mputinq resource and a renewal o f the existing DENDRAL qrant IPP-00012) o This proposal was approved for 5 years by the Pational Advisory Researc+ Fes3urces Council. Certain reservations were, however, communicate3 to us: they concerned especially what we must aqree was an ambitious effort to close the control loop for l'intelliqent. automationfi* whose costs overreached t he imme~i3~e utility of the expected result. During subsequent 3iscussiDn? with the RioTechnDLoqy Resources Branch, takinq into accnurt the council review aR.d a number of diverse policy issues, we 37ceoit administratively to seqment the two components of the 3riqinaf proposal. The ATPI-SU?!EX portion of the original proposal (exclil3inq DI?F?DRAL) was recently funded for 5 years as a national resource for artificial intelligence in medicine. The present ~r~p~ri31 for resource-related research in hiomolecular characterization and artificial intelliqence is an elaboration of thz DEN?RAL portion incorporating intensive reexamination and revision of the previous proposal. Vith thr? differentiation of priorities represented by AItl-SUEEX, th? "tlnetizs Fesearch Center (ZPC), and continuinq work on Irtifizial intelliqecce under Dr. Feiqanbaum's leadership, the Dresclt renewal application places more emphasis than heretofore 3n real-world oriented applic2t.ions. Correspondingly, we have sqceei that it is now more appropriate thst Dr. Djerassi should be ?ps!qn3ted as Principal Investiqatz)r in this phase of our work. Fs outline3 in section 3.2, the interests 2nd responsibilities of ?r3f?ssors Djerassi (Chemis+ryl I Feiqenbaum (Computer Science) and '_e?erberq (Genetics) have? been closely intordigitated. With their f!lrther connections with many colleagues, these programs enjoy a h irJh acqrs? of university-wide participation. For example, the ?zRttics rtppartment is also closely affiliated with Biology, Bi 3zhemis+.ry, Pediatrics, Psychiatry an3 xedicine through joint lppointrnents or joint research projects or both. This breadth would be difficult to obtain except at a few institutions where t5s medical. schDo3. is both aca4emicslly an3. qeoqraphically inteqrlted with the university t:, the deqree that characterizes the Stanford University envir0nmeF.t. GLOFSARY !IF ABDREVIATIO?JS AC"F - Advanced Computer for Yedical Research {Nib-funded computer resource, 1968-1973) 91 - artificial intclliqencp 4 It!-S!Tr!??Y- A comprehensive computer resource jntended to serve the n&iqnal requirement for artificial intelligence in medicine. This will be implevsnted at the Stanford University facLlity called AIM-SVYEX - Advanced Research Projects 4qency of the Department of Defense. - Biot?chngloqy Resources Rranch - carhDn-l? saqr;etic resonance - qas chroaatoqraphy or qas chromatoqraph - Genetics Research Center (Stanford, J. Lederberg, Principal Investiqator; NIGYS-approved and awaitinq fundinq. Grant #PDl-GY 20832-01) - hiqh resolution mass spectrometry - infra-red - Tnstrumentation Research Laboratory {Stanford Gene%ics Department) - low resolution mass sppctrometry - magnetic circular dichroism - Mass spactronetry or mass sDectrometer - National Aeronautics G Space Administration - nuclear maqnetic resonance - National Science Foilndation - optinal rotatory dispersion - a mo-lified version of the PL-1 computer language (for the Stanford ACME computer fscility) - Stanford University Eedical Experimental Computer Resource (WIH funded computer resource, 1973-1978) - ultra-violet 3& A. 03JECTTVES: Core Research. "he fllnds now applied for would permit 1) the continued fundinq of the ?!S laboratory as a biomolecular oh3r37cerization resource; ?) advancement of laboratory instrumentation capability in specific areas of GC-HRMS and the exploitation of metastahle peak linalvsis. 3) the fucther development of AI computer techniques to match the instranentation. This work will emphasize practical utilization for applications in biomolecular characterization connected with other on-qoinq biomedical research programs. It will include, for 3X3UP1, al the analysis of mixtures by SC/MS; b) metastable peak 3n3lvsis f>r difficult problems of pure compounds and of mixtures not rcalily separable by CC; c) optimized data analysis for zhsr3cterization of MS peaks an3 d) heuristic analysis of spectra for the molecular ion composition. ?ur project is the only systematic effort, to our knowledge, cl~rrently underway in this country for computer assisted structure ?licidation, Subsequent to our early publications, an intensive proqram has been mounted in Japan in similar areas. This situa+.lon may ho contrasted with computer assisted organic synt.h,+sis, an area receiving considerable attention from several ccse3rzh qroups, The%? capabilities can be beneficially provided t3 a wider community via the ATM-SUP?SX resource. Research on the ?nulstion of human intellect by computer programs will undoubtedly Influence the efficiency with which chemical research can be soplied to ever more complex problems of health, e. q., intermediary metabolism and its pathologies; environmental infl3F?n:a s on hea1t.h; the development and critical validation of new therapeutic aqents. "he a:Elevament of these nbjectiQeS depends on the continued maintenance and development of the DENDP9L AI programming system {see h313WI . The advent of the ATM-StMEX facility will remove some nf the serious computational limits on the exercise of this system that have delayed recent progress. Pduc3tion. rn our university setting, pre-doctoral and post-doctoral educa+_ion of course constitutes a part of our mission. As far as is practically possible, research participation in the DENDRAL proqram has been couple3 with 3 issertation w3rk by graduate stu??nts and post-doctoral research experience respectively. Ex3 mplss of people (and their research area) whose education has Seen enttan:e d in this way are the followinq: Srad~late Students: J, Simek, padaqoqical aspects of the structure qerlerstor; Rai Lee Tan, synthesis of new estroqen compounds; H. Fqqcrt, 13ZYYIR of amines and st?roidal ketones; C. Van Antwerp, 13CYR of steroidal alcohols; C. Farrell, theory formation from ~a3s spectral data; L. !!asint.cr, development of the structure qnnnrator; Y. Stefik, AI applications to chemistry. 37 F'nst3933rll Fcllous: r,. Drr)mey, theory formation f ram analytical !3t3; P. Sritter, mass spectral fragmentation of bioloqically active steroids; R. Carhart, analysis of 13CI?R spectra by DSNDRAL-like programs; 5, Hammerurn, development of better fraqmcntation rules for proqesterones. Fofmll orqlnization. *his project has been a long-term commitment of Djerassi, Lederberq and Feiqenbaum functioning in effect as Jo-investigators, We coordinate our activities with day-to-day zDntlc?s i.1 the pursuit of converqent research objectives. In the light of the extension of our collaborative activity during the 1.3;t tdo years, we are now orqanizinq a formal advisory group to i ~cludp, in addition to ourselves, H. Cann, J. Barchas, and E. Van Tamelen, This qroup will advise the principal investigators on the direction of the proqram with respect to allocating available facilities and seekinq out and helping other collaborators, This desiqlation simply recoqnizes the fact that many of our colleagues haIre already been ecqaged in c. alevant collaborative research with 1s. A '?S resource has recently been funded at the riniversity of California/Berkeley, under the direction of Dr. A.I.. Burlingane. Drs. Djerassi and Burlingame have recently engaqed in some zollahorative research which was made more successful by the sharinq of facilities ard expertise available at or.e institution but n?t at the other. We would hope to maintain and strengthen t hES2 contacts to avoid unnecessary duplication of effort. U? plan to discuss with Dr. A.L. Burlinqamo the most appropriate pr3c?Aures for coordinaticq the related activities of our respect ivn proqrams at the 1Jni.versit.y of California/Berkeley and !I?_?. ?hi; may take the form of reciprocal membership in advisory committees. fhs '*hardware resource w to which this application is pegged has 333~ id entified as the MS facility. ahile these instruments alone represent an investment of over 5`300,000, funded previously by several aqencies, they do not represent the most important resource. WC, would use this designation instead for the working team led by th,e princioal and co-investiqators. The skills embraced by this group Fncllde, as mentioned, computer science, structural orqanic zhemis+rv, molecular biology, instrumentation enqincering and a wide ranqe of other disciplines, They are represented not only in tke princinal professors hut in a diversified and accomplished orofessional research staff (SE?;? Bu3qet Justification). The proqrlm for which funds are now requested is the vital means by which the interests of this qroup can he sustained in a zn?r3inate? effort that would be very costly both in funds and in tin4 if it had to he reconstructed from scratch. 'dithout the finanziatl support now requested, this line of collaborative research will have to he abandoned, with it a unique style of interdisciplinary collaboration, and the "1s facility will be terminated. 9. BACRGP3JJND AldD RATIOWALE 1, Phe StrtlctVlre Elucidation Problem 3) The General Problem, Analysis of molecular structure is a major activity in our program of resource related research. For the speci.fic task of efuci3ating molecular structures, i.e., the topoloqy of atom-to-atom connectivities, analysts utilize a mixture of information derived from chemical procedures and spectroscopic techniques. Each item of information, if not red!lndant 3r uninterpretable, contributes to t4e solution of the problem. Chemists 3raw upon a tremendous body 7f snecific knowledge about the task area (e.q., clinical chemistry, biochemistry), molecular structure, spectroscopic techniques, etc., in or3er to piece together this information and l?fer the structure of aolecules. These features, and the relative si!nplicity of the finai concept of a structure, make the prohl?m particularly well-suited for applications of the tec!iniqtles of AI to assis t research workers performing the task. bl D-jerassF*s Laboratory. Professor D jerassi has been concerned with secncture elucidation problems since the beginning of his -hamica research. Y kL,d His activities at Stanford have been concerned heavily with the apnlication of particular spectroscopic techniques to structural studios of biomedically important ~3nP3unas. These techniques include optical rotatory dispersion IoR?) and, more recently, maqnetic circular dichroism (BCD) (both 3f t.hem supported initially by the NIH), Since 1961 he and his ?coun have also been concerned with MS because of the power of the technique, in terms of specificity and sensitivity, as an an3lyticsl tool for structllre elucidation. Four books and anproximately 250 articles on 9s have been published by him and i i.s zolIea7ues. rha technique of E4S does not silffice for all structure 3zt~rminatio~ problems, but it is a very powerful tool in areas where thora exists a body of knowledge about the MS behavior of related molecules, Bhen sample size is limited MS may well be %he only technique that tag be utilized. The recent availability of hiqh resolution mass spectrometers has madn HFMS the technique of c: h 3 s z e four ntany applications because under ideal conditions the exact mass number uniquely specifies the the empirical formula of 3 molecule or fraqraent. nn a parallel course, the technique of ;:/?pS, routinely available with low resolution mass spectrometers (r,C,'LRMS), has revolutionized investigations wherever complex aixtclrss ace encountered. All of the above considerations argue that an extension o f H,S at Stanford to provide routine GC/LRMS and ;C/!fRMS an.alyses would he the next logical step to assist researcher; depending on this facility for solutions of their structure elucidation problems. 2. Aistorical Backqrnund 91 Y3ss Spectrometry Laboratory. Prior to the existing DENDHAL 7 r I c t , the qroundwork was laid for computerization of the existing mass spectrometers, an Associated Electrical Industries MS-9 high resolution mass spectrometer and an Atlas Z'i-4 low resolution mass spectrometer. This work, supported primarily by NASA via the Tnstru~cntation Research Laboratory (T9L) in the Department of :?r!etiZS, resulted in link-up to the then existinq ACME computer facility via a PDP-11 mini-comolIter which acted as a buffer bctweon tha spectrometers and ACHE. Initial data acquisition and ceduztion proqrams were written for thz system and utilized on a limitz?d basis, The fundinq of the DENDRAL proposal, NIH qrant PU-612 [May 7,1Q71-present) in conjunction with additional resources provided by the IRL resulted in a major improvement to these capabilities. The fruits of these? efforts are described un3et section T-B.3 [below). h) Summary of Early DEYDRAL Development. r? 1364, Lederberq devi.scd a notational algr,rithm for chemical :):r!lz tllr?s (termed DENDPAT,,f that allowed questions of molecular structure to b? framed in precise qraph-theoretic terms. (Refs. 1,3-5,12)* He also showed how to use the DENDPAL algorithm to qenerate complete and irredundant lists of structural isomers. (RCfS. ???? o Tn lQ65-66 Lederberq and Feiqc?nbaum began exploring the idea of usin th: isomer qenerator in an artificial intelligence program - searchinq the space of possible structures for plausible solutions to a problem much as a chess-playinq program searches the space of lesal moves for the best moves. (Refs, 7,12). This approach qusrsnt??s that every possible sollfion to a problem is considered - 2ith9r ianlicitly, as when whole classes of unstable structures are rz-jectsd, or explicitly, as when complnto molecules are tested for plansibility, In either case, an investigator easily ?etPrmines the criteria for rejection and acceptance and knows that no possibilities have been foryottpr., l!his approach also guarantees that structures appear in the list only once - that 3utor3rphi.z representations of the Sam? corrplex molecule have not h?en included. In both these respects the computer proqram has an advantaqe over manual approaches to structure elucidation. Cl Initial collaboration with Djerassi. (Refs. 14,15,19, 20,21,22,24). Ledetberq and Feiqonbaum realized that (a) only throuqh aaplicatio? to real problems could the AT approach be materially l?v,ancad and critically evaluated, and (b) MS appeared to be a fruitful applications area. MS appeared to be an excellent orohlem area because of the close relationship between spectral Tr3qmen+atiop patterns and molecular structure for many classes of a?leculls* Djerassigs interest and expertise - an3 daily intpr3ctio2 between members of his qroup and the AI group - led to 9 seriF5 of joint publications describing the approach and initial cesrllts of the programs. The success of these collaborative ?)lf>rts led to the proposa?. to the NIH for initial fundinq to ex+e?I these efforts. 1) Efforts Under NIH Funling for DENDRAL. (3efs. 25-41). ?h? initial fundinq by NIH provided the opportunity to upgrade the Fnstrument~tioc and computer pcoqr2ms. In particular we were able to !llnun+. a concerted project on both the analysis of mass spectra of bioTedically important compounds and the mathematical aspects of m31?cul3r structure. Proqress reports to the NIH describe this research in detail. The nest recent annual report appears in 4 r?pe!li?x 5. R series of publications Sirccted to audiences both in corsputer science and chemistry sre listed in the bibliography. Ilh? followFnq section [Section 3) summarizes the capabilities for -tcllcture elucidation which, 3 in thsmselves, constitute an impqrtnnt result of past wnrk. e) Related Research. Fr! important side effect of tha DEWDRAL project is the extent to vhich additional research was inspired and carried out to fill q3cIs in existinq knowledqe. This research, not supported by the r)Srt'DRhL qrant, has been beneficial to or-qoicq DENDRAL work, and vine-versa. Publications whjch have arisen from this research are listed in the hiblioqranhy (Pefs, 58-70). A brief review of these oublications should indicate the need for precise specification of the kn7wledqe elicited from chemists and used in computer Dr3nr3ms. As an example, consider the description and application >f an early alqorithm for qeneration of cyclic structural isomers (311. This paper considered the problem of spectroscopic diffar?ntLation of isomers of Z6H100. Unsaturated ethers fall in 3ne of the classes of isomeric compounds which must be considered, but th? RS of unsaturated ethers had not been investigated systematically. This work WAS subsequently carried out in Professor Djerassi's laboratory independently of DENDRAL support, but of hen?Eit to DZ':NDRAL (62). Other examples will be found in t h3 Bihlioqraphy (Eefs. 58-70). 3. Existinq Capabilities an hlvp worked to develop distinctive capabilities for molecular structure elucidation, bringing together a hiqh quality HRMS svstea and AI programs applied to biomolecular characterization. The feasibility of our analytical approach has been demonstrated in saveral problem areas, base3 upon the development both of a MS svst3v ;Init a qeneral se+ of computer proqrams for use in new areas. The princioal capabilities are summarized below. These are now in 3pilq and Here developed primarily under NIH funding to this pc3 jer=t, with additional support supplied by ARPA and NASA in snecific areas. (These agencies have reduced funding levels for this work because overall cutbacks have forced NASA to cut out this are3 3,f research despite their interest and ARPA is chartered to provide funds for frontier coapllter science research but not f9c 3pplic3tions. Thus the NLH is the principal of support for future development of anplications programs in the interdisciplinary area of artifici31 intelliqence./health related r:herrical problems.) 3, HRYS System and Coupled SC/LRMS System. We h3ve coupled the PIH-supported Varian -YRT 711 High Resolution Yass Spectrometer with a Hewlett Packard Gas Cbromatograph and denonstratad its utility for GC/LR?!S analysis of such difficult analytic problems as the free -;terols (i.e., not derivatized) isolated from marine and other sources. Advanced data reduction techniques for this instrument Yere written for use with the ACME conput er svstem (36Q/50) and cow exist in Stanford's new 370/158 ihI. Ch :Jnntinues to support the PL/ACNR lanquaqe. nJC/HRMS scans on extracts from urine and anniotiz flui demonstrated this system's zapability to provide hiqh quality mass measurements on complex nixtures obtained from biological sources. An example of one ;t/fiRYS run on the amino acid fraction of amniotic fluid is presented below (Sec. 1II.D). DEWDRAL Structure Generator (Refs. 1-6,14,31,37,38,40,41) nhe D;iDRAL Structure Generator program accomplishes exhaustive ; an? irredundant generation of isomers, with and without rings. This pcoqram guarantees consideration of every candidate structure - either implicitly, as when whole classes of structures are focbi3"en, or explicitly, as when individual compounds in a class -1ro spec<,fied, It corresponds to the "legal move generator" of c3nnutcriz2d chess playinq and other heuris+ic programs. C. DRNDRAL Planner (Refs. 25,28,33) de have written a very qeneral set of computer proqrams for determining structural features from analytical data in well-defined areas. Such qeneral planning programs have been wri+t?n fo: low and hiqh resolution MS, interpreted proton NJ4R swectroscopy and 13CMR data, 3, INTSUM (ReFs. 26,29,34,35) TNTS`JM ?s 3 computer proqram that aids in finding interpretive rales for YS. The proqraa interprets a larqcl collection of MS data 1:: orjinq to criteria specified by the investiqator. Then it summarizes the data to show which of the possible interpretations seen most plausible. P. PllLEGEhr (Refs. 26,35) RITLESEW is the current rule qeneration program that sugqests various rules of interpretation for th? ES data summarized by r NTS~JM. Although not finished, the program can provide useful assistance in practical theory formation. f. Ancillary Techniques 1. The !!S facility provides other types of experiments in MS, including ultra-high resolution measurements (masses determined via peak matching), defocussed metastahle ion determinations (Barber-Rlliott technique) and low ionizing voltage experiments. These data are utilized by both scientists and proqrams where anpropriate. 2. Additional computer proqraas provide added problem- solving assistance. 3, Predictor program for predicting major f.. eatures of mass spectra. h. Proqcams for drawinq and displayinq chemical structures. c. Subroutines developed in conjunction with or existing as parts of the Structure Generator for pr2bleol.s of partitioning, construction >f vertex-graphs, and constructive graph labelling. These can be 3nnlie? to answer certain questions of isomerism which do not require the complete generator. For example, the labelling alqorithm can list alI structlrres resulting from substituting sites of a carbocyclic skeleton with stated numbers of different fuactianal qrouns. 9* Other Spectroscopic Techniques ?vailahle to us are the facilities of Professor Djerassi's Laboratory for work requiring Idditional spectroscopic data. Also Ivailahle on a fee for service basis are extensive spectroscopic facilities (NMR, I.!?., 3n.d V.V.) of the chemistry department. These would be utilized for collecting additional data on particular structure problems and gatherinq data on known compounds [particularly in the area of 13CMR) as the AI programs bec?me kn?zledqable about other spectrosconic information. h, Chnaicsl Facilities l-he stsff and facilities of thP chemistry department represent SubstaRt-la1 synthesis capabilities and gerieral chemical know-how. This r?sollrce can be called upon to pr~vid? assistance i.~ 5vnthesi.s of model or labelled cnmpounds, derivatization of 9ixCure9, 3nd so forth, For example, a graduate student in shemistry is presently engaqed in tfiesis research dealing with the lahnratnry synthesis of a now s.strDgen metabolite stronqly su~~3cte3 to be a component of certain preqnancy urines. The prsviously proposed structure 3f this compound was one of the candidate structures inferred by the nlanner in a study of ?strDqen mixtures (Il-dehydroestradiol-17-alpha, ref. 33). 4. !Js?r Community RzDn3rnic utilization of existinq and proposed facilities can be ce3ilized by sharinq them with 1 community of users. Lacking suppfcmentsry funds that would he needed for a comprehensive, na!nr service facility, this =omaunity will include the following 7 r3'105, hot will be informally available t.o others. A. Stanford Eommunity if Stanford Chemistry Department (except for Hodgson, all 3re heavily supported by the hiT!i in their research efforts) Letters of interest are sttached to the proposal in Appendix A. ?rof. C. Djerassi - Steroids, marine sterols Prof. W. Johnson - steroi3s Prof. E. Van Tamelen - steroids, triterpenoids, other natural products Prof. H. Mosher - natural products (e.g., marine toxins) Prof. K, FIodqson - hiDlogical liqands, ligand-metal complexes Prof. J. Collmnr. - cytDchrDme P450 models ii) Stanford ?ledical School Collaborators The following research projects in the Stanford Biomedical ComEunity will furnish samples for mass eztrometric analysis under the present proposal. it)tached to this proposal (Appendix A) are copies of the letters of interest in the proposed facility r?zzived from the principsl investFqators of these arants. Dr. James R. Trndell, Department of Anesthesia, Stanford University School of Yedicinz. Drug metabolite identification in humans. Dr. Irene S. Forrest, biomedical P,esearch Laboratory, Veterans Administration i-I3spital, Palo Alto, Drug tnetabolite identification in bumar.s. Dr. T. Rabinowitz and D.I. Wilkinson, Department of Dermatoloqy, Stanford University School of Medicine. Prostaqlandins. PrDf. Fuqene D. Eiohin, Department of Respiratory Medicine, Stanford Opiversity School of Tedicinc, Ratio of NAD+/WADH in cplls by measuring ratio of oxidized to reduced redox plLrs. Dr. Leo E. Holfister, Veterans Administration 43 Hospital/Department of Medicine, Stanford 1Jniversity School of Fedicine, Mataholism of Marihuana. Dr. Hiram ??. Sern, Pharmacy Deoartment, Stanford University n0spita.l. Drug Identification. Dr. Sumner p1. dalman, Department of Pharmacology, Stanford University School of Medicine. Drug and drug metabolitc identification. Dr. Jack Rarchas, Department of Psychiatry, Stanford University School of Yedicine. Neurotransmitters and- related compounds in m?n. Dr. Keith A. Kvecvolden, Chemical Evolution Branch, YRSA Ames Research Center, Yountain View, Calif. Amino acids, acids in geochemical samples, structure of products formed from electrical discharges in gas mixtures. Dr. William P. Fair, Department of Urology, Stanford University School of Medicine, Identification of the prostatic antibacterial factor; polyamines fputrescine, spermnine, spermidine) in body fluids of patients with prostatic carcinoma, R?sil,?s tha user projects just summarized, other major prospects are in sight, At the time of writing, the chair of pharmacology is vacant. Conversations uith the leadinq candidate have ina icated a deep-seated interest in GC/HR?IS as the principal analytical tool for broad ranging studies of drug metabolism in nal. 3. Rxtramural Users The development of the techniques of OP,D, MS and MCD at Stanford has beer paralleled with extensive sharing of these resources natian- an1 world-wide 513 collaborative research efforts, without anv ~~l~ditional funding. Rather than provide routine service, axneriance has shown th.at discretionary selection of problems results in better utilization 3f our people an,d instrumentation resource;. we would extend this provision of services including atr2ilablp computer programs, to a limited number of extramural IISerT, vote, for example, our successful collaboration with ?r?fassor Adlercreutz, Meilahti Hospital, University of Helsinki, on th? identification of estrogens fron body fluids utilizing the 9X planning proqram {ref. 33). ". Re?atiDnship to ATE-SrJMEY and the "Jenctics Research Center I'he present application is strengthened by two research projects related to, but not overlapping, the proposed research of this scant. 1) RT"P-SUYEX (NIH RR-00785, Oct. 1, 1973, thru July 31, 1978, prirlcipal Investigator, J. Lederberg). This is a resource grant co cst_3tbfi ,sh a national facility for applications of artificial i?tslliqenzn in medicine (AI%). Our own use of this facility will inzlud? SUMEX PDP-10 computer time and file storage necessary to run the DENDRAL artificial intelligence proyrnms. This support vi.11 he furnished without charge to the present proposal. It represents an annual investmerzt of about R100,OOO in computer time z?ui.valsnt value. ?he AT?!-SVEX computinq facility is shared equally between a national user community {AIFP) and a Stanford fledicnl School community. The DENDRAL research will he supported out of the Stanford portion. The ATM service rill be administered under the policg control of a national advisory committee and will be imolemsnted over a national computer network. AIM-SIIFIFX provides tt?p W?3?lS for members of thm national user community interested in ?tr!lzt!lre elucidation to access the DSNDRAL programs. 2) ;enetizs Research Center (NIH POl-SM 20832-01 - approved by thP NISMS Zouncil, awaitinq fundinq, Principal Investiqator, 3. i~33tberq). This research proposal is a comprehensive grant uhich would support interdepartmentni research at the Stanford Eedical 3ch301 in Yedical Senetics, Pe3irltrics and other clinical an3lications. .9 section of that proposal concerns the use of SZ/LRMS for screening body fluids for evidence of inborn errors of metabolism. (This project qre% out of the initial DENDRAL grant, one of the research qoals of nhich was the analysis of body fluids using ?????? o This research on inborn metabolic errors will be n~~?ii!~cted jointly in the Stanford Departments of Genetics and Pediatrics using existing eguipment {Finniqan 1015 Quadrupole mass spectrometer, Varian Aerograph GC and a PDP-11/20 based data ??????? o k?? 3ppr?zisted the value of GC/HPYS analyses of selected extracts 3f ho4y ffjids (i.e., those containinq metaboli tes not identified bv routine GC/LRMS data) when formulating the Senetics fiescarch Lleater proposal. Accordingly, A small amount of funding was there reg~lestad for recording selected GZ/HRYS data on the GC/Varian EAT 711 'Rass spectrometer in the Department of Chemistry. If these funds are awarded, we will noqotlate with YTH a suitable elimination of this minor overlap with the present budget. rr. Sn'?CIFTC AIMS 1'52 spacj.fic aims enumerated in this section will be pursued in tfi3 hiqhly inter-disciplinary manner that has characterized the Ll)7NPRkL project from the start of its NIH support. The aims are not Iisjoint,but interactive and inter-dependent. For example, the pawor of P!S and, potentially, other spactroscopic techniques, z3tn h? enhanced by the use of computer proqrams to perform various asnez+s of structure elucidation and thaory formation, From the star3point of computer science, one measure of the utility of techniques of artifizi al intelligence is how well they perform in r??l-v3rld 3Dplications. I+ is necessary in the development of tFi3s2 proqrams to have a source of data and informed, involved teaB-mates able to criticize methods and results. The aims are alaborsted in the methods section. me have attempt28 tr, keep the proposal to a readable length. rhF?r?fore, some detail. has been omitted. Houever, many details can be found in the bilioqraphy an? we are prepare3 to provide 3dditi onal information ;turinq t.he site visit. 1. Enhance the power of the MS resource. m he existinq MS resource, toqether with computer programs which ixist Dr which are proposed (see Aim 2, below), is capable of -olvinq sane of the structure 3 zlucidat.ion problems of the user community given computer supnort for data collection and re3~lction. We refer specifically to the areas of GC/LRMS and ro:lt%ne, b3tch HRYS samples. We believe that many of the problems of the user community require more powerful techniques (see YPction III) * These techniques, specifically fJC/HRZS and senl-automatic metastable defocussinq, can be provided with a minimum of cost and effort, thus enhancing considerably the capabilities of the resource. Dur first aim is to provide the resource with adequate computer 3upp3rt (re placinq the previous ACZ!E system) to enable collection In3 reduction of mass spectral data including low and high resolution scans and data or defocussed mctastable ions. 83 DrnDoSC to develop this computer support in the ways described 30131, (those aims are writt.en to include the work necessary to im3lencnt the extended PDP-11/20 computer system. A description of the rationale for this choice is provided in Section 1II.A and tha specific auqmentations in the Rudqet Justification). ,!I 1 Convert Pxistinq, proven data acquisition and reduction nc3qra?rs from the PL/ACME langllaqe into Fortran, consistent with t; n3-:: rikizal assembly lanquaqz proqrams for data acquisition snd instrument control. These programs will be written in Fortran to enhance comnatibility with %hc computer systems of other users of 3uch pzlcka~es* 3) Modify these proqrams, 3s requirej, to handle acquisition and ce?3u:tion of frequent or repetitive HRYS scans with selected instLumen~ performance feedhsck to the operator, and to take aqvantaqe of the cxpandcd canabd. A formal proof will be devised =ls W?ll. This s'lqorithm represents one very powerful approach to t hn problem of implementation 3f constraints, 3s discussed in the f~3 ?.ovinq paraqraph. The qencratinq proqrams will be modified to allow isomer qpnQr3tion within constraints, Different kinds of constraints can h? inferred from different kinds of spectroscopic data. !?e intend t7 qive thz proqram knowledqe of 3 variety of these. Thp Planner proqrams that infer constraints from mass sDectron?try data will be hrnadened to include additional kn?wlerIqe shout the spectral behavior of classes of compounds of r"levan~e to thcz NIH-sponsored research of the user community. In 3??i.i+i.on, 83 will add the capability for utilization of izf3rma+t3n about chemical isolation procedures (e.q., one expects scijic and neutral compounds in solvent extraction of acidified bo3v fluins) and relative GC retlntior! times (e.g., to admit the possibility of homaloqous series). w F! ~~303s~ to implement a more qeneral method for ir,fsrrirq the i33ntity of thp mol?sular ion whether or not this 1!3pe1cs explicitly in the spectrum, This information is important f3r the Tuccessful operation of the structure qenerator and the n?3rlrlf?r. 'de want the program to use whatever information is 3v3il3bl2 -Ind not depend, as it ctlrrently does, on having knowledqe 3f the structural slass together with inference rules for that class. Tnt?rfsce routines will he written to mak? it easier for other ;nicntis?s to use tF.ese proqrans, We have to wait for an inker3ctivc system before startLng this: AI!?-SI!MEX will he ideal. rr.p'lf/output routines will he crucial to ~ssy use of the system. u~u3vcc, WC als3 want t3 give users the facility to understand the system's reasoning steps so they can t3k~ 34vantage of it. Tn a4ditirn to makinq the compilter programs available through 4T"-TTJYEX, we w0da like to translate parts of the LISP code into another 13nquaqe - for reasons of both efficiency and ?xp3rtability. We have talked with computer professionals at IBM Pesaaroh Canter about usinq the APL lanquaqe. FORTRAN, ALGOL and PL/l arc other lanquages whose merits for our purposes we will explor?. W? wish to continue a low-level of effort on computer proqrams that interpret other kinds of spectroscopic data. Plannicq proqrams similar to the ?IS Planner could be written for automatic analysis of data fro3 3,ther spectroscopic tc:hniques(e.q., IF, IJV) # as we have illustrated for 13CMP (ref. ??? o Th? structure qenerator's view of chemical strlictlre is topDloqizal and is presentlv unconstrained by bond lengths and angles. 3ecauss stereochemjcal considerations are frequently important in structure el.ucidation, we propose to begin corsidsration of stereochemistry in the structure generation and 2vaLuation processes. !! nrolram with detailed knowledge about information ?htain?Sle from various spectroscopic techniques could be written t3 C?X3Zli!l? a list Df candidate solutions and propose experiments necessary and sufficient to distinquish among them. The program would represet?? an extended Predictor (e.q., ref. 27). Be have a Eir- ,t version of a program that suggests "crucial" metastable pz3ks t.0 he snuqht in order to distinguish among car,didate stru~tutz?s. Work on this proqram will con%inuc at a fow level of 3ct!vity, possibly expandinq into areas other than ES. One topic we will continue to pursue is our collaborative effort uith Dr. ;iLd3 Loew, Geneti.cs Department, on the potential application of molez:rllar orbital theory to prediction of mass spectra (ref. 71). Th33ry 3rnation Proqrams:: The rile formation proqran (RtJLEGF:N) will he extended so that i + car? search a larqer space of rules. Present a priori zo3str3int; on the rule qeneration qive us a search reduction from te?s of millions to a thousand possihl? rules, Even though search heuristics now 3110~ efficient search of these possibilities, we want to be able to deal with much larger spaces efficiently, as wh?? the number of primitive predicates is drastically increased, The PULFGEN proqram will. be modified so th-at complex Eraqmectation and rearranqement processes are manipulated nearly 3s easily 3s simple fraqmentations. The program currently finds frlqmentstion rules invnlvinq 3ne 3r two bonds, possibly followed bv hydroqen miqration, Ir. the case of cyclic systems such as =S~CD~ZCS, however, the proqram must be able to work with sets of . . three or more bonds in some cleavages, Interactive proqrams will. be provided 3~ AIM-SUP!:EX for the inv3stiqat3r to query the rule qeneration program. For example, msnv questions now arise about the proqram steps by which the proqram Tnfer; the rules it suggests as explanations of the re7l~larltins. Why, for example, was some particular rule not considered plausible? New d2it.a will have to be selected in order to test the rules 3nd to differentiate amnnq competinq rules. Be will mite a nroqram that suqqests new experiments (i.e., new data to obtain), deosniinq on the nature of the existing rules, The t?st phase of the theory formation program will be written as an evaluation function of each rule against new data. Tn3;7f3r 3s 3ny new experiments are **crucialtl experiments, the evslu-ation function may merely reject a proposed rule. Mostly, 13wevor , riles will have to be evaluated against new data along 3apv dimensions: frequency, strenqth of evidence, uniqueness, simplicity, and the like. We wish to experiment with the whole theory formation program to ,lztermi3e the critical aspects of our design. For example, 11) how sensitive is the proqram to discrepancies, inconsistencies and ecr:>r5 in the data? (2) how well can the proqram find rules within a sliqhtly different model of chemistry? (3) how well can the proqrsm perform w?%h one pass through the data, or several passes? and (4) how critical are the principles of theory Eormati~on? 3. Apply the structure plucidati3n techniques - both i?;tcumsnt3tion and computer programs - to hFomedically relevant compounds. 3ur own interests are in elucidatinq the structures of, and SnIerstandFnq the MS of, marine sterols, hormonal steroids, and compounds isolated from human bofly fluids that can be associated with qenatic disorders (from research in the GRC). In addition, W? riff be workinq closely with members of the Stanford Medical School 3nd Chemistry Department - in particular those mentioned aih3ve fSect?.on 1.3.4) - on their structure elucidation problems in which YS will be used. Althouqh most users expect to require HRMS 3na SZ/HRYS data, some of their problems will be attacked utilizicq SC/LRMS techniques and library search through (usually) restricted libraries of mass slnectral data, We propose to investiaate some extensions to the technique of library search (SIP Methods) to complement our existinq and planned DENDRAL prcqrams. WE! plan to continue 3ur exchanqe of mass spectral data an? library search information as we have previously done with Dr. 5, Uarkey (University of Colorado Medizsl School) and Dr. F. W. YcLaffnrtv (Cornell Universi.tvI * P.s !.n the Dast, attention to new biomedFcnL research problems will lead t-o increased capabilities in the computer proqrams. We rejuir? close communication with the people engaged in the research so that the proqrams actually assist the researcher while iycreasinq in power, Collahor3 tive proposals have come out of sjxzh past DENDRAL sponsored work, for exampl?, larqe portions of t ?-lP ;9C Droposal apd a proposal. for 13CMH research. k'3 2~visi.92 the interaction and collaboration with the user zoamunity to involve the followinq: a) In all casts, WC plar close cooperation with the users in 311 3SD?Cf'; of the problem, Rlthouqh the basic isolation procedures are the problem of each invcstiqator, his knowledge of th? 3va:lable facil ities ar.? their limitations can be an important 3i3 t3 ssmple preparation and Inalysis of the results. This is gartizularty true for collaborst3rs who arc unfamiliar with the tczhniques of HRFS (e*q*, sample size and resolving power I en 2 :s s I r y t-, separate the mass doublets that can be realistically nxpez+oT in different contexts). b) The needs of the user comnunity will be varied. Drs. Duffield 3n? Smith sill, in collaboration with the users, determine the kir.3~ Qf MS experiments which will he most useful, considering 33Klpl2 complexity, stability, guantity, and so forth. We wish to lltilize fully the existlnq resource and our proposed extensions, 5ricqinq +:, bear on a problem any techriqu2 which is appropriat-e snd z3n h? provided. This will include the full scope of available experimental techniques in MS (LRPS, RRMS, GC/LRNS, ;C/HRYS, mztastable defocussinq, ultra-high resolution mass asas.Jrlments) and available computer proqr2ms (see below). cl Y3ny problems will be amenable to treatment by computer proqrams which exist or which uill be 3evelope3, fDr example, 3tru:tur31 isomer problf?ms or HEMS interpretation on compounds in 3 w-11-understood class. We will take the responsibility for utilizinq t.hese proqfams where appropriate to assist in structure ?J.nci3?+ion problems, Ide will instruct members of the community irl else of the proqrams when pr3qrams are used routinely by c3l.labor3t3rs. Wolezu13 r structure elucidati31 entails the intelligent and ~2tieri+ application of a large body of krouledqc to each specific Dr?hlen. The importance and relative difficulty of the problem ;.~pel us to seek the powerful assistance of computer programs to helo chemists in their analyses. It is unlikely that such prnqraas will ever replace chemists, especially because computer pcoqrams are readily written only to focus on rather narrow 3s~ects of problems. Nevertheless, our past research is roasonahly forwarded as a demonstration of the computer's ability tn Issist in practica, 1 biomolecular characterization although this was 3 spinoff from theoretically oriented research. Tn order t-, meet the major objectives of this pronosal we will fnzus our Itter.tior! primarily 3n structure elucidation of b13merIiz sllfy important compounds throuqh YS and AI. However, many of t.ho cDmoute!r proqrams can already use information from other analytical techniques. Sr, we want to be able to think of structure alucifiation in the context. of an ensemble of analytical caoabi? ities. A. Snbancing the Power of the Yyi.lss Sptctrometry F?esource Wo have developed a siqnificant resource consisting of instrunentation (the Varian CAT'-711 and ancillary equipment) and conou+.rr programs for instrument zvalu%tion, data acquisition and reduction. Pouti ne reduction of high resolution mass spectra to elsEsnt31 compositions and ion abundances without human intervcrtion provides the capability for efficient handling of 13cqe volumes of hiqh resolution mass spectra (such as will result Er3m SZ/HF??IS runs). The development of the GC and of the SC/MS combination is in the excellent hands of Ys. Annemarie Wegmann, vh3 is responsible for operation of the complete system. We now !L~VP mpr? than two years o f operational experience with the WS, I- ha . .I ._ :;C and related equipmsr-t. under a wide variety of experimental conditions. U?ne of the resource-related research discussed in this proposal cd2 ba carried out without siqnifisant quantities of mass spectral 33ta. The ?xistencc and extensions of the HS resource, the d~velopmert of computer techniques and the applications to biomedical problems demand an efficient atecAar!isn! for acquisition 3nd ro?Iuction of p4S data, and eventual transmission of the data to the S:JFJ?X resource. Th!lS, operation of the MS requires s;rrbstar t Lal computer sunport t9 deal with the large volumes of data produzcd by the system at hiqh data rates. WP feel that a properly confiqured system of hardware an3 software should orovide, at. a minimum, the followinq clpahiliti6s: 1) nptailed evaluation of the condition an4 performance of t_he PS pri3r to recDrdinq data on valuable samples, with feedback to the 9P?ra+or. 2) A coordinated system of hardware and software for signal ~nn3itiotlinq, peak detection and peak analysis. 31 D3ta reduction techniques based on a compute3 (not theoretical) a3?el 3,f the MS, j.ncluding peak shaoes, mass/time function, and resolving power as a function of mass. 4) Pe3k profile analysis for aultiplet uti.nc BRAS analysis and ?r/Pl: uhera this efficiently responds to local needs. Many aembers of the user community will require in addition to GC/HRMS, YPYS analvsis of relatively pure compounds or mixtures of small a~mbess of compounds. me will provide this capability on an t~terim basis, usinq St_anfordls TRY 370/1513 system while the PDP 11/23 syst"m is being upqrade?. We wer? able, using the ACRE ssmpr~ter facility, to start zvafultinq the operation of a GC/YS system at high mass resolutions. These experiments were hampered somewhat by the Liaits+lons of the computer system useI to acquire the data (only occasi 31131, sinqle scans were possible); they were necessarily liscontinaed (as well as all Hf)wS operation!) upon the termination of AC"4E. we do have, however, some benchmark figures for the oerformanca of the proposed system. flixturos of fatty acid esters (e.q., methyl pslmitate and methy, 1 stearatc) qave good quality naas measlrrements (+-lo ppm) over a dynamic range of 100~1 for sample sizes of the order of 0.5-l-O micrograms/component during 19 sen/l2c33e in mass scans (resolvinq powers S,OOO-R,qOO). W? 3~0 haltinqly cont.icu inq our evaluation of the SC/HI?% system aven without a data system, making measurements on individual ions >f the 1113s~ standard and known materials in the GC effluent. fhas~ data can be approximately translated into expectations 7 ur!.nq dyn3tmic scanninq. We have performed an extensive series of mnasurements on both methyl stearate and cholesterol (not ?erivatFzed), the fatter compound being more representative of our 7urr?at research problems. These measurements tend to confirm the 2r*limi?arv data described above. Firmer data will be available slxhsequnnt to the submission of t-his proposal. ~5 prorose to operate our existinq GC/YS system under high c?s.~l:~+ion conditions siminq toward optimization of resolving D-Jd"T?, sc3n rates and GC and molecular separator operatinq z9rditions to determine the paxlmum usable sensitivity of the SYStI?C!l. . i42 r3zoqnlze that the ultimate sensitivity will not approach that attainable by photographic methods of recorclinq; WP feel that the 3kilit.y for on-line operation and evaluation of the operating conditions of the ?lS partially offsets the sensitivity di;advantaTes. We realize that some structure elucidation Droblems will not be amensblc to study because of the sensitivity limi tstions;; we feel, however, that many problems of interest to the U;?SC Community can be studied effectively with this performance? capability. Rather +h3r. propose a research program to inzrcase the sensitivity of hiqh resolution mass spectrometers (P.4., McLafferty, et-al., Anal. Chem., 44, 2282 (1972), dynamic CP ;:l3inninq of peaks; Jet Propulsion Laboratory - chemical multipl!er emission/detector arrays, private communication to T. Rindfle'sch), we propose to identify our limitations and, with our z~LlaSnrat~r5, use discretion In selecting 'and preparing samples. FlIrther 3 s;3lerstions of technical capability to meet the statp of +!-i3 3rt jc sensitivity will require investments in hardware that 233 he better justified at a later staqe of a successful facility program , Yeanwhile, other laboratories can be expected to make siqrlifi zInt contributions to this important problem. Practical r-?arl for budqet 15mitatiops is the main reason we do not press t'lis issue ourselves at the present time. yiqnifizant improvements in sensitivity (with only small decreases in mass measurement accuracy) can be achiave3 by operating the MS ?t reduced resolving POWPI--: coupled wit. h intelliqent analysis of tha cesultinq data to detect and resolve the potentially greater n!~aFer rf 2verlappinq peak envelopes, This proposal is not ??kire'lv new (e.q.# see Smith, et-al., Anal. Chem., 43, 1796 (1371) ; nurlinqame, et.. al,, in l*ComputPrs in gnalytical Chemistry," C.H. Orr and J.A. Morris, Yd,, Progress in Analytical "hnmistry, Vol. 4, Plenum Press, N2w York, ??.Y., 1970, Chap. III). Jr? can, however, siqnificantly extend these earlier techniques by ztiliza+iol of 3ur multiplet resolution alqBrit hm. This algorithm ?~holie? in a computer proqram, has been shown to increase the effective resolvinq power of the ?9S up to a factor of three. It hnses its ooeration on a dynamic model of peak shape computed 3iccl:tly from the data, For computational efficiency and to avoid splri?us infDrm3tlon, this algorithm would be best implemented as 3 post-processor, basin9 its search for multiplets on the results nf pri>r elemental. composition determination. "he ability to detect and analyze for unresolved peaks is mediated by consideration 0, f the mass measurement accuracy of an MS system. r'hese svst?ms are capable of determining peak positions (and thus nasszs) to a sm3Ll fraction of the peak width. The high accuracy nf such neasurements (+- 2-10 ppm) can, in fact, be utilized to I?!. ezt. and "resoLven muftiplets in instances where the unresolved so,': UPS ar3 known precisely (see Durlinqame, et al., ref. above, ??I? ::-I vs. 13C doublet fietection and resolution), For instlnzes where the !keteroatom content of a molecule is known 3r where fhe possibilities are reduced severely by chemical, SD? ., -tr3sc>oic and mass measurement heuristics, there may be a r3Pqe of possible overlapping ions resulting from fragmentation of ??? ??o???o? o These poten%ial overlaps may be computed and then used (in combination with the known resolvinq power and mass ~~~qsurn~?nt acctlracy of the MS and the measure1 mass of the peak, zssuminq it was comprised of only 3ne type of ion) to direct the nulti~let resolution proqran. As an exnmnle, we have computed the possible mass doublets for various ranqes of compositions (Sederberg, et al.., to be * ouhlishod), A sample t3bl.e for C, N, D =<4 is appended ITable 1). 7nLv 28 of the 364 possibilities are shown, namely those whose n .03 and would he fairly easy to resolve, requirirq ?/5OOfi resolution at i"W=150. 9t the other extreme, 5 doublets show e<.Ol fCY4 vs. H404; C2H20 VS. N3; :2Y2 vs H403; C3N vs F203; and E4 vs H2N02) which would ?exan,'l special treatment for resolution. T!lz? 10 dnllbLets for which -01 =< e =< .03 pose the interesting ~h~lleaq~s for tradeoff of resoluti.on vs. scpsi+.ivit.y in the coat2xt of given problems, For example, i.f W is absent, the only ambiquities are C3 vs. H402 (e = -.02) and C4 vs 03 (e = ,015). N!uzh 3s WC would wish always t3 have unambiguous empirical formulas f9r all ioEs, HRHS remaLns a valuable tool despite these limitations. AS shown by these examples, even modoratp resolution c educes the number of candidates tn a manageably small number of aLternativ?s, eontextual. ar,d intnrval ?ata (within the spectrum) c39r-i 03 us21 to trim these fwthec at tjro levels: (a) pooling of 3231: statistics to sharper. d ecision probabilities on the presence >f hztPr9atoms -- the fraqments are subsets of the mol+?cule and (h) the assemblaqe of candidate solutions unt?er each of the alternative formulae. ?anif?stly, computer processing can sort >t),t branchas of decision trees that would soon exhaust human p;itienze. These heuristics arp built i_nt3 the DEWDRFL programs {solutions b?SE?.l ?n fraqmentation theory), but are also applicable to table look-up approaches. WC {ref. 29,33), and others (e.g., H.-K, iaipf, et, al., J. Amer. :il?n. Sot., 95, 3369 (1973)) have illustrated the importance of neta - (._ the history of a sample is known, so as to restrict th potential classes of compounds and for classes where the rules 3f FAC; fraqaentation are well nnderstood, the program's performance matches +hlt of trained mass spectroscopists, the program also 3ffets SO@l? advantaqes !.n its exhaustive and r3tpi.d analysis of the ! a+ a. r?any structure elucidation problems of the user community fit into this cateqory ana existing resources can fulfill these nee?s, #h,lthec man- or computer-implemented. PlS cannot solve all 5 tcuct IIre elucidation Problems, however, In such cases, recourse i- to other spectroscopic techniques if sample size permits. As &z-l ?ns:ribed in the introductory section, diverse information is ni2c33. Coyether to achieve a solution. Interactive computer nroqrsms can assist in seqments of this procedure, with the advantaqes of exhaustive eva luayion of the data and the molecular structures suqqested by these data. rn 311r own and in planned col!abocative work, we rdill call upon t h I extensive facilities of the chemistry department for 3cJuisition of additions1 spec+roscopic data. These services are filsnce3 by fees, paid from existing research qrants of the user zoaEuci.ty. There are sufficient documente? examples of structure ?t,rcidati?? problems to obviate the requirement for extensive use 3f +!-%ns:p a43itional facilities in 3evelopment. of the programs. On the other hand, the intensive pursuit of mechan iced "intelligenceW in the domain of MS requires more than availability of public ES 'lata . Tt requires the collaboration of skilled chemists actively enqa?erl in practical ES research and, Rt the same time, committed to the nxploration of innovations in the application of AI to the solution of the problems As il +!lc? 3ast, we will deoelon the computer programs throuqh :los3 collaboration among Drs. Duffiefd an3 Smith (and other nembers of their qroups) and the pcnqram aesiqners and proqtammecs. for us, this means daily consultation for discussion 3f str?iteqy, extensions to the proqram, an3 solutions to new prDblelFS, Tn particular, we propose to continue software devclooment (on the AIM-SUMEX facility) as folllows: 1) The ret zntly completed structure generating alqorithm will be t he core OF our efforts to assist in structure elucidation. The strnztura generator can quarantee that the correct solution is somewhere in the list of possibilities. A?dit ional programs, such as th? Planner allow us to avoid exhaustive generation in dractice. Some parts of the cyclic structure generator program h2ve nDt. baen extensively tested yet, 2nd these tests will be the first task to complete. 2) The structure elucidation task is stronqly directed toward rejection of whole cateqories (e.g., compound classes) of --,lSrtlons as quickly as possible by usinq as mlrch knowledge about z-3. t he chemical history or characteristics of a sample as is avail3hle. Details of spectroscopic data then define the nDlezil!ar framework more precisely. Each st.ep in this procedure ronroser.ts the application of constraints on the set of possible 3313tions. Computational effici_ency demands that these constra5nt.s be applied early in the qeneration process when the structure qenerslor is utillze.1. `.J? have ma3e some effort to examine th3 kinds of constraints used by scientists engaged in structure elucidation. We have begun Iesiqninq strateqies so that these constraints can be brought to be3.r 3n th2 structure generator. Some of these strategies involve nilor changes to tt?e existinq program; others require significant extensions of existinq qenerating functions. One approach which sc3!ns particularly attractive to us is presently under 3evel3pm?n%. This approach will utilize the existing s%ructure qeaer3tor, with some modifications, to generate a dictionary of cyclic skeletons up to those containinq a maximum of twelve tertiary vertices. The dictionary will be a complete, irredundant list of rinq systems which contain no multiple bon3s and no cut-edges (acyclic parts), This .lictionary will be organized and rts, m -* Theory Formation 322 inoortant aim of this project is to improve the existing theory formation capabilities and thus provide more assistance to scientists investigating reqularities within classes of compounds. This is a theory formation task at a very praqmatic level. The Il.5 Fheorv that the proqram attempts to find is of the same form as the one practicinq mass spectroscopists use for structure ?llci33+i3rl* Thus, resultinq pieces of theory are extensions to both thn szlentis+s' theory ani the computer's theory of the ?iscipl,ine. To improve this program we need to complete the Plqn- Generate-Test program that has been started (as described in th? 3ppendsd annual report) an3 tune it over many test cases. We also w;sh to make the proqrams interactive and easy to use so that thay are more readily accessible. This can he done when the prorTrams are transferred +o the AIR-SUMEX facility. We plan to apply the theory formation proqram to two different kinds of data: (a) the data collected in the interest of gnderstandinq the mass spectronetrg of a particular class of conpnlinstosterones, We propose t:, continue to use the INTSUP! program in its present form and as i+ is improved in support of these st.udic?s. I'ha qsnerator of rules that we POW have, 9ULEGEN, does a credible iob of explaininq the reqularities summarized by XNTSUE1. It has foun3, for example, the well-known alpha -cl?avaqe fragmentation PlY3,?2SS an1 beta cleavaqe followed by rearrangement in the low r?sol,lti2n data for fifteen aliphatic amines. The program will be extended in two important ways to increase its utility: (i) the proaram needs to be able to work with an increase3 number of '=szrintive predicates in the qeneration of rules, and (ii) it 1 ._ neeas to h? qiver? a more flexible representation of complex fraqmFntati.on mechanisms so that it can m3ce easily find rules involving note than two bonds. We will continua workinq with low resolution P!S data of the 753-200 nonofunctional aliphatic compounds studied previously in t-he context of the performance proqram. These compounds are dell-under-stood and thus provide a go03 test of the program*s -Ffectiven2ss. _ Tn order to insure generality in the theory Formation nroqrams, we will also test the system aqainst the high rnsoluti.on mass spectra of the 68 estrogrnic steroids. Since they 3re also w211-understood, these compounds will show how well the nroqram can deal with complex ring systems, multifunctional Z3ntP5;1Il?3, cleawaqes involvinq more than two bonds, and high resolution data. "he existinq proqrar?s arc in qsnd working order - within definite Linits - s3 we expect to apply them to new sets of data from the SS 13Sosatory as interest arises. For example, as the high and 10~ resolution KS from marine sterols are collected we expect to use INTSUP! and F.ULEGE?J (at least) to assist in the interpretation au3 generalization of these data, Since these problems will a?vanze th? state of knowledqc of PS, it is not correct to look on then as test problems, However, in the past the programs l?velone? most rapidly when they were applied to unsolved problems 3f interest to our colleagues in the chemistry department. ?or ??velopment of the interactive programs, we will rely heavily nn the criteria of acceptability by Stlnfor3 users. The programs themselves will be written in TNTZ:RLISP on the SUMEX computer, Tyitialtv, we will provide intaractive access to the control nararaetors of the proqrams in order to allow users to tailor their runt; C.o their immediate interests. Zater we hope to expand these to 2lLow intcrroqatlon of the programs with respect to both Tontents of the results and the program's reasoning steps, 3. Ipplic3 tions to Biomedical Problems v " c*n immediately offer +.c the user community the Planner, for analysis of HF/f!S in terms of molecular structure. The program is icsensi+ivs to the source of the 11s data, and we foresee significant use of the proqram for analysis of spectra of mixtures ~itbont prior separation and spectra from the GC/HRFS facility witb3ut adjitional proqr3mminq effort. Examples of applications 3~~3s are summarized below. ii? wis-, h to exploit our existinq capabilities of the analysis of hioloqical mixtures without prior separation (ref. 33). This approach will prove particularly useful in studies of mixtures which 3c1 difficult to separate and analyze by GC. Phytoecdysones r~lsted to ecdysone, an insect molting hormone, present swh a problem. EC of these compounds is very ?iFficult, although high-pressure li_quid chromatnqraphy has recently been used to zarry out separations. This class of compounds represents an lPt?resCI. nq and valuable test case for our combined MS and computer techniques, particularly the specification and subsequent acquisition of motastahlP defocussing data for precise linking of nStrent and frnqmenf ions in tFe spectrum of a complex mixture (r3fs. 28, 33). mad compourIs, mixtures and current structure eI.ucidatioo problems are available (Nakanishi, Columbia; Takemoto, fohoku University, Senc?ai, Japan). Althouqh most users cannot be s2mnle+ely specific as to the n2itur? of their future structure ?lX~cidation problems, we feel that several of these problems can b? handled by such an approach. 4s the str!lcture qenerater and its extensions are developed further. we foresee con+inuing use of an interactive version 3p~lFe:l to specific problems of the user community. As an PX3ZlPler the work in collaboration uith the GFC project will involve sttllies cf several classes of compounds extracted from hunan body fluids (c. q., aromatic and aliphatic acids, various classes of bases, amino acids and carbohydrates) which contain cenresentatives varyinq by substitutions about a small number of ~~L?cl~lar skeletons. The generator can define all isomers uhich aust be considered as possible solutions. FDr th>se prnblems which are amenable to attack by library search pfoce?!ures, e-q., screeninq of SC/LRRS runs of marine sterols to -#P?? out known COInpOUndS, ue pcop~se to use these procedures and to investiqate extensions to them. usinq a procedure related to that described by 3cLaffcrt.y [K-S. IZwok, et al., J Amer. Chem. sot., 95, b185 (1973), we seek to 3ltermino from nDdified library search techriques the known structures which yield similisr spectr;l. Utilizing the DEND%L structural manipulation routines, me would then seek to determlnp those related structures (#hose spectra are not in the library) which are possible solutions. R Librlrv, incl.udinq Wiswesser Line E;otation names, Pxists (F, W. YcLaEferty, private communication) and would be of some utility in ttTi.5 w3rk. The YS facility in conjunction with our proqrams will be used in studies of Ihe f 0llovir.q nature: 1) Prof. Djerassi - we plan use of the t?S facilities and computer Droqrams in onqoinq research connected with existing NIH-supported stu lies on steroids an:! marine ster3fs and continued collaboration with Prof. Adlcrcreutz on estrogen mixtures isolated from body flUidS. Further collaboration with Praf. Adlercreatz will be on structural studies of new estroqen mefaholitcs whose presence in n!xt*lres has been inferred through our previous collaborative ?ffC.JlT?LS. fhe work on marine sterols presently utilizes ?:/LE?r(rS and frequently laborious separation procedures to isolate individual fractions for HRMS analysis. GC/HRMS will be a siqcificant assistince in this effort. We plan MS stildies of known marine sterols (u+.ilizinq TNTS!JPI) to derive fraqmentation rules, which than will be used in the Planner to aid structure elucidation of rl?'W z ?mp2unds. :s~e also plan further work on extensions of IqS theory in the ::tcr3i3 field, initially focussed on additional hiomedicafly iapoctact classes of steroids related to the preqnane fnroqesterones) and androstane (testosterones) skeletons. This work is currently being Carrie1 out by Dr. Smith ir. collaboration with two visitinq senior scientists [Dr. Roy Gritter, Dr. Geoff 3?T3"l>V) currently on saFha+ical l.Pave fellowships. 3) Chemistry Department Collaborators - as indicated by the responses summarized in the letters of interest (Appendix A), *here is significant interest in use of the HS facility by other NIS-supported members of the chemistry department. All those listed are famjliar with the technique of MS as applied to structure elucidation problems. East have 11sed MS frequently, aacticalarly Prof. Van Tamelen in his studies of the cyclization 3f squalene and related studies in the terpenoid and steroid F;=ll, L ir. The interests of these collaborators are generally in HBES and 7C/HRrnS, with occasional use of other capabilities of the 97;t.arn. The tvpos of compourds studied hv this group and an ildiz2tinn of the amount of USC expected are summarized in the letters of interest. 31 "J-anet+cs Pesearch Center (Gr?C) : (Profs. J. Lederberg, H. Cann; Dr. fi. Duffield) ?he body fluids analyzed by GC/LRMS +D ?ate include urine, blood, amniotic fluid and cerebrospinal FLli.3. 'Sach body fluid is fractionated into the following z9mpounr'l classes: 9) Drqanic acids and neutral compounds b) amin? acids cl carbohydrates which ?ftcc appropriate derivatization are analyzed by ;C/T,BYS/coraputer system. A library of known LRHS will serve as !-!I? nrimary means of identifyinq metabolites from their experimentally recorded LBYS. T i.n t\oss instances where the LP?lS is insufficient for metabolite !!3n+Ffication GC/Hl?PS data will be necessary to determine the cnmposi?ion of al1 ions in its mass spectrum. These data will qr?atlv enhance the prospects of identifying the metabolite in susstinn. It. is known [on past performance) that if a compound is present in 303~ flnicl; at the level of 1 microqram per GC peak then gO0a qliality HRlMS will be recorded (ion amplitude dynamic range of 1: lc?D, mass accuracy of +-5pnm) using the Varian !?AT 711 mass sD?ctrnneter. Tf the GC peak 3f interest contains insufficient nateria?. f or a HRMS scan then preparative "JC could be used to cDncen+rate that portion of the chromatograph effluent prior to s C/HR ?!s * 2ri.or to the demise of the ACME comnutcr system {July 31, 1973) we developed a GC/HRM!S system 3na applied it to the analysis of 3x+r3c+s from body fluids. The followlnq example represents r?slllts obtained with this system durinq its development. The example used was a routine analysis and was run to determine the can;lhility of the overall system dqrinq its development and not as 3:: unknown. sample of extreme interest. The total ion plot recorded durinq the lifetime of the GC/HRMS analysis of an amniotic fluid is reproducea as Figure 1. A complete hiqh resolution scan was recorded on each of the peaks shoun in Fiqure 1. Filinq timz of the time-shared ACME computer s.yst?m did not allow the system to operate in a repetitive scan nole. For the sake of brevity only th2 GC/HRMS scan (# 1594, "irlara 3) correspondinq to qlutamic acid N-TFA O-n-Butyl ester deriv3tivr Fs produced. (The corresponding GC/T,PYS scan is Figllre 2). The szan time per aeca?e of mass was 10.5 seconds, the resolution 6,500 and the matchinq tolerance for the assignment of emairic31 composition set to 4 mmu. The results shou that the systam was capable of accurate mass measurement with a dynamic ra!- qe in ion amplitude of about 33: 1 in this instance. "h? cessation of computer suppart for the GC/HRMS system did not allow a HFYS analysis to be made which was crucial to the Lde?tification of a metaholite present in a body fluid. Since th3i time however, several instances have arisen where GC/HRMS l3ta would have been collected in an effort to identify net abolLtes not previously seen. Th? cxp?cted sample throuqbput in the SRU project vith existing personnel is expected to approach 5 to 7 body fluir's per week (15- 21 G:/I,?MS fractions to be run in the Genetics Department per week) a On avernqe GC/RFMS would be required on 1 - 2 samples per wnck. The research interests of the Medical School collaborators relative ta the proposed F'S resource are summarized in the letters of interest {Appendix A). The #S services required hy this cormunity will include GC/LRMS (Forrest, Sera, Kalman for drug and flruq metaholite identification, Rabinowitz and Wilkinson for nrostaqlandin identification, Pobin for identification of oxidized/r?duced rerlox pairs, Hollister foe Farihuana metabolites, E! archas, naarotransmitters, Pair, polynmines and the prostatic anttbacterial factor in urine); SC/HRRS (Trude11, drug metabolite i?entification, Kvenvolden, structure of amino acids and related ,3mpoucc?s plus samples as required from interests described under ;C/'T,RMS) * In thnse instances where the biological extract contains insufficient material for a GC/HRMS sc.an preparative GC, using 3xistFnq Instrumentation within the chemistry department, can be usr3 +.n c?ncent.rate the materill prior to tha GC/!IRMS ar.alysis. rf t.he m3t?rial of interest is obtained relatively pure by this technique then ARMS analysis usinq direct snmpfe insertion into f-h . ..3 ion sotlrce would! be utilized. ts m?ntion2d above, several of the computer proqrams have immediate utilitv for assistinq with structure elucidation 3c~hlems* Car example, the Structure Generator proqram can answer itciiCtllral isomerism questions Independently of mass spectrometry, 14-q. I to prov%de lists of isomers in conjunction with isomer :ntercocversion problems such as carbonium ion rearranqements). alec311se the program will he able to qenerate complete lists of isomers with (or uithout) some specified structural features, a rn-,e3rchpr caE have confidence that no possibilities have been ,v?rlnoked. Some interest in the structure qenerator has been exprassed by representatives of the phacmacelltical industry. The n??.erafior could be use? to sugqest complete sets of structural 3 ltern3tivas for possible synthesis, once a physiologically active cnn qer~?r hss ken idectified. In more qzneral terms, the structure generator cap he cizhly suqqestive of new, unexplored areas of synthetic oraanir chemistry. for example, the qenerator has been used t7y 3 qr33u3t.e student in chemistry, Mr. Jan Simek, to i?nntify the space of possible Diels-Alder condensation products consisting of six atoms of any combination of C3Cb3R, nitroqen, oxyqer., and 5ulfrr in a six-aamhered rirq with one double bond. A literature se3r.. -h thronqh the Finq Index revealed that many of the rinq sys*ens have never hcen r~port.33. 66 Ttruzture eluci?!ation is an important and difficult problem for biomedical scientists. Rany of them lack +.he detailed chemical backqroond necessary to be effici.ent in this endeavor. Generally speakinq, they also lack the freguently complex and expensive ?zfuipm?nt (e.q., high resolution mass spectrometers) to provide spectroscooic data to assist them in solving problems of molecular 3truzt:1rs. We plan to provide the chemical and analytical exoertise to facilitate t?ie solution of their struct.ural problems. This research aims at providing more powerful techniques for Ietcrmininq molecular structures than are now routinely available. r p ~319 icular, we have proposed (a) providinq extended MS services 3.c a means of collecting povarful analytic data for scientists; (h) -lenelosinq {and extendinq) sophisticated computer programs to issi_st wits the interpretation of the 3ata from mass spectrometry 3~3 alseuhare (c) developing (and extending) novel. compilter programs to agsist with formulation of the rules of interpretation, and (d) applying these state of the art techniques to problem; of biomedical relevance. 3ur research group is thus dedicated to a broad-based attack on the apD1ication.s of structure ?luci?Ia+ion to bioloqical and biomedical problems. YhP Droposet'l research not only holds promise for significant L>nq-term advances, it can 4av2 immediate benefits as well, Many nemberr of the biomedical community at Stanford have called upon the "1s laboratory for assistance in the past and will continue to ?o so ir: the future. Thv proposed resource will provide the non?uit for a substantial increase in the ntilization of P!S within +h? Stanforl! biomedical community. The ability of the proposed 9 c=.33urze to interpret the experimental data it generates (enhanced by the close proximity of the resource and biomedical community) sh3llll rnsalt in a successfIll program of interdisciplinary rQseesrc!l. 3RYS i; 3n important source of data for these problems, and ;C/!+R*S is still more important. Previous investment by the NIH in the !rarian FAT-711 HRES system at Stanford can he utilized now an< built lpon for the future. Continued operation of the K/MS ;vstam will qiva the Ttanford common ity access to state-of-the-art spectroscopic techniques and to professional mass spectroscopists w h 3 sari heln with onqoing problems. ?he computer programs themselves constitute a unique resource for assistinq with the structure determination. The previous NIH Jrlnt sunported development of the programs. Tn part, we are reguastinq funds to exploit these programs. OR? 3f the most siqnifican? aspects of this work is its icterAiscipli.nary view of solving molecular structure problems by in;elliqently directed search of the space of chemical graph structur=s. As a result of nosing the structure determination probqem in this framework, WC have been able to further the knowledge about struct-urc elucidation jn at least three ways. First, s3rn? of the knowledge used by analytlcaf chemists has been made more precise for use in a computer program. Second, coIlifyi??q such knowledqe for the computer has led to the discovery 3f ICW rese;lrch areas to cxten3. our existirg knowledge of MS. liC?VPra'l. publications listed ir, the bihlioqraphy (RPfs. 42 and fnIlDwinq) are reports of exactly this kind of research. Finally, t he zomp~t3r~s systematic search through th? space of possible structures qives the prac+.icinq scien+ist the confidence that no 3 tr :1 ctures were merely overlooked. the efficiency of the program ?epenl!.s on the exclusions of many whola classes of compounds, but th3 co~nnnfer will have reject el those classes using precise, ?xnlicitly stated criteria, C!ur recent work on finding "IS interpretation rules (theory fDrm3+ion) can provide additional unique capabilities for assi.st!.nq with the problem solving. We wish to continue this research hacause it offers h3pe fnr a solution to the problem of Fucnisbinq real-world knowledq? to computer programs -- in natticillar t:, the computer proqrams that assist with structure eluciIatlon. This is a pressjng problem in current AI research. 9igh norfocmance proqrams, of which DEIDRAL is most often cited, ierlve their power from l;irqe stores of knowledqe. Yet there are no routine methods for infusing such systems with knowledge of the task domail. We believe our research in theory formation holds a rev to the solution of this problem, v. FhCILIPfES F, EO?lIP!!ENT fh? Stanford gass Spectrometry Laboratory will provide YS services OR the Varian MAT-711 mass spe ztrometer couple:! with a ".?wl?tt-P3ekard qas chromatoqraph (Model 76108). As service ilstrunents for more routine mass spectral analyses, the ??kora+orY has a MS-? and CH-11 mass sp?ctrometcrs. nata red11ction is currently provided on Stanford's IBM 3701158 z?,inpltcr il conjunction with a front-end PDP-l1/20 data ac7:lisi.t ion computer. (The PDP-11/20 presently has only the zauahilitv for buffering peak profile data between the mass spectrometer and the IEM 370/158 zoBputer at the Stanford Computer 3entar.j An alternative to buying time on thi? 3?0/158 is proposed an3 discussed in the budget justification. T!ha 8: h,\ programs will he run on the NIH-sponsored AIM-SJJMEX zoapzter facility (a PDP-10 computer with the TRNEX operating syst3n, 192~ words of memory, and adequate peripherals for our purposes;) . Running these programs on SUPIFX rill incur no charge. A. D3?1'D?F.L PURLICATIONS (1) J. Lel:?rb?lTCJ, "DPWDRAL-64 - A system for computer 2 0 !-! s t IT 1.1 r t. i ? n , Fnumeration and Vcltation of CIrqanic Polecules as l'ree Structures and Cyclic Graphs", (technical reports to NASA, 31s~ 3v3il3ble from the author and summarized in (12)). (la) ?3rt I. Notational algorithm for tree structures (1964) CT.57029 (15, Part II. Topoloqy of cyclic graphs 1196s) CR.68898 {fz) P3rt III. Complete ckemical graphs; enheddinq rings ir! trees 11969) (2) J. LeYerberq, "Zomputation of ~ol~cu13r Formulas for Mass SD?ctrometry**, Holdc~-Day, Inc. (1964). {3) J. ?,23prherg, JJT~poloqical Yappinq of nrganic Molecule~*~, Proc. Nat. Acal. Sci., 53:1, Januasy 1565, pp. 134-139. (`J) J. Lp'f?rberq, "Systematics of orq;zlnic molecules, qrsph topnlony 3nd Hamilton circuits. A general outline of the DENDRAL Sy;ten.'J NASA CR-YRR99 fl965) (5) J, L312rberq, JiHamilton Circuits 3f Convex Trivalent Folyhcdra [up to 18 vertices), Aa, Math. Monthly, !!ay 7967, 16) s;. L. Sutherf3nd, '*DENDRAL - A Computer Program for ;enerati?q and Filtering Chemical Structurnsll, Stanford Artificial rntelfiqenco Project Ycmo No. $9, February 1967. 17) 3. Lei?rberq and E. A. Feigecbaum, JJMechanizatior! of T?3u:tiv~ Inference in r)rqanic Chemistry", in P. Kleinmuntz (ed) Form31 F?presentations for Hum3.r: Ju3qm?nt, [Wiley, 1968) (also 3t?nford Artificial Tntelliqec:e Project FlemD Yo, S4, August 1767). (8) J* Le32rk?rq, "Online computation of molecular formulas from llas5 IlllmbPr. '* NASA CR-94977 (i968) (9) I?. A. Feiqenhaum and B. G. Duchansn, "Heuristic DENDRAL: A Program F-Jr Zeneratinq Explanatory Hypnthesss in Drqanic Chemistry", in Pr3ceedinqs, Hawaii International Conference on System Sciences, 3. K. Yin3riwal3 and F. F. Ku3 (eds), Yn iversit y of Hanaii Press, 1968, (10) E. G. Ruchanar?, G. L. Sutherland, and E. A, Peigenbaum, `JHeuri?ti,c DRNDRAL: A Program for Generatinq Explanatory Pypothas~s in organic Chemistry". In Machine Intelliger?ce 4 (B. Se!tzer an?! D. Michie, eds) Edinburqh University Press (1969). (31s~ St3rlford Artificial Intelligence Project llemo Ho, 62, July (11) F. A. Peiqenhaum, "Artificial Intelligence: Themes in the Secnn4 ?eca3c1$. Tn Fir,al Supplement t3 Pr3ceedinqs of the IFIP68 In?~cn?iti3z?l Conqress, Edinhurqh, August 1968 (also Stanford Artificial Tntelligence Project Memo No, 67, August 1968). {12) J. I,e;lPfhPrg, f*Topolo,;ry of fl31ecul~s~~, in The !!nthematical Tci f?rlC?F: - h C3llection of Essays, (ed. ) "9mmittee on Support of ?ssesrch i? the ?+athematical Szienzcls (COSRIMS), National Academy -tf- Sciences - National Research Council, "J1.I.T. Press, (1969) 8 PP. 3 7-51. (13) 2. Sutherland, **Neuristic DENDRAL: A Family of LISP 'r9qraws", Stagford ArtifScial Intellinence Project Fen0 No. 80, Yarch 1969. {lcr) J. Lrsdecberq, G. L. Sloth-rland, R, 5. Puchanan, E. A. Fziqsnbaum, A, V. Robertson, A. M. Duffielil, and C. Djerassi, 't?np!ications of ArtifiC:al Intelliqence for Chemical Inference I. The Yumker of Possible orqaci:: Compounds: Acyclic Structures 33nt3ininq C, H, 0 and Y'?. Journal of the American Chemical CO,i?ty I 31211 (Ray 21, 1969). (15) A, H. Duffield, A. V. Robertson, C. Djerassi, B. G. 3uch3n3n, 2. 'L. Sutherland, E. A, Feiqenbaum, 2nd J. Lederberq, "Appllc2ti3n of Artificial Intzlligencs for Chemic31 Inference II. Lqterpretation of Low Resolution Hess Spectra of Ketones". Inurnal of the American Chemical Society, 91: 11 (t?ay 21, 1969). (16) IT!. S. Buchanan, G, T,. Sutherland, E. A. Feigenhaum, "Toward an Vndsrst3nding cf Information Processes of Scientific Inference i.n ths Context. of Orqartic Chemistry", in flachine Intelligence 5, f3* ?eltz?r and D, Ficbie, eds) Edinburqh rJniversity Press [1373), (also Stanford Artificial fntelliqence Project Hem0 No. "9, SPDtemher 1969). (17) J. I.?derberq, G. L, Sutherland, B. G. Buchanan, and E. A. "eiqenhaum, 1,A Heuristic Proqram for S3lvinq a Scientific f?ferecce Problem: Summary of Motivation an4 Implementation", in ". Bln3rji 6 FI.D, Hssarovic [eds,) Thenvtical Approaches to Xon-NumPrizal Problem Solving, New York: Springer-Verlag, 1970. (4150, S+23ford Artificial Intelligence Project Remo Yo. 104, Vovemher 1963.) (19, c. w. Churchman rind B, G. Buchanan, "On the Design of Indnztive Tystems: Some Philosophical Problems". Fritish Journal f3r the Philosophy of Science, 20 (1969), up. 311-323. /19t S. Szhrnll, Al M. Duffield, ::. Djerassi, B. G. Buchanan, G. T ?? o Su?her!3nd, E. A. Pei7enbaum, and 3. Sederberg, "Application of atrtificlal Tntelliqence for Chemical Inference TII. Aliphatlc Fthsr; Dinqnosed by Their Low Resolution '?ass Spectra and Nt!R Dita". Journal of the American Chemical Society, 91:26 [December 17, 1969). (231 A. !3achs, A. Y. Duffield, G. Schroll, C. njerazsi, A. 3. Dplfino, R. S, Buchanan, G, L. Sutherfsnd, 9. h. feiqenhaum, and 3. Lcdcrberq, "Applicatior.5 of Artificial Intelligence For :hemicsl Inference. IV. Saturated Amines Diagnose<1 by Their Low Qesofut i0" Mass Spectra and Nuclsar Fagnetic Resonance Spectra", 3ournal of the American Chemical Society, 92, 6831 (1970). (91) Y.M. Sheikh, A. Bucks, A.B. Delfin3, G. Schroll, A.!. Dnffield, Z. Djerassi, F.G. Buchanan, G.L. Sutherland, E-A. "?iq?~baum and J, Lederberq, ?qhpplications of Artificial In!el?iqence for Chemical ?nference V. An Approach to the J>mput?r 53neration of Cycl?c Structures. Differentiation Retween 911 the Possible Isomeric Ketones of Composition ChHlOO'*, Orqanic @ass Spec+rometry, 4, 493 (1970). (2?) A. Ruchs, A.B. Delfino, A.PI. Duffield, C. D jcrassi, B .`;. Yuchaaan, E.A. Feigenhaum and J, Lederherq, "Applications of &,rtiFiy-i . *L al Tntelliqence for Chemicsl Inference VI. Approach to a Ger.eral Method of Tr?terpretlnq Low !?osolution Mass Spectra with a Znm ou+erfr , Helvetica Chemica A:ta, 53, 1394 (1970). (23) ??.A. Feiqenbaum, E.G. Buchanan, and ,'I. Lederherg, "On Generality 3x3 Prs5lcn Solvinq: A Case Study Using the DT;:NDBAL Program*'. In nac+i.na Intelliqence 6 (B. ?leltzer and D. Yichis, eds.) Edinburgh 7ziv2csity Press (19711. [Also Stanford Artificial Intelligence ?r-eject. YF?mo No. 137.) (2Ul A. 9:lchs, A-B. Delfino, C. Djerassi, A.H. Duffield, B.G. Buchanan, s.1. Feiqsnhaum, J. Ledcrherq, G. Schroll, and G.L. Sutherland, "The 4pplication of Artificial Tntelliqencc in the Interpretation 2E Low- Resolution mass Spectra", Advances in Mass Spectrometry, 5, 314. (25, B.C. Buchanan. and J. Lederherq, "The Heuristic DENDRAL Program for Explaininq Empirical Dataf4. Tn proceedings of the IFIP Cnnqre~s 71, Liubljana, Yugoslavia 11971). (Also Stanford Artificial Tntelliqence Project Memo No. 141.) (25) P.G, Buchanan, E,A. Feiqenbaum, and J. Lcderberg, '*A Heuristic Proqramminq Study of Theory formation in Science." In proceedings II? tbs SecoPrl Internationa 1 Joint Conference on Artificial rntelli~qanse, Imperial Cofleqe, London (September, 1971). (Also Stanford Artificial 'Intelligence Project tierno No. 145.) (27) Buchanan, B, G., Duffiel3, A.M.* Robertson, A.V., "An Application of 9r%ificial Tntelligencc to the Interpretation of Mass Spectra", Yass Spectrometry Techniques and Appliances, Edited by G. W. A. "iIn?, John Wiley F; Sons, Inc., 7971, p. 121-77. (29) D.H. Smith, B.G. Fuchanan, B.S. Enqelmore, A.M. Duffield, A. Yeo, p.n. Peiqenbaum, J. Lederberq, and C, Djerassi, "Applications of ?rt ificial Intelliqence for Chemical Inference VIII. An approach f:, the Computer Interpretation of the Aiqh Resolution 13ass Spectra nf Complex Rolecules. Structure Elucidation of Estrogenic St7r3i3sn, Journal of the American Chemical Society, 94, 5952-5973 (lq73). 129) B-G. Buchanan, E,A, Peigenhaum, and N.S. Sridharan, *'Heuristic ThPnry Formation: Data InterpcetatiDn and Yule Formation". In *actline Intelfiqence 7, E'dinburqh University Press (1972). (30) Ledsrberq, J., "Rapid Calculation of Molecular Formulas from "as3 Value3,". 3nl. of Chemical Education, 49, 613 (1972). 131) Drown, Al, qasinter L., Hjalweland, L., l*Constructive Graph tahe1i.n q t7s inq Double C~set.s'~, Discrete YathematFcs (in press). ( AlFO Computer Science nemo 319, 1972). (32) F. G, Buchanan, Review of Hubert Dreyfus' "What Computers Canlt n 0: A Critique of Artificial. Reason", Computing f)evietls (January, 1773). (Also Stanford Artificial Intelliqence Project Remo No. (31) D, 9. Sm;J.th, B. G. Buchanan, R. S. Enqelmorc?!, H. Aldercreutz and - Djerassi, UI *'Applications of Artificial Intelliqence for Chemical Tnferercc IX. Analysis of Mixtures Wikhout Prior Separation as Illustrated for EstroqensrV. Journal of the American Chemical solzz?ty 95, 6075 11973). (34) ?. U, Smith, B. G. Puchanan, W. C, Yhitc, E. A. Feigenbaum, _ . . . Djerassi and J, Lederherg, "Applications of Artificial Tntelliqence for Chemical Inference X. Intsum. A Data Tnterpretati.qn Proqram as Applie d to the Collected Mass Spectra of Pstroqenic Steroids", Tetrahedron, 29, 3117 (1973). (35) F. G. Euchanan and N. S. Sridharan, "Rule Pormation on Van-Yomoqeneous Classes of f3hjectstq. In proceedinqs of the Third Tctarnatfonal Joint Conference on Artificial Intelligence (St?ilf?rfi, California, Auqust, 1973). {Also Stanford Artificial Intelliqence Project nemo No. 215.) (6) n. flichic and E.G. BUCh3nan, "Current Status of the Heuristic DENDRAL. Proqram for Applyinq Artificial Intelliqence to the Interpretation of Mass Spec+ran. August, 1973. (371 P. Scown and L. Masinter, "An Algorithm for the Construction of the Graphs of Organic Molec~les~~, Discrete rsthematics (in press). Also Stanford Computer Science Department Memo ;~Av-cs-73-367, May, 1973) (39) 9.H. Smith, L.M. Masinter and M.S. Sridhiran, **Heuristic Q!?:E!DRAL: Analysis of Molecular Structure," Proceedinqs of the N4TO/CNA Alvanced Study Institute on Computer Representation and Yanioufati3n of Chemical Information, in press. (39) P. Clrhart and C. Djerassi, wApplications of Artificial Intslliqenze for Chemical Ir:ference XI: The Analysis of Cl3 NMR Data for Structure Elucidation Df Acyclic Aminesgl, J. Chem. Sot. I??rkin II), 1753 (1973). (40) L. Yasinter, N, Sridharan, and D.H. Smith, VtApplications of Artificial Intelfiqence for Chemical Inference XIX: Exhaustive ??neration of cyclic and Acyclic Isomers.tl, submitted to Journal of the American Chemical Society. (41) I.. Yssinter, N. Sridharan, R. Carhart and D.H. Smith, HApplicatLons of Artificial fnte.lliaenc~ for Chemical Inference XIII: An AlqorFthm for tabellinq Chemical Graphs", submitted to Journal of th? !m?rican Chemical Society. (47) ThcJ Determination of Phenylalanine in Ser!lm by Mass "raqm?n+oqraphy. Clinical 9iochem.. 6 (1973). By W.E* Pereira, T'.A. F13con, Y. Hoya!?o, 9. Summons snd A.P. Duffield. (43) The Simultaneous Quantitation of Ten Amino Acids in Soil Yxtracts by Mass Fraqmentoqraphy. Anal. Si ochew., 55, 236 (1973). "y H.E. Pereira, Y. Hoyaro, !J.F, Reynolds, 9-E. Summons and A-M. Duffie1-i. 144) An Analysis of Twelve Amino Acids in Rioloqical Fluids by Bass Fraqmsntoqraphv. Anal. Chea., in press. Ry F.P. Summons, #.E. 73 P-Zf?FC3, W.?. Beyn3lds, 1`.C. Pindfleisch and A.tr. Duffield, (45) l'hz :!u3ntitat_ion 3f R-AmLnoizDbukyric Reid in Urine by Mass Fraqmentoqcaphy. Cfin. Chim. Acta, in press. ey W.E. Pereira, n .F. Summons, B.E. Reynolds, T.C. Rindffeisch 3nd 1.Y. Duffield. ('IF;) ?hz npternination of Wh3nol in Plood an3 Urine by Pass ~r3~me~t.oqra phy* Clin. Chim. 9ct3, in press. ?y W.E. Pereira, c u ..-. qnmmon s, T, C. Pi~dfleiscE 31-11 R.N. I7uffielk3. ~ub?icsti.ons Descrihinq DENDRAL-Relate d Fesesrch But lJot Funded By ?tiiS nrant (47) An Automated Gas Chromatoqraphic Analysis of Phenylalanine In ser,1m. Clinical Biochem., 5, 166 [1972). Ry E. Steed, 8. ??C"if3* R. Falpern, pl. D. SoLomnn and A.!?. Duffield. (48) PVrrolizi3ine Alkaloids. XIX. Structure of the Alkaloid PracifDline. Cofl. Czech. Chen. Commur!., 37, 4112 (1g72). By P. Sedmora, 4. Klasek, A.??. Duffield and F. Santavy. (`14) chlorination Studies T. The Reaction of Aqueous Hypochlorous Aci? with Cvtosine. Riochem. Riophys. Res. Commun., 48, 880 (1172). Tzy w. Patton, V. Sacon, A.M. Dufficld, F. Halpern, Y. 'loya?l>, k'. Pereira and J, LedcrSetq. (571 A Stl!.ly of thy Flectran Impact Fragmentation of Promazine Sulphoxide and Promazine usinq Specifically Deuterated Analoques. Austral . J. Chem., 25, 325 (19731. !3y 4l.D. Solomon, R. Summons, 2. Pzreira and A.?!. Duffield. (511 Snectrometrie de Masse VIII. Eliminat.ion d'can Induite par fmpat=t "lectronique dans Lr Totrahydro-1,2,3,4-Napthtal-1,2. Qrg. Rass Spectre., 7, '357 (1973). By P. Petros, J.P. rYlorizur, J. Kossanyi and A.M. Duffield. 1521 Chlorination StlIdies II. The Reaction of Aq'leous Hypochlornus Aci? with a-Amino Acids and Dipeptiqes. Bio-him. et Biophys. Acta, 313, 170 [1373). By W.??. Pereira, Y. Hoyano, Ii. Summons, V.A. Dacon and A.!l. Duffield. (43) Spectromstrie de wasse. IX. Fraqmentations Induites par rmpazt Flez+ rorique de GLycoLs- En Serie Tetraline. Eull. Chim. sot, Trance, 2105 (1973). By P. Perros, J.?. Morizur, J, rtossanyi 3n;l A.+!* Doff ield. (54) The Use of I?ass Spectrometry for the Identification of !?~+.abolitss of Phenothiazines. Pr3ceedinqs Df the Third Tn+ornationaf Svmposium or Phenothiazlnes, Raven Press, New York, lQ73. Ry A.M. Duffield. [SS) chlorination Studies TV. The Reaction of Aqueous HypochLorous fizzi3 wi+b Pyriaidine and Purine Eases. Biochem. Biophys. Res. Zommun., 53, 1195 {1373) * R-7 Y. FJoyan3, V. fincon, R.E. Summons, K.F. ?erPira, R, Halpern and A.!?. Duffinld. (56) F;as;s Spectrometry in Structural and stereochemical Problems. ZCXXYVIT. Electron Impact Tnduced Rydrogen Losses and gigrations in S9,mc Aromatic Amides. prq. Hass Spectry., in press. Ry A.M. nrlfLfiClI1, ';. deWartino and C. Djerassi. (57) StabLe Ts3tope Mass Fraqmentoqravhy: Quantitation and Tvdrctqen-Deuteri.um Fxchanqe Studies of Eight F"urchison Meteorite Amino Rcizls. Geochem. it CosmQchim. Acta, suhaitted for nublicati3n. Eiy W.E. Pereira, F.E. Summons, T.t. Rindfleisch, 4 . Pi * Pllffi?ld, Es. Zeitmari ani! J.G. Lawless. (58) Pass Yp-cfrometry in Structural and Stereochemical Problems CLXXXIII. A Ctadv of the Electron fmpact Induced fragmentation of Aliphatic 911ehy3's. J, Amer. Chem. Sot,, 91, 6814 (1969). Ry R.3. Liedtke I?? c. Pjecassi. (`;?I Y3ss Speztrometry in Structural and Stereochemical Problems - CYZVTI. Flectron-Impact Tcducad Functional Group Interaction in s-~~nzyloxycyclohexy1 Trimnthglsilyl Ether. Orq. ??ass Spectrom. 3, 257 (7970). By Paul D. Woodqate, R3bi.n T. Gray and Carl ni3r3::si. IS')) y3ss Spectrometry in S+ru,, -t.?lral and Stereochemical Problems - CYzvIIr. A study of the fraqmentation Processes of Some a,B-Unsaturated Aliphatic Ketones. Orq. ?lass Spectrom., 4, 273 (1971)) . By Younus M. Sheikh, h.?!. Duffield and Carl Djerassi, (c;lr "ass Snectrometry in Structural and Stereochemical Problems err T, Int?ractioc of Remote Functional Groups in Acyclic Systems anan Efcctron Impact, J. Drg, Chem,, 36, 1796 (1471). By ?I. Sheehan, R.J. Spangler, M. Tkeda and C. Djerassi. (62) mass Spectrometry in Structural and Stereochemical Problems TJVTT. Praqmentation of Unsaturated Ethers. Orq. Yass Spectrom., 5, 895 (1971) s By 3. P. Morizur and C. Djerassi. (63) "4ass Spectromotry in Structural and Stereochemical Problems "CVITT. The Effect of Double Bonds Upon the McLafferty Rsarranqemant of Carbonyl Compounds. J. Amcr. Chem, Sot., 94, 473 (1712). Fv J.R, Dias, Y.Y. Sheikh and C. Djerassi. (64) P3s;s Spectrometry in Structxtral and Stereochemical Problems C~XV. Rehsvior of Phenyl-Substituted a,B-Unsaturated Ketones Upon Electron Tmpact. Promotion of f!ydrogen 9e3rranqement Processes. 7, orq. Them., 37, 776 (1972). By R.J. Liedtke, A.F. Gerrard, 3. Diekman 32:! c. Pjerassi. (55) Mass Spectrometry in Structural and Stereochemical Problems ??????? o ????? Delineation of C0mpetir.g Fraqment3tion Pathways of Zompfex Yolecules from a Study of Yetastable Ion Transitions of neuterate Derivatives, Orq. Mass Spectrom., 7, 367 (1973), UY D.A. Smith, A.M. Duffield and C, Djnrassi. (55) T!h= Clrhon-13 maqnetic Resonance Spectra of Acyclic Aliphatic 3 mi23s. J. Amer. Chem. SOC., 95, 3710 (1973). By Ii. Fqgert and - Djec3ssi. . . (57) The Carbon-13 Ruelear Plaqnetic Resonance Spectra of Keto Steroids. J, Orq. Chem,, 33, 3788 (1973). Ry H. Fqgert and C. Dj?rzssi. (68) nlsss Spectrometry in Structural and Stereochemical Problems "CYUXVIIT. The Effect of Heteroatoms [Jpon the Mass Spectrometric ?raqmen+ntion of Cyclohexanonps, J, 3rq. Them., in press, By 7.5. Rlock, P.H, Smith, and C. Djarassi. (63) M3ss Spectrometry in Structural and Stereochemical Problems ZCY LI7:. Applications of DADI, a Technique for Study of fletastable T3"3, to F?ixt,lre Analysis, J. Amer. Chem. SOC., submitted for aubfication. By D.F. Smith, C. Djerassi, P.R. ilaurer, and U. [70) Y2.s~ Spectrom~try in Sfruct.xral and Stereochemical Problems. The PraqRentation of Proqesteror.e an:3 Alkyl-Substituted Przq?secrnnes, in prenaration, by S, Hammerun and C. Djerassi. f711 Ar?pLications of Molecular Orbital Theory +o the Tctcroretation of Yass Spectra: Prediction of Primary Frs7nentatioR Sites in Clrqanic Folecu13~,~~ Org. Mass Spectrom,, 7, 1241 (19731, by G. Loew, 8. Chadwick, and 3.P. Smith. PROGRESS REPORT TEXT OF 1973 ANNUAL REPORT FOR RESEARCH PROJECT: RESOURCE-RELATED RESEARCH -- COMPUTERS AND CHEMISTRY Progress Report Part A. APPLICATIONS OF ARTIFICIAL INTELLIGENCE TO MASS SPECTROMETRY 3BJECTEVES: Research activities carried out under Pact A of this project have been directed toward extending the reasoning power of heuristic DENDRAL. Heuristic DENDRAL reresents a paradigm for attacking problems in one of the major areas of importance to any scientific discipline dealing vith nnlecules, the area of structure elucidation. We have focused our attenti.Dn on the use of heuristic programming techniques for analysis of mass spectra and ancillary analytical data which can be obtained utilizing a mass spectrometer. It is convenient to discuss objectives, progress and plans by examining three broad areas of activity in research connected with ?art A. He wish to note that these areas conform to our overall strategy of PLAN-GENERATE-TEST. We have shovn, earlier, how pouerful this strategy is when applied to the task of structure elucidation utilizing mass spectral data, The areas and their objectives are the following: (I) PLSNNER: (a] Extend th e programs used for structure elucidation to structural analysis of complex molecules. (b) Assess the capabilities an3 limitations of the PLANNER. (c) Generalize the programming techniques to reduce compound class dependence. (4) Explor e the utility of ancillary data available from the mass spectrometer. (11) STRUCTURE GENERATOR: (a) Complete the exhaustive, irredundant generator of molecular structures. (b) Develop efficient constraints an the generator to exploit its potential utility. (c) Exploit the concepts developed for the structure generator in solving various structure-problems (related to m.s. and others) and isomer-problems, (III) PREDICTOR (a) Extend the Predictor to still more complex molecular structures. (b) Explore the design of experimental strategies, utilizing Predictor functions, t3 differentiate among candidate solutions. We point out that the PLAN-GENERATE-TEST strategy, although applied to structure elucidation, has potential utility as a strategy for solving other chemical problems. Similarly, although we utilize mass spectral data aln3st exclusively, the same heuristic programming techniques allow facile extension to analysis of jata from other types of analytical instrumentation. These vere not objectives of the original research proposal but seem logical extensions for future work. We have illustrated the potential of these techniques for analysis of 13C NMR data (Carhart and Djerassi, 1973). This is discussed briefly under the PLANS section, below. PRC)GRESS: (I) PLANNER The function of the Planner is to analyze mass spectral data acquired in a compound. The Planner StttiI1pt.S to derive structural information from these data using the rules of behavior of compounds in the mass spectrometer. 3bjective (a): Extend Programs. The Planner is presently embodied in a program which also contains a set of functions to assemble this structural information into complete molecules {a primitive Structure Generator) and to test these molecular structures with other, not necessarily mass spectral, rules (a prinitive Predictor). This performance program was written in this way to provide a useful tool for chemical studies while more qeneral versions of the Structure Generator and Predictor were being ~oveLor>ed. This program and its performance have been described in some detail in a publication and ia previous progress reports. A manuscript (Smith, et-al., 1973) has now appeared describing the application of this program to the analysis of mixtures of compounds without prior separation. Dbjective (b): Assess Capabilities. We have extended the capabilities of the Planner so that we can snalyza both lou and high resolution mass spectral data. A low resolution mass spectrum is regarded by the program as a pseudo high resolution spectrum wherein possible elemental compositions of each peak are limited only by the inferred molecular formula of the compound. This results in more ambiguity with a commensurate increase in number of candidate solutions as would be expected considering the lower specificity of low resolution data as compared to high resolution data. We have extended our capabilities for molecular ion determination utilizing a heuristic search technique through the space of plausible molecular ions. This technique has had significant success even when dealing with the low resolution mass spectra of compounds which display no molecular ion, for example the class of derivatized amino acids [trifluoracetyl, n-butyL esters) important to studies carried out under Part B, below. We have segmented the performance program to decrease the amount of memory required for its operation. This should increase the chances for other groups to make use of the program. llhe limitations of the present performance program are primarily the requirement that some information about the class of compounds be ic3own, and that, for each class, relatively detailed rules about the Bass spectral fragmentation of this class be available. The former limitation results primarily from the nature of the program in that a complete structure generator is not incorporated. The primitive structure generator available to the program can only place substituents about an assumed skeleton. This limitation will be alleviated uhen a structure generator vith GOODLIST and BADLIST constraints is available (see Structure Generator, below). The latter limitation is mora fundamental, but is characteristic of every spectroscopic technique to one degree or another. It Dust be assumed that analysis 3f a mass spectrum.!, alona, m3y not lead to sufticicntly unambiguous information about the structure of the compound yielding the spectrum. It is Eor this reason that extensions of the programming techniques to encompass data from other spectroscopic techniques are attractive. 3 3bjective (c): Generalize Techniques. Pe have carried out several successful experiments to ensure that the performance program, used originally for analysis of estrogenic steroids, retains only procedures which are compound-class independent. By supplying fragmentation rules for other classes of compounds, we nave successfully carried out structure elucidation of molecules in several diverse classes including other steroijal hormones and related compounds (progesterones, testosteroaes, androsterones), steroidal sapogenins and derivatized amino acids. Objective (d): Explore Utility Previous progress reports have summarized in some detail the ways in which data from ancillary techniques in mass spectrometry (metastable ion an3 low ionizing voltage data, labile hydrogen exchange) can be used bg the program. The utility of metastable ions for aid in stucture elucidation continues as an active area of interest. Experience with the program has inspired studies on metastable ions, first, to help delineate the course of fragmentation of molecules with the purpose of extending and refining fragmentation rules used by the program (Smith, Duffield and Djerassi, 1973). Experience with the increased specificity of structural information with concomitant reduction in analysis time when metastable ion information is nvailaole (Smith, et-al., 1973) has led to a study of a new technique for detection and analysis of netastable ions (Direct Analysis of Daughter Ions, or DADI) and has illustrated the utility of this technique in mixture analysis (Smith, Djerassi, Haurer and Rapp, 1973) o Experience with the PLANNER has led to several research activities relate3 to, but not supported by, this grant. Our studies of estrogen aixtures isolated from pregnancy urine have suggested new compounds likely to be important in the human metabolism of estrogens. Some of these compounds are hitherto unreported structures and a synthesis prograa is underway in Professor Djerassl "s laboratory to produce some of these compounds. The Planner will be used as one method of comparison of the synthesized, authentic standards with those isolated from pregnancy urine. Hark is also being carried out to explore the fragmentation of model systems possessing two heteroatoms in close proximity. It is clear from the first of these studies (Block, Smith, and Djcrassi, 1973) that the fragmentation of these difunctional systems does not reflect that of monofunctional analogs. More groundwork is required in this area to obtain better fragmentation rules for these systems, II. STRUCTURE GENERATOR Objective (a): Complete the Generator l!he last progress report discu ssed the completion of both the basic structure generator algorithm and program, which provide the capability 6' for exhaustive generation of graph isomers of a given empirical formula, uith prospective avoidance of duplicate structures. Since the time of the submission of that report, manuscripts describing the structure generator, directed specifically to an audience of chemists, have been submitted (Masinter, Sridharan, Lederberg, and Smith, 1973; Masintcr and Sridharan, 1973). Some effort over the past year has been devoted to verification of the completeness and irredundancy of the method. He have extended existing combinatorial counting algorithms to check that 'the numbers of isomers generated are correct, iie have used an interactive version of the generator to verify that variations (allowed by the algorithm) of the mechanism of generation yield the same set of . Lsoiner3. Ir, this nay we are n3w increasingly confident that the program's performance accurately reflects the mathematically proven algorithm on which it is based. The Structure Generator has been briefly described, and placed in its context within Heuristic DENDRAL, in an invited paper presented at a NATO/CBS sponsored conference on Computer Representation and Manipulation of Chemical IaforaatiDn, held in Amsterdam in June, 1973 (Smith, nasinter and Sridhlrsn, 1973). We have also begun 60 develop techniques to expand the scope of the 3eneratBr. One example, which has been completed, is adding extensions to the CATALOG. The CATALOG contains the set of vertex-graphs from uhich structures are assembled. The original CATALOG was not sufficient to generate all isomers of some potentially interesting compositions, 2. ., Q those involving graphs possessing nodes of degree >3. He now have a program which constructs complete sets of vertex-graphs containing nodes sf degree >3 from the set of trivalent graphs in the original EATAL3;. IJ~ have thus extended the capabilities of the generator, Other such extensions are discussed in the PLANS section, below. Objective (b) : Develop Constraints It is absolutely essential that we provide the mechanism for constraining the Structure Generator: without constraints it is merely a legal move generator, as in a chess-playing program- For structure elucidation problems, the Planner can determine many features of the molecular structure from various types of experimental Data such as presence of functional groups, and the numbers of double bonds and rings. Partial information of this sort can be used to constrain the Structure Generator to the space of plausible candidate structures. From a graph-theoretic point of view, however, constraining the graph generating algorithm is a difficult unsolved problem. 14e are presently formulating sever31 types of constraints to apply to the structure Generatoi, Some types of constraint await the development of neu mathematical tools (see PLANS), vhile others can be immediately implemented with relatively minor alterations to the algorithm. The class of constraints presently receiving attention deals with types of unsaturation (rings or double bonds) desired in the final structures. Related to this constraint is the constraint of number of quaternary carbons present. The former information (number and nature of multiple bonds) is readily available from several spectroscopic techniques, uhile the latter may be obtained from 13C NtiR. The implementation of this class of constraints will be used as the model for future implementation 3f a GOODLIST (structural features known to be present) and a BADLIST {structural features known to be absent). It is possible that some types of constraints may not be easily implemented within the algorithm. Thus, retrospective tests of isomer; may be required to search fDr desired or unwanted features. We have developed some new approaches to graph matching which seem to be significantly more efficient than previous methods. Should prospective implementation of a constraint prove difficult, we vi11 5 83 have at our disposal some powerful graph matching tools to exercise the constraint. 3bjective (c) : Exploit th? Generator for Structure Elucidation ble have demonstrated the utility of some subsystems of the structure generator, e.g., the LABELLER, by exploring some problems of isomerism noted in the chemical literature. We have corrected the member and presented the identities of isomers formed by different substitutions of slkyl chains about a porphyrin nucleus. tie are presently exploring some problens of isomerism of carbocyclic ring systems, specifically C10~10 snd (CH) 10 and CJOHZn-4 tricyclic ring systems, n = 8 - 12, related to the mechanistics of isomeric interconversion. We hsve the complete list of all topologically possible 1176 &membered Diels- Rider ring systems, using any combination of C,N,O and S. This list was generated using the PARTITIONER and an extended version of the LABELLER. These are all the 6-membered ring systems that can be embedded in structures resulting from the well-known Diels-Alder reaction. Of the 1176 possible ring systems, approximately 80% are unreported in the Ring Indax. tiany of these are chemically unstable - underscoring the need for a BADLISF implementation for the Cyclic Structilre Generator. However, many of these unreported ring systems are certainly chemically plausible. Awareness of such gaps in relatively simple synthetic categories might lead to discovery of new categories of compounds with important biological effects, {III) PREDICTOR The function of the Predictor in the PLAN-GENERATE-TEST strategy is to perform a detailed evaluation of candidate solutions (structures) to a structure elucidation problem. It may use a more detailed model of spectroscopic behavior than that embodied in a Planner to attempt to differentiate among possible solutions. 3bjective (a): Extend the Program We have extended and generalized the Predictor used previously for saturated, aliphatic, monofunctional compounds. Given a list of structures and rules of fragmentation processes, it will predict a mass spectrum for each structure. Prediction of relative ion abundances is crude, but previous work has shown that even crude measures of ion abundance are usually satisfactory. The predicted spectrum can be matched then with the observed and candidates ranked according to the quality of the match. The program works with structures and rules of any complexity, An interesting philosophical question is how much klouledge should be brought to bear on interpretation of the data at the Planning vs. Predicting stages of analysis. It is our feeling that if more can be accomplished during Planning to constrain the Structure Generator, the analysis vi11 be more efficient. On the other hand, some knowledge can be utilized only if a complete structure is specified, so that its use is restricted to a predictive role. boreover, Predic,tor Functions have a greater utility, as indicated in the subsequent- section. Objective (b): Differentiate Structures The Predictor has a more obvious application in the design of experiaental strategies to differentiate among candidate structures. Rules of spectroscopic hehrlvior utilized during Planning demand the presence of some data to evaluate. The Predictor can then be used to request additional data from any source to aid in differentiation. We have explored this approach by analyzing the spectrum of a compound nith the performance program. The Predictor uas used to evaluate the the set of candidate structures to define the minimum number of metastsble defocussing experiments necessary to achieve a unique solution. Thus, no time need be spent acquiring unnecessary or redundant data. Clearly, this has important implications for future work in that many different types of data (e-g,, NMR, IR) might be requested by the Predictor to facilitate identification. PLANS For tha remaining period of this grant we Propose to carry out the following extensions of the research outlined above. (I) PLANNER The major area of activity related to the present version of the Planner will be to focus our attention on using the program in support of chemical studies outlined under Part i3 (see below). The chemical extraction and derivitization procedures used in the analysis of body fluids restricts the types of compounds present in each separated fraction. Such simplifications make this a problem more amenaDLe to attack. Only certain class'es of compounds are present in each fraction, and we have some knowledge of the mass spectral fragmentation of these classes. He wish to couple the program to the results of library matching procedures so that we direct our efforts to structure elucidation of those components which have not been previously identified. This is particularly important in the context of analysis problems such as those discussed under Part B, He propose increasing the utility of the program by removing two present constraints: (a) allow unspecified "dummy" atoms in the skeleton instead of requiring a rigidly fixed structural skeleton, and (b) allow fragmentation processes to be specified more flexibly - in particular, allow fragmentations in substituents on the skeleton instead of requiring all fragmentations to cut through the skeleton. Although we are presently uncomfortable with immediate coupling of the structure Generator to the Planner, we propose continued exploration of the problems of controlling the generator automatically. Actual implementation auaits a more comprehensive treatment of the problem of constraints. II. STRUCTURE GENERATOR The inclusion of a reasonable set of constraints is obviously required and will be the subject of most of our future development work. We plan to develop an interface to the present interactive version of the Structure Generator that speaks a more chemical language. This intecfsce will be designed to avoid the present requirement that the b.. user knov something about the program before he can use it. As the Dptinum method for implementation of a constraint is determined, the interface will be extended to translate the usual specification of the constraint in chemical terms into rules acting at the level of the program. As we stressed ir, development of the PLANNER, there are considerable advantages to building a powerful program in an incremental fashion. These steps are loyically directed to our longer term goal of developing a useful structure elucidation tool for the chemist, based on the structure generator, There are several other areas of interest which are l,eripherally related to the problem of constraints and which will occupy our attention. The Structure Generator knows no chemistry other than atom names and their associated valence. There are several important areas where this is an immediate problem. For example, the program has no explicit awareness of the aromatic resonances, leading to a remediable redundancy in the list of isomers. An aromaticity-predictor is also indispensable for anticipating chemical behavior of a structure. We wish to deal with types of isomers besides simple connectivity isomers. We need to have the facility for assembly of molecular sub-structures (the usual type of information inferred from spectroscopic data) uhen such an assembly yields new rings or multiple bonds, All the above questions need a reexamination of the fundamental mathematical considerations. The present algorithm has been proven to yield complete and irredundant solutions. In devising new algorithms or variants of the present one, the burden of proof can be reduced to (the usually easier) equivalence to the previous algorithm, Professor Harold Brown, who was the mathematician instrumental in initial development of the labelling algorithms for structure generation, will be with us again for several months to help attack the problems outlined above. XII. PREDICTOR Although the Predictor has been essentially finished for our own internal use, we propose to spend a modest amount of time in the coming months making it more usable by others. In particular, we uish to extend the initial work on predicting the new experiments necessary for distinguishing among candi.-iate structures (e.g., predicting that a metastable peak at mass 70.1 vould confirm one structure and disconfirm another). In addition, we plan to vork on cataloging some existing sets of mass spectrometry rules in such a way that the program can be easily used for different classes of problems. d . Part A references (Published or submitted during year) R. Carhart and C. Djerassi, "Applications of Artificial Intelligence for Chemical Inference XI.... D.H. Smith, B.G. Buchanan, R.S. Engelmore, H. Adlercreutz, and C. Djerassi, **Applications of Artificial Intelligence for Chemical Inference, IX. Analysis of Mixtures Without Prior Separation as Illustrated for Estrogens,t' J. Amer. Chem. Sot., September 5, 1973. D.H. Smith, A.M. Duffield, and C. Djerassi, '*Mass Spectrometry in Structural snd Stereochemical Problems, CCXXII. Delineation of Competing Fragmentation Pathways of Complex Molecules from a Study of Metastable Ion Transitions of Deuterated Derivatives," Org. Mass Spectrom., 7, 367 (1973). D.H. Smith, C. Djerassi, K.H. Maurer, and U. Rapp, "flass Spectrometry in Structural and Stereochemical Problems, CCXXXIV. Applications of DADI, A Technique for Study of Metastable Ions, to Mixture Analysis," J. Amer. Chem, Sot., submitted (1973). J.H. Block, D.H. Smith, and C. Djerassi, "Mass Spectrometry in Structural and Stereochemical Problems, CCXXXVIII. The Effect of Heteroatoms upon the Zass Spectrometric fragmentation of Cyclohexanones," J. Org. Chem., submitted (1973). L.M. tiasinter, N.S. Sridharan, J. Lederberg, and D.H. Smith, "Applications of krtificial Intelligence for Chemical Inference XII. Exhaustive Generation of Cyclic and Acyclic Isomers,l' J. Amer. Chem. sot., submitted (1973). L.H. Hasinter and N.S. Sridharan, 18Applications of Artificial Intelligence for Chemical Inference, XIII. Labelling Objects Having Symmetry, J. Amer. Chem. Sot., submitted (1973). D.H. Smith, L.M. Hasinter, and N.S. Sridharan, tlHeuristic DENDRAL: Analysis of Molecular Structures," to be published in the proceedings of the NATO/CRS Advanced Study Institute on Computer Representation and Hanipulation of Chemical Information. N.S. Sridharan, nComputer Generation of Vertex Graphs", Stanford Computer Science Memo CS-73-381, Stanford University, July, 1973. Part B-i Gas Chromatograph - Mass Spectrometer Data System Development 3BJECTIVES AND RATIONALE The objectives of this part 3f the research project are the improvement of GC/MS data system capabilities and the coupling of extracted data to the Heuristic DENDRAL programs for analysis- ide ultimately seek a substantial degree of interaction between the instruaentation and the analysis programs including computer specification and control of the data to be collected. In addition to the development goals, this portion of the project provides for the day-to-day operation of the GC/MS systems in support of mass spectrum interpretation computer program development (Parts A and C) and applications 0 f GC and MS to biomedical and natural product sample analysis nith collaborators. Our rationale for this approach is that the overall system should be designed for problem solving rather than just for data acquisition. This implies that analytical computer programs, after review of available experimental data, could be able to specify additional information needed to confirm a solution or distinguish between alternative solutions. Such reguests could be passed back to an instrument management program to set up proper instrument parameters and collect the additional information. Our initial objectives to implement an on-line, closed-loop system using the ACME computer facility have met with a number of difficulties. These grow principally out of ACME's limited computing capacity and commitments as a general time-sharing service. In addition, the scanning high resolution mass spectrometer has inherent sensitivity limitations, which do not preclude a demonstration but rather limit the practical sample volume which could be analyzed. Until such limitations can be overcome, particularly in terms of computing support, ue have focussed our efforts on an open-loop demonstration of such an approach. PROGRESS Progress has been made in demonstrating a GC/fiigh Resolution Mass Spectrometry capability, in further developing automated data analysis algorithms, and in planning for the implementation of a data system for the collection of metastable ion information, Progress in these and other areas directed toward the main research goals has been impacted by a transition in computing support uhich is still underway. This transition, discussed in more detail below, vas occasioned by the phase-over of the ACME computing facility, which we had been using, from NIH grant subsidy to a fully fee-for-service operation under Stanford University auspices. Summaries of the results and problems encountered in each of the areas follow. Gas Chromatography/High Resolution Mass Spectrometry (GC/HRMS) iie have verified the feasibility of combined gas chromatography/high resolution mass spectrometry (GC/HRMS). Using programs described in previoils reports, we can acquire selected scans and reduce them automatically. The procedures are slow compared to Vtreal-timeN1 because of the limitations of the time-shared ACME facility. Ue have recorded sufficient spectra of standard compounds to show that the system is performing well. A typical experiment which illustrates some of the parameters involved was the following. A mixture (approximately 1 aicrogran/c3mponent) of methyl palmitate and methyl stearate was analyzed by GC under conditions such that the GC peaks were well separated and of approximately 25 sec. duration- The mass spectrometer was scanned at a rate of 10.5 set/decade, and a resolving power of 5000. The resulting mass spectra displayed peaks over a dynamic range of 103 to 1 and were automatically reduced to masses and elemental compositions without difficulty. Mass measurement accuracy appears to be 10 ppm over this dynamic range. 14e have begun to exercise the GC/HRMS system on urine fractions containing significant components whose structures have not been elucidated on the basis of low resolution spectra alone. k'hereas more work is required to establish system performance capabilities, two things have become clear: 1) GC/HR:fS can be a useful analytical adjunct to our low resolution SC/MS clinical studies (Part B-ii), and 2) the sensitivity of the present system limits analysis to relatively intense GC peaks. This sensitivity limitation is inherent in scanning instruments where one gives up a factor of 20-50 in sensitivity over photographic image plane systems in return for on-line data read-out. This limitation may be relieved by using television read-out systems in coniunction with extended channeltron detector arrays as has been proposed by researchers at the Jet Propulsion Laboratory. iJe can nevertheless make progress in applying SC/HRMS techniques to accessible effluent peaks and can adapt the i.mprDved sensor capability when available. These experiments have also shown that the ACHE computer facility cannot reliably provide the rapid service required to acquire and file repetitive spectometer scans. This problem is to be expected in a heavily used time-shared facility without special configuration for high rate, real time support. Excepting possible requirements for real time data analysis [such as in a closed-loop system), this problem could be solved by implementing a large local buffer (e.g., disk) at the front-end data acquisition mini-computer. We are exploring this possibility in conjunction with the overall planning for computer support discussed below, Data Analysis Algorithms Al Peak Resolution One of the significant trade-offs to be made in GC/HRMS is that Of sensitivity versus resolution. In maintaining high instrument resolution (in the range of 5,000-10,000) while scanning fast enough to analyze a Gc effluent peak (approximately 10 secldecade), system sensitivity is constrained as discussed above. We have worked on a method for reducing instrument resolution requirements through more sophisticated computer analysis of a lover resolution output, In effect this transters the burden of overlapping peak detection and mass determination to the computer instead of reguiring inherently well resolved data out of the instrument. The advantage comes in better system sensitivity, Unresolved peaks are separated by an analytical algorithm, the operation of which is based on a model peak derived from known singlet peaks in the data. Actual tabulated peak models are used rather than the assumption of a particular parametric shape (e.g., triangular, Gaussian, etc.). This algorithm provides an effective increase in system resolution by approximately a factor ot tnree thereby effectively increasing system sensitivity. By measuring and comparing successive moments of the sample and model peaks, a series of hypotheses are tested to establish the multiplicity of the peak, minimizing computing requirements for the usually encountered simple peaks. Analytic expressions for the amplitudes and positions of component peaks have been derived in the doublet case in terms of the first four moments of the peak complex, This eliminates time consuming iteration procedures for this important multiplet case. Iteration is still required for more complex multiplets. B. GC Analysis The application of GC/MS techniques to clinical problems as described in Part B(ii) of this proposal has indicated the desirability of automating the analysis of the results of a GC/MS experiment. The SC/tiS output involves extracting from the approximately 700 spectra collected during a GC run, the 50 or so representing components of the body fluid sample. The raw spectra are in part contaminated with background "column bleedI and in part composited with adjacent constituent spectra unresolved by the CC. We have begun to develop a solution to this problem with promising results. By using a disk-oriented aAtrix transposition algorithm, the array of 700 spectra by 500 mass samples per spectrum can be rotated to gain convenient access to the "mass fragmentogram" form of the rlata. The transposition alqorithm avoids many successive passes over the input data file as would be required in a straightforuard approach. By yenzrating a reorganize3 intermediate file, time savings by factors of 5-10 are achieved. The fraqmentogram form of the data displayed at a few selected mass values, has been used at Stanford, MIT, and elseuhere for some time I to evaluate the GC effluent profile as seen from these masses. Mass fragmentograms have the important property of displaying higher resolution in localizing CC effluent constituents. Thus by transposing the raw data to the mass fragmentogram domain, ye can systematically analyze these data for baselines, peak positions, and amplitudes, and thus derive better mass spectra for the relatively few constituent materials. These are free from background contamination and influences of adjacent GC peaks unresolved in the overall gas chromatogram. These spectra can then be analyzed by library search techniques or first principles as necessary. We have applied a preliminary version of this algorithm to several urine samples. These contain several apparently simple peaks which in fact consist of multiple components. The algorithm performs well in separating out these constituents although further testing is required. Closed-loop Instrument Control In the long term, it could be possible for the data interpretation software to direct the acquisition of data in order to minimize ambiguities in problem solutions and to optimize system efficiency. The task of deciding among and collecting various types of mass spectral information (e-g-, high resolution spectra, low ionizing voltage spectra, or sel?ctzd metastable ion information) under closed-loop control during a GC experiment is difficult. Problems arise because of the large requirements placed on computer resources and present limitations in instrument sensitivity or data read-out imposed by the time constraints ot GC effluent peak widths. solutions to these problems may not bc economically feasible within currently existing tecc!irioloqy but seem achievable in the future- We are studying this problem in a manner which would entail a multi (two' DC three) - pass system. This permits the collectioc of one type of data (e.g., high resolution mass spectra) during the first GC/MS analysis. Processing of these data by DENDRAL will reveal what additional data are necessary on specific GC peaks during a subsequent GC/MS run. Such additional data could help to uniquely solve a structure or at least to reduce the number of candidate structures. This simulated closed-loop procedure could demonstrate the utility of DENDRAL type programs to examine data, determine solutions and propose additional strategies, but will not have the requirement of operating in real-time, Some parameters in the acquisition of particular types of information, such as metastable data, will require computer control, even in the open-loop mode. We have considered plans to implement tuo aspects of instrument control, in addition to the magnetic scan control implemented for GC operation and reported previously. These include system resolution control, such as would be required t3 change from normal spectrum scanning mode to metastable scanning mode, and high voltage control necessary to selectively measure metastable ion fragmentation data. In addition to these we have considered the discrete switching of various electronic mode controls which are straightforward and not discussed in detail. Implementation plans for computer control of these instrument functions have been delayed because of the ACHE computing facility transition which diverted the necessary hardware and software manpover. Resolution control involves changing the widths of the slits at the exit of the ion source and the entrance to the ion multiplier detector. Additional source and electrostatic analyzer voltages must be controlled to optimize performance, as discussed later. Mechanical slit adjustment is accomplished on the MAT-717 instrument by heating wires uhich support the slit jaws. The resulting expansion or contraction of those wires move the spring-loaded jaws. As implemented by the manufacturer, the time constants involved in heating the control wires are 5-10 seconds. It is possible to speed this up to approximately 0.5 seconds by application of a controlled over voltage decreasing to the appropriate equilibrium value for the desired slit width. This was demonstrated by a series of experiments on an extra slit assembly mounted in a vacuum jar in our laboratory. Cooling of the wires is relatively fast in the way they are mounted so no problem exists in that direction. It is desirable to have feedback to indicate the actual slit width achieved rather than relying on a slit assembly calibration. Stretching of the support wires or changes in the spring tension under temperature cycling would change this calibration- An optical scheme to measure slit width in situ is possible. We do not contemplate implementing this feedback immediately because it requires major chanqes to the instrument flight tube. Two types of metastable ion relationships are obtainable by suitable control of the double-focussing instrument, First, for a given daughter ion, one can trace the parent ions which give rise to it. second, for a given parent ion one can trace the various daughters to which it gives rise, The first measurement ("metastable defocussing") is the more straightforward for this instrument since parent ions can hC? enumerated by a si:aple scan of the accelerating voltage, holding the electrostatic analyzer (ESA) voltage and magnetic field constant. The second type of scan requires the coordinated scan to two of the three fields. Ye feel that joint computer control of the accelerating voltage and ESA voltage is the simpler approach since the magnetic field is more difficult to set and monitor because of hysteresis effects. For a resolution of 1000 in the metastable ion mass measurement, the voltages must be set to approximately .O 1-.02X accuracy. This requires a 14-16 bit digital- to-analog (D/A) converter to control the input (10 volts) to the operational amplifier which generates the high voltage. Similar D/A controls of ion source voltages for ion current and focus optimization can be implemented using optical isolators to allou vernier control of the various high voltages around the nominal 8KV values. Computing Transition As mentioned earlier, the transition of the ACME computing facility from NI1-i subsidy to Stanford -sponsored fee-for-service operation has impacted our development efforts this past year. Both the low resolution instrument used for routine body fluid analysis research nnd the high resolution instrument are affected. All computing support was previously obtained from the ACME facility, much of it as core research without explicit transfer of funds. The transition has required consideration of both technical and economic factors. The neu facility represents a combination of the previous ACME interactive and real time computing load with various administrative and batch computiag loads on a new IBM 370/158 computer. This new environment will have even more difficulty in supporting real time computing needs than ACNE did. No real time support has been available since the 360/50 service was discontinued on July 31, although terminal service was reestablished in mid-August, Data acquisition service via the IBM 1800 is expected to be operational by early Novemher- For the high resolution instrument, this transition, as a minimum, necessitates an i.nterfacc modification (we previously sent data through the IBM 2701 interface no longer to be supported). It also amplifies the problems we ancountered in sending and filing high rate mass spectrometer data (particularly during GC/MS runs). These problems mould be present to s3me extent in any general time-sharing service machine without specific hardware and software configuration provision for these needs (such provisions for real time support had been proposed in our SUMEX computer application). After examiLing a variety of alternatives, we conclude that a d.?dicated mini-computer solution (built around a machine uith the arithmetic capability of a PDP-11/45) would be highly attractive technically and relatively inexpensive. A stand-alone mini-computer system would cost in the range of RSO,OOO-$60,000, augmenting existing equipment, plus approximately $9,000 per year maintenance and $2,000 per year for supplies. Estimates for 370/158 support, based on current chargin,; ril~jc)r`i'ih~:s and previous utilization experience, run from 535,000-550,000 per year. This spread is caused by uncertainties in the effects of planned measures to increase operating efficiency and possible changes to the rate structure- In any case, the mini-computer approach pays for itself in 7 to 2 years of operation and provides the responsiveness or` a dedicated machine for real time support. Unfortunately our existing budget does not provida for this solution. The budget is very marginal for purchase of computing support from the 370/158 as well. This later approach is the only currently available one, however, since it can be implemented with relatively low start-up cost. The effect of budget limitations appears in terms of a reduced number of samples which can be run. We have attempted to minimize the other budget costs (manpower principally) to increase the computing funds available. This will necessarily impact our development goals. We hope, in the renewal application for DEUDRAL support, to be able to implement the more effective mini-computer approach for the high resolution spectrometer as a longer term solution. We have undertaken an interim mini-computer solution for the low resolution spectrometer (Finnigan 1015 quadrupole) which is primarily used for our body fluid analysis studies. For the same reasons outlined above, a mini-computer solution is attractive. In the case of the low resolution quadrupole instrument, a lesser capacity machine will suffice for immediate data acquisition and display functions. We have implemented such an interim system on a PDP-11/20 machine available from other funding sources. This system, which is now operational, allows the acq?lisition of GC/YS data, limited by the capacity of the DEC tape storaqe medium to approximately 600 spectra, per experiment. Por certain types of GC analyses, up to 1030 spectra per experiment are required so this limits. to some extent, the utility of this interim system. A calcomp plotter is supported for display purposes. A fixed head disk 'provides for library search procedures which are still being converted from the ACME system. We have applied to the NIH-GMS for funds to augment this system in order to relieve current limitations as part of a Genetics Center research proposal. FUTURE PLANS Our future plans are basically to continue development along the lines outlined above. We vi.11 complete the computing support transition steps described. These include primarily establishing a connection to the new 370/158 facility to provide interim support for the high resolution system. We will pursue additional software and hardware development goals as far as possible within the limited budget available, These efforts will concentrate for the most part on bringing up a metastable ion analysis data system. It should be reemphasized that the manpower levels proposed in the follow-on budget have been minimized to allow for purchasing computing time on the 370/158. The allocated manpower is required primarily for instrument operation and maintenance with minimal provision for development efforts. Part B(ii). Analysis of Body Fluids by Gas Chromatography/Mass Spectronetry. The chemical separation of urine into the following fractions prior to GC/MS analysis has been described in previous DENDRAL Reports: free acids (analyzed by gc/ms as their methyl esters) amino acids (analyzed by gc/ms as their N-trifluoroacetate n-butyl ester) carbohydrates (analyzed by gc/ms as their trimethyl silyl ether derivatives) hydrolyzed acids (analyzed by gc/ms as their methyl esters) hydrolyzed amino acids (analyzed by gc/ms as their N-trifluoroacetate n-butyl esters) During the past year we have extended these methods of fractionation to the follouing body fluids: bload [after an initial precipitation of proteins by the addition of ethanol) and amniotic and cerebrospinal fluids. The following summarizes the results obtained from an analysis of these fluids during the past year by gas chromatography-mass spectrometry- URINE ANALYSIS: A. The Development of a flMetaboliclt Profile Characteristic of NeDnatal Tyroscnemia Using Combined Gas Chromatography-Mass Spectrometry. This work vas carried out in collaboration with clinical colleagues from the Department of Pediatrics at Stanford University and a joint publication describing this research is in preparation. The study was based on a total of one hundred and four 24-hour urine samples from sixteen premature or small birthweight infants receiving treatment in the Stanford nursery. After exclusion of infants who became ill, died, or left the nursery, we were able to follow nine infants closely for periods of between 4 and 6 weeks from day 3 of life. All nine infants had birthweights of below 15OOg and three of these uere below 1OOOg. Of the nine infants studied, five showed transient tyrosinemia as shown by a marked elevation in the urinary excretion of the tyrosine metabolites, p-hydroxyphecyllactic acid, p-hydroxyphenylpyruvic acid and p-hydroxyphenylacetic acid. There was also a less marked but distinct elevation in the urinary tyrosine output. Figures 1 and 2 show the metabolic profiles of the same infant (J.L.) in the normal (a) and tyrosinemic[b) states, Figure 1 shows the free acid z)utputs, chromatographed as the methyl ester-methyl ether derivatives and Figure 2 is an expression of the free amino acids of the same urines, chromatographed as the N-trifluoroacetyl n-butyl ester derivatives. In each case the concentration of each metabolite is a function of the peak heiqht as compared to the height of the internal standard. Table 1 is a summary of the ranges of urinary output of tyrosine and metabolites observed for all the infants in the study. ThBLE 1 Dally Excretion in mg/kg Tyrosine p-HPLactic p-HPPyruvic p-RP Acetic Normal 0.2 - 3 o-5 0 - 0.5 0.2 - 2 Tyrosinemic 3 - '15 5 - 50 0.5 - 5 0.5 - 5 As shown by Table 1 and Figure 1 neonatal tyrosinemia is characterized by a very large increase in the output of p-hydroxyphcnyllactic acid and by a lo-50 fold excess of the latter over p-hydroxyphenylpyruvic acid. Studies of the hereditary defects in tyrosine metabolism initially indicated that p-hydroxyphenylpyruvic acid uas the major metabolite althouqh more recently cases have been reported where p-hydrDxyphcnyllactic is in 3 2-5 excess over p-hydroxyphenylpyrnvic. These latter determinations were made using GC and GC/MS methods and therefore probably reflect the improved specificity of the analytical procedure (previously colormetric methods Here used) rather than a difference of actual metabolic profile. Apart from the very large excess of p-hydroxyphenyllactic acid over its keto analog we could detect no significant differences between the profiles shoun in neonatal tyrosinemia and those published for hereditary disease. Other metabolites such as p-hydroxymandelic acid, DOPA N-acetyltyrosine, which have previously been reported in tyrosinemic urine were not seen to be elevated, B. SC/MS Analysis of Urine from Children Suffering from Leukemia. This research was carried out with twenty 24-hour urine samples supplied by Drs. Jordan iiilbur and Tom Long of the Stanford Children's Hospital. The acidic fraction of all urines studied in this project showed no abnormal metaholites nor were gross amounts of knoun acids detected. I'he amino acid fraction, however, of six of the urine samples showed the presence of an non-protein amino acid, beta-aminoisobutyric acid (BAIB). In several of these instances the patients were excreting in excess of 1 gram of DAIB per day. The literature contains many references to increased BAX:B excretion (genetic excretors, lead poisoning, pulmonary tuberculosis, march hemoglobinuria, thalassaemia and Down's Syndrome). The reported excretion of BAIB by leukemic patients was not substantiated by another investigator. There are several criticisms in the literature of the methods used for the quantitation of BAIB in biological fluids and in order to fill this void a sensitive, specific and rapid method for the quantitation of BAIB has been developed. (SEE: The Quantitation of bAIB in Urine by Mass Fragmentography; Ii-E. Pereira, 8, Summons, W.E. Reynolds, TIC. Rindfleisch and A.M. Duffield, in press). C. GC/MS Analysis of Urine from Patients Suffering from Hodgkin's Disease. During this study 20 urine samples from patients with diagnosed Hodqkin's Disease (Department of Oncology, Stanford University Medical Center) were analyzed and in general, no abnormal metabolic profile could be found in any urine. There vas one exception in which an individual was noted to excrete massive quantities of adipic acid (of the order of 1 gram per day). I). Detection of Metabolic Errors by CC/MS Analysis of Body Fluids. This project results from a collaborative effort between the Departmeuts of Genetics and Pediatrics of the Stanford University Medical Center. l!o date over 50 samples have been analyzed; the majority i35) being `7 %!r urine, while amniotic fluid (lo), blood (6) , and cerebrospinal fluid (6) were also analyzed. `It has been and will continue to be our practice to analyze aliquots of fluid samples in collaboration with clinical investigators obtained for valid diagnostic purposes completely divorced from this research on GC/XS analysis techniques. This investigation is not iatended to serve as a screening program for a large population hut rather to focus on those individuals who exhibit suggestive clinical manifestations such as psychomotor retardation and progressive neurologic disease as well as suggestive pedigrees. In the case of amniotic fluid the hope is to be able to monitor the condition of the fetus in those pregnancies which might be considered at risk. To date we have investigated specimens from normal pregnancies in order to establish the catalog of compounds to be observed in amniotic fluid. From this base it could prove possible to identify materials which might identify the health of the fetus. We have been able to confirm the presence of erotic acid in a urine from a person found to have erotic aciduria uhile another urine sample was used to domonstrate our ability to identify the characteristic metabolites present in isovaleric acidemia. The following description refers to a urine from a child with hypophosphatasia. A child died 33 hours after birth in Fresno, California, with the classical signs of hypophosphatasia. This genetic defect is marked by high phosphoethanolamine (PEA) concentrations in urine of affected honozygotes and unaffected heterozygotes. Atter derivatization (in this instance the TNS ethers of the water soluble carbohydrate fraction were prepared) we were able to detect by GC/MS large concentrations of ethanolamine and phosphoric acid but not PEA itself. The derivatization procedure we used most likely hydrolyzed PEA. We were able to quantitate for this compound in the infant's urine using an amino acid analyzer, and PEA excretion was extremely high (over 200 times normal values for infants) confirming the diagnosis. Next we examined urine samples from the child's parents, presumed heterozygotes, by GC/MS and by the amino acid analyzer, Again, no PEA was detected by the former method although the presence of ethanolamine and phosphoric acid was demonstrated. We determined the follouing excretion levels of PZA by amino acid analyzer: Newborn infant: 94 micromoles per 100 ml. (Normal 0.21-O-33) Father: 269 micromoles per 24 hours (normal 17-99) Hother: 32 micromoles per 24 hours (normal 17-99) It is of interest that in this family the affected infant and his unaffected father both show subnormal serum alkaline phosphatase activity. The mother, who did not excrete increased amounts of PEA, uas found to have normal activity of this enzyme in her serum. The following table summarizes the serum phosphatase activity measurements: Neuborn infant: 0.2 units* (normal 2.8-6.7) Father: 0.7 units (normal 0.8-2-3) @other: 3.4 units (normal 0.8-2.3) f* - 1 unit is that phosphatase activity which vi11 liberate 1 millimole of p-nitrophenol per hour per liter of serum) El Drug Analysis Service Using GC/ES Ide were recently contacted by physicians to rapidly identify a drug self-administered by a patient in the Stanford University Hospital. From the mass spectrum the drug was identified as pentazocaine within the hour. Although not part of the formal DENDRAL proposal we expect that similar cases may arise in the t`uture and we intend to respond positively to such requests. Development of Library Search Routines for Mass Spectrum Identification The analysis of a single body fluid fraction produces between 600 and 750 mass spectra. In order to cope with the interpretation of the daily production of mass specta (about 8 body fluid fractions for a total >f between 4,800 and 6,000 mass spectra) ue have begun the implementation of library search routines. Concurrent with the analysis of body fluids for metabolic content we have been recording the mass spectra of many reference compounds. This collection represents the beginning of the construction of a library of reference spectra. Late in 1973 ue expect to receive from Dr. S. Markey, University of Colorado Medical Center, a more comprehensive library which he has collated from contribrltors (including our own laboratory) in the field of biological applications of gas chromatography/mass spectrometry. Prior to the demise of the ACME computer faciliity at Stanford University, we ran library search routines on data collected from urine fractions. Because of the ACHE system being heavily loaded, our programs took about one minute per compound identification- However, the experience gained will be used to implement library search routines on our current PDF 11 GC/MS data system. In addition ye have sent mass spectra from several urine analyses to Dr. S. Gsotch, Jet Propulsion Laboratory, Pasadena, California, in order that he .could use his library search routines on real data. In this instance the limiting factor for efficient compound identification was the library content vhich was limited to a few compounds of biological significance, In addition those compounds of interest that were present in the library were often in a derivatized form different from that used in our analytical methodology. Application of GC/HRMS to Body Fluid Analysis We reported in the last annual report of the DENDHAL project that the Varian MAT 711 mass spectrometer uas interfaced with a gas chromatograph for the recording of low resolution mass spectra. We have n3w used this system tar the recording of HRflS of gas chromatographic rractions f con urine analyses. We were able to record HRMS scans over several gas chromatographic peaks of interest in a number or urine fractions. The high resolution results were found to be of a high quality in mass measurement accuracy. Uhen using the MAT 711 instrument for GC/HRMS the sensitivity of the ion source was a limiting factor in that less intense gas chromatographic peaks often lacked sufficient material to generate acceptable high resolution mass spectra. Notwithstanding this limitation the HRMS data recorded on different urine fractions uas used to confirm the identification of several metabolites. If by chance the metabolite of concern was available only in quantities insufficient for direct GC/HRMS, preparative GC would be used to concentrate the component of interest for subsequent HRMS, `7 97 RESOURZE OPERATXON Over the term of this cjrant our mass spectrometry laboratories have provided support to cuu'.~rous research projects in addition to the DENDRAL computer program development project funded under this grant. These cover a variety of applications at Stanford, in the United States, and abroad. Included are problems in the study of human netabolites, biochemistry, and natural product chemistry. Samples have been run in wllaboration biith outside people both on the EAT-711 GC/High Resolution Xass Spectrometer system and the Pinnigan 1015 GC/Low Resolution Quadrupole Plass Spectrometer system. The low resolution system has also been supported by a NASA research grant. The following tables summarize the support rendered in terms of numbers of samples run through various types of analysis: I. EAT-711 High Resolution System (Period covered 11/'71 - 6/73). Batch Batch GC/High GC/Low High Resol. Low Resol, Resol. Hesol. HS MS MS MS DENDRAL program devel. 317 3 Stanford Genetics (Body fluid analysis) 39 17 Stanford Chemistry (non- DENDRAL - Dr. Djerassi's qroue 1 91 112 Stanford Chemistry (non- DBUDRAL - Drs. Vantamelen, Johnson, Masher, Collman, Altman, Goldstein) 29 23 4 Stanford Surgery {Dr. Fair) 8 Dr. Adlerkreutz (Finland) 10 DC. Venien (France) 26 Dr. Gilbert, Hors, Baker (Brazil) 40 Dr. Orazi (Argentina) 19 Dr. Subramanian (India) 10 Dr. Khastqir (India) Dr. O'Sullivan (Ireland) Dr. Badr (Libya) Dr. Hital (India) 44 1 5 5 5 30 5 -e-e ---- T3 G-LA 13 50 624 215 13 54 s~lllplcs samples samplc5 sa mp 1~:; 11) FINNIGAN 1015 Low Resolution System (period covered 8/72-a/73) Note the samples run are specified by fluid type. Each fluid is extracted and dzrivatized as described in Part B (ii) and therefore may represent several GC/LK?iS analyses. Specific discussions of the results of various of the analyses run are discussed earlier in Part B(ii). Stanford Pediatrics (Drs. Cann, Sunshine and Johnson) Stanford Oncology (Dr. Rosenberg) Stanford Psychiatry - Genetics (Drs. Brodie and Cavalli-Sforza) Stanford Respiratory Medicine (Dr. Robin) Stanford Pharmacology (Dr. Kalman) Stanford Biochemistry (Dr. Stark) Stanford Children's Hospital (Drs. Wilbur and Long) UC San Francisco Eedical School - Dermatoloqy (Dr. Banda) Hen10 Park V-A. Hospital (Dr. Forrest) Palo Alto V. A. E!ospital {Drs. Hollister and Green) University of Puerto Rico School of Medicine (Dr. Garcia-Castro) GC/Low Resolution MS ---I_ ----- 141 urines 7 Amniotic Fluids 6 bloods 2 cerebrospinal fluids 20 urines 4 cerebrospinal fluids 2 urines 2 bloods 2 extracts 4 extracts 24 urines 2 urines 13 extracts 7 extracts 7 urines ----I--------- 243 samples ? /' i-I- ??@ PART B PUBLICATIONS Ilhe following sjlmmarizes the publications resulting from research in the low resolution mass spectrometry laboratory over the past year, including body fluid analysis, This laboratory has been jointly supported by NItl (DE:NDi(AL) and NASA. 'L'he listed publications include rosearch relevant to hoth sponsors. The Determination of Phenylalanine in Serum by Hass Fragmentography. Clinical Biochem., 6 (1973) By W.E. Pereira, V.A. Bacon, Y. Hoyano, R. Summons and A.M. Duffield. The Simultaneous Quantitation of Ten Amino Acids in Soil Extracts by Mass Fragmentography Anal. Biochem-, 55, 236 (7973) By. W.E- Pcreira, Y. Hoyano, W.E. Reynolds, R-E. Summons and A.M. Duffield. An Analysis of Twelve Amino Acids in Biological Fluids by Hass Fragmentography. Anal. Chem., By R.E. Summons, W.E. Pereira, W.E. Reynolds, T.C. Rindfleisch and A.Fi. Duffield. The Quantitation of B-Amino isobutyric Acid in Urine by Mass Pragmentography. clin. Chim. Acta, in press By d,E, Percira, 6.E. Summons, il.E. Reynolds, T.C. Rindfleisch and A. M. Duffield. The DetermFnation of Ethanol in Blood and Urine by Mass Pragmentography. clin. Chim. kcta By W.E. Pereira, H.E. Summons, TIC. Rindfleisch and A.M. Duffield. A Study of the Electron Impact Fragmentation of Promazine Sulphoxide and Promazine using Specifically Deuterated Analogues. Austral. J. Chem,, 26, 325 (1373) By M.D. Solomon, R. Summons, W. Pereira and A.M. Duffield. Mass Spectrometry in Structural and Stereochemical Problems. CCXXXVII. Electron Impact Induced Hydrogen Losses and Migrations in Some Aromatic Amides Org. Mass Spectry-, in press. By A.M. Duffield, G. DeMartino and C. Djerassi. Spectrometrie de Masse. IX. Fragmentations Xnduites par Impact Electrorique de Glycols- -En Serie Tetraline Bull Sot. Chim. France, 2105 (1973). Spectrometric de passe VIII. Elimination d*can Induite par Impact Electronique dans Le Tetrahydro-1,2,3,4-Napthtal-ene-diol-1,2- Or9 Mass Spectrc., 7, 357 (1973). By P. Perros, J.P, Morizur, J, Kossanyi and A.M. Duffield. Chlorination Studies I. The Reaction of Aqueous Hypochlorous Acid with Cytosirie. Biochem, Biophys, Hes. Commun., 48, 8RO (Y972) By w. Patton, V. Bacon, A.M. Duffield, B. Halpern, Y. Hoyano, ii. Pereira and J. Lederberg. Chlorination S%udies II. The Xeaction of Aqueous Hypochlorous Acid with -Amir~o Acids and Dipeptidcs. Biochim. et Biophys. Acta, 313, 170 (1973). BY- w. E. Pereira, Y. Hoyano, R. Summons, V,A, Bacon and A.IY, Duffield. Chlorination Studies IV- The Reaction of Aqueous Hypochlorous Acid with Pyrimidine and Purine Bases. Biochem. Biophys. Res. Cornmun., 53, 1195 (1973). By Y. Hoyano, V. Bacon, H-E. Summons, W.E, Pereira, B. Halpern and A.H. Duffield. Part C. EXTENSION OF THE THEORY OF MASS SPECTROflETRY BY COiClPUTER DBJECTIVES: Part C of the DENDRAL effort, termed Meta-DENDRAL, aims at providing theory formation help for chemists interested in the mass spectrometric behavior of new classes of compounds. Our goals are necessarily long-range because theory formation by computer is itself an exciting, unsolved problem in computer science. We have chosen to explore this problem in the context of mass spectrometry in order to make frontier computer research results available to working scientists. The problem of finding judgmental rules for use in a computer program is common to many biomedical computing projects, such as medical diagnosis and therapy recommendation programs. In oriier to give these programs the knowledge that makes them perform at acceptable levels, a medical expert is often asked to summarize his own knowledge of the problem area in rules that the proqram can use. The neta-DENDRAL theory formation program is a paradigm for the kind of assistance that computers can give to the medical experts in this role. Programs of this sort can, first of all, provide the expert uith an interpreted summary of a large collection of "hard" empirical data. Second, the program can suggest to the expert plausible rules that appear to explain major features of the data. rhus, the expert is abl e to assimilate large collections of data in the rules given to the computer. iue believe that the meta-DENDRAL work is a useful model on which fruitful uork in other biomedical problems can be based. The over-all strategy of this research is to model the theory formation activity of scientists. i/e start with a set of empirical data which are known molecular structures and their associated mass spectra. By exploring the possible mechanistic explanations of each sass spectrum, the program is able to find a set of mechanisms that appear to be characteristic for the class of molecules. These characteristic mechanisms constitute the general mass spectrometry rules for the class, or a first-level theory for the class. Further refinements of the rules give more sophisticated restatements of the theory. k'e have designed the programs in such a uay as to provide useful results from the intermediate steps. The progress section discusses several sets of results that have been obtained, even though the entire program has not yet been completed. PROGRESS: In the past ten months (since January, 1973) the theory formation proqrams have seen significant application and significant new extensions. In addition, the work has been described in publications for both chemists and computer scientists. Applications of Existing Programs. The INTSUM program, for interpreting and summarizing the mass spectra of many known compounds of one class, was described in the previous annual report as essentially tinished. In this last period we have used this program to help understand the mass spectrometry of several classes of compounds, including estrogens, equilenins and other estroqenic steroids, androstanes, alkyd pregnanes, vinyl quinazalones, amino acids and aromatic acids. An article written for mass spectroscopists and soon to appear in Tetrahedron {Smith, et-al, enclosed) describes this proqram and its usefulness in unbcrstanding the previously unreported mass spectromctry of the equilenins. The amino acid and aromatic dcid results are useful for interpreting the mass spectra taken from those fractions of urine (see Part b). The INTSUM program is available to anyone who requests it, as stated in the article soon to appear. Because of the complexity of the proqraa, we recommend that mass spectroscopists use this program on a netuork computer after they have collected a number of mass spectra from a class of compounds whose fragmentation mechanisms they wish to investigate, Recent Extensions to Meta-DENDRAL. In this last period significant progress has been made on the theory formation programs that use the interpreted summary of the data provided by the INTS'IJH program. h simple rule formation program, described previously {HI7), finds the characteristic mass spectrometry mechanisms for a class of c0mpounds, assuming +.hat the compounds exhibit regular behavior as a class. Recent work has removed the restriction that the compounds must behave as a class - important classes can he found by the proqram within the set of given compounds, The procedure was described in a paper for the Third International Joint Conference on Artificial Intelligence, which is enclosed. [it the same time that the rule formation program looks for characteristic mechanisms, the class separation procedure refines the class of molecules that appear to behave uniformly (i.e., appear to exhibit most of the characteristic mechanisms). Another important extension of the theory formation program makes the rule descriptions more general and less specific to the class of compounds studied. The mechanisms in the rules are now described generally in terms of the kinds of bonds that break, and not in terms of the precise relations of the bonds to the skeletal structure common to the class. For example, a rule is now stated as "hny bond that is the second bond from a nitrogen atom is likely to break", rather than "In the skeleton R1-C2-N3-C4-R5 the bond betueen atoms 1 and 2 and the bond between atoms 4 and 5 are both likely to break". These general descriptions will allow much more freedom in the kinds of interpretations that can bz placed on the INTSUH results. It is possible, for example, to alter the set of predicates used to describe bonds without altering the program. The program can be conceptualized 3s a search program throuqh the space of possible combinations of predicates. Some predicates describe the type of bond (e-q., 'single'), others describe the atoms joined by the bond (e.g., 'nitrogen', 'secondary'), and others describe the bonds and atoms next away from the bond that breaks. Some a priori heuristics . limit consideration of complex predicates to chemically meaningful combinations, for example, by forbidding consideration of a single atom as both carbon and nitrogen, Other heuristics guide the process of expansion by forbidding a new predicate to be added to a description if its addition reduces the explanatory poner of the existing description. For example, if a high average intensity is associated with breaking the X-X bond in X-X-Y dnd frtrtt:er specification of either of the X's reduces the average intensity, then the description is not changed. In addition to the work just mentioned, a generative model of rule formation has been pursu!:d bj7 Carl Farrell in his dissertation work directed by Professor I:ei.g?nbaulr; and Dr. Uuchanan. He has written a program which accepts, as input, descriptions of specific molecules and all the primitive actions that miqht explain the mass spectra of those molecules. The output of the proqram is a set of general situation-action rules that describe classes of molecules that seem to be characteristically show evidence of significant actions. PLANS In the following period we plan to increase the performance capabilities Df the theory formation program in several ways, 1. Sample Selection. The program's current strategy is to find the rules exhibited by most or all of the molecules in the initial sample. If the molecules are diverse, the rules will be diverse, Thus, we plan to add a preprocessor that can select a lfsim~lel' set of molecules for the rule formation to work with. For example, unbranched {straight-chain) compounds should be expected to preticnt fewer complications for initial theory formation than highly branched compounds. The effects of the complicating features can be studied after the simple rules have been found. 2. Rule Clarification. After simple rules have been found, ue want the program to clarify the conditions under which the rules hold. By studying more complicated molecules, the proqram can find the simple rules that no longer hold for these cases. For example, we want the program to discover that terminal alpoa carbons (as in CH3-X-N) are special- Or, the program should discover the effects of double bonds by examining neu cases even thouqh the molecules in the original set contained no double bonds. 3. Experimentation. Because the original set of molecules contains the simpler examples from which it is easier to find characteristic mechanisms, the program will need to clarify rules in the way suggested under (2). For a human scientist, this means describing new experiments to perform that will help place limits on the range of applicability of the rules, Looking at additional arbitrary molecules may be helpful, but not as helpful as looking at the specific molecules that will resolve specific questions about the preliminary rule set. 4. Integration of Results. When the program has examined two or more classes of molecules, it I_ should be able to integrate the results into a common set of mechanisms (if any are co,nmon). The set of predicates used by the integration program may not have to be wider than the set used by the rule formation progran, but one would expect the rules themselves to be more qeneral. For example, intcqrating aliphatic amine and ether results should combine the separate alpha-cleavage rules {one with nitrogen, one with oxyqen) lnt0 a more goneral rule (specifying 'N Or O', Or 'hetcroatom'). PART C REPERZMCZS {Published or submitted during this year) D.H. Smith, B.G. Buchanan, h'.C, L'hite, B.A. Feigenbaum, C. Djerassi and J. Lederberg, "Ap;r;lications of Artificial Intelligence for Chemical Inference X. INTSUM. A Data Interpretation Program as Appliezl to the Collected Mass Spectra of Estrogenic Steroidst@. Tetrahedron. In press. B-G, Buchanan and N-S. Sridharan, "Analysis of Behavior of Cheaical Molecules: Rule Formation on Non-Homogeneous Classes of Objscts4'. In proceedinqs of the Third International Joint Conference on Artificial Intelligence, Stanford University (August, 1973)- (Also Stanford Artificial Intelligence Project Memo No. 215.) Related Publications DI Michie and D.G. Buchanan, l'Current Status of the Heuristic DENDRAL Program for Applying Artificial Intelligence to the Interpretation of Mass Spectra". August, 1973. E.H. Shortliffe, S.G. Axline, B.G. bUChanan, T-C. Merrigan and S.N. Cohen, "An Artificial Intelligence Program to Advise Physicians Regarding Antimicrobial Therapy". Computers & Biomedical Research. In Press. HUMAN SUBJECTS HUMAN SUBJECTS As a part of this research project, GC/MS analysis techniques will be applied to human body fluids in collaboration with clinical investigators, and blood and urine specimens will be collected from human subjects. Collection of VOIDED URINE SPECIMENS presents no risk to the patient. Collection of blood samples will not be taken solely for the purpose of this research but rather would be collected as part of a diagnostic procedure deemed necessary for clinical diagnosis. The undersigned agrees to accept responsibility for the scientific and technical conduct of the project and for provision of required progress reports if a grant is awarded as the result of this application. Principal In+stigator APPEhDIX A FIGURES l-3 N 0 I. 2 l+ --; -,> -- 2 2 3 4 - .!I. -3 --i -- c i 2 - L+ -: -. -- c i 2 -4 -2 -I. 0 i 0 0 -1 - I-f -3 2 1 3 1 I; -3 1, --Lt '2 0 -: -I - ;' -3 3 -1 -2 -3 -- -2 -3 -1 E IXTlU 0.0, 2:j. `i- 46.; -.. ; - TJ. Li -.,I, . ci --..I .L' I .,' 3 L ' , .) . L: -ql+.-; -21.3 2 , 1: -22.5 0.3 21 . l-f 17.t -;/ -9. - 14 . 5 13.1 36.5 -43.7 -20.3 3.; 26.5 48.7 - -31.6 -5.2 15.3 38.7 . . . . . . -- . . -- . . -- . . . . . . -- . . -- . . __C_`-;__-_ FIGURE FIGURE 2 Ff',E DPH01594 USING REP FILE HP801564 SOURCE IS urine-fraction:aminoacids(Gehrke technique Ct?P ID IS GVP-17H UBCEFINED MASS OF flC!L ICN DATE RUB 730313 RUN CN KAT 711 E8P DOUN SCAEi THBESHOLD= 4, FEAKS REJECTED 107 POE AREA, 0 FOR UIDTB SAflP RATE=10800 , MIN WIDTfI= 2, !!IP AREA= 18 (15.6 SECS, 1.49 DECS), TDEC=10,5 NUHRER CF PEAKS POUND=170 ARRAY SPACE USED( 1750/ 8000)= 21.9% 38 CALIE MASSES WITH LAST CKE=455,0, 6 BASSES ABCVE NOT POUND PlISSED CALIBRATION MASSES: 281.0 ************t***** TABLE OF PROJECTICN EBRCBS ***+*+************** 28.0 9.5 31.0 43.6 44.0 2.8 51.0 24.9 69.0 -0.3 81.0 0.3 93.0 O-3 100.0 0.2 112.0 -0.5 119.0 -0.2 131.0 -0-2 143.0 0.1 155.0 0-l 162.0 0.1 169.0 -0.9 18 1.0 2.4 .193.0 -2.3 205.0 1.5 212.0 -0.2 219.0 -0.9 231-O 4.1 243.0 -4.4 255.0 1.8 269.0 -0.9 293.0 1.5 305.0 0.4 319.0 -2.0 331.0 2.6 343.0 -2.2 355.0 2.6 367.0 0.6 381.0 1.1 393.0 5.1 405.0 4.4 417.0 -1.7 431.0 2.0 443.0 -4.0 455.0 6.6 BASS PILE flPHOl594 HAS BPEN CREATED WITH 84 MASSES FOUND SABPLE EASE PEAK IS 3.7 VCLTS Ah'D REF EASE PEAK IS 174.5 YCLTS HATCHING TCLERAHCE= 4,COO M!lU FRCE WASS 40 TO 300 WATCHING C BASS 41.03931 43,053t?8 55.01820 55.05510 56.06277 57.07068 64.01433 6e.96245 68-97505 69.03101 70.03085 75.00519 83.01006 84.04382 85-02852 15 H 20 0 5 N 4 FL 6 AREA Ri!lU ERRCR CCBPOSITICN 58.0 0,183 c 3H 5 5.4 -0,899 C 3H 7 3.5 1.153 C 1H 1N 3 -0,189 c 3E 30 1 12.0 0.321 c 4E 7 33.9 0,180 C 4H 8 72.8 0.248 C 4H 9 4.0 -1,721 C 1H 45: 3 1.863 C 2H 2PL 2 3.7 4.7 5.4 -1,698 c 2R 31 3 -3.04 1 c 4H 5c 1 4.4 2.907 C 1H 21 4 1.564 c 3E 40 1B 1 -3,405 H 4 N 2FL 2 6.2 -1,682 H 10 21J 3 -3,025 c 2E 30 3 0,560 C 3H 1PL 2 -0,585 fI 20 IFL 3 5.8 -1.911 c 2H 1c 1N 3 -3.253 c 4H 30 2 -'O-814 c 2B 2PL 3 6.0 0,230 c 2R 4bi 4 -1,113 C 4B 6C 11 1 35.3 0.914 c 2H 3c 1P 3 -0.429 c 4H 50 2 2.011 96.00504 9.5 -0:83 3 -2.170 -3.513 3.896 2.751 -1.073 134.96208 3.3 137.00035 4.6 1.639 -2,383 0,494 -2, 185 -3,528 -3.330 2,934 -1.088 -0,890 -2.233 139.02548 5.6 2,567 -0.113 -1.455 -1.258 -2.600 2.129 2.327 0,984 -0.160 -3.985 140.03549 3-2 2.065 0.722 -1.957 0,920 -0.423 -3.102 3.162 -1.807 152.03288 66.1 2.141 -0,538 -1.881 -1.683 -3.025 1,704 1.902 0,559 0,757 -0.586 153-03532 4.9 2,620 1.278 -3,254 ' 1,475 0.133 0.331 -3.69 1 -1.012 C'2H 4FL 3 C la 4c 5 c 20 1H 4 C 4H 20 21 1 C 5fl 10 1FL 1 C 2H 20 2FL 2 c 2B ,l bl 1 FL 3 C 5H 10 3N 2 C 10 H 10 1 C 2B 2C 4N 2EL 1 C 50 1N 3fL 1 C 7H 2C 2PL 1 C 2H 10 2nr 3PL 2 H 10 2ti 2FL 4 c 58 1FL 4 U3PL 5 c 2ti 2c 1FL 5 C 1H 5C 5N 3 c 4R 3c 21 4 C 6B 5C 3N.l C 1H 4C 3N YPL 1 C 3H 6C 4N 1PL 1 C 78 3N 1PL 2 C 2H 2N 4FL 3 C 4H 40 1N 1PL 3 C 1B 5C 2N 1FL 4 C 1E 4B 2FL 5 C 4FI 4C 21 4 C 6H 60 3N 1 C 9H 4N 2 C 1 H 5 0 31 4FL 1 C 3B 7C 4N 1FL 1 C 6H 50 1N 2PL 1 C 48 5-O 1I 1PL 3 C 1H 5N 2PL 5 C 2H 6C 5N 3 C 5R 4C 2N 4 C 7H 60 3N 1 C 2E! 5C 3bl 4FL 1 C 4B 70 4N 1FL 1 C 8R 4N 1FL 2 C 38 3N 4PL 3 C 5H SC 1N 1PL 3 H 40 1N 4PL 4 c 2B 60 21 1PL 4 C 9R 31 3 CllR 50 1 C 2B 7C 5N 3 C 6B 40 1N 3PL 1 C 8B 6C 2PL 1 C 3H SG 2N3fL 2 C 8EI 5bl 1FL 2 C 5H 7C 3fL 2 -3.494 2.573 164.00092 3.5 3.877 2.534 -0.145 2,732 1,389 -1.290 -2,435 -3,580 3.829 1.150 173.98967 6.3 . 1.963 -2.059 0.818 -3,204 -3,006 3.258 -0.764 174.99890 11,4- -2.896 0.491 3,368 -0,654 2.223 -0,456 -1,799 -1.601 -2.943 0.641 178.98990 3.5 0,599 -3.423 -0,546 -1.691 1,894 -2,128 -3.273 180-02676 87.2 -1,582 -2,925 1.805 -2.727 3.339 2.003 0,660 -3,872 3.537 2.194 181.03192 12.5 2.973 3.171 -0,851 al.828 2.026 0.683 -0,653 -l-Q96 C 3H 4N 4PL 3 C 6R 5FL 4 C 50 3N 4 C 7H 2G 4N 1 c 10 0 1N 2 C 28 1C 4N 4FL 1 C 4H 3C 5N 1PL 1 C 7H 1C 21 2PL C 4H 20 3N 2FL : C 1Ei 30 4N 2FL 3 C 2R 20 31 1FL 4 c 5at 2FL 4 C 50 3N 2PL 2 c 10 0 1FL 2 C 2H 10 4N 2FL 3 C 7H 10 2PL 3 C 20 2N 3FL 4 C 21 2FL 6 C 5FL 6 C 7H 10 3N 3 C 13 FL 1 c SH 10 3N 2FL 2 ClOH 1C 1 FL 2 C 2B 20 4N 2FL 3 c 50 1N 3PL 3 C 7H 2C 2FL 3 C 2B 10 2N 3FL 4 C 4H 3C 3FL 4 C 5H 1FL 6 c 70 3N 2FL 1 C12C 1FL 1 c 4n 10 4N 2PL 2 C 1H 2C 5N 2FL 3 C 20 2% 2FL 5 c 7FL 5 C 4H 1C 1FL 6 c 6H 40 3N 4 c 8H 6C 4N 1 C12E.I 3N 1FL 1 C 3H 5C 4N 4FL 1 c 6H 6C 4FL 2 c 7B 2N 4PL 2 C 9H 40 1N 1PL 2 Ef 6C 5N 4PL 2 C 1B SC 4N 3FL 3 c 3a 7C 5FL 3 ~12~50 2 c 7H '4C 2N 3FL 1 c12EJ 4N 1PL 1 c 9A 6C 3FL 1 c 4H SO 3N 3PL 2 c 6H 7C 4FL 2 c 7H 3P 4FL c 9H 50 1I 1;L 2 190.99983 196.95901 198.03705 199.03798 199.98764 200.99631 217.99628 0.881 -01462 7.6 3,119 1.776 -0.903 1.974 -0,705 -2.048 -1,850 -3.193 1.537 -2,995 3.2 1co.o 2.670 -1.862 -3.204 2: 868 1.525 -3,006 3,060 1,723 0,380 3,258 3.7 -0.190 2,489 -1,533 -1.33s -2.678 2.052 -2,480 -3,823 . 2,250 0,907 7-7 1,838 -2,184 0,693 -3.329 -3,131 3.133 -0,889 1.988 -2,039 -1,836 26.7 2.681 -1,341 -1,143 \ 1,536 72,486 -2,288 -3.63 1 1,099 -3.433 3.976 4.4 -2.778 C 1E 60 4P 3FL C 3H 80 5FL 3 c 7H 10 411 3 C 9'8 30 S C128.10 2B 1 c 4B 20 5B 3PL C 70 2N 4EL 1 c 9H 20 3bT 1PL C 4H 10 319 4FL C 6H 3C 41J 1 FL C 10 N 1PL 3 C 1H 20 41p 4PL C15B 4N 1 C 6H 60 419 4 C 8H 80 5s 1 C 10H 3 P 4 FL 1 C12H 5c lB1 1 FL C 3H 70 SH 4FL c 6H 8C SFL 2 c 7H 4C lbi'4FL C 9H 60 21 1 FL C 1H 7C SI 3FL ClOB 50 2a 3 c 7H 7c SN 2 c 12 H 70 3 c 7B 6G 38 3FL C 9H 80.4PL 1 C13H 5FL 2 C 4H 7C 4Ei 3PL C 6H 90 SPL 2 C 8B 4N 3PL 3 C 10 H 60 1FL 3 c 90 4-B 2 Cl40 2 3 1 1 . 2 2 3 1 1 2 2 3 1 2 c 6H 10 sat 2PL. 1 c 11 ?I 1c 3fL 1 C 60 3N 3FL 2 C 40 31 2FL 4' c 90 1EL 4 C 1H lG 48 2PL 5 c 6H 10 TFL 5 C lo 28 3fL 6 C9HlO4a2 c 14 H 1c 2 C 90 20 3PL 1 c 6H IO 58 2FL 1 CllH 2C 3FL 1 c 6H 1C 3% 3PL 2' c 8H 3C &FL 2 C12PL 3 c 3B 20 UN 3PL 3 ., C 4H 1C 3N 2FL 4 Cl20 2li 3 223.98842 - -3.231 10,9- 2.617 1.472 -2.550 -2,352 -3.695 -3.497 3.912 -0.110 2.767 -1,255 225.02393 6.2 3.802 -2.072 3,994 2.657 1,315 1.512 0.170 -2,510 0.368 -0;975 226.03322 10.2 -0,603 2.784 2.982 1,639 -1.040 1.837 0,494 -2,185 0,692 -0.651 236.99301 3.3 -0.625 'I.770 1.815 -2.207 Ok670 -3.352 -0.099 -3.923 3.486 -0.536 2,341 -1.681 1,196 -1,483 -2,826 219.99133 7.9 0.443 -0.702 -1,846 1,738 -2.284 0,593 -3,429 -0.551 C 9H 2C 5N 2 c 9n 10 3N 3PL 1 Cl00 2ar 2FL 2 c 15 PL 2 C 7H 1c 31 2EL 3 c 12 n 10 1PL 3 C 4H 2C UN 2FL 4 c 70 1N 3PL 4 C 9H 20 2FL 4 Cl20 3ii 2 C 9H 10 UN 2FL 1 C 6H 20 5N 2PL 2 C 70 2N 2PL 4 c 12 PL 4 C 4H 10 3N 2PL 5 C 9H 1C 1PL 5 C 1H 20 UN 2FL 6 c 40 1P 3PL 6 Cl10 4P 2 c 8H IC 5N 2FL 1 C 13 H 1c 3PL. 1 C 80 3N 3FL 2 ClOH 20 UPL 2 C 5H 1C 4N 3fL 3 C 60 3N 2PL 4 c 11 0 1FL 4 C 3H 1C 4E 2FL 5 c 8H 1c 2PL 5 C 14 H 1N 4 C 7H 50 51 4 C 10 H 6C 5PL 1 CllH 20 1N 4PL 1 C13H 4C 2N 1FL 1 C 8H 30 2N 4FL 2 c 10 H 5 $7 3N 1PL 2 C13H 3N 2PL 2 C 5H 4C 3N 4PL C 7H 6C 4N 1 FL 3 C 7H 6C 5B 4 C13H 50 2N 1FL 1 C 8H 4C 2N 4PL ClOH 6C 3N 1 FL ; C13H 48 2FL 2 c 5H 50 31 4PL 3 C 78 7C 4N 1FL 3 C 10 H 50 1N 2FL 3 C 2H 6C 4B 4FL 4 c 4H 80 5N 1PL 4 C 12 H 1c 4B 2 C 9A 2C 5N 2FL 1 c 10 0 2N 2PL 3 c 1s FL 3 C 7H 10 3N 2PL 4 c 12 H 10 1PL 4 243.99310 245-00569 247.98857 248*99155 254.10146 2=-10947 5.8 '5.3 4.2 4.7 39.5 3.3 -0.475 -3.154 -1.620 2,209 1,064 -2.760 -0.081 -3,905 3.504 -0.518 2.359 -1.663 1.214 -1.586 3.144 1.999 0.656 0.854 -0.489 -1.825 -3,168 -0,291 -1,634 2.766 1.621 -2,401 -2.203 -3.546 -3.348 0.039 2.916 -1.106 -0.908 -2.087 -3.232 3.032 0.353 3,230 -0,792 -1,937 -3.082 -01052 -1,395 -1,197 3.533 2.190 2.388 1,545 -at, 634 1,243 -0,100 0.143 -1.200 -3.879 C 4H 20 4N 2 PL c 7c 1N 3FL 5 C 1H 30 5N2PL c14c 3E 2 C 11 H 10 4N 2FL CllC 2ti 3PL 2 C 88 20 5N 2PL C 8B 1c 3s 3PL C 9C 2N2FL 4 c 14 PL 4 C 6H 10 3N2PL C 11 H 1c 1PL 5 C 3H 20 4N2PL ClOH 3C 5N 3 c 14 0 1N 3PL 1 c 11 tf 1C 2N 3FL C13H 30 3FL 2 c 8H PO 3N 3PL ClOH 4C 4PL 3 C 11 N 4PL 3 c13a 2c 1 bl.1 PL C 5H 30 4N 3PL c 7B SC 5PL 4 Cl30 4N 2 c 10 B 1C 5N 2PL c 15 H 1C 3FL 1 Cl00 3N 3FL 2 C12H 20 4Ft 2 C 7H 10 4N 3PL c 13 0 1PL 4 C 5H 1C 4N 2FL C 10 H 10 2PL 5 C SC 2E 3PL 6 C 13 H 1C 412 ClOH 2i: 5N 2PL C 88 2C 5N 1 FL CtlO 2N 2FL 3 C 3H 10 5N 4FL 5 6 2 3 3 4 1 3 5 1 3 4 c 8H 1C 31 2FLa 4 C 5H 2C 4N 2TL 5 C 2H 30 SN 2PL 6 ClOH140 UN 4 C12H160 5N 1 C 7HlSC 5tI 4FL 1 Cl18120 1N 4FL 2 C13Hl4C 2N 1PL 2 C 8H13C 2H. /c-rti --i5.c - L .-la 1 PL c 1nTiTrzr 3 N-co, C 5Hl4C 3N 4PL 4 C 7Hl6C 4N 1PL 4 ClOH150 4N 4 . . C12H170 5X 1 C15H15C 21 2 i55.99071 4.4 256.94409 19.5 261.98169 7.4 262.99121 19.3 269.98486 4.2 273.98608 10.7 -1,002 3.728 2.385 2.583 1.240 -1.439 1,438 3.833 -0.189 -1.334 -2.479 3,785 1,106 -2,916 3,983 -0,039 -1,184 -0.900 -2.045 0.395 -3.627 2,023 0.878 -2,946 -0.267 3.3.18 -q*704 2.173 . .-1,849 '1.101 1.299 La2.723 -3.868 3.541 2.396 2.594 1,251 -1,428 -2.749 -3,894 3.515 2.370 -1,652 -1.454 -2,797 -2,599 -3.942 0.788 c 7H160 SN 4FL c 11 H 13 0 1N 4PL Cl3Hl5C 2N 1 FL C 8H14C 2N 4PL ClOB160 3N 1PL C13H14N 2FL 3 c 5H150 3ar 4PL Cl00 5N 4 Cl50 312 C 12 H 1C 4N 2PL C 9H 2C 5H 2FL C 7H 2C 5N 1 PL Cl00 2N 2FL 4 Cl5PL 4 C 2H 1C 5s 4FL C 7H 10 3N 2PL c 4H 2 c 4N 2FL Cl10 4b( 2PL 2 C 8R 1C 51 2PL C 60 3rm 2PL 6 c 11 0 1PL 6 c 14 c 3N 2PL 1 c 11 H 1C 4N 2PL Cl10 2N 3PL 3 c 8H 2C 5N 2PL C 9C 2N 2FL 5 C14FL s c 6H 1C 3N2PL CllH 10 1PL 6 C 13 H 1C 5N 1 FL C 80 5ti 4PL 2 Cl30 3N 2PL 2 C 10 H 1C 4N 2FL Cl10 3N 1 FL 4. c 8H 1C 4N 1 FL C 30 4-B 4FL 6 C 5H 2C 5N 1PL C 80 21 2FL 6 c 14 0 41J 3 c 11 H 1C 5N 3FL c 12 0 4N 2PL 2 C 9H 1C 5N 2PL C 14 H 1c 3PL 3 C 9 c 3Ei 3PL 4 c 11 Ef 20 4FL 4 ,c 6H 1C 4N3PL c 8H 30 5FL 5 c 12 0 1FL 6 *$$$**$*****$*t*$ MISSED CALfBRATICN &!~sS=281.O ***lb***** 28 1.08472 4.7 -3,910 CllH13C SN 4 3,499 C12H120 4N 3PL 2,157 ClYHlYC SPL 1 0,820 c 15 H 10 c 1N 4FL 4 1 2 4 5" 6 3 2 3 6 1 3 5 ' 6 1 3 5 1 1 ,, 2.354 -0.325 -1.668 -1.470 -2.812 1.917 282.09814 11.1. 1.728 -3,439 3.970 2.825 0.146 3.023 l-681 -0.999 -2,144 282.99707 5.3 1.422 0.277 -2.403 -3.547 -0,868 3.861 2.717 -1.305 285.98682 9.3 -2.073 -3.218 3.046 -0,976 -0.778 -2,121 -1.923 -3.266 1.464 -3.068 290.98560, 4.6 1.376 0,231 -3,593 2.671 -1.351 1,526 -2.496 0.548 -0.597 -1,742 i-843 -2,179 3,794 -1.175 -2.320 -2.122 -0,576 0.719 1,882 0.737 -3,087 293.98828 4.7 297.97510 3.2 . 311.97388 312.98804 4.9 11.1 c 9R13C 5N 3YL 2 C12H110 2N 4FL 2 C14~13C 3N 1FL 2 C 9H12C 3N 4PL 3 C 11 H 14 C 4 N 1FL 3 C 15 H 11 N 1FL 4 C 11 H14C 5N 4 C13H150 4N 2FL 1 C 14 H 1_4_.-C-~_ 3. He 1 FL -n-i!3 C -2 4 N 1 FL c 14 u 13 0 1N 2FL 3 C 6Hl40 4N 4 FL C 8H16C 5N 1 FL CllH140 2N 2FL C 8H15C 31 2FL C 14 H 1C 31 2FL c 11 A SO 41 2FL c 14 0 1P 3FL 3 C 11 H 1C 2N 3FL C 8H 3C 5N 2FL c 12 0 1N 2FL 5 c 9H 1C 21 2FL c 14 R 1FL 6 c 15 0 41 3 c 12 R 10 5N 3 FL ClOH 1C 5ti 2PL c 15 H 10 3PL 3 c 10 c 3u 3PL 4 C 12 H 2 0 4FL 4 C 7H 1C 41 3PL c 9A 30 5FL 5 C 13 G 1FL 6 C 4H 20 5N 3FL Cl50 41 2PL 1 c 12 R 1-G 5H 2fL c 12 0 3Ei 3PL 3 Cl00 3bl 2FL 5 c 15 c 1EL 5 c 7H 10 4N 2FL c 12 H 1C 2PL 6 C 150 3 N 2 FL 2 c 12 A 1C Ybl 2FL C 9H 2C SN 2FL C 10 0 2 El 2 FL 6 c 15 FL 6 c 15 0 5FL 2 Cl20 4N 1FL 4 C 9H 10 5N 1 FL c 40 5s 4PL 6 c 15 0 51 1PL ClOC 4N 1 FL : c750 3H 2PL 3 c 12 H 1C 4N 2FL C 12 0 2 N 3 FL 5 4 4 6 : 2 6 3 4 5 4 ., 40 60 80 100 120 140 160 co< FIGURE 3. Gc/LRMs (FINNIGAN 1015 QUADR~POLF: mss SPECTRO~~ETER). SAMPLE: GLUTAJUC ACID N-TFA O-n-BUTYL ESTER DERIVATIVE. $+&j+ . 180 200 220 240 w L- 260 280 300 320 340 360 380 400 420 440 460 480 500 GUP L7H flMNIOTIC FLUID D400 . . FILE 344 . APPENDIX B LETTERS OF INTEREST STANFORD UNIVERSITY STANFORD, CALIFORNIA 94305 DEPARTMENT OF CHEMISTRY December 17, 1973 Professor Carl Djerassi Department of Chemistry Stanford University Stanford, California 94305 Dear Carl: I am writing to indicate the anticipaJed use of mass spectral facilities by my research group in the for,seeable future. As has been true in the past, we plan to utilize both GC/HRMS and simple HRMS for various purposes, especially 1) the determination of structure of enzymic cyclization products, including members of the lanosterol class, derived from squalene oxide-like substrates, the purpose being the elucidation of the mechanism of enzymic steroid synthesis, and 2) the characterization and confirmation of structures of intermediates in the synthesis of natural products, including polycyclic terpenoids, alkaloids of physiological interest, and nucleosides,and 3) identifi- cation and/or structure determination of organic materials employed in our organic-inorganic program devoted to nitrogen fixation and related processes. Very truly yours, E. E. van Tamelen Professor of-chemistry EEvT/jlb OFFICE MEMORANDUM o STANFORD UNIVERSITY a OFFICE MEMORANDUM To : FROM : SUBJECT: DATE: December 3, Carl Djerassi Keith Hodgson Response to inquiry about GC/HRMS facility In response to your three questions concerning the potential use of upgraded GC/HRMS facilities: 1. Yes, especially in the study of certain biological ligands and lower molecular weight ligand-metal complexes. 2. Potential use of the facility might run in the range of 8-10 samples per year most of which probably would be handled most easily by simple HRMS. for the next OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM . STANFORD UNIVERSITY . OFFICE MEMORANDUM . DATE: December 3, 1973 lz 2: To : Professor Carl Djerassi 2 E: =i 4 FROM : James P. Collman Professor of Chemistry . SUBJECT: Please excuse our belated response to your inquiry of November 20 concerning a potential upgrading of mass spectrometry facilities. The service you mention in your memo of the 20th would be valuable to us. We would have significant use for the GC/HRMS for a project dealing with models for cytochrome Pdso based monooxygenases currently supported by the NIH. JPC :lb o OFFICE MEMORANDUM . STANFORD UNIVERSITY o OFFICE MEMORANDUM . STANFORD UNIVERSITY . OFFICE MEMORAFtDUM . DATE: December 13, 1973 To : Professor Djerassi FROM : Professor Harry S. Mosher SUBJECT: Your proposal to the NIH On our NIH Grant on the investigation of animal toxins we have been studying natural products from the skin of Central American frogs (atelopidtoxin) and some products from marine animals (n&dibranchs! as well as some new choline esters isolated from the hypobranchial gland of various sea snails. Some, if not all,of these are mixtures. Obviously the new capabilities of the mass spectrometry laboratory would be of value to me, I expect only occasional use of HRMS an.'. GC/HRMS, but on these occasions these techniques would be very important. o 0 =: z m 3 T 0' E 0 s DATE: 14 December 1973 To : FROM : SUBJECT: C. Djerassi W. S. Johnson OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM . The contemplated new facility for high resolution mass spectrometry and combined gas chromatography/high resolution mass spectrometry would be of extreme value to our research program concerning the non-enzymic biogenetic-like cyclization of polyolefines, a project which is presently supported by NIH Grant AM 3787-14. If this facility were to become available, we would expect to use it extensively in the analysis of product mixtures of the aforementioned cyclizations. We estimate that our need for the gas chromatographic capability would be about 20% of the total need for the mass spectrometry service. o 4 o STANFORD UNIVERSITY MEDICAL CENTER STANFORD, CALIFORNM 94305 . (415) 321.12C'O EXT. 5785 b STANFORD U~w3t.m~ SCHOOL OF MEO~CINE De$dmcnt of Ancsthrsia November 30, 1973 Professor Joshua Lederberg Department of Genetics School of Medicine Sta'nlford University Stanford, California 94305 Dear Dr. Lederberg: Thank you for including my laboratories in the group which could be served by a GC[ERMS facility. As you know,.Dr. Cohen and I have our own GC/FE/ Computer System. Cur use of the pro- posed facility would be limited to those times when it is necessary to use high resolution ot identify a metabolite. I would estimate a need for three GC/HFW and three HEWS Spectra per year. My work is entirely supported by the National Institutes of Health. S~~lQ;~&Kq(w James R. Trudell, Ph.D. JRT:rw VETERANS ADMINISTRATION HOSPITAL 3801 MIRANDA AVENUE PALO ALTO, CALIFORNIA 94304 IN REPLY REFER TO: November 26, 1973 Professor J. Lederberg Department of Genetics Stanford University School of Medicine Stanford, California 94305 Dear Prof. Lederberg: Dr. Allen Duffield of your department has informed me that you plan to obtain additional apparatus that would provide high resolution C&/MS as a service to the Stanford community. We have in the past used the hospitality of your department in the identification of metabolites and derivatives of phenothiazine drugs and cannabinoids by GC/MS. Originally, we had the collaboration of Dr. B. Halpern and more recently Dr. A. Duffield, who was instrumental in helping us with some of our problems. * Our department would indeed be most interested in availing ourselves of Gc/Ms analyses in the course of our current NIH projects which again are concerned essentially with drug metabolism and the isolation and characterization of unknown drug derivatives. As a rough estimate, I would think that we may-be interested in the analyses of about five samples per month, two of which will require high resolution MS. I certainly hope that your project to acquire the sbphisticated new instrumentation you are.seeking will be successful. Sincerely yours, Irene S. Forrest, Ph.D. Chief, Biochem. Research Lab. (15W ISF:jr Show vcrcrm's jcd~ nam, VA jil c number, and racial security nmhr on al/ rortw~ondcncc. OFFICE MEMORANDUM . STANFORD UNIVERSITY o OFFICE MEMORANDUM . STANFORD UNIVERSITY . OFFICE MEMORANOUM I zi DATE: November 30; 1973 ci To : z Joshua Lederberg z E FROM I Z -c 1. Rabinowitz, Ph.D. 0 D.I. Wilkinson, Ph.D. SUBJECT: RE: NIH GC/HRMS Proposal * 3 Research carried out in this department has strongly implicated a role : for the prostaglandins in the etiology of psoriasis (E. M. Farber, K. Aso, : 32nd Annual Meeting, American Academy Dermatology, Chicago, Ill., Dec. 1973; z E. M. Farber et al, J. Invest. Derm,,in preparation; E. M. Farber et al, c" Nature New Biology, in preparation). The prostaglandins are a class of 3 C2D fatty acids, having molecular weights near 350 and basal tissue con- e centrations in the nanogram and picogram per gram range. The Frostaglandins are 2 presently detected by radioimmunoassay, bioassay and mass spectr,ometric : techniques, among others. There is considerable controversy concerning the s method of choice for measurement of absolute amounts of prostaglandin in various Z tissues. In particular, it has been suggested that mass spectrometric techniques c yield more accurate quantitative assays than radioimmunoassay techniques (Adv. Biosciences, 2, 71-123, 1973, Ed. G. Rasp.6, S. Bernhard, Pergamon Press, f4.Y.). f Radioimmunoassay techniques are currently in use in our laboratories, and ; c" the addition of mass spectrometry capability would greatly increase the : definitiveness of our studies, as well as make available to us a powerful tool for the study of prostaglandin precursors and metabolites. Work to date has been supported in part by NIH Grant No. AM 15107. i n In 1 D. I. Wilkinson, Ph.D. Departmect of Dermatology a IR:DIW:ss 0 . . . OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM o STANFORD UNIVERSITY . OFFICE MEMORANbWh me 3 1973 DATE: November 30, 1973 To : Joshua Lederberg, Department of Genetics Carl Djerassf, Department of Chemistry FROM 1 Eugene D. Robfn, M.D., Department of Respd atory Medicine SUBJECT: Your memo of November 20, 1973 describing a proposed GC/HRMS facility. I have applfed to the NIX for a contfnuatfon of my research grant, Adaptations To 02 Depletion in which I have proposed to measure the redox state of NAD+/NADH and NADP+/NADPH by measuring the ratio of oxidized to reduced redox pairs usfng g gas chromatography/mass spectrometry. These analyses will be conducted with the assistance of Drs. Alan Duffield and Wilfred Pereira of the Department of Genetics. I welcome the opportunity to have a GC/HRMS facility available on campus to support the GC/LRMS available fn the department of genetics. The facility you propose to establish wfll ba of importance to us in those instances where assignment of molecular composition to ionized fragmerts is crucial for kass spectral interpretation. I would anticipate using this service between one and two times a month. Sfncerely yours, Professor of Medicine and Physiology EDR:ods VETERANS ADMINISTRATION tdOSPITAL 3801 MIRANDA AVENUE PALOALTO,CALIFORNIA 94304 November 28, 1973 IN REPLY REFER TO: Dr. Joshua Lederberg Department of Genetics Stanford University School of Medicine Palo Alto, California 94305 Dear Dr. Lederberg: I should be very pleased if you were able to obtain through the National Institutes of Health a GC/MS facility which could be shared jointly by members of the Stanford University faculty. At present, I am being funded under grant DA-00424-01 for a study of the metabolism of ma,ihuana. We have made significant progress in our methods of extracting metabolites, in isolating new ones by thin-layer chromatographic techniques, and by purify- ing them to some degree as determined by GLC. The big bottleneck has been the lack of ready access to a GC/MS set-up which would permit further characterization of the metabolites. Our needs would be primarily for GC/low resolution MS, for which we have extensive need, perhaps the analysis of 15-25 samples per month. Depending on the outcome of these analyses, we might have 1 to 2 samples per month requiring GC/high resolution MS. We anticipate having little need for high resolution MS without GC because of the fact that our samples are isolated from complex mixtures and are nearly impossible to purify. If there is any way in which I could assist in helping obtain such a facility for the University, please let-me know. LEH:bh $brw vc~crm's full name, VA f;l c numbtt, and social Jccurity number on aI/ corrtspondtnct. STANFORD Uh'IVERSITY HOSPITAL Pharmacy Department Date September 5, 1973 To: Y Dr. J. Lederberg, Director Department of Genetics From: Hiram H. Sera, Director Su?ject: Drug Analysis Service with Gas Chromatograph and Mass Spect- rometer. . 1 wish to express our appreciation to your department for assisting us in identifying a drug -sample submitted to us from the El patient :are area. . BACKG iiODiD : . The patient on ElA with G.I. disturbance, joint "pains and occasional spike temperature was found to possess an unidentified medication in a plastic vial and was found to have self-administered the drug intramus- cularly while in the hospital. The house staff was notified and the drug sample was submitted to us for immediate identification. Through my previous association and knowledge of Drs. Summons' and W. Perieras' (in Dr. Duffield's instrumentation research laboratory) wcrk with gas chromatograph and mass spectrometer, I had taken the liberty to request .their assistance in the identification. In an hour, the determination was made and the drug was found to be Pentazocaine or Talwin which is a synthetic analgesic used commonlv in this hospital in tablet and injection forms. . Since we do occasionally receive similar requests from physicians, I wish to call on your staff again in the future. Thank you. EEIS:lh .cc: Mr. John Williams Dr. Roger Summons Dr. W. Periera Dr. A. Duffield OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM c 2 iii- 2; DATE: November 26, 1973 z z To : Joshua Lederberg, Department of Chemistry 7 Carl 'Djerassi, Department of Chemistry =: E FROM : SUBJECT: Sumner M. Kalman z 0 0 Mass Spectrometry, Your Memo'of November 20, 1973. ," rc A central facility for mass spectrometry and GC/MS would be highly desirable from my point of view. We often need to identify metabolites of drugs that interfere with our assays, and that represent research problems as well. Frequently we need to check the purity of a reference material which is in short supply. I have received much heln from both your laboratories in the past and would welcome the opportunity to use an expanded facility. For many of our prcjlems low resolution MS is satisfactory and I hope you mean to provide this service too. With respect to your questions I anticipate that (1) Yes. (2) We would probably use GC/MS once a month or more. We would use MS at about the same rate. (3) Yes. Sincerely yours, Sumner M. Kalman, M.D. Professor of Pharmacology Director, Drug Assay Lab OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM OEC 3 l9?J . r= ii s DATE: 3 December 1973 0" c To : Joshua Lederberg z 2 z VI FROM :' z Jack Barchas 4 b 0 SUBJECT: GC/HRMS =: ii In bur thanks to you and Alan Duffield for inquiring of our interest in the proposed GC/HRMS. We would find it quite useful, as we are currently applying for funding for a quadrupole mass spectro- `meter for mass fragmentography studies. With such a unit, there would be many times when the capability of the HRMS instrumentation w0ul.o be valuable in structural elucidation. We would expect very heavy utilization of our instrument if we were to.obtain the funding, ad, therefore, would expect to make considerable use of the proposed GC/HRMS, which is an essential ancillary tool. The GC aspects of the instrument would be valuable, since we would expect to be studying a number of unknowns and the GC separation would be an integral part of that process. b Our work is supported by NIMH, ONR, NASA, and the Alcohol Abuse division of HEW. JDB/rs REPLY TO M-TNOF: LPE: 239-q NATIONAL AERONAIJTICS AND SPACE ADMINISTRATION AMES RESEARCH CENTER MOFFETT FIELD. CALIFORNIA 94035 November 28, 1973 Professor Joshua Lederberg Department of Genetics School of Medicine Stanford University Stanford, CA 94305 Dear Professor Lederberg: I was delighted to learn of your proposed plans to upgrade your mass spectrometry capabilities by providing routine high retolution mass spectrometry (HRMS) and combined gas chromatography/ high resolution mass spectrometry (GC/HRMS). Such a service could be of inestimable value to our program. As you know we are developing gas chromatography/high resolution mass spectrometry facilities for NASA's interests. In particular we are modifying our equipment in order to determine carbon and nitrogen isotopic compositions of organic molecules. If available we would use your proposed facilities for our routine GC/HRMS analysis of biologically significant molecules which are sought in our program. Most of our work requires GC/HRMS as opposed to HR*MS. In addition, we are al+o most interested in computer programs which aid in mass spectral interpretations. Although we have a few of our own programs, we would be most eager to upgrade our own interpretation capabilities through use of programs from your facility. Our work thus far has been supported solely by NASA;.we are not supported at present by NIH. I hope that our expression of interest will be of use to you in obtaining funding for a potentially most useful analytical facility. Sincerely yours, Keith A. Kvenvolden Chief, Chemical Evolution Branch OFFICE MEMORANDUM o STANFORD UNIVERSITY . OFFICE MEMORANDUM o STANFORD UNIVERSITY o OFFICE MEMORANDUM DATE: November 28,%1973 To : Joshua Lederberg, Ph.D. s331 FROM : William R. Fair, M.D. S287 SUBJECT: Use of facilities for high resolution mass spectral analysis with gas chromatography. As your memo of November 1973 requested, we have answered the questions concerning our interest in GC/HRMS. 1. This service would be of definite value to us in two projects currently being investigated in our laboratories. a) The identification, distribution, and biological significance of the prostatic antibacterial factor (PAF). Our preliminary experiments indicate that this is a basic polypeptide, perhaps attached to a divalent metal such as zinc. b) This service would also be of valCe in the determination of the urinary polyamine levels in patients with various genitourinary tract malignancies. Our initial experiments along this line indicate that there is significant elevation of polyamines in patients with pros tatic carcinoma. The use of GC/HRMS would enable a more precise quantita- tion of these differences and enable us to expand our research into other areas concerning the biochemical significance of the polyamines. 2. I would estimate that on the PAF project we would, use approximately 2-4 samples per month and perhaps lo-12 samples per month on the polyamine projects. Both of these projects would require the use of GC/HRMS. 3. A portion of our research on the PAF is currently supported by a grant from the NIH. The amount of this grant is $36,698, and this grant will terminate on December 31, 1974.