The Molecular Basis of Biological Specificity by Linus Pauling Institute of Orthomolecular Medicine 2700 Sand Hill Road, Menlo Park, California 94025 During the decade 1930-1940 I formulated a general theory of the molecular basis of biological specificity, involving the idea that biological specificity results from the interaction of complementary molecular structures, with hydrogen bonds among the most important of the weak intermolecular forces between the interacting molecules. The most striking example of specific biological interactions of this sort is the interaction between the two complementary strands of the DNA molecule in the double helix discovered by Watson and Crick twenty-one years ago. My early work was on the determination of the structure of crystals by the x-ray diffraction technique, the determination of the structure of gas molecules by electron diffraction, and the application of quantum mechanics to physical and chemical problems, especially the structure of molecules and the nature of the chemical bond. In 1929, when Thomas Hunt Morgan came to the California Institute of Technology, bringing with him a number of very able younger biologists, I began to become familiar with biological problems, and to think about possible ways in which biological specificity could be explained in terms of interactions between molecules. I worked on several problems of biological specificity, from the molecular point of view, without success; one of them was the problem of explaining the self-sterility of the marine organism Ciona (the 2. sea squirt), which was being studied by Morgan. In 1934 the problem of the shape of the oxygen equilibrium curve of hemoglobin attracted my attention. Consideration of the structure of hemoglobin led to the idea that investigation of the magnetic properties of this substance and its derivatives would provide valuable information, and work along these lines , in collaboration with C. D, Coryell and a number of students, was initiated. Alfred E. Mirsky of the Rockefeller Institute for Medical Research, who had been studying hemoglobin for several yearsp came to Pasadena for a year , and he and I formulated a theory of the structure of native, denatured, and -coagulated proteins, based upon the concept that a native protein molecule consists of one polypeptide chain (or of two or more such chains) folded into a uniquely defined configuration, in which it is held by hydrogen bonds between the peptide nitrogen and oxygen atoms, as well as by other Wak forces , with denaturation involving a loss of this well-defined structure. .l In 1936, while I was on a short visit to the Rockefeller Institute for Medical Research, Carl Landsteiner asked me how I would explain the observed properties of antibodies and antigens by means of their molecular structure. I thought about this problem during the following years, and consulted with Landsteiner about the interpretation of sometimes conflicting experimental results. By 1940 I had formulated a theory of the structure and process of formation of antibodies. 2 This theory was based upon the concept that the specific combining region of an antibody molecule is complementary in structure to a portion of the surface of the antigen, with the antigen-antibody bond resulting 3. from the cooperation of weak forces (electronic Van der Waals forces, electrostatic interaction of charged groups, and hydrogen bonding) between the complementary structures, over an area sufficiently large that the total binding energy could resist the disrupting influence of thermal agitation. Precipitating and, agglutinating antibodies were assumed to be bivalent, consisting of a central part, with structure common to all or almost all antibodies produced by the animal, and two end parts, the combining regions, with structure complementary to that of the antigen. (The idea of complementary structures for antibody and antigen was suggested by Breinl and Haurowitz, 3 Alexander4, and Mudd. 5 There is some intimation of it in the early work of Ehrlich and Bordet. ) The complementary combining regions were assumed to be formed by the folding of polypeptide chains in the presence of the antigen, in such a way that the forces of attraction would mold the folding chain into a structure complementary to that of a portion of the antigen, with the folded chain then being held in this configuration by hydrogen bonds and other interactions, even after the antibody has dissociated from the antigen on which the combining group was molded. Dan Campbell, David Pressman, and a number of other workers in our laboratory carried out experimental studies that verified the valence 2 for precipitating and agglutinating antibodies 67 and that left no doubt that the combining regions of antibodies are complementary in structure to the homologous haptenic groups of the antigens. a The fit of the combining region of the antigen to the hapten was shown to be close, better than 20 pm in some cases a and the effects of Van der Waals attraction, electrostatic forces@ and hydrogen-bond formation were separately verified in quantitative 4. hapten-inhibition studies. A satisfactory theoretical explanation of quantitative values of free energy of combination of haptens with antibodies homologous to the 0-, m-, and p-azobenzene arsenic acid groups on the basis of known intermolecular `interactions was reported in 1945. 9 For several haptens with various groups substituted in the positions of the azo group in the hapten of the immunizing antigen the standard free energy of combination, as given by hapten inhibition constants, was found to be proportional to the calculated Van der Waals interaction with the surrounding antibody, which includes proportionality to the electric polarizability of the group. For groups forming hydrogen bonds the energy of the hydrogen bond (1. 5 to 3 kJ mole -1 , representing the difference in energy of the hydrogen bond formed by the hapten with antibody and with water) was needed, in addition to the term corresponding to electronic Van der Waals interaction. The effect . . of electric charge was determined by comparison of haptens closely similar in shape, but with a difference in electric charge: In one caselo comparison of haptens with either trimethylammonium ion or tertiary butyl group, and in the other case with either carboxylate ion or nitro group. In each comparison there was indication of a complementary electric charge in the antibody, close to the charge in the immunizing antigen. The magnitude of the effect showed the charge in the antibody to be within 320 pm (first case) or 260 pm (second case) of the minimum distance permitted by the Van der Waals radii of the groups. I think that this work, which was based on earlier work by Landsteiner and his collaborators, 12 leaves no doubt that the specificity of antibodies is the result of . the complementariness in structure of the combining group and a portion of the surface of the homologous antigen. 5. It became evident that non-biological specificity could also be explained in terms of complementariness. I gave an example in a lecture on analogies between antibodies and simpler chemical substances: 13 "The reaction shown by simple chemical substances that is analogous to that of specific combination of antigen and antibody is the formation of a crystal of a substance from solution. A crystal of a molecular substance is stable because all of the molecules pile themselves into such a configuration that each molecule is surrounded as closely as possible by other molecules - that is, if a molecule were to be removed from the interior of a crystal, the cavity that it would leave would have very nearly the shape of the molecule itself. We can say that the part of a crystal other than a given molecule is very*closely complementary to that molecule. Other molecules, with different shape and structure, would not fit into this cavity nearly so well, and in consequence other molecules in general would not be incorporated in a growing crystal. This is the explanation of the astounding chemical process of purification by crystallization - from a very complicated system, such as, for example, grape jelly, containing hundreds of different kinds of molecules, crystals which are nearly chemically pure may be formed, such as crystals of cream of tartar, potassium hydrogen tartrate. " In the same paper it is stated that "although crystallization is the only simple chemical reaction that shows striking similarity to serological reactions with respect to specificity, there are many physiological phenomena that are similarly specific, and for which the specificity can be given a similar explanation, The specificity of the catalytic activity of enzymes is due to a surface configuration of the enzyme such as to make the enzyme complementary to the substrate molecule,, ora rather, to the substrate molecule in the strained 6. state that occurs during the catalyzed reaction. The specific action of drugs and bactericidal substances have a similar explanation. Even the senses of taste and odor are based upon molecular configuration rather than upon ordinary chemical properties - a molecule which has the same shape as a camphor molecule will smell like camphor even though it may be quite unrelated to camphor chemically, I am convinced that it will be found in the futures as our understanding of physiological phenomena becomes deeper, that the shapes and sizes of molecules are of just as great significance in determining their physiological behavior as are their internal structure and ordinary chemical properties. " In 1940 Max Delbriick and I 14 published a discussion of the intermolecular forces operative in biological processes. P. Jordan had advanced the idea that there exists a quantum-mechanical stabilizing interaction that operates preferentially between identical or nearly identical molecules or parts of molecules, and is of great importance for biological processes, including the production of new genes identical with the old ones. Delbriick and I pointed out that the specific quantum-mechanical forces between identical molecules could not be large enough to cause a specific attraction between like molecules under the conditions of excitation and perturbation prevailing in living organisms, and therefore could not be effective in bringing about autocatalytic reactions. We wrote that "It is our opinion that the processes of synthesis and folding of highly complex molecules in the living cell involves in addition to covalent-bond formation, only the intermolecular interactions of Van der Waals attraction and repulsion, electrostatic interactions, hydrogen-bond formation, etc. , which are now rather well understood, These interactions are such as to give stability to a system of two molecules with complementary structures in juxtaposition, rather 7. than of two molecules with necessarily identical structures; we accordingly feel that complementariness should be given primary consideration in the discussion of specific attraction between molecules and the enzymatic synthesis of moleeules. " We mentioned that "The case might occur in which the two complementary structures happened to be identical; however , in this case also the stability of the complex of two molecules would be due to their complementariness rather than their identity. " Some time later 15 I discussed the matter of gene replication in more detail: "I believe that the genes serve as the templates on which are molded the enzymes that are responsible for the chemical characters of the organisms, and that they also serve as templates for the production of replicas of themselves. The detailed mechanism by means of which a gene or a virus molecule produces replicas of itself .is not yet known. In general the use of a gene or virus as a template would lead to the formation of a molecule not with identical structure but with complementary structure. It might happen, of course, that a molecule could be at the same time identical with and complementary to the template on which it is molded. However, this case seems to me to be too unlikely to be valid in general, except in the following way. If the structure that serves as a template (the gene or virus molecule)consists of, say, two parts, which are themselves complementary in structure, then each of these parts can serve as the mold for the production of a replica of the other part, and the complex of two complementary parts thus can serve as the mold for the production of duplicates of itself. " The same statements were made in the spring.of 1948 in lectures in Oxford, Cambridge, London, and elsewhere. The hydrogen bond was recognized by Latimer and Rodebush as an 8. important structural feature over fifty years ago, 16 In their 1920 paper they mentioned that "Mr. Huggins of this laboratory in some work as yet unpublished has used the idea of a hydrogen kernel a theory in regard to certain organic compounds. " pointed out the importance of the hydrogen bond in 1 held between two atoms as Mirsky and Pauling in 1936 the determining of the structure of proteins, A In the same year Huggins also discussed protein structures in a more detailed way, with hydrogen bonds between the NH and CO groups of the main chains. 17 A few years later Huggins described several helical structures for polypeptide chains, with intrachain hydrogen bonds. 18 These structures were needlessly restricted to having an integral number of amino-acid residues per turn of the chain, and, moreover, Huggins did not require the amide groups to be planar, although the planarity of these groups had been recognized since 1932, 19 and had already been verified by several determinations of the structure of simple peptide crystals in our laboratory. It is unfortunate that Huggins was handicapped by these two erroneous assumptions in his imaginative and otherwise sound attack on the problem of the secondary structure of proteins. The same two erroneous assumptions provided a similar insuperable barrier to the vigorous attack made by Bragg, Kendrew, and Perutz on the same problem. 20 In the meantime,, Corey and other investigators in Pasadena had determined the crystal structures of a number of amino acids and simple peptides, and Pauling and Corey had discovered the alpha helix and the parallel-chain and antiparallel-chain pleated sheets. 21 The discovery of the alpha helix left no doubt about the importance of helical structures and of hydrogen bonds in determining the secondary structures of proteins. 9. I had been interested in the nucleic acids since 1933, when Sherman and I calculated the resonance energy of guanine and other purines. 22 My colleague Robert B. Corey had made some x-ray diffraction photographs of fibers of nucleic acid, which were, however; of somewhat poorer quality than those published by Astbury and Bell. 23 I began work on the problem of interpreting the x-ray photographs on 26 November 1952; on the preceding day I had attended a seminar in biology in the California Institute of Technology, at which Professor Robley Williams of University of California, Berkeley, showed a slide of an electron microscope photograph of molecules of sodium ribonucleate. He said that the small fibrils had a diameter of about 1. 5 nm, and that they were apparently cylindrical, in that only one diameter was shown. The x-ray photographs indicated an identity distance along the axis of the molecule of 340 pm, and, with the measured density of ribonucleic acid, about 1.62 g cm -3 s it was indicated that the fibers contain two or three molecules, probably helices twisted about one another. The value of the spacing of the principal equatorial x-ray reflection had been shown to decrease with decreasing amount of hydration of the fibers, with minimum value 1.62 nm. I assumed this value to correspond to essentially anhydrous nucleic acid, and, with use of the density, I calculated the number of polynucleotide chains per unit to be exactly three. This result surprised me, because I had expected the value 2 if the nucleic acid fibers really represented genes. I decided, however, that probably the fibers were artifacts, produced by the process of extraction from cells and the subsequent stretching. During the next month I strove to find a way of arranging the polynucleotide chains in a triple helix, and was successful, although the 10. structure was described as "an extraordinarily tight one, with little opportunity for change in positions of the atoms". The paper in which this structure was described was communicated to the Proceedings of the National Academy of Sciences on 31 December 1952, and a copy of the manuscript was sent to Watson and Crick. 24 In hindsight, it is evident that I made a mistake on 26 November 1952 in having decided to study the triple helix rather than the double helix. It is likely that the fibers giving the'equatorial spacing 1.62 nm contained some water, and also had density smaller than 1.62 g cm -3 * The diameter 1.5 nm observed by Williams for nucleic acid molecules correspondss with density assumed 1. 6 g cm -3 and unit translation 340 pm along the molecular axis, to two molecules in a helical structure (calculated diameter 1.6 nm) rather than to three (1. 9 nm). I am now astonished that I began work on the triple-helix structure, rather than on the double helix. I had not forgotten that Delbriick and I had suggested that the gene might consist of two complementary molecules, but for some reason, not clear to me now, the triple-chain structure apparently appealed to me, possibly because the assumption of a three-fold axis simplified the search for an acceptable structure. I cannot say what would have happened if I had made the other assumption, that of a double helix, on 26 November 1952, or if I had succeeded in getting access to the diffraction photographs of DNA that had been made by Wilkins. There is a chance that I would have thought of the Watson-Crick structure during the next few weeks. I knew that the purines and pyrimidines were present in nucleic acid in equal amounts, but I bad not drawn the reasonable conclusion 11. about purine-pyrimidine pairs. I knew about hydrogen bonding by purines and pyrimidines. Nevertheless, I myself think that the chance is rather small that I would have thought of the double helix in 1952, before Watson and Crick made their great discovery. After all, I had spent part of the summer of 1937 in a search for ways of folding polypeptide chains , with planar amide groups of the correct dimensions and with hydrogen bonds between the'C0 and NH groups of residues separated by some distance along the chain, in such a way as to account for the x-ray diffraction photographs of alpha keratin, but without success. There was no reason why the alpha helix should not have been discovered then, rather than eleven years later, when it was discovered after a few hours of work. There is no doubt that even rather simple ideas sometimes are very elusive. It is my opinion that if Watson and Crick had not carried on their persistent effort, and had not had the benefit of advice about the structures of the nitrogen bases and hydrogen bonds from Jerry Donohue and information from the excellent x-ray diffraction photographs of Wilkins, the discovery of the double helix, which has led to such great developments in molecular biology, might `well have been delayed for several years. 12. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Mirsky, A. E., and Pauling, L. , Proc. U. S. Nat.Acad. Sci. 22, 439 (1936). Pauling, L. , J.Amer. Chem. Sot. 62, 2643 (1940). Breinl, F., and Haurowitz, F., Z. physiol. Chem. 192, 45 (1930). Alexander, J., Protoplasma E, 296 (1931). I Mudd, S. , J. Immunol. 23# 423 (1932). Pauling, L., Pressman, D., Campbell, D. H., Ikada, C., and Ikawa, M. B J. Amer. Chem. Sot. 6& 2994 (1942). Pauling, L., Pressman, D., and Campbell, D. H., J.Amer. Chem. Sot. 66, 330 (1944). Pauling, L., Pressman, D., and Grossberg, A. L., J.Amer. Chem. Sot. 66, 784 (1944). Pauling, L., and Pressman, D., J.Amer. Chem. Sot. 67, 1003 (1945). Pressman, D., Grossberg, A. L., Pence, L. H., and Paul&g, L., J.Amer. Chem. Sot. 62, 250 (1946). Pressman, D., Swingle, S. S., Grossberg, A. L., and Pauling, L., J. Amer. Chem. Sot. 66, 173 1 (1944). Landsteiner, Press9 1945. Pauling, L. s Pauling, L. s Pauling, L., K o ? The Specificity of Serological Reactions, Harvard University Chem. Eng. News 22, 1064 (1946). and Delbrkck, M. o Science 92, 77 (1940), Molecular Architecture and the Processes of Life (21st Sir Jesse Boot Foundation lecture# 28 May 1948), published by the Sir Jesse Boot Foundation, Nottingham, 1948. 13. 16. Latimer, W. M. , and Rodebush, W. H., J.Amer. Chem. Sot. 42, 1419 (1920). 17. Huggins, M. L., J. Org. Chem. A, 407 (1936). 18. Huggins, M. L., Chem. Revs. 32, 195 (1943). 19. Pauling, L., Proc. U. S.Nat.Acad.Sci. 12, 293 (1932). 20. Bragg, W. L., Kendrew# J., and Perutz, M., Proc. Roy. Sot. London 3, A-1074 (1950). 21. Pauling, L., and Corey, R.B., J.Amer. Chem. Sot. 7,2, 5349 22. Pauling, L., and Sherman, J., J. Chem. Phys. i# 606 (1933). 23. Astbury, W. T. p and Bell, F. 0. o Nature l$, 747 (1938). 24. Pauling, L. p and Corey* R. B. o Proc. U. S. Nat.Acad. Sci. 39, (1950). 84 (1953).