PART 1% NUCLEIC ACIDS THE STRUCTURE OF THE NUCLEIC ACIDS AND RELATED SUBSTANCES BY F. H. C. CRICK The Medical Rexearch Council Unit for the Study of the Molecular Structure of Biological Sjstem~ Cavendtib Laboratory, Cambridge, E'ngland [ 173 1 T IIERE HAVE BEEN a number of recent reviews covering the structure of deoxyribonucleic acid (DNA) ,rpsJ ribonucleic acid (RNA) ,4 and the synthetic RNA-like polyribonu- cleotides." It therefore seems unnecessary to go over this subject in detail again. In this short paper I shall try to review the reported results in a broader way, in an attempt to emphasize those features that the proposed structures have in common. Such an attempt is perhaps a little premature, but sufficient data are MJW available to make possible a beginning of this sort. I shall not discuss in detail the method of obtain- ing structures from the rather meager information contained in X-ray diagrams, but a few points need emphasis. First, it is essential to know the chemical formula of the material being studied. In most cases what is available is not the exnct chemical formula, but the general one. More precisely, the formula of the backbone is known, and also the manner in which the purines aud pyrimidine bases are attached; their precise sequence remains unknown. Fortunately this is not always as serious a handicap as might be ex- pected since, under X rays, at low resolution, one base Iooks somewhat like another provided the bases occupy roughly the same position. It would be much more serious, for example, if it were not known whether an a! or a p glycosidic link were present. In practice, these difficulties have proved mainly of im- portance in studying RNA, especially at the period wheu it was thought that this acid might be rather extensively branched. Second, the X rays show clearly only the repeating part of the structure, and they can be used effectively only to study structures that are spatially regular. Moreover, examination by X ray is a poor way to obtain an answer to the question "how much of the material is in the regular structure?" Third, information from other sources is useful. Thus, the titration data on DNA have proved most .informative, as has the infrared dichroism.aJ It would be a great advantage to have more studies on optical rotation to co&m or establish the "hand" of the various helices proposed. At the moment we should like very much to know just how much pairing of bases there is in RNA, and between which bases it occurs. Fourth, it is important to realize that most (although not all) studies must be made on synthetic materials or on materials extracted from their natural context in the cell. Wilkins' work on various intact cells is a brilliant exception to this realization;sls it is at present a grave anxiety to us that similar studies on RNA in intact cells have not been feasible. The Chemical Formula The information we have is well known,10 and it can be summarized briefly. The phosphate sugar backbones of DNA, RNA, and the RNA-like polymers are all much alike, the onIy difference being the sub- stitution of an I-I for an OH in the 2' position of the sugar in the case of DNA. The glycosidic linkage is believed to be always a /3 one. The bases in the natural materials are similar; the same four bases com- monly occur in both, except that, in DNA, urncil is replaced by thymine (5methyl uracil) , I think it is more interesting to consider for a moment not what we do find in nature, but what we do not find. It might perhaps have been expected, by anal- ogy with polysaccharides, that we would find mixecl nucleic acids containing in the same chain both ribose and deoxyribose residues, either arranged in some repeatiug alternation, or "at random." The absence of these I believe to be an important fact. To a crystal- lographer the regularity of the backbone suggests that the structure is likely to be stereochemically reg- ular, and leads him to surmise that perhaps this regu- larity may be of biological importance. From the genet- ic angle one notes that the information cannot easily be stored in the backbone alone, as it might be if the ribose-deoxyribose sequence were used as a code; any such information as to the backbone must therefore come from configurational variations, and it seems improbable that these could be sufficieutly stable to be acceptable to a geneticist. The regularity of the backbones, then, underlines the irregularity of the base sequence, and it strongly suggests that any genet- ic information in the nucleic acid is carried by the base sequence, and only by this sequence. We must briefly note the two rules proposed by ChargafflrJa for the over-all base composition of natural nucIeic acids. They are A c ._- =-- T G =I A+-C -o-m= 1 the letters referring to adenine (A), cytosine (C), thymine (T), g uanine (G), and uracil (II), on a molar basis. I should not wish to be drawn into controversy as to bow far these rules `are valid, except to say that the first is obeyed for most specimens of DNA, while the second one, which seems to apply in many cases [ 175 I SPECIAL PUBLICATIONS NEW YORK ACADEMY OF SCIENCES to material from whole cells, appears to be less well founded; it is probably not correct for certain plant viruses. Nor shall I do more than mention the occur- rence of bases other than the standard four, since I have recently discussed elsewhere some aspects of the problems they raise. The Proposed Structures We can now proceed to an examination of the pro- posed structures, the most important features of which are Set Out in TABLE 1. TABLE 1 succeeded in making a satisfactory poly G, so I have omitted it from the table. Poly AU, when synthesized from an equimolar mixture of adenylic acid and uridylic acid, has approximately equal amounts of adenine and uracil, and Heppel et al.14 have shown that their sequence is random or very nearly so. Poly A + poly U describes the structure formed by mixing preformed poly A with preformed poly U. X-ray diagrams of RNA from a variety of sources are all very poor, and they are indistinguishable from those given by poly AU and poly AUGC, so these three materials have been listed together.4J Naturally it is not impossible that improved X-ray pictures might show up significant differences. X-ray work on all these compounds is being ac- tively pursued, but much remains to be done, both on compounds that exist, such as poly I, and on some not yet made, such as poly CU and poly ACU. Alex- ander Rich discusses elsewhere in these pages later developments in this field, but I shall mention very briefly poIy A (and poly A -i- poly U) to make my subsequent points intelligible. Meterlal Number and dlsposltion of chains K%nB successive ahosohates 2, antiparallel 2, antiparallel 2.55 A., 33" 5.7A." 3.3, A., 36" 6.8 A.* RNA Poly AU Poly AUGC unknown; probably 2 (probably similar to poly A) (Poly A + poly VI 2 (parallel?) 3.2 to 3.6 A. I probably about 36" greater than 7 A. Poly A Poly I 2, parallel 3.85 A., about 43" 7.1 A. probably more ? ? than one Poly c (some structure, 1 ? but not known) Poly u amorphous - - The abbrevlatlans used in this table are explained In the text. *I am Indebted to Wilklnsl for this Information. A few comments are necessary on the nature of the materials. Thus DNA "A" and DNA "B" are the two forms of the DNA structure; the latter occurs at higher humidities and in solution, whiIe the former appears at lower humidities. In most specimens the two are interconvertible. No significant differences have been found in the X-ray patterns of DNA from different biological sources, so it is unnecessary to in- dicate the type of DNA; in fact, most of the X-ray work has been done with calf thymus DNA. The term "poly" in the table refers to the synthetic polymers first produced by Grunberg-Manago et al.,13 who used an enzyme system from bacteria. These polymers have the same backbone as natural RNA, but differ in the sequence of bases. In poly A, for example, every base is adenine. Nobody has yet The structure of poly A has been studied by Rich, 1. D. Watson, D. R. Davies, and myself, and also by G. Morgan and R. S. Bear at the Massachusetts Institute of Technology, Cambridge, Mass. After an extensive tria1 of one-chain structures both groups have decided in favor of a two-chain structure of the type briefly reported by Watson.4 In this structure the two helical backbones are parallel rather than antiparaIle1. The adenines are paired - that is, an ndenine on one chain is joined by a pair of hydrogen bonds to its neighbor on the other chain; in addition, the NH, of the adenine makes a further hydrogen bond to the phosphate of the opposite chain. To do this the bases must be tilted somewhat, and this in- creases the distance between the bases in the fiber direction from 3.4 to 3.8 A. This structure, or simple variants of it, seems very reasonable for the fiber, but it is quite unclear whether it persists into solution. All that is known is that fibers that give good X-ray pictures are not regularly ob- tained (suggesting that the structure may be a pre- carious one and that the strong 3.8 A. spacing changes to 3.4 A. as the fiber goes into solution). Special physi- cochemical studies will therefore be necessary to decide whether poly A is single-strandecl or double- stranded in dilute solution. The fact that it has a large hypochromicity, even in very dilute solution, suggests that a regular structure of some sort exists under these conditions. The most interesting recent work on the subject is undoubtedly that of Rich and Davies15 on poly A + poly U. Imagine that one has a quantity of poly A [ 170 1 in one test tube and an equimolar quantity of poly U in another. What happens when one mixes them together? One immediately notices a considerable in- crease in viscosity. Warner10 had previously shown that there was a striking drop in the total U.V. absorp- tion and that, under electrophoresis, the mixture moved as a single component. All this suggests that a new structure has been formed. Good fibers, which are easily drawn, give good X-ray fiber diagrams. One glance - at least to an experienced eye - sug- gests that the structure present is similar to that of DNA. Rich and Davies have proposed a DNA-like structure, with base-pairing (of the DNA type) be- tween the adenine and uracil of different chains. The structure is a little fatter than DNA. It is not yet clear whether the two chains are parallel or antiparallel. One will perceive immediately that two important deductions can be drawn from this structure: (1) Two chains can wind around each other in solution to form a double helix, It is hoped that the physicochemistry of this process will be studied ac- tively in the near future. (2) An RNA backbone can take up a DNA-like configuration. I must briefly mention our ideas about RNA and tbc related poly AU and poly AUGC. It is clear that, for the material as we have it in the test tube, the structure is a rather disordered one. It seems to re- semble poly A, except that the strong inner ring of the X-ray pattern may imply that some of the phos- phates are at a larger radius. There is also a hint, from the absence of reflections around 3# A. spacing near the meridian, that the structure has a diad (or psendodiad) parallel to the fiber axis, as poly A has. Thus, an entirely reasonable structure would be one with two parallel chains, with occasional base-pair- ing, but otherwise rather irregular. Unfortunately the data are much too poor to make this more than an informed guess, and further studies on a variety of polynucIeotides would be needed before one could repose much confidence in them. Such a structure would not appear to make sense biologically in any c11~vio~1s way, and it may well be that the configurn- t-ion of RNA, as we have it, bears little relation to its configuration inside the cell. I have not included the DNA nucleoproteins in *IXBLK 1 because they are somewhat irrelevant, and also because Wilkins, elsewhere in this publication, describes recent developments regarding them. Possilde Generalizations TABLE 1 shows, first, that thus far no rsgerlnr stmc- tccra has been &scotzEred consisting of a single chain. Poly U appears amorphous; the hypochromicity is CRICK: STRUCTURE OF NUCLEIC ACIDS very small and it is probably a rather random coil. Poly I has a structure, but it appears to be about 30 A. across, and thus probably involves more than one chain in the structural unit. About poly C we can say only that some structure is present. In all the other CCLWS it seems probable, with varying degrees of cer- tainty, that the structural unit consists of two chains, although the case of poly A in solution is problem- atical. Watson, Rich, and I spent much time and effort in the attempt to fit the poly A data to a single-chain model, and this attempt has also been made by Mor- gan and Bear (personal communication). We now believe that the existence of such a model is unbkely, both on stereochemical grounds and also because its existence does not conform with the available X-ray data. It seems not unreasonable to reach the tentative conclusion that there is no natural repeating configu- ration for a single nucleic acid chain. Nucleic acid, in fact, seems capable of behaving in a regular man- ner only in the married state. Whether a single chain of nucleic acid can take up a regular configuration if combined with protein is another matter, This may we11 occur inside tobacco mosaic virus, for example. Whereas the existence of a regular structure with one chain seems unlikely, it would be rash to con- clude that there will always be two chains in the structural unit. There is probably room for a third, or even, under some circumstances, for a fourth. It will be interesting to see whether such structures can be found in nature, or produced from the RNA-like polynucleotides. Another type of structure that has not been found thus far in these materials is a coiled-coil structure of the general type proposed for collagen and a-keratin. There is some early evidence,`7 however, for a miccllo structure for DNA gels. It is tempting to speculate that this is based on the fivefold screw axis, which is a subsymmetry element of the tenfold screw axis of the DNA double-helix in the B form. One might think that my second point would be that all the structures are helical, but this is not really surprising since there is practically no other way to make a regular arrangement of an asymmetric, re- peating chain. My point is, rather, that all the helices are somewhat similar. That is, the parameters of the screw axes are astonishingly alike (see TABLE 1) and, moreover, that they are all right-handed (I note, in passing, that the stable configuration for the a-helix has also been shown recently to be right-hand- cd) ,~~JB,~~ It is not unreasonable to ascribe this mainly to two causes: the tendency of the bases to stack above one another, and the tendency of the backbone to be [ 177 1 SPECIAL PUBLICATIONS NEW YORK ACADEMY OF SCIENCES rather fully extended, probably because of the mutual repulsion of the charged phosphate groups. It is too early to do more than mention other possible rea- sons: the glycosidic bond may have to be approxi- mately radial, for example; or there may be a pre- ferred staggering of the atoms of the phosphate-sugar backbone. In any case it certainly would help us con- siderably in our attack on such difficult problems as the RNA configuration in small viruses and microsomal particles if we were able to establish the fact that the nucleic acid backbone is likely to vary its configura- tion only within certain limits. My next point is that thus far we know of no case in which two backbones lie very cIose together. Here again it seems unlikely that this would happen unless special arrangements were made to reduce the re- pulsion between the phosphates in the backbone. It is one of the oldest (unpublished) speculations in this field that t& might be done with divalent cat- ions; some studies on materials in which these have replaced sodium might be very valuable. I should not be surprised if a DNA configuration were found in which the two backbones were close together be- cause of the divalent ion, and in which the bases, as a consequence, became unpaired. I shall be interested to learn if Wilkins has any information as to whether the bases are actually paired in nucleohistone. The material recently described by Doty21 might be an interesting one on which this could be tested. Finally, a word must be said about base-pairing. Donohue22 has recently enumerated all the possible ways of pairing the standard bases, assuming that the most likely tautomeric forms exist and that at least two hydrogen bonds are made. It is implied, but not clearly stated, in his paper that there is only one sat- isfactory way of base-pairing for DNA that will fit the X-ray data if all four bases must occur on both chains. This is the way proposed28 by Watson and myself and recently refined by Pauling and Corey.2*k However, as the latter have pointed out, different arrangements are possible in other materials. In par- ticular the pairing proposed for poIy A is that actually found by Broomhead in crystals of adenine hydro- chloride. Recently it has been suggested (Bernhard, unpublished, and Geiduschek, unpublished) that mu- tually induced ionization leading to zwitterion for- mation may play an important part in the pairing of bases, but this idea must at the moment be regarded as speculative. CIearly, it will be necessary to carry out some fundamental work on the pairing of bases, either by studying it in mixed crystals of the mo- nomers (if such crystals can be found) or between nucleotides or polynucleotides in solution. In passing it should be noted that the pairing of bases proposed by Donahue nnd StenFJ in their "mating helix" for RNA involves a pairing not included in Donohue's paper, since it requires a tautomeric shift of one of the hydrogen atoms of the guanines. It is also very un- clear whether the bases can be turned over, through 180' about the glycosidic bond, from their most likely positions (those found in DNA). The structure of the nucleic acids is of such fundamental importance that more work on the basic physical chemistry of its com- ponents is certain to return valuable dividends. REFERENCES 1. WILKLNS, M. H. F. 1956. Biochem. Sot. Symposia. Cambridge, Eng. In press. 2. LANGIUD~E, R., W. E. SEEDS, H. R. WKSON, C. W. HOOP~R, M. H. F. WILKINS & L. D. HAMILTON. 1957. In press. 3. CRICK, F. H. C. 1957. In The Chemical Basis of Heredity. W. D. McElroy and B. Glass, Eels. Johns Hopkins Press. B&more, Md. 4. WATSON, J, D. 1957. In The Chemical Basis of Heredity. W. D. McElroy and B. Glass, Eds. Johns Hopkins Press. Balti- more, Md. 5. RICH, A. 1957. In The Chemical Basis of Heredity. W. D. McElroy and B. Glass, Eds. Johns Hopkins Press. Bnlti- more, Md. f3. SUTHERLAND, G. B. B. M. L? M. TSUBOI. 1957. Proc. Roy. Sot. London. In press. 7. JORDAN, D. 0.1955. In The Nucleic Acids, 1:447. E. Chargaff and J. N. Davidson, Eds. Academic Press. New York, N.Y. 8. WILKINS, M. H. F. &J, T. RANDALL. IBii3. Riochim. et Biophys. Acta. 10:192. 9. WILKINS, M. H. F., A. R. STOKES & I-I. R. WILSON. 1953. Nature. 1?1:738. 10. TIIE Nucrmc ACIDS. 1955. 1. E. Chargaff and J. N. Davidson, Eds. Academic Press. New York, N. Y. [ 178 1 CRICK: STRUCTURE OF' NUCLEIC ACIDS I 1. Crranca~~~~, B. 1955. 171 The Nacleic Acids. 1:307. E. Chargaff and J. N. Davidson, Eds. Academic Press, New York, N. Y. 12. ELSON, I). & E. CIIAIIGAFF. 1955. Biochim. et Biophys. Acta. 17:3G7. 1.3. GI~~JNB~~~~-MANAGO, h!l., P. J. Owrrz &S. OCIIOA. 1956. Uiochim. et Biophys. Acta. 20:269. 14. HEPPI~:L, L. A,, P. J. OBTIZ & S. Ocxon. In preparation. 15. RICH, A. & D. R. DAVIES. 1956. J. Am. Chem. Sot. 78:3548. 16. bvAI7NEI7, R. C. 1956. F'cderation Proc. 15:379. 17. RILEY, D. I-`. h G. Omm~. 1951. Biochim. et Biophys. Acta. `7:526. 18. YANC, J. T. & P. DOTY. 1957. J, Am. Chem. Sot. In press. 19. ELLIOTT, A. & E. R. MALCOI,M. 1956. Nature. 178:912. 20. ELLIOTT, A., W. E. HANUY & E. R. hku.coLx 1956. Natwe. 178:1170. 2 I. DOTY, 1'. & G. ZURAY. 1957. In press. 22. DONOII~E, J. 1.956. Proc. Natl. Ad. Sci. 42:60. 23. Cnrcn, F. H. C. & J. D. WATSON. 1954. Proc. Roy. Sot. London. A223:80. 24. Pnur.rzlc, L. & R. 13. COIIEY. 1956. Arch. Biochcm. Biophys. 65: 164. 25. Bnooa~ra~:a~~, J. M. 1948. Act3 Cryst. 1:324. 20. Doh~ouuq J. & G. S. SVXNT. 1956. Proc. NatI. Acad. Sci. 42:734. [ 179 1