chapter 7 SPECIES VARIATION IN PROTEIN STRUCTURE 0 ne of the major questions to be answered in ar- riving at a clear understanding of the phylogenetic relationships be- tween different forms of life is whether there exist identical, or closely homologous, genes in widely separated species, or whether similarities in phenotype are due to analogous genes which determine equivalent appearance or function by different pathways. The tech- niques of experimental genetics permit us to compare the genetic makeup of only those organisms that can be successfully crossed. We know, for example, that the eye pigments of a wide variety of species contain the same light-sensitive compound. However, we have no genetic way of testing whether the synthesis of this com- pound in different organisms is under the control of the same set of genes, structurally modified perhaps in some slight manner but still essentially identical, or whether completely different genes are in- volved which act in concert to achieve the same end result. All genetic analyses depend on the availability of some recogniz- able phenotypic character, be it morphological, functional, or meta- 142 bolic. When the character being used as the criterion for the pres- ence or absence of a functional gene or set of genes is a gross mor- phological one, we cannot attempt to distinguish homology of genes from analogy of genes. This is true even in those instances in which we can demonstrate the presence, in widely separated species, of identical chemical structures such as the creatine phosphate in the tissues of all vertebrates. The production of such a substance could be carried out by analogous, rather than homologous, enzyme sys- tems in the different species, and the genes that exert the basic con- trol on the synthetic process might conceivably be quite different in terms of chemical organization. There does appear to be one ap- proach, however, which might give definitive information about the persistence of particular genes throughout the phyla. The tech- niques of isolation and structural analysis now available enable the protein chemist to make exact comparisons of proteins isolated from a wide variety of biological sources. If we accept the hypothesis that the proteins represent a primary, if perhaps a fuzzy, "print" of genetic information, we may then conclude that two organisms have the same gene, or gene set, when they both contain the same protein molecule. (Many readers will not be willing to swallow, whole, the thesis that proteins represent the direct translation of genetic infor- mation. We shall consider some of the arguments, pro and con, in Chapter 8. ) It is clear that we must expect to find differences between the "same" protein in various species. This is particularly true since, as we have discussed in the past chapter, certain parts of biologically active protein molecules are relatively more "dispensable" than oth- ers from the standpoint of function. Mutations that lead to changes in the sequence of amino acids in the last three C-terminal amino acids of ribonuclease, for example, might cause little change in the life expectancy or fertility of the affected animal. On the other hand, a mutation that led to a critical modification in the sequence of the "active center" of the enzyme might well be lethal, and the gene, so mutated, would not be perpetuated. A comparison of the structures of homologous proteins (i.e., pro- teins with the same kinds of biological activity or function) from different species is important, therefore, for two reasons. First, the similarities found give a measure of the minimum structure which is essential for biological function. Second, the clifferetlces found may give us important clues to the rate at which successful mutations have occurred throughout evolutionary time and may also serve as an additional basis for establishing phylogenetic relationships. SPECIES VARIATION IN PROTEIN STRUCTURE 143 The proteins and polypeptides for which complete comparisons of covalent structure can be made at the present time are relatively few in number. We ,can list, in this category, insulin, adrenocorticotropin, melanotropin, vasopressin, oxytocin, and hypertensin. The complete structure of glucagon (hypoglycemic factor) has been elucidated, but no comparisons of this hormone from different species have as yet appeared. Among the enzymes, only ribonuclease has been suffi- ciently studied to permit essentially complete comparison of species differences. The structure of a tetradecapeptide portion of cyto- chrome c, in the vicinity of the heme prosthetic group, has been ex- amined for a fairly large number of organisms. Finally, there exists a large literature on the composition, end group, and end sequence analysis of various sets of homologous proteins, and on their physical, enzymatic, and immunologic properties, which allows us to make at least some educated guesses about similarities and differences. Ribonuclease and the "Fingerprinting" Technique The detection and study of chemical differences between homolo- gous proteins is generally carried out by one or another variety of "fingerprinting" technique. In essence, this involves the use of re- producible physical methods for the separation of peptide fragments produced by digestion of the proteins with proteolytic enzymes. After establishing the distribution pattern of these fragments for the original "reference standard" protein, differences obtained on digests of the protein from other biological sources may then be easily de- tected, and the nature of the chemical modification may be deter- mined by classical methods of amino acid and sequence analysis. An example of a "fingerprint" comparison is presented in Figures 71 and 72. This figure shows the patterns for beef and sheep ribonu- cleases. The differences are quite clear and are completely repro- ducible horn digest cto digest. In this instance, digestion was carried out first with trypsin and subsequently with chymotrypsin. The protein was oxidized with performic acid prior to digestion to avoid steric complications which might be introduced by the disulfide bridges. The sort of fingerprinting used here is rapid and technically sim- ple and serves, adequately, for the preliminary detection of differ- ences. Indeed, if we are happy with micro techniques of the sort that served Sanger and his colleagues so well in their studies of in- sulin structure, a complete comparison of the corresponding peptides 144 THE MOLECULAR BASIS OF EVOLUTION Chromatography. using n-butanol-acetic acid-water; 4~15 ------+ Figure 71. A "fingerprint" pancreatic ribonuclease. of a proteolytic enzyme digest of oxidized bovine the left of the figure. An aliquot of the digest was applied in a small spot at This material was then subjected to descending paper chromatography, as indicated by the arrow, and, after allowing the solvent to evaporate, the pnper was moistened with buffer solution and subjected to high- voltage electrophoresis. The sheet of paper was then sprayed with ninhydrin solution to stain those areas containing peptides. These areas were cut out (from a lightly stained paper), and the peptides were eluted. The amino acid composition of each was determined, after hydrolysis with acid, by paper chroma- tography. The composition of the various peptide components is given in Table 8. For further details consult an article by C. B. Anfinsen, S. E. G. Aqvist, Juanita P. Cooke, and Bijrje Jansson, J. BioE. Chem., 234, No. 5 (1959). /, cl I 30 3b Chromatography, using n-butanol-acetic acid-water; 4:1:5 -- Figure 72. A "fingerprint" r&ease. of performic acid-oxidized ovine pancreatic ribo- The techniques employed were the same as those described in Figure 71, and the composition of the various peptide components is given in Table 8. shown in such a fingerprint can be made with relative assurance. The use of such methods always involves an uncertainty in regard to the minor ninhydrin-positive components that routinely plague the paper chromatographer, and a certain amount of personal judgment is frequently involved in deciding whether a trace peptide compo- Asp Thr Ser Ghl Pro GlY Ala CYS Val Met Ileu Leu `Jh Phe T .YS His A rg __- 15 15.65 15.20 -1 (?) 10 9.77 9.06 -1 15 13.95 16 15 +e 12 11.75 13.3 +1to+e 4 3.!)5 3.96 3 3. 03 3. I4 1% 11.87 11.1 -1 4 3.36 3.60 9 8.48 8.87 4 3.62 3.73 3 1.83 1.76 2 2.05 1.99 6 5.7 5.71 3 3.27 3.41 10 11.25 9.88 -1 4 3.51 3.83 4 3.76 3.78 9 -~ a From unpublished experiments of S. Aqvist, C. B. Anfinsen, J. Cooke, and B. Jiinsson. 146 THE MOLECULAR BASIS OF EVOLUTION SPECIES VARIATION IN PROTEIN STRUCTURE 147 nent is due to a fleck of dirt on the paper or to a bona fide structural fragment. For this reason the purist will often prefer the use of ion exchange columns over paper chromatography and electrophoresis, since he can then make more quantitative estimates of the recovery of fragments in relation to theoretical expectation. The latter course is obviously to be recommended in principle. However, when a large series of proteins are to be compared, and when experience has indicated the limits of error involved, the more rapid and flexible paper method will probably be employed to establish the major gen- eralities of structure. The study of species differences in the structure of enzymes is of special interest because with these proteins we can, in many in- stances, consider such variations in terms of the ability of the enzyme to catalyze a specific, chemical reaction. When, for example, a suc- TABLE 7 Amino Acid Analyses of Beef and Sheep Pancreatic Ribonuclease" Amino Acid Beef Enzyme Sheep Enzyme Average Observed Average Observed Estimated Number of Resi- Number of Resi- Changes in dues per Mole dues per Mole Sheep "Theory" (mol. wt. 13,683) (mol. wt. 13,683) Enzyme ~.__ TABLE 8 Analyses of Peptides from Fingerprints of Trypsin and Chymotrypsin Digestions of Oxidized Pancreatic Beef and Sheep Ribonuclease' Beef No. Composition Sheep No. 3b Lysz,Glu,Thr,Ala3 (beef) Lyr,Glu,Ser, Ala (sheep) 3a and 3b 6b Phe+Glu,Arg. 6b -f Asp,Glu,Thr,Ser3,Met,His -0 -c Asp,Alan,Sera,Tyr -E 16 Cys,Asp,Glu,M&,Lys 16 1 Ser,Arg 1 2 Asp,Leu,Thr,Lys - 13 AspArg Asp,Leu,Thr,Glu,Arg (sheep) 10 9 Cys,Asp,Val.Pro,Thr,T.ys,Ph- 9 18 Glu3,Vala,~,,-u,Serz,His, Asp,Alaz,Cys 18 14 Asp,Cys,AIa,Val,Lys 14 PO Cys,Aspz,Glu,Gly,Thr,Tyr 20 7 Glu,Ser,Tyr 7 13 Thr,Ser,fiIet 12 21 Cys,Asp,Ileu,Thr,Ser,Arg 21 15 Glu,Gly,Ser2,Thr,T,ya 1.5 Ah Cys,Asp,Ala, Z'yrt,Pro,Lys 8b SC Asp,Glu,Ala, Thr2.Lys - Lys,Glu,Ala, Thr 23 28 His,VaI,Ileu~,Cys,Asp,Glu,Gly,Ala,Pro,Tyr as 4 VaZz,Pro,His,Phe 4 a2 Asp,Ala,Ser,Val 22 From carbohenzoxy- Lya,Ser,Glu,Ala,Phe,Arg s3 oxidized sheep ribonuclease Cys,Lys,His,Asp,Glu,Ser,Met,Thr,Ala,Tyr,Arg Sl a Amide nitrogens cannot be assigned to specific glutamic or aspartic acid residues since they are split off by the acid hydrolysis prior to chromatography. b According to earlier observations (see Figure 60, Chapter 5). cleavages should have occurred between phenylalanine and glutamic acid in peptide 6 and in such a way as to remove the lysine residue from peptide 8. However, only traces of free phenylalanine and lysine were detected on the fingerprint patterns. c These two peptides were not detected hy the ninhydrin-staining reaction but have been accounted for in peptide Sl. which was prepared by trypsin digestion of the carbobenzoxylated polypeptide chain (see Chapter 5), rather than by combined trypsin- chymotrypsin digestion of oxidized ribonuclease. The spots labeled 5 and 17 in the fingerprint of digests of beef ribonuclease were present in such small quantities that their amino acid compositions could not be deter- mined with any assurance. Th e Same was true for the component labeled "X" in the . beef ribonurlease fingerprint. The amino acids on the paper rhromatograms (Figure 73) which gave an unusually strong ninhydrin reaction are italicized. The subscripts, when present, indicate the number of moles of each amino acid in the peptide under consideration, as cessful mutation has occurred which leads to the substitution of a charged amino acid residue for an uncharged one in a sequence, this information permits us to make some helpful conclusions regarding the nature and location of the binding site for substrate molecules on the enzyme surface. The amino acid analyses for bovine and sheep pancreatic ribonu- clease are shown in Table 7. These data show that sheep ribonucle- ase contains less lysine, threonine (and perhaps less aspartic acid 1, and more serine and glutamic acid than does the beef enzyme. End group analysis indicates that both proteins contain N-terminal lysine, and sedimentation constants determined in the ultracentrifuge are es- sentially identical. When the peptides separated by the fingerprint- ing procedure illustrated in Figures 71 and 72 were eluted and analyzed, qualitatively, for amino acid composition, the results sum- marized in Table 8 were obtained. Examples of the chromato- graphic comparison of hydrolysates of a few sheep and beef peptides are shown in Figure 73. The peptides obtained from hydrolysatcs of the beef protein are those to be expected from the combined nc- tion of trypsin and chymotrypsin on oxidized ribonuclease, as may be deduced from the partial formula for this polypeptide shown in Fig- ure 60. Corresponding peptides from the sheep material are also easily assignable to particular areas of the formula. Those sequences in sheep ribonuclease which differ from the beef structure may be al- located to specific areas of the chain on the basis of their composi- tions. At one point in the chain of the sheep enzyme, the absence of a trypsin-sensitive sequence involving lysine had precluded cleav- age where the beef enzyme wus split, and a single, longer peptide (sheep peptide 10 Figure 73e), embodying two of the beef frag- ments, resulted. The studies relating various aspects of structure to function which were reviewed in the last chapter have suggested that the disulfide bridge joining half-cystines 1 and 6 may be reduced without complete destruction of catalytic activity. The species variation oc- curring at residue 37, where a positively charged amino acid, lysine, has been replaced by a negatively charged amino acid, glutamic acid, also suggests the unessentiality for substrate adsorption and hydroly- sis of this part of the polypeptide chain. -___ ___ ~___ determined for hcrf ribonu&=asr (Figure 60, Chapter 5). Thus for Ijcef peptitlr :$I, the earlier quantitative an:IIysrs. xhirh drmonstrated the presence of two rcsitlllrs of lysinc alId three of alanine for each residue of glutamic acid and threoninr, \vere confirmed by thr staining reaction which was correspondingly stronger for the t 11-o former amino acids. 148 THE MOLECULAR BASIS OF EvoLUTlON SPECIES VARIATION IN PROTEIN STRUCTURE 149 (f) Figure 73. Two dimensional chromatography of the amino acids produced by acid hydrolysis of some of the peptide components shown in Figures 71 and 72. (a) Peptide 22 from beef ribonuclease. This is the C-terminal tetrapeptide sequence of the enzyme and has the structure Asp.Ala.Ser.Val. (b) The C-terminal tetrapeptide of sheep ribonuclease, having the same composition as that obtained from the beef enzyme. (c) Peptide component 18 from bovine pancreatic ribonuclease. This peptide is derived from residues 47 through 61 in the polypeptide chain. (d) The amino acid composition of peptide 18 from the ovine enzyme. tical from both species. (e) P pt'd Peptide 18 appears to be iden- e I e component 10 from sheep ribo- nuclease. This peptide is derived from the amino acids in positions 34 through 39. The lysine residue, present in the structure of bo- vine ribonuclease within this portion of the sequence, has been re- placed by glutamic acid in the sheep enzyme. The cleavage with trypsin which occurs at residue 37 of the beef enzyme can thus not occur for the sheep protein, and a single hexapeptide sequence is obtained instead of a tetrapeptide and dipeptide. ponent 3b from sheep ribonuclease. (f) Peptide com- The peptide represents the N-terminal heptapeptide sequence of the enzyme. 3b from the beef fingerprint, the N-terminal heptapeptide sequence. (g) Component by the replacement of serine by threonine. The beef enzyme differs, in this region, from the sheep enzyme See Figure 62 for details of structure. Adrenocorticotropin IACTH) The complete amino acid sequences are known for corticotropins isolated from the anterior pituitary glands of three different species, pig, beef, and sheep. The structure of sheep ACTH was discussed in the last chapter, and the sequences shown in Table 9 include only those areas of the three molecules where differences are to be found. Although some difference between the content of amide nitrogen groups has been reported for the three species, these are not included in the figure since it has not been possible to rule out, with certainty, the possibility that these variations are due, in part, to the rigors of the isolation and purification techniques employed. TABLE 9 Variations in Amino Acid Sequences Among Different Preparations of ACTH Residue No. Preparation Species 525 26 27 528 `29 30 31 33 33 @-Corticotropin sheep beef* 1 Corticotropin A pig Ala.Gly.Glu.Asp.Asp.Glu Ala.Ser.Glu.NHr Asp.Gly.Ala.Glu.Asp.Glu Leu.Ala.Glu -__ ' Identity with sheep hormone not absolutely certain but very probable as judged from the nearly complete sequence analysis by J. S. Dixon and C. H. Li (personal communication to the author). Two points are of particular interest in regard to the sequences shown. First, the corticotropins of sheep and beef are identical and differ from that of the pig. This finding is consonant with the closer phylogenetic relationship of sheep and cows to each other than of either to pigs. Second, chemical differences are found only in that portion of the ACTH molecule which has been shown to be unessen- tial for hormonal activity. Genetic mutations leading to such differ- ences might, therefore, not be expected to impose significant disad- vantages in terms of survival, and these genes could become estab- lished in the gene pools of the species. Melanotropin (MSH 1 Melanotropin, like the other hormones considered in this chapter, is a typically chordate polypeptide. Indeed, the demonstration of melanocyte-stimulating activity in extracts of tunicates constitutes an 152 THE MOLECULAR BASIS OF EVOLUTION SP~~,~~ VARIATION IN PROTEIN STRUCTURE important bit of evidence supporting the assignment of these organ- isms to the main thoroughfare of evolution between invertebrates and chordates. Melanotropins have been isolated in pure form from both pig and beef pituitaries (posterior-intermediate lobes). Only a single poly- peptide having MSH activity has been isolated from beef tissues whereas two different chemical entities termed (Y and p-MSH have been isolated and characterized from hog pituitaries. The struc- tures of these substances is given in Figure 74 together with that for porcine ACTH. We shall consider further the provocative similar- ity between the structures of MSH and ACTH in Chapter 10 in re- lation to protein biosynthesis. This similarity is undoubtedly re- sponsible for the fact that adrenocorticotropic hormone exhibits marked melanocyte-stimulating activity. Beef P-MSH differs from porcine P-MSH only in the replacement of the glutamic acid residue in position 2 by serine. Porcine (Y-MSH, however, is considerably different from both the other hormones and is actually identical with the sequence of the first thirteen amino acids in pig ACTH, except for the presence of a masking, acyl, group on the N-terminal amino group and an amide nitrogen group at the C-terminus. Lee and Lerner,* who isolated a-MSH, have suggested that this form of the hormone is the major one in pituitary extracts since it accounts, in their experiments, for the largest share of activ- ity, although other investigators have not so far confirmed the pres- ence of a-MSH.* Bovine P-MSH possesses considerably less biological activity than does porcine /3-MSH, and, until comparisons can be made of synthetic samples of these two materials, this difference in potency must be ascribed to the amino acid substitution at position 2. Insulin Sanger and his colleagues have determined the amino acid se- quences for insulins derived from five different species. Differences * C. H. Li and his colleagues have also recently isolated a-MSH from both porcine and bovine pituitary glands. in both species. The structures were found to be the same ( Personal communication. ) The sequence of a-MSH has been confirmed by total synthesis in the lab- oratories of Klalls Hofmann and of R. Boissonnas. The activity of the synthetic material is critically dependent on the presence of the acetyl group on the N-terminal serine residue. Thus, in the experiments of Roissonnas and his co- workers, the activity of the acetylated polypeptide was approximately 70 times greater than before acetylation. Boissonnas. ) (Persona1 communications from Hofmann and 154 THE MOLECULAR BASIS OF EVOLUTION . . . CySOsH.Ala.Ser.Val . . . (beef) . . . CyS03H.Thr.Ser.Ileu . . . (pig) . . . CySOaH.Ala.Gly.Val . . . (sheep) . . . CyS03H.Thr.Gly.Ileu . . . (horse) . . . CySOsH.Thr.Ser.Ileu . . . (sperm whale) . . . CySOaH.Ala.Ser.Thr . . . (sei whale) Figure 75. Species differences in the amino acid sequences of insulins from various biological sources. These differences all occur within the disulfide "loop" of the A chain. were limited to the amino acids within the disulfide "loop" of the A chain (Figure 75) and the R chain was identical in all instances (see Figure 65). Of the five insulins examined, only those from the pig and the sperm whale exhibited the same structure. The fact that all the observed differences were restricted to the sequence within the "loop" suggests that the amino acids in this region of the insulin molecule are not particularly critical ones from the standpoint of hormonal activity. On the other hand, several investigators have ob- tained evidence indicating that insulin loses its biological activity when disulfide bridges are reductively cleaved. The species differ- ence results with insulin suggest that only the steric configuration of the loop is essential and that the "spacers" between the half-cystine residues may be varied through mutation of the corresponding gene or genes. It is of interest that sequence variations have not been ob- served in the C-terminal region of the B chain, an area which does appear to be essential for activity as shown by the inactivation of insulin following the removal of the last seven residues in the chain. Hypertensins The hypertensins are peptides present in serum which possess pressor activity. Two forms have been isolated from horse serum,' the first being convertible to the second by the action of an enzyme in plasma according to the equation: Asp.Arg.Val.Tyr.Ileu.His.Pro.Phe.His.Leu -+ Asp.Arg.Val.Tyr.Ileu.His.Pro.Phe. + His.Leu The precursor compound is not active in an in vitro test system, but after cleavage of the critical peptide bond it becomes active both in viva and in vitro. The precursor form has also been isolated from SPECIES VARIATION IN PROTEIN STRUCTURE 155 p . Ala . C.lu . Cy . His . ? s CH3 dH-CH, CH3 AH-CH, /y;,y/ Hc'u;yH . . CH3 FHz dH2 CH, CHz CHz COOH COOH `t'hr . Vat . Glu . . Ala . Glu . Cy . His . Thr . Val . Glu . Lys . . . . CH3 CH2 I CH, tiH3 I CH2 CH2 COOH COOH bovine serum3 and is identical with that from horse serum except for the substitution of isoleucine by dine. These two amino acids are extremely similar in structure and the substitution in this case represents one of the more minimal changes possible given the avail- able selection of naturally occurring amino acids. Cytochrome c or The electron-transporting enzyme, cytochrome c, furnishes one of the most interesting examples of species variations in protein struc- ture, since it has been isolated in pure form from a particularly wide assortment of species. Unfortunately, studies on variations in se- quence have been carried out for only a relatively small portion of the total chain, and preliminary amino acid analyses indicate that there may be modifications elsewhere in the molecule as well. Never- Figure 76. Structure of the peptide-porphyrin compound isolated from trypsin digests of cytochrome c. 1024 (1954). After H. Tuppy and G. Bodo, Monatsch. Chem., 85, Beef Horse Pig I Salmon Chicken Silkworm Yeast Rhodospirit- NH2 NH2 I I . . . Vat.Glu.Lys.Cys.Als.Glu.Cys.His.Thr.V~t.~lu.l,ys . . NH* NH2 I . . . V;~I.Gtu.I,ys.Cy.s.Al:~.Gtu.Cys.~lis.Tt~r.V~~I.Gtu . . . NH2 Nlt2 I I . . . V~~t.Gtu.Lys.Cys.Ser.Gl~~.~~ys.lIis.Tt~r.Vat.~~tu . . . NH2 NIl2 I I . . . Val.Gtu.Arg.Cys.Ata.Glu.Cys.His.Tt~r.Val.Glu . _ . Phe.Lys.Tt~r.Arg.Cys.Glu.IRu.Cys.Hirc.Tt~r.Vat.Glu . . . NH2 . . . Lys 1 or .Cys.Leu.At:~.Cys.lti.s.`~t~r.Phe.Asp.Gtu.Gty.Ala.Asp.I,ys . . . Arg lum ruhrum 1,y.s Common sequcwe: or 1 .Cys.;Y.Y.Cys.lIis.Ttlr. Aw Figure 77. Variations in the sequence of the polypeptide chain of cytochrome c from species to species. From H. T~~ppy, Symposic~~ on Protein Structure (A. Neuberger, editor), John Wiley & Sons, 1958. THE MOLECULAR BASIS OF EVOLUTION SPECIES VARIATION IN PROTEIN STRUCTURE 157 156 theless, the investigations of H. Tuppy, S. PaIt&, and G. Rodo,4 on this ubiquitously distributed enzyme, lend the most convincing sup- port to the argument that certain units of the universal gene pool may be extremely ancient. In the degradative studies of the enzyme, advantage was taken of the finding of H. Theorel15 that the heme prosthetic group of cyto- chrome c is attached through stable thioether linkages to the protein moiety. After proteolytic degradation with trypsin (and in later studies with pepsin), that portion of the polypeptide chain which is attached to the heme nucleus was isolated. The structure of the heme-peptide compound as determined for cytochrome c from horse heart tissue is shown (in two alternatively possible forms) in Figure 76. In subsequent investigations corresponding sequences from cyto- chrome obtained from a variety of other species have been elucidated as shown in Figure 77. Somatotropinr I Growth Hormones) and Prolactin Pure growth hormone has been isolated from the pituitaries of the species listed in Table 10 by C. H. Li and his colleagues. These proteins have been subjected to both physical and chemical study, and, although even partially complete sequences are not yet available, a great deal can already be said about species variability. Molecular weights vary over nearly a twofold range, and the differences in the number of chains and the cystine content are striking. The beef and sheep hormones are, as in the case of several other proteins we have discussed earlier, quite similar, reflecting once again the close phylo- genetic relationship between these two species. The prolactins of sheep and beef are also extremely similar (Table 11 ), the only difference between them so far observed being a slightly greater tyrosine content in the beef hormone. The absence of a chemically detectable C-terminal amino acid residue is another ex- ample of "masked" end groups. known. The nature of the masking is un- Hemoglobin The hemoglobins have been studied, from the phylogenetic point of view, perhaps more than any other class of proteins. Most of 158 THE MOLECULAR BASIS OF EVOLUTION TABLE 10 N- and C-Terminal Sequences of Somatotropins from Various Species' Terminal Sequences - Somatotropins Amino End Carboxyl End Bovine Ovine Whale Monkey Human -___ Phe.Ala . . . Ala.Phe.Ala . . . Phe . . . Ala . . . Phe.AspNHt.Lys . . . Phe.Ala.Thr . . . Phe.Ser.Thr . . . . . . Ala.Phe.Phe . . . Try.Ala.Phe . . . Leu.Ala.Phe . . . Ala.Gly.Phe . . . Tyr.Leu.Phe Some Physicochemical Properties of Various Somatotropins Physicochemical Characteristicsb Bovine Ovine Whale Monkey Human s20,w 3.19 2.76 2.84 1.88 9.47 D20,w x 10' 7.23 5.25 6.56 7.20 8.88 v20 0.76 0.733 0.737 0.796 0.739 Molecular weight 45,000 47,800 39,900 45,400 27,100 f/f0 1.31 1.68 1.45 1.57 1.23 PI 0.85 6.8 (5.9 5.5 4.9 Cystine 4 5 3 4 4 N-Terminal Residue(s) Phe,Ala Phe,Ala Phe Phe Phe C-Terminal Residue Phe I'he I'he Phe I'he ___- 8 From C. H. Li, Symposium 012 Protein Structure (A. Neuberger, editor), John Wiley & Sons, 1958. b S~O,~ in Svedhergs; D~o,~ in cm.2/sec.; p in ct./gram; f/f@, dissymmetry constant; PI, isoelectric point; cystine in residues per mole. For details on the determination and significance of the physical constants consult the volumes entitled The Proteins (H. Neurath and K. Bailey, editors), Academic Press, 1953, 1954. the available information on the hemoglobins has to do with the chemical and spectrophotometric characteristics of the various pros- thetic groups and with oxygen and carbon dioxide-combining proper- ties. Consequently, this information is not directly pertinent to our present discussion of species differences in protein structure. Re- SPECIES VARIATION IN PROTEIN STRUCTURE 159 TABLE 11 Some Physical and Chemical Properties of Prolactin from Ovine and Bovine Pituitary Glands' Physical and Chemical Properties Ovine Bovine Molecular weight Sedimentation-diffusion Osmotic pressure Analytical data Diffusion coefficient (Dzs) Sedimentation constant (Szs,,) Partial specific volume (V20) Isoelectric point, pH Specific rotation Partition coefficient (%butanol/O.%? % aqueous trichloroacetic acid) Tyrosine, y0 Tryptophan, ye Cystine, residue/mole N-terminal amino acid C-terminal amino acid 24,200 26,500 46,000 24,100 8.44 x 10-7 2.19 0.739 5.73 5.73 -40.5O -4O.P 1.58 2.07 5.26 6.62 1.69 1.75 3 3 Threonine Threonine none none s From C. H. Li, Symposium on Protein Structure (A. Neuberger, editor), John Wiley & Sons, 1958. cently, however, following the important initial studies of R. Porter and F. Sanger,6 a number of investigators have begun to examine amino acid sequences in the hemoglobins. Such studies are becoming increasingly more meaningful as the result of physicochemical inves- tigations on the number of peptide chains per molecule and on the size of the monomer subunit, It now appears that, with the possible exception of foetal hemoglobin and the hemoglobin of the chicken (perhaps birds in general), the vertebrate hemoglobins contain two types of chains. These are present, under physiological conditions, in the form of a molecule with a molecular weight of about 65,000, composed of four chains, two of each type, held together through noncovalent linkages. (The earlier results, which indicated the pres- ence of six chains in the hemoglobin of the horse, are probably in- correct on the basis of recent electrophoretic and ultracentrifugal in- vestigations. ) Valine is the N-terminal amino acid residue on both polypeptide chains of the vertebrate hemoglobins so far examined, except for the goat, sheep, and cow, in which one of the chains be- gins with methionine. A summary of the available end group data is given in Table 12. It is far too early to attempt to make any 160 THE MOLECULAR BASIS OF EVOLUTION TABLE 12 End Group Data on Vertebrate Hemoglobins Species N-Terminal Amino Acids or Sequences8 Human adult'*3 Val.T,eu Human foetaIl* Val.- Dog2 Val.Leu Horse,**2*4 pig* Val.Leu COW,`*? goat,"2 sheep',2 Val.Leu Guinea pig2 Val.Leu Rabbit,* snake* Val.Leu Chicken2 Val.Leu Val.- Val.- Val.GIy Val.Glu.Leu Met.Gly Val.Ser Val.Gly (Val.Asp)t (Val.Gly)2 (Val.Asp)2 8 The presence of a third iv-terminal sequence has been reported only by Ozawa and Satake.2 1. K. Porter and F. Sanger, Biochem. J., 42, 887 (1948). 2. H. Ozawa and K. Satake, J. Biochem. (Japan), 42, 641 (1955). 3. M. S. Masri and K. Singer, Arch. B&hem. BiopAys., 68, 414 (1955). 4. D. B. Smith, A. Haug, and S. Wilson, Federation Proceedings, 16,766 (1957). evolutionary sense out of this information. However, it might be pointed out that the N-terminal sequence, Val.Leu, seems to be pres- ent throughout the species examined, including the representatives of the reptiles and the birds. We may speculate on the possibility that the "valyl-1eucyl" chain represents a relatively early invention of evolution, and that the addition (and modification) of a second type of chain accompanied later differentiations in the vertebrate phylum. The elegant studies of Pauling, Itano, and their colleagues, and of Ingram on the normal and abnormal human hemoglobins, are discussed in the next chapter. The information gained from these studies should be of great value as a baseline for the investigation of hemoglobin structure in other species. Species Comparisons of Serum Proteins The chemist, interested in comparative protein chemistry, generally studies proteins that have unique and interesting biological activities. This is the reason why most of our knowledge of protein structure concerns enzymes, hormones, and pigment-associated proteins. The odds in favor of being chosen for study are, in a sense, fixed in favor of exactly those proteins that Nature might find it necessary to pre- serve in reasonably unmodified form during the evolutionary process. SPECIE.5 VARIATION IN PROTEIN STRUCTURE 161 TABLE 13 Precipitin Tests with Antihuman Serum' (Antihuman serum was prepared in rabbits by periodic injection with human serum) Origin of Serum Amount of Precipitate Relative to Human Primates Man 100 Chimpanzee Gorilla 130 (loose precipitum) 64 Orang 49 Mandrill 42 Guinea baboon 29 Spider monkey a9 Carnivores Dog Jackal Himalayan bear Genet Cat Persian lynx Tiger 3 10 (loose precipitum) 8 3 3 3 a IJngulates ox 10 Sheep 10 Water buck 7 Hog deer 7 Reindeer 7 Goat 0 Horse a Swine 0 Rodents Guinea pig 0 Rabbit 0 Insectivore.9 Tenrec 0 Marsupials Six species: rock and nail-tailed wallabies, kangaroo, Tasmanian wolf 0 8 After G. H. F. Nuttall, from Biochemical Evolution; G. Wald in Trends in Phgsiology and Biochemistry (E. S. G. B arron, editor), Academic Press, 195%. 162 THE MOLECULAR BASIS OF EVOWTION On the other hand, the "permissible" degree of change in proteins that require less rigid engineering, such as certain of the serum pro- teins, the less dynamic elements of tissues such as the collagens and elastins, and various proteins of the hair and skin, is likely to be quite large. One of the earliest studies of the comparative biochemistry of pro- teins was carried out by Nuttall and his collaborators. These inves- tigators used immunological techniques to study the phylogenetic re- lationships between the serum proteins of a wide variety of species. They employed the extent of the precipitin reaction between anti- human serum and the serums of other species as a measure of sim- ilarity (Table 13). We know that the precipitin reaction is not an absolutely specific one and, therefore, that cross reactions which do occur do not require the presence of molecules identical to human serum protein molecules. The results suggest that the serum pro- teins of the species examined form a graded series of macromolecules in which only serums from phylogenetic "neighbors" can cross react significantly. Nuttall strengthened this conclusion by cross-reacting serums from more closely related animals. A dramatic example is his observation that antifrog serum reacts strongly with serums of other tail-less amphibia, but not at all with those of tailed amphibia. In spite of the complete lack of immunochemical similarity between the serum albumins of distant species, the more obvious functional aspects of the protein may nevertheless be retained. For example, the serum albumins of rat and man carry out such physiological functions as fatty acid binding and transport and osmotic pressure regulation, in essentially the same manner and with equal facility. We may guess, until experimentation has a chance to prove us wrong, that the modifications which led to immunological differences spared, or at least only slightly remodeled, the functionally critical parts of serum albumin structure. REFERENCES 1. T. G. Lee and A. B. Lerner, I. Biol. Chem., 221, 943 (1956). 2. L. T. Skeggs, Jr., K. E. Lentz, J. R. Kahn, N. P. Shumway, and K. R. Woods, J. Exptl. Med., 104, 193 (1958). 3. D. F. Elliott and W. S. Peart, Nature, 177, 527 (1956). 4. These studies are summarized by H. Tuppy in Symposium on Protein Struc- ture (A. Neuberger, editor), Methuen, London, 1958. 5. H. Theorell, Biochem. Z., 298, 242 ( 1938); EnzymoZogiu, 6, 88 ( 1939). 6. R. R. Porter and F. Sanger, Btochem. J., 42,287 ( 1948). SPECIES VARIATION IN PROTEIN STRUCTURE 163