The robustness of biological load-bearing scaffolds is defined partly by the mechanical properties of the various components present in the scaffold and partly by how these components are joined to one another. The musculoskeletal system, for example, consists largely of striated muscle, tendon, bone, and cartilage that are interconnected through junctions such as the enthesis (1). Quite a lot is known about structure–property relationships within bone (2), muscle (3), tendon (4), and cartilage (5), but how these merge with one another especially at a molecular level is still largely unexplored. The characterization of entheses is crucial for understanding how loads are distributed in natural scaffolds and, in particular, for engineering superior multifunctional biomedical materials.
The byssus or holdfast of marine mussels offers a convenient model for studying some aspects of molecular joinery particularly with regard to the load bearing function. Each byssal thread contains no living cells, readily lends itself to mechanical characterization, and consists of relatively few matrix proteins that are quickly and reproducibly put together. The focus of this study is the critical junction between the distal portion of the thread and the adhesive plaque (Figure 1). From an architectural perspective, this is an intriguing junction. The thread consists of densely and anisotropically packed collagenous microfibrils, whereas the plaque is a solid foam with a porosity ranging from 1 to 5 μm in pore diameter (6-8). Microscopic inspection of sections of the thread-plaque junction has revealed that the collagenous microfibrils undergo extensive splaying where they meet the foam, not unlike a tree deeply rooted in the earth. Apart from the large amount of surface area created by the splaying, the molecular aspects of interaction have not been explored.
The plaques and threads produced by the California mussel (Mytilus californianus Conrad) are sufficiently large to enable a dissection and molecular characterization of junction-specific proteins. Previous research has determined that the distal thread of Mytilus species consists primarily of two collagens and small amounts of a thread matrix protein tmp-1 (9, 10). The collagens, known as preCOL-D and preCOL-NG, are peculiar in being kinked and having a linear block domain structure that includes silklike, Gly-rich, collagen, and His-rich domains (11, 12). The plaque is composed of three parts: the adhesive footprint which contains mcfp-3, -5, and -6 (13, 14) and the structural foam which is dominated by mfp-2, a 45 kDa protein with 11 tandem repeats of an epidermal growth factor-like domain (15, 16), and a large transitional region in which the foam and collagen fibrils interpenetrate one another. The aim of this study was to explore the plaque-thread junction for proteins capable of mediating the contact between the fibrous preCOLs and foam proteins. mcfp-41 is a junction-specific protein with two different but well-defined metal-binding domains.
Once ~500 plaques or 2 g of wet weight was accumulated, the plaques were homogenized on ice with 20 mL of 5% acetic acid and 8 M urea using a small hand-held tissue grinder (Kontes, Vineland, NJ). The supernatant, which contained crude mcfp-4, was harvested by centrifugation for 40 min at 20000g and 4 °C using an SS-34 rotor. The crude extract of mcfp-4 from adhesive plaques was adjusted to pH 7.5 with 6 M NaOH. During the adjustment of the supernatant pH, 1 mg/mL Na2BO4 powder was included for protection of Dopa residues. Partial purification of mcfp-4 from plaques was accomplished by immobilized metal affinity chromatography (IMAC) using a 1 mL chelating column from Amersham-Pharmacia charged with Zn2+. Equilibration buffer contained 20 mM Tris, 10 mM Na2BO4, 0.5 M NaCl, and 8 M urea (pH 7.5), whereas for elution, the same buffer acidified to pH 3.0 was used. After the crude extract of mcfp-4 had been slowly loaded onto a pre-equilibrated IMAC chelating column, the column was flushed extensively with equilibration buffer and then eluted with 8 column volumes of the acidic elution buffer. Elution was monitored at 280 nm with an UV monitor, and 1 mL fractions were collected. Peak absorbing fractions were pooled, dialyzed against 5% acetic acid overnight, and lyophilized. The lyophilized, partially purified mcfp-4 was reconstituted in a small volume of 5% acetic acid. Purification of mcfp-4 was achieved by gel filtration chromatography using a Shodex-803 column (5 μm, 8 mm × 300 mm) which was equilibrated and eluted with 5% acetic acid in 0.2% TFA. The sample load was limited to 200 μL/run, and the eluted volume was monitored at 280 nm.
Dialysis resulted in visible turbidity, which was clarified by centrifugation at 20000g for 30 min at 4 °C. Redialysis of the supernatant against 0.1 M sodium borate at pH 8 precipitated most of the mcfp-4 and small amounts of fp-3 and fp-5, which were harvested by centrifugation for 30 min at 20000g and resuspended in a small volume of 5% acetic acid with 8 M urea. Residual particulate matter was removed by spinning the samples at 14K rpm for 20 min in an Eppendorf microfuge. Pure mcfp-4 was obtained by reverse phase HPLC (RP-300 Aquapore device with dimensions of 260 mm × 7 mm from Applied Biosciences Inc., Foster City, CA) using a linear gradient of aqueous acetonitrile for elution. Eluant was monitored continuously at 220 nm, and collected 1 mL fractions were assayed by amino acid analysis and electrophoresis following freeze-drying.
With purified mRNAs, a cDNA library was constructed by using the CloneMiner cDNA library construction kit from Invitrogen. This cDNA library served as a readily available source of cDNA. Alternatively, first-strand cDNA was synthesized from total RNA using Superscript II reverse transcriptase with an adapter primer, 5′-GGC CAC GCG TCG ACT AGT ACT (T)16-3′ (Invitrogen). The product of the room-temperature reaction was used as an alternative template for PCR.
On the basis of the known tryptic peptide sequence of mcfp-4, HQVLHGHVHT, a degenerate forward primer, 5′-CAY CAR GTN TTR CAY GGN CAY GTN CAY AC-3′, was designed and coupled with an abridged universal amplification primer (antisense, 5′-GGC CAC GCG TCG ACT AGT AC-3′) from Invitrogen to amplify the carboxyl terminus of mcfp-4 from RT template using DyNAzyme EXT DNA polymerase from New England Biolabs for the long PCR product (catalog no. F-505S).
After the carboxyl-terminal sequence of mcfp-4 had been cloned (~2 kb with a 3′-untranslated region and poly-A tail), a reverse primer (5′-G TTT ATG TAG AAC ACG ATG CAT G-3′), which was designed according to the known carboxyl-terminal sequence, and a degenerate forward primer (5′-CCN AGY GGN TAY GCN AAY ATH GGN CA-3′ based on the N-terminal sequence of mature mcfp-4 by Edman degradation) were used to amplify the amino terminus of mcfp-4 from the cDNA library. The complete sequence of mature mcfp-4 was then determined by overlapping the partial cDNA clones of amino- and carboxyl-terminal cDNA sequences.
PCRs were carried out in 25 μL of 1 × buffer B (Fisher) and 5 pmol of each primer, 5 μmol of each dNTP, 1 μL of the first-strand reaction mixture, 2 mM MgCl2, and 2.5 units of Taq DNA polymerase (Fisher) for 35 cycles on a Robocycler (Stratagene). Each cycle consisted of 30 s at 94 °C, 30 s at 52 °C, and 1.5 min at 72 °C, with a final extension of 15 min. The PCR products were subjected to 1% agarose gel electrophoresis, purified, cloned into a pCR-XL-TOPO vector using the TOPO XL PCR cloning kit from Invitrogen, and transformed into competent Top10 cells for amplification, purification, and sequencing.
To obtain the 5′ end information, the GeneRacer kit from Invitrogen was used to obtain sequence information from full-length transcripts by a 5′ RACE strategy. PCR was performed with a gene-specific primer (antisense, 5′-GAA CGC GGT GAT TGT GAA CAT GTG AGT GTA AAA C-3′), which reversely primes the N-terminus of mcfp-4, and a GeneRacer 5′ primer from Invitrogen (sense, 5′-CGA CTG GAG CAC GAG GAC ACT GA-3′).
Modification of the synthetic peptide by diethyl pyrocarbonate (DEPC) was performed in 50 mM Na/K phosphate buffer at pH 7.0. An ~3-fold molar excess of DEPC was incubated with peptide for 30 min at room temperature (24). The modified peptide was purified by C18 reversed phase HPLC with a shallow acetonitrile gradient. After lyophilization, the DEPC-modified peptide was incubated with a 10-fold molar excess of CuCl2 in 25 mM N-ethylmorpholine and 30 mM KCl buffer (pH 7.4) at room temperature for 10 min (23). In some cases, prior to the DEPC treatment, the peptide was incubated with a 10-fold molar excess of CuCl2 in Tris-HCl buffer at pH 7.0 for 30 min, and the samples were subjected to MALDI-TOF MS analysis without further purification.
To better characterize this protein, a large number of plaques were collected and extracted. On the basis of superficial similarities between mcfp-4 and a partially characterized histidine-rich protein from Mytilus edulis (26), we attempted to further separate the soluble proteins on a Zn2+ IMAC column. Several proteins were bound by IMAC and eluted with a step to pH 3 (Figure 2A,C, lane 2). Two were strongly NBT-positive following acid urea-PAGE (suggesting Dopa) and migrated between the preCOLs and mcfp-3s. Although both may be mcfp-4 variants, the more mobile band also stained strongly for histidine (Figure 2C, lanes 3 and 4). Gel filtration on a Shodex-803 column was able to resolve the remaining proteins with the histidine-rich component being the first to elute (Figure 2B,C, lane 5). The m/z of this fraction was 92 kDa as determined by MALDI-TOF mass spectrometry.
In contrast to the plaque, extraction of dissected foot tissue (i.e., phenol glands) with 8 M urea and 5% acetic acid was inadequate in attempting to solubilize mcfp-4. Only the use of 4 M guanidine HCl and 5% acetic acid liberated significant amounts of the protein as it did in the case of mefp-5 (19) (Figure 3A). mcfp-4’s finicky solubility, however, was used to advantage in a serially extractive strategy; that is, via successive extraction of the insoluble residue of foot tissue with 5% acetic acid, 5% acetic acid and 8 M urea, and 5% acetic acid and 4 M guanidine HCl, little but mcfp-4 and -5 remained in the last supernatant especially after dialysis against 1% (v/v) perchlorate. mcfp-4 was finally harvested from PCA by a final dialysis against 0.1 M borate, which caused extensive but reversible precipitation as followed by acid urea-PAGE (Figure 3A, lanes 1-3). A single pass through a reverse phase C8 HPLC column (Figure 3B) was adequate to purify mcfp-4 to homogeneity (Figure 3A, lanes 4-8). Peak HPLC fractions containing pure mcfp-4 were subjected to MALDI-TOF mass spectrometry, which revealed strong singly and doubly charged ions at 92 433 and 46 217 Da, respectively (Figure 4).
Foot-derived mcfp-4 stained as a sharp band with Coomassie blue R-250, and also by redox cycling with NBT, which indicated the presence of some Dopa. The histidine-rich content of mcfp-4 was supported by intense orange staining with Pauly’s reagent (Figure 3A,C).
The cDNA-deduced sequence revealed a very histidine-rich N-terminal moiety in which the histidine content approaches 41 mol %. The domain contains a tandemly repeated decapeptide motif, HVHTHQVLHG, which is highly conserved and occurs ~36 times in variant 1. The His-rich domain is punctuated by a linker sequence (linker A) containing the only cysteines in the protein before entering the second largest repeat sequence, the DN-rich domain, in which the levels of Asp and Asn approach 30 mol %, and with 16 repeats of undecapeptide DDHVNDIAQTA, in which the nonitalicized residues are substituted in less than half of the repeats (Figure 5). Only His-3 is never substituted. The DN domain is followed by another short linker sequence (linker B, Figure 5) before it ends with four repeats of a 13-amino acid sequence. The predicted amino acid composition based on the deduced mature sequence exhibited the expected overall prevalence of His and Asx at 23.6 and 11 mol %, respectively, and was consistent with the experimentally determined compositions of the purified mcfp-4s (Table 1).
Two mcfp-4 cDNA variants (variants 1 and 2) were detected and appear to represent alternatively spliced products. The variants differ only in the number of decapeptide repeats present in the His-rich domain, which is reflected in the calculated masses of 87.99 and 93.4 kDa, respectively. These are both close to the observed mass of 92 kDa, but the former would appear to require either 4 kDa of post-translational modification or the latter, 1-2 kDa of sequence trimmed from the C-terminus.
The involvement of peptidyl-histidines in copper binding was explored by the use of diethyl pyrocarbonate (DEPC), a reagent known to preferentially modify histidine by carbethoxylation (CE) of the imidazole nitrogens (27). Modification of peptidyl-histidine with DEPC prior to exposure to the metal abolished copper binding (Figure 7A, top and bottom). The spectra are somewhat complicated by the presence of salt (Na + K; Δmass = 23 Da + 39 Da = 62 Da) adducts. Thus, each of the monoprotonated peptides with seven, eight, and nine carbethoxylations is accompanied by a sodiated and potassiated counterpart, e.g., [M + 8CE + Na + K]+ is 62 Da above the [M + 8CE + H] peak. This is a crucial control given how close the masses for Na + K and Cu are. At the concentrations that were used, DEPC was expected to monocarbethoxylate every histidine (23, 27). Since the peptide mass increased by 72 Da with every monocarbethoxylation, eight histidines (m/z 3042.9) appeared to be modified along with one or two other sites (m/z 3110 and 3187.1), perhaps the N-terminal amine and an ε-amine of lysine (Figure 7A). Copper binding by the His-rich peptide largely prevented its modification by DEPC (Figure 7B). Apparently, not all of the sites were shielded from modification when copper was first bound by the peptide (Figure 7B), because at least two DEPC adducts (Δ = 72 Da) are clearly detectable. Presumably, these are the lower-affinity sites, but whether they are His or other residues remains to be determined.
The metal binding capacity of the C-terminal undecapeptide repeats was also assessed by a MALDI-TOF approach using a synthetic peptide based on the mcfp-4 sequence (residues 562-580). Whereas none of the transition metals (Cu, Ni, Zn, Co, and Fe) were detectably bound by the peptide, 1-3 equiv of bound Ca/peptide survived desorption and ionization (Figure 8 and the Supporting Information). Further work will be necessary to pinpoint the binding ligands.
mcfp-4 is localized to the apical portion of the byssal adhesive plaque where the splayed microfibrils of the thread core merge with the foam of the plaque (Figure 1). mcfp-4 is a highly repetitive and asymmetric protein with respect to composition and sequence. Of the total histidine, 84% is concentrated in the N-terminal half of the protein where it occurs four times on average in each of 36 decapeptide repeats in variant 1. In contrast, all of the Asp and 84% of the Asn are in the C-terminal half where they are prominent in a tandemly repeated peptide sequence 11 residues long with a DDHVN(D/N)IAQIA consensus sequence. Dopa occurs in the protein at less than 2 mol % and, on the basis of partial direct sequences, may be limited to the N- and C-termini as was the case for Dopa distribution in mefp-2 (15).
mcfp-4 joins a dozen or so known proteins with His-rich peptide repeats. Nearly all have demonstrated or proposed metal binding functions. In Table 2, the mcfp-4 consensus sequence is compared with other His-rich sequences known or proposed to bind metals. Most are involved in Zn/Cu binding, although plasmodial HRP-2 has been reported to bind iron heme (28). The repeat sequence of mcfp-4 is unique and unmatched in the sheer number of tandem repeats present. The preferential high-capacity Cu2+ binding ability of a synthetic peptide based on actual mcfp-4 sequence has been qualitatively established using MALDI-TOF mass spectrometry. Abolition of binding by prior modification of the peptides by DEPC emphasizes the role played by histidine.
The acidic consensus repeats at the C-terminus of mcfp-4 resemble repeat sequences of some known Ca binding proteins such as sarcoplasmic HRP given their high proportion of Asp/Asn and Glu/Gln (Table 2). However, in the best-characterized calcium binding proteins (EF hand and EGF domains), the binding is generally achieved by noncontiguous ligands in three-dimensional space (29). EGF domains are particularly widespread in proteins such as factor VI and fibrillin but also in mefp-2, which is an abundant plaque protein with 11 EGF repeats (16). EGF domains are endowed with Ca2+ binding ability (29) that is defined by a pentagonal bipyramidal geometry (seven ligands) involving Asp, Asn, Glu, and Gln side chains. A synthetic peptide 18 residues long based on the C-terminal repeat domain of mcfp-4 qualitatively exhibited a preference for binding one to three Ca2+ ions as detected by MALDI-TOF mass spectrometry.
mcfp-4 probably functions as a macromolecular bifunctional linker in the plaque-thread junction (Figure 9). On the basis of its location in the plaque, asymmetric organization, and metal ion binding behavior, mcfp-4 seems well suited to the role of coupling agent between the preCOLs in the frayed ends of the thread, on the one hand, and mcfp-2 (16) or the phosphorylated variants of mcfp-5 and -6 (14), on the other. The histidine-rich sequence in the N-terminal moiety of mcfp-4 appears to have a particularly high binding capacity for copper under MALDI conditions, e.g., up to 11 Cu ions per 2-mer. Perhaps during injection molding of a new thread, the histidine-rich repeats of mcfp-4 are presented to the preCOLs, which are also endowed with histidine-rich metal-binding domains at their termini. PreCOL-NG, for example, has a GGGHGGGHGGGRGGGH sequence at one end and an AHAHAHARAHAHA sequence at the other (12). Synthetic peptides of similar sequences exhibit copper binding in vitro (30). Transition metals such as Cu2+ coupled to the histidine ligands would serve as intermolecular bridges between the proteins (Figure 9). A similar relationship may apply to Ca2+ in coupling the C-terminal end of mcfp-4 to other calcium binding proteins in the plaque. The fate of Dopa in mcfp-4 is probably closely linked to that of the 5,5′-diDopa detected in the plaque (31) and thus represents an even more permanent link between plaque proteins.
The concept of connecting load-bearing proteins at junctions through coordinate or chelate interactions rather than covalent or noncovalent bonds is a not a new one. A high-resolution structure of a magnesium-mediated connection between integrin and collagen was recently reported (32, 33). The high number of available binding sites in mcfp-4, however, is remarkable. The nearly 36 His-rich decapeptides and 16 Asp/Asn-rich undecapeptides should enable an excellent load distribution in mcfp-4 if all repeats participate equally in binding. mcfp-4 is an intriguing splicing element because it is so clearly multi-heterobifunctional; that is, it appears to be designed to connect to multiple domains on two different proteins. Although it remains to be shown that Cu2+ is the actual bridging metal in situ, it is certainly the best metal for mcfp-4 under the conditions prescribed for the MALDI assay.
Synthetic analogues of the consensus sequences of several His-rich proteins have unexpectedly provided effective metal ion binding functionalities for scaffolds used in the fabrication of nanowires and nanolithography (34, 35). Given its involvement in an evolved load-bearing structure and remarkable capacity for copper binding, mcfp-4 may help to inspire a new generation of metallopolymers.
We thank Rachel Ngo for her assistance with byssal thread induction and analysis and Dr. Dong-Soo Hwang for helping construct the expression library of M. californianus.
DEPC | diethyl pyrocarbonate |
Dopa | 3,4-dihydroxyphenyl-L-alanine |
MALDI-TOF | matrix-assisted laser desorption and ionization with time of flight |
mcfp-4 | Mytilus californianus foot protein 4 |
NBT | nitroblue tetrazolium |
PAGE | polyacrylamide gel electrophoresis |
RACE | rapid amplification of cDNA ends |
RT-PCR | reverse transcriptase polymerase chain reaction |
Isolation of mcfp-4 from foot tissue, isolation of tryptic peptides derived from mcfp-4 for sequencing, and MALDI mass spectrometric analysis of the behavior of metal ion binding by an acidic peptide. This material is available free of charge via the Internet at http://pubs.acs.org.