pmc logo imageJournal ListSearchpmc logo image
Logo of jbacterJ Bacteriol SubscriptionsJ Bacteriol Web Site
J Bacteriol. 2005 August; 187(15): 5376–5386.
doi: 10.1128/JB.187.15.5376-5386.2005.
PMCID: PMC1196044
NanA, a Neuraminidase from Streptococcus pneumoniae, Shows High Levels of Sequence Diversity, at Least in Part through Recombination with Streptococcus oralis
Samantha J. King,* Adrian M. Whatmore, and Christopher G. Dowson
Infectious Disease Research Group, Department of Biological Sciences, University of Warwick, Coventry CV4 7AL, United Kingdom
*Corresponding author. Present address: 401A Johnson Pavilion, Department of Microbiology, University of Pennsylvania, Philadelphia, PA 19104-6076. Phone: (215) 573-3510. Fax: (215) 573-4856. E-mail: sking2/at/mail.med.upenn.edu.
Present address: Department of Statutory and Exotic Bacterial Diseases, Veterinary Laboratories Agency, Addlestone, Surrey, United Kingdom KT15 3NB.
Received September 1, 2004; Accepted April 28, 2005.
Abstract
Streptococcus pneumoniae, an important human pathogen, contains at least two genes, nanA and nanB, that express sialidase activity. NanA is a virulence determinant of pneumococci which is important in animal models of colonization and middle ear infections. The gene encoding NanA was detected in all 106 pneumococcal strains screened that represented 59 restriction profiles. Sequencing confirmed a high level of diversity, up to 17.2% at the nucleotide level and 14.8% at the amino acid level. NanA diversity is due to a number of mechanisms including insertions, point mutations, and recombination generating mosaic genes. The level of nucleotide divergence for each recombinant block is greater than 30% and much higher than the 20% identified within mosaic pbp genes, suggesting that a high selective pressure exists for these alterations. These data indicate that at least one of the four recombinant blocks identified originated from a Streptococcus oralis isolate, demonstrating for the first time that protein virulence determinants of pneumococci have, as identified previously for genes encoding penicillin binding proteins, evolved by recombination with oral streptococci. No amino acid alterations were identified within the aspartic boxes or predicted active site, suggesting that sequence variation may be important in evading the adaptive immune response. Furthermore, this suggests that nanA is an important target of the immune system in the interaction between the pneumococcus and host.
 
Streptococcus pneumoniae is an important human pathogen that causes diseases ranging in severity from sinusitis and otitis media to pneumonia, bacteremia, and meningitis (13). However, the bacterium more frequently colonizes the human upper respiratory tract asymptomatically. Several virulence determinants have been identified in S. pneumoniae including the neuraminidase NanA, which can cleave terminal sialic acid residues that are α(2-3) and α(2-6) linked to galactose or α(2-6) linked to N-acetylgalactosamine (6). The pneumococcus contains several loci with sequence similarity to neuraminidases; nanA and nanB are expressed and possess activity (3, 6). A third putative neuraminidase, nanC, remains to be characterized (35, 37).

The role of S. pneumoniae neuraminidase activity is not fully elucidated; however, evidence indicates that NanA is involved in colonization and virulence. One suggested contribution of NanA to colonization is to increase the availability of ganglioside receptors that are important for pneumococcal adherence (40). A nanA mutant has a significantly reduced ability to colonize tracheal epithelium (39) and persist in a chinchilla model of nasopharyngeal colonization (38). Furthermore, this mutant was cleared more efficiently in a chinchilla middle ear infection model (38). In contrast, little difference was identified in an intraperitoneal infection model (4).

Evidence suggests that NanA may contribute to long-term colonization by modifying both other nasopharyngeal organisms and host proteins. NanA has been demonstrated to desialylate the cell surfaces of Neisseria meningitidis and Haemophilus influenzae (35). Both these species reside in and possibly compete for the same host niche as pneumococci, and desialylation of lipopolysaccharides of competitors may well provide an advantage. In addition, NanA may contribute to a protease-independent mechanism to modify the function of host glycoproteins that bind to the pneumococcus and protect the airway (24). All isolates screened have been found to produce sialidase activity (21, 31). In patients with pneumococcal infection, a direct correlation exists between levels of N-acetylneuraminic acid in cerebrospinal fluid and development of coma and adverse outcome (31). Recently, immunization with NanA has also been demonstrated to afford protection against nasopharyngeal colonization and experimental otitis media (27).

The housekeeping genes of S. pneumoniae show little variability with sequence divergence ranging from 1 to 2% (7, 11). However, due to the natural transformability of this species it evolves largely though homologous recombination and not point mutations (12). Intra- or interspecies (9) recombination can occur, resulting in the transfer of an entire gene(s) or gene fragments with >20% nucleotide sequence diversity (9). This homologous recombination results in the formation of mosaic genes. Mosaic structure was first identified in the formation of low-affinity penicillin binding proteins found in highly penicillin resistant pneumococci resulting from the selective pressure of penicillin therapy (7). It is therefore not surprising that host immune selection has also led to horizontal transfer involving genes encoding some exposed pneumococcal surface proteins. Loci displaying evidence of this gene transfer include the capsule, pspA, pspC, and igaP (5, 16, 18, 33).

In addition, published data demonstrate some sequence divergence within the 3′ region of nanA (Fig. 1). The function of this 3′ region of nanA is unclear, as a clone lacking the 232 N-terminal amino acids still possessed neuraminidase activity (6). Dowson et al. (8) previously identified a serotype 14 clinical isolate (Pn13) that lacked three tandem 60-bp repeats present within R36A nanA (encoding amino acids 898 to 957). Furthermore, proximal to the 60-bp repeats, R36A possessed a 183-bp region (bp 2508 to 2691) that diverged 36% from the nucleotide and 34% from the amino acid sequence of Pn13. In addition, Pn13 contained a 15-bp duplication (nucleotide 2431) upstream of these alterations. Sequencing of a serotype 4 (TIGR4) genome also revealed sequence diversity in the 3′ region of nanA, which was identical to Pn13 with the exception of an 11-bp deletion within TIGR4. This deletion results in a change in reading frame and termination of the gene at amino acid 804 (32, 37).

FIG. 1.FIG. 1.
Nucleotide (A) and amino acid (B) alignment to demonstrate diversity within the 3′ region of nanA from R36A, Pn13, and TIGR4. Numbering of the sequence starts from the first putative ATG within the R36A published sequence (X72967 [6]). Identical (more ...)

NanA is a pneumococcal virulence determinant and a protection-eliciting antigen, which make it a potential vaccine target. Diversity within some surface-attached pneumococcal protein virulence determinants has previously been described (5, 16, 33). Furthermore, limited evidence suggests that NanA may also possess sequence diversity. Therefore, this study aims to fully investigate the distribution and diversity of NanA.

MATERIALS AND METHODS

Bacterial isolates, culture media, and chemicals. One hundred six S. pneumoniae isolates and 66 representatives of 20 related species were used in this study. These included the S. pneumoniae laboratory strain R36A (NCTC10319), an acapsular derivative of a serotype 2 strain (36), and Pn13 (a serotype 14 strain from Papua New Guinea), previously demonstrated to contain nanA diversity. A further 104 S. pneumoniae isolates from both clinical disease (53 isolates) and asymptomatic carriage (51 isolates) were selected to represent a range of countries of isolation (six countries) and serotypes (42 serotypes). The remaining 66 isolates were from 18 different streptococcal species and two Abiotrophia species formerly classified as S. defectivus and S. adjacens (19). The 66 isolates included 23 S. oralis and 10 S. mitis isolates, which are species closely related to S. pneumoniae. Furthermore, S. oralis and some strains of S. mitis produce sialidase activity, making them candidate donors for recombination events found within NanA (2). A full list of isolates, their clinical background, and characteristics described in this paper is available upon request from the authors. Streptococci were grown on brain heart infusion (Becton Dickinson) plates containing 1.5% agar, supplemented with 5% sheep's blood, and incubated at 37°C in a 5% CO2 incubator. All chemicals, unless otherwise specified, were purchased from Sigma Chemical Co.

Preparation of chromosomal DNA. Chromosomal DNA was prepared from all the isolates used, following the protocol described previously (45).

PCR analysis. PCRs were performed under standard conditions with 30 cycles of 95°C for 1 min, X°C for 1 min, and 72°C for 1 min per kilobase of predicted product size where X°C represents an appropriate annealing temperature for the primer pair used. Products were visualized by agarose gel electrophoresis on 1.0 to 1.5% agarose, depending on the size of the PCR product, in the presence of 1 μg ml−1 ethidium bromide. Details of oligonucleotides used in this study are provided in Table 1. For all PCRs appropriate positive controls were performed resulting in products of approximately the predicted sizes.

TABLE 1.TABLE 1.
PCR primers utilized in this study

Analysis of genetic diversity by RFLP. Restriction fragment length polymorphism (RFLP) analysis was performed as previously described (23). In brief, 5 μl of PCR product was digested by each restriction enzyme, in a total volume of 25 μl, according to the manufacturer's instructions. Seven restriction digests were performed using enzymes RsaI, Tsp509I, Hsp92II, MboI, HinfI, MnlI, and HaeIII/DdeI to investigate diversity within the 5′ region of nanA. Restriction products were separated on 4% and 8% polyacrylamide gels and visualized under UV illumination following staining for 15 min in 0.3 μg ml−1 ethidium bromide. Restriction profiles were designated by visual comparison of RFLP patterns. The number of 15-bp repeat sequences in the nanA 3′ region was determined by restriction of PCR products with DdeI or TaqI and subsequent separation on 12% polyacrylamide gels.

Direct sequencing. A fraction of 5′ and 3′ PCR products were purified by passage through the QIAquick PCR purification kit (QIAGEN) and sequenced using amplification and internal primers with an ABI 373A system.

Bioinformatics. Nucleotide divergence from RFLP data was estimated using Nei3.bas (courtesy of B. Spratt, Imperial College, London, United Kingdom) based on the algorithm of reference 30. Primary sequence analysis was performed using the package DNASTAR. The bioinformatics tools located on http://ncbi.nlm.nih.gov/BLAST were utilized to investigate the similarity of sequences identified to those already present in the databases. Alignments were performed using MEGA—Molecular Evolutionary Genetics Analysis, version 1.01 (25). Mosaic structures resulting from recombination events were detected by the maximum chi-squared test (P < 0.001) (28) and Sawyer's runs test (34). Programs for both tests were written in C++ for the Macintosh by Nick Ross (University of Sussex).

Neuraminidase activity. Neuraminidase activity assays were performed on aliquots of sonicated stationary-phase cultures using the fluorimetric substrate 2′-(4-methylumbelliferyl)-α-d-N-acetylneuraminic acid (MUAN) as previously described (24).

Southern blot analysis. Southern blot analysis was performed using the Boehringer digoxigenin (DIG) system according to the manufacturer's instructions. Five micrograms of chromosomal DNA from a nanAR (nanA allele containing 60-bp repeats) and a nanAΔR (nanA allele lacking 60-bp repeats) pneumococcal strain and 10 other streptococcal isolates (three S. oralis isolates, one S. mitis isolate, three S. sanguinis isolates, one S. cristatus isolate, one S. constellatus subsp. constellatus isolate, and one S. ratti isolate) was restricted with EcoRV. The repeat and block D probe used was generated by restricting an R36A PCR product obtained by amplification using primers 9 and 2 with PvuII. These primers restrict the product 32 bp before the end of the repeat sequence. The resulting 624-bp fragment was purified from an agarose gel using a QIAQuick gel extraction kit (QIAGEN) and labeled with DIG-High Prime (Boehringer). Hybridization was allowed to proceed overnight at 42°C in EasyHyb solution (Boehringer). Posthybridization washes were performed at room temperature using 2× SSC (1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate)-0.1% (wt/vol) sodium dodecyl sulfate followed by 0.1× SSC-0.1% (wt/vol) sodium dodecyl sulfate.

Nucleotide sequence accession numbers. Sequences of the 12 novel nanA 5′ and 10 novel nanA 3′ alleles have been submitted to GenBank and assigned the accession numbers AJ968992 to AJ969014. In addition, the S. oralis gene fragments were submitted to GenBank and assigned accession numbers AJ968986 to AJ968991.

RESULTS

Investigation of the distribution of nanA by PCR. Chromosomal DNA was prepared from the 106 pneumococcal isolates. To investigate the distribution and diversity of nanA, primers were designed to the 5′ (primers 1 and 3 [from nucleotides 198 to 2267]) and 3′ (primers 4 and 2 to amplify the 3′ regions [bp 2383 to 3126 bp]) regions of the gene. Amplification with primers for both the 5′ and 3′ regions of the nanA gene resulted in a product from all 106 isolates screened.

Diversity and the mechanisms of evolution within 5′ nanA. All 106 5′ nanA PCR products, obtained by amplification with primers 1 and 3, were restricted independently with seven frequently cutting restriction enzymes to determine their sequence diversity. Comparison of restriction patterns resulted in the identification of 53 distinct RFLP profiles (5′ RFLP). The nucleotide divergence between any two RFLP alleles as estimated using Nei3.bas was found to vary from 0.11% to greater than 10% (data not shown) (30). For this method of analysis, values above 10% are not accurate. The most common restriction profile identified, 5′ RFLP9, was identified among 12.5% of isolates screened including strain R36A.

PCR products from 12 isolates with different 5′ RFLP patterns were sequenced in order to further examine the extent and nature of genetic diversity within the 5′ region of nanA (Table 2). Diversity between these sequenced PCR products ranged from 0.4% to 17.2% at the nucleotide level and from 0.4% to 14.8% at the amino acid level (accession numbers AJ968992 to AJ969003), confirming the high level of divergence initially identified by RFLP (Fig. 2A).

TABLE 2.TABLE 2.
Details of the nanA 5′ PCR products sequenced and their nucleotide divergence from R36A
FIG. 2.FIG. 2.FIG. 2.FIG. 2.
Diversity within the 5′ region of nanA. Nucleotide (A) and amino acid alignment (B) and schematic (C) of the 5′ published sequences from R36A, TIGR4, and 12 other pneumococci selected to represent a range of 5′ RFLP profiles. The (more ...)

Several causes of sequence divergence were identified: mosaic blocks presumably introduced by horizontal gene transfer, point mutations, and a small insertion as illustrated schematically in Fig. 2C. The 48-bp illegitimate insertion at nucleotide 252 found in two isolates (Cr51 and Cr33) is not present within nanA or available database sequences. Three DNA regions introduced by homologous recombination from other species were identified (blocks A [bp 486 to 1008], B [bp 700 to 738], and C [bp 1161 to 1703]). A Sawyer runs test indicated that these divergent regions were the result of recombination, and the boundaries of all three were confirmed by maximum chi-squared analysis. In addition, the recombinant blocks were further tested by maximum chi-squared analysis and showed no significant internal structure. Searches of GenBank and available genome sequences revealed no sequence similarities to any of the three blocks. Cr51 contains block A (bp 486 to 1008), which is 28.8% divergent at the nucleotide level and 22.9% at the amino acid level from the corresponding region of R36A. Three partial block A sequences were identified within Cr13 (bp 486 to 535), Cl25 (bp 758 to 1008), and Cr8 (bp 885 to 1008). Block B is a 39-bp region (bp 700 to 738) located within the same region of the gene as block A. Block B sequences were identified within nanA from three isolates (Cl57, Cr50, and Cr8), and partial sequences are located within a further two isolates (Cl25 and Pn13 [bp 700 to 717 bp]). Block B diverges 30.8% at the nucleotide level and 7.7% at the amino acid level from the equivalent region of R36A. Block C is a 543-bp region (bp 1161 to 1703) present in nanA from three isolates (Cl54, Cr33, and Cr50) and diverges 34.7% at the nucleotide and 32% at the amino acid level from the corresponding region of R36A. The sequences of blocks A, B, and C were highly conserved between strains, for the sequences present in each isolate. Despite the high levels of sequence diversity between the identified blocks and R36A, no amino acid alterations were observed in any of the four aspartic box motifs or residues proposed to be involved in the 5′ region active site.

In addition to interspecies homologous recombination, alterations occurring by point mutations (or intraspecies recombination with an allelic variant) were identified. The frequency of point mutations identified was relatively low and the changes largely synonymous, with sequence divergence excluding the recombinant blocks ranging from 0.15 to 1.02% at the nucleotide level and 0.29 to 1.48% at the amino acid level (data not shown).

Despite the diversity identified within nanA all 11 strains selected for sequencing were shown to possess neuraminidase activity. Although other pneumococcal neuraminidases could contribute to the detected activity, the assay employed was optimized to detect NanA activity.

The origin of mosaic blocks identified within the 5′ region of nanA. In order to investigate if any related streptococcal species were the donors of recombinant block A or C, primer pair 5 and 6 and pair 7 and 8 were designed to amplify fragments of 523 bp and 512 bp from recombinant blocks A and C, respectively. Sixty-six isolates representing 18 different streptococcal and two Abiotrophia species (19) were screened using both primer pairs. Block A primers (5 and 6) amplified a band of the predicted size for eight S. oralis isolates although a further 15 S. oralis isolates including the type strain (NCTC11427) were found to be negative by PCR. Sequencing six of these PCR products from S. oralis revealed that they were closely related to block A (5.9 to 6.1%) (accession numbers AJ968986 to AJ968991). The sequence most closely related to recombinant block A (Cr51) diverged only 5.9% at the nucleotide level and 3.2% at the amino acid level. Diversity between the six S. oralis sequences ranged from 2.3 to 5.9% at the nucleotide level. No amplification products were visualized for any of the 66 streptococcal isolates following amplification with block C primers (7 and 8) or for the whole region encompassing block A to block C (primers 5 and 8).

Diversity within the 3′ region of nanA. In order to investigate the distribution of the previously identified 3′ sequence divergence (Fig. 1) and to investigate any additional sequence diversity within this region of nanA, the 3′ region of the gene (bp 2383 to 3126) was amplified, successfully, from all 106 isolates using primers 4 and 2. To investigate the distribution of the previously identified 60-bp repeat sequences, the PCR products were visualized alongside R36A and Pn13 controls. The amplification products could be divided into five different groups based on size. Amplification from 94 isolates resulted in a product of approximately the same size as that from Pn13 (579 bp), known to contain no repeats and designated nanAΔR. The remaining 10 isolates (9.6%) gave larger PCR products; details of these isolates containing putative nanARN (N indicating the number of repeats) alleles are presented in Table 3. Sequencing confirmed that the four different product sizes observed correlated with sequences containing one, two, three, or four 60-bp tandem repeats (Fig. 3). Sequencing also revealed limited diversity within the repeat sequences. Ten nucleotide variants were identified that code for five closely related 20-amino-acid sequences, including the three distinct R36A repeat sequences.

TABLE 3.TABLE 3.
Details of the 10 nanAR-containing isolates
FIG. 3.FIG. 3.
Schematic of diversity identified within the 3′ region of nanA. The numbering starts at the corresponding location within the coding sequence of the published sequence starting at the first putative ATG within the R36A published sequence (X72967 (more ...)

Sequencing of the 3′ region of nanA also demonstrated that the 183-bp region proximal to the repeats (block D) identified in R36A was present in seven of the 10 nanARN variants (Fig. 4). However, one of these seven isolates (Cl37), containing three repeats, possesses a partial block D sequence. The recombination event involving this allele appears to have occurred within a region conserved between Pn13 and block D and 10 to 22 nucleotides (bp 2518 to 2530) downstream of that of the other six isolates. The remaining three isolates (Cr26, Cr19, and Cl25) containing one 60-bp repeat lack any block D sequence and are identical in this region to Pn13.

FIG. 4.FIG. 4.
Southern blot of block D and the 60-bp repeats to isolates of other streptococcal species. The repeat sequences and block D from R36A were DIG labeled and used as a probe. Five micrograms of chromosomal DNA from each strain was restricted with EcoRV, (more ...)

As discussed above, sequencing demonstrated that the 60-bp tandem repeat sequence could be present without block D, but it remained unknown if block D was present without the repeats. To investigate the distribution of block D within the nanAΔR isolates, primers were designed to exploit sequence alterations at the 5′ end of this region between R36A (primer 9) and Pn13 (primer 10). Two reactions were performed for all 94 nanAΔR isolates and relevant controls utilizing primer pair 9 and 2 and pair 10 and 2. All nanAΔR isolates lacked full-length block D, giving a product of approximately the predicted size (444 bp) with primers 10 and 2. No product was visualized when using primer 9.

Sequencing of the 3′ PCR products also demonstrated diversity in the numbers of 15-bp direct repeat sequences (nucleotide 2416) (Fig. 3). To further investigate the diversity in the number of 15-bp repeats, nanAΔR 3′ PCR products were digested with restriction enzymes that cut just distal to the 15-bp sequence, resulting in a small fragment that increased by 15 bp for each repeat. The restriction digests demonstrated that 16.5% of the isolates contain one 15-bp repeat, 79% contain two repeats, and 4.5% contain three repeats (data not shown). Neither sequencing nor restriction digests showed evidence of the deletion event that results in an early termination of the TIGR4 NanA, suggesting that this sequence alteration is not common within pneumococci. With the exception of these repeat sequences, and block D, the 3′ regions of the sequenced nanAR are highly similar.

Origin of the 3′ 60-bp repeat sequences and block D. The predicted 20-amino-acid repeat sequence and block D identified within the 3′ region of R36A are not present elsewhere within the NanA sequence, the available pneumococcal genomes, or other microbial genomes. Nevertheless, the 20-amino-acid repeats share up to 75% amino acid identity with repeat sequences within the 3′ region of a proposed β-galactosidase from S. pneumoniae (accession no. BAB91370.1). However, these repeats are not present within the predicted amino acid sequence of the other sequenced pneumococcal β-galactosidases. The repeats also share 41% and 38% sequence identity to repeats within the immunoglobulin A1 proteases of S. oralis and S. sanguinis, respectively. A Southern blot assay was performed to investigate the donor of the repeat sequences and proximal divergent region using R36A as a probe (Fig. 4). This probe hybridized to a chromosomal digest of a nanAR4-containing isolate (Cr48) used as a positive control and very weakly to one fragment from the negative control nanAΔR isolate (Cr7). Two fragments of DNA from each of two S. oralis isolates including the type strain (NCTC11427) reacted with the probe. A third S. oralis isolate did not hybridize with the probe under the conditions used. A weak interaction with the probe was also detected for the S. sanguinis type strain (NCTC7863).

DISCUSSION

The results presented in this study demonstrate the presence of the gene in all strains screened. The study of nanA diversity was divided into the 5′ region containing the active site (6) and the 3′ anchor region that is not required for neuraminidase activity. Data from both the 5′ and 3′ regions combined revealed 59 distinct RFLP profiles. As this method analyzed a relatively small percentage of the nanA sequence, the total number of alleles will have been substantially underestimated. Furthermore, the diversity identified within nanA was due to deletion, tandem duplications, point mutations, illegitimate recombination, and interspecies homologous recombination.

Interspecies homologous recombination transfer within nanA has resulted in mosaic genes involving four different recombinant blocks (blocks A, B, C, and D). Block B has a high level of nucleotide diversity from the R36A sequence; however, there is only a single amino acid alteration, implying that this region is conserved between neuraminidases of different species. The three remaining recombinant blocks possessed high levels of both amino acid and nucleotide alterations. The level of nucleotide divergence for each recombinant block is greater than 30% and much higher than the 20% identified within mosaic pbp genes (10, 26), suggesting that a high selective pressure exists for these alterations. Multiple recombination events have occurred within a single allele, as indicated by the presence of more than one recombinant block within the nanA sequence of several isolates (Fig. 2). It is highly likely that multiple genes acted as donors, as two blocks (A and B) are within the same region of nanA. Despite the different sizes of individual blocks identified, the high sequence conservation of each block suggests that a single donor has been responsible for the introduction of each region into pneumococci. The fact that all partial blocks identified share one boundary with the full-length block suggests that a single interspecies recombination event with the donor has occurred for each block, followed by transfer of nanA regions between different S. pneumoniae strains.

The 20-amino-acid repeat sequences were identified within the nanA 3′ region of 9.6% (10/104) of isolates screened. The number of 60-bp repeats varied (nanAR1 to -4) as has been seen within other streptococcal surface proteins, presumably due to slipped strand mispairing or recombination (44). A full-length block D was located proximal to the 20-amino-acid repeats from six of the nanAR sequences but none of the nanAΔR isolates. Of the remaining four nanAR isolates, one contained a partial block D sequence while the remaining three nanAR alleles shared sequence identical to that from Pn13 with the addition of one repeat. As repeat sequences have been identified without block D, it is possible that they were introduced by two independent recombination events. Moreover, the introduction of the repeats may have facilitated introduction of block D. However, it seems more likely that they were transferred during a single recombination event and that the nanAR1 alleles were formed by recombination between a nanAR and a nanAΔR isolate. This recombination could occur within a 7-bp sequence present at the start of each repeat and immediately distal to this location in all isolates (Fig. 1) and at any point distal to the repeat. Besides attachment to the cell surface the function of the 3′ region is unknown (6). However, the nature of these recombination events suggests that there is a selective pressure upon this region.

Many surface-exposed and secreted proteins of streptococci contain repeats, for example immunoglobulin A1 protease of S. pneumoniae (43) and M proteins of S. pyogenes (17). The wide distribution of these repeat structures implies that these regions must be functional. It has been suggested that the presence of repeats within a protein may provide means for immune evasion (20) or posttranslational or postsecretional processing or aid retention of the protein in the membrane (20). Therefore, NanA containing repeats may have a selective advantage under certain environmental conditions. A function, preventing capsule expression, has been assigned to apparently random spontaneous repeats arising within the capsule locus of S. pneumoniae when grown in Sorbarod biofilms (41, 42). However, the physiological conditions apparently driving these events are as yet unclear.

The donors of the blocks identified within nanA are most likely related streptococci. Many streptococci, including S. oralis, S. pyogenes, S. agalactiae, S. intermedius, and some S. mitis isolates, are known to produce sialidase activity (15, 22, 47), although little is known about the sequences, cellular locations, or diversity of these sialidases. In addition, other streptococcal species may also contain previously undetected neuraminidases. While the donor(s) of blocks B and C as well as the 48-bp insertion (at residue 253) is unknown, the donor(s) of block A, the nanA 3′ repeats, and/or block D is clearly related to sequences within some S. oralis isolates. The evidence provided for positive identification of a donor species in pbp genes was that the nucleotide variation between the S. mitis genes and recombinant blocks in resistant pneumococci was within the range of variation naturally found among pbp genes from different isolates of S. mitis (9). Applying these same principles to the data presented in this paper indicates that an S. oralis strain is the donor of block A, as the nucleotide divergence between block A from nanA and that from S. oralis (5.9 to 6.1%) is within the range of that between the six putative donor isolates of S. oralis (2.3 to 5.9%). The presence of sequences similar to the 3′ repeats and/or block D in two of three S. oralis isolates suggests that a member of this species may also have been the donor for these sequences. Further Southern blot assays would be necessary to determine if both the repeats and proximal region are present in these isolates. In addition, it should be noted that another species could have been responsible for introduction of these sequences to both S. pneumoniae and S. oralis. Both the sequence and Southern blot assay support published reports of extensive genetic variation within S. oralis (1, 46, 48).

Despite the high level of diversity within nanA, it is still lower than that identified within the pneumococcal genes pspA and pspC, where multiple recombination events have led to complex mosaic alleles (5, 16). In the case of pspA, the blocks are so diverse that the boundaries of the recombinant blocks are impossible to determine. Hollingshead et al. (16) suggest that this extraordinary degree of mosaicism indicates the importance of this surface protein as a natural target for host defense against the pneumococcus. This indicates that pspA is often the target of positive or negative selection in the interaction between the pneumococcus and its host. It is possible that the high level of diversity identified within nanA, another surface-associated pneumococcal protein, indicates that this protein is under selective pressure due to antibodies elicited by the host, although no data exist to demonstrate that antibodies to NanA are elicited during human colonization. Sequence alteration may also affect the ability of the organism to utilize different naturally occurring sialic acids. Further studies would be required to determine whether any of the sequence alterations observed do affect evasion of the host immune system or enzyme kinetics. However, as there are no amino acid alterations identified within the aspartic boxes or predicted active site, alterations may be involved in immune system evasion.

The high level of sequence variation observed in this study is contradictory to that found by Hakenbeck et al. (14), who used DNA chip-based hybridization techniques and identified no variation within nanA, although mosaic structure was identified within pbp2x. Hakenbeck et al. used only 20 pneumococcal isolates in their research, and it is therefore possible that the oligonucleotides utilized do not include sequences representing mosaic blocks (14).

We can place the variation identified within the nanA gene in the wider context of pneumococcal diversity, as these 104 clinical and carried isolates were presented within a previously published multilocus restriction typing (MLRT) study (29). Eighty-three MLRTs were identified within the 104 strains. There are of course some examples of two strains that share similar nanA alleles and genetic background. However, the majority of strains show no association between nanA RFLP profiles and specific lineages or anatomical sites of isolation. The 11 strains selected for sequencing represented 11 different MLRTs. Furthermore, the MLRT data support multiple recombination events leading to the distribution of blocks within the S. pneumoniae population. For example, Cr51 and Cr33 (both containing the 48-bp insertion sequence) contained divergent restriction profiles at six of the nine loci utilized in the MLRT study.

This paper demonstrates that nanA genes from different S. pneumoniae isolates are divergent due to the presence of recombinant blocks, repeats, and an insertion. Recombination with multiple donors has occurred, at least one case of which appears to be an S. oralis isolate. The results of these recombination events have spread horizontally between pneumococci. The recombination event within the 3′ region of nanA resulting in the introduction of the repeats and block D suggests a function for this region that is not essential for neuraminidase activity. The conservation of residues postulated to be involved in the active site or aspartic boxes implies that identified sequence divergence may provide a selective advantage in avoiding the host immune system. Furthermore, this suggests that nanA is an important target of the immune system in the interaction between the pneumococcus and host.

REFERENCES
1.
Beighton, D., and S. Alum. 1997. Use of repetitive extragenic palindromic PCR (REP-PCR) to study Streptococcus oralis. J. Dent. Res. 76:1026.
2.
Beighton, D., and R. Whiley. 1990. Sialidase activity of the “Streptococcus milleri group” and other viridans group streptococci. J. Clin. Microbiol. 28:1431-1433. [PubMed].
3.
Berry, A. M., R. A. Lock, and J. C. Paton. 1996. Cloning and characterization of nanB, a second Streptococcus pneumoniae neuraminidase gene, and purification of the NanB enzyme from recombinant Escherichia coli. J. Bacteriol. 178:4854-4860. [PubMed].
4.
Berry, A. M., and J. C. Paton. 2000. Additive attenuation of virulence of Streptococcus pneumoniae by mutation of the genes encoding pneumolysin and other putative pneumococcal virulence proteins. Infect. Immun. 68:133-140. [PubMed].
5.
Brooks-Walter, A., D. E. Briles, and S. K. Hollingshead. 1999. The pspC gene of Streptococcus pneumoniae encodes a polymorphic protein, PspC, which elicits cross-reactive antibodies to PspA and provides immunity to pneumococcal bacteremia. Infect. Immun. 67:6533-6542. [PubMed].
6.
Camara, M., G. J. Boulnois, P. W. Andrew, and T. J. Mitchell. 1994. A neuraminidase from Streptococcus pneumoniae has the features of a surface protein. Infect. Immun. 62:3688-3695. [PubMed].
7.
Dowson, C., A. Hutchison, J. Brannigan, R. George, D. Hansman, J. Linares, A. Tomasz, J. Smith, and B. Spratt. 1989. Horizontal transfer of penicillin-binding protein genes in penicillin-resistant clinical isolates of Streptococcus pneumoniae. Proc. Natl. Acad. Sci. USA 86:8842-8846. [PubMed].
8.
Dowson, C. G., V. Barcus, S. King, P. Pickerill, A. Whatmore, and M. Yeo. 1997. Horizontal gene transfer and the evolution of resistance and virulence determinants in Streptococcus. Soc. Appl. Bacteriol. Symp. Ser. 26:42S-51S. [PubMed].
9.
Dowson, C. G., T. J. Coffey, C. Kell, and R. A. Whiley. 1993. Evolution of penicillin resistance in Streptococcus pneumoniae; the role of Streptococcus mitis in the formation of a low affinity PBP2B in S. pneumoniae. Mol. Microbiol. 9:635-643. [PubMed].
10.
Dowson, C. G., A. Hutchinson, and B. G. Spratt. 1989. Extensive remodelling of the transpeptidase domain of penicillin-binding protein 2B of a penicillin-resistant South African isolate of Streptococcus pneumoniae. Mol. Microbiol. 3:95-102. [PubMed].
11.
Enright, M. C., and B. G. Spratt. 1998. A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiology 144:3049-3060. [PubMed].
12.
Feil, E. J., J. M. Smith, M. C. Enright, and B. G. Spratt. 2000. Estimating recombinational parameters in Streptococcus pneumoniae from multilocus sequence typing data. Genetics 154:1439-1450. [PubMed].
13.
Feldman, C., and K. Klugman. 1997. Pneumococcal infections. Curr. Opin. Infect. Dis. 10:109-115.
14.
Hakenbeck, R., N. Balmelle, B. Weber, C. Gardes, W. Keck, and A. de Saizieu. 2001. Mosaic genes and mosaic chromosomes: intra- and interspecies genomic variation of Streptococcus pneumoniae. Infect. Immun. 69:2477-2486. [PubMed].
15.
Hayano, S., A. Tanaka, and Y. Okuyama. 1969. Distribution and serological specificity of sialidase produced by various groups of streptococci. J. Bacteriol. 100:354-357. [PubMed].
16.
Hollingshead, S., R. Becker, and D. Briles. 2000. Diversity of PspA: mosaic genes and evidence for past recombination in Streptococcus pneumoniae. Infect. Immun. 68:5889-5900. [PubMed].
17.
Hollingshead, S., V. Fischetti, and J. Scott. 1987. Size variation in group A streptococcal M protein is generated by homologous recombination between intragenic repeats. Mol. Gen. Genet. 207:196-203. [PubMed].
18.
Iannelli, F., M. Oggioni, and G. Pozzi. 2002. Allelic variation in the highly polymorphic locus of pspC of Streptococcus pneumoniae. Gene 284:63-71. [PubMed].
19.
Kawamura, Y., X. G. Hou, F. Sultana, S. J. Lui, H. Yamamoto, and T. Ezaki. 1995. Transfer of Streptococcus adjacens and Streptococcus defectivus to Abiotrophia gen. nov. as Abiotrophia adjacens comb. nov. and Abiotrophia defectiva comb. nov., respectively. Int. J. Syst. Bacteriol. 45:406-408. [PubMed].
20.
Kehoe, M. A. 1994. Cell wall associated proteins in Gram positive bacteria, p. 217-261. In J.-M. Ghuysen and R. Hakenbeck (ed.), Bacterial cell wall. Elsevier Science, New York, N.Y.
21.
Kelly, R. T., S. Farmer, and D. Greiff. 1967. Neuraminidase activities of clinical isolates of Diplococcus pneumoniae. J. Bacteriol. 94:272-273. [PubMed].
22.
Kilian, M., L. Mikkelsen, and J. Henrichsen. 1989. Taxonomic study of viridans streptococci: description of Streptococcus gordonii sp. nov. and amended descriptions of Streptococcus sanguis (White and Niven 1946), Streptococcus oralis (Bridge and Sneath 1982), and Streptococcus mitis (Andrewes and Horder 1906). Int. J. Syst. Bacteriol. 39:471-484.
23.
King, S. J., P. J. Heath, I. Luque, C. Tarradas, C. G. Dowson, and A. M. Whatmore. 2001. Distribution and genetic diversity of suilysin in Streptococcus suis isolated from different disease of pigs and characterization of the genetic basis of suilysin absence. Infect. Immun. 69:7571-7582.
24.
King, S. J., K. R. Hippe, J. M. Gould, D. Bae, S. Peterson, R. T. Cline, C. Fasching, E. N. Janoff, and J. N. Weiser. 2004. Phase variable desialylation of host proteins that bind to Streptococcus pneumoniae in vivo and protect the airway. Mol. Microbiol. 54:159-171. [PubMed].
25.
Kumar, S., K. Tamura, and M. Nei. 1993. MEGA—Molecular Evolutionary Genetics Analysis 1.01 edition. The Pennsylvania State University, University Park.
26.
Laibile, G., B. G. Spratt, and R. Hakenbeck. 1991. Interspecies recombinational events during the evolution of altered pbp2 genes in penicillin resistant clinical isolates of Streptococcus pneumoniae. Mol. Microbiol. 5:1993-2002. [PubMed].
27.
Long, J. P., H. H. Tong, and T. F. DeMaria. 2004. Immunization with native or recombinant Streptococcus pneumoniae affords protection in the chinchilla otitis media model. Infect. Immun. 72:4309-4313. [PubMed].
28.
Maynard-Smith, J. 1992. Analyzing the mosaic structure of genes. J. Mol. Evol. 34:126-129. [PubMed].
29.
Muller-Graf, C., A. Whatmore, S. King, K. Trzcinski, A. Pickerill, N. Doherty, J. Paul, D. Griffiths, D. Crook, and C. Dowson. 1999. Population biology of Streptococcus pneumoniae isolated from oropharyngeal carriage and invasive disease. Microbiology 145:3283-3293. [PubMed].
30.
Nei, M., and W.-H. Li. 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. USA 76:5269-5273. [PubMed].
31.
O'Toole, R. D., L. Goode, and C. Howe. 1971. Neuraminidase activity in bacterial meningitis. J. Clin. Investig. 50:979-985. [PubMed].
32.
Pericone, C., D. Bae, M. Shchepetov, T. McCool, and J. Weiser. 2002. Short-sequence tandem and nontandem DNA repeats and endogenous hydrogen peroxide production contribute to genetic instability of Streptococcus pneumoniae. J. Bacteriol. 184:4392-4399. [PubMed].
33.
Poulsen, K., J. Reinholdt, and M. Kilian. 1996. Characterization of the Streptococcus pneumoniae immunoglobulin A1 protease gene (iga) and its translation product. Infect. Immun. 64:3957-3966. [PubMed].
34.
Sawyer, S. 1989. Statistical tests for detecting gene conversion. Mol. Biol. Evol. 6:526-538. [PubMed].
35.
Shakhnovich, E., S. King, and J. Weiser. 2002. Neuraminidase expressed by Streptococcus pneumoniae desialylates the lipopolysaccharide of Neisseria meningitidis and Haemophilus influenzae: a paradigm for interbacterial competition among pathogens of the human respiratory tract. Infect. Immun. 70:7161-7164. [PubMed].
36.
Smith, M. D., and W. R. Guild. 1979. A plasmid in Streptococcus pneumoniae. J. Bacteriol. 137:735-739. [PubMed].
37.
Tettelin, H., K. E. Nelson, I. T. Paulsen, J. A. Eisen, T. D. Read, S. Peterson, J. Heidelberg, R. T. DeBoy, D. H. Haft, R. J. Dodson, A. S. Durkin, M. Gwinn, J. F. Kolonay, W. C. Nelson, J. D. Peterson, L. A. Umayam, O. White, S. L. Salzberg, M. R. Lewis, D. Radune, E. Holtzapple, H. Khouri, A. M. Wolf, T. R. Utterback, C. L. Hansen, L. A. McDonald, T. V. Feldblyum, S. Angiuoli, T. Dickinson, E. K. Hickey, I. E. Holt, B. J. Loftus, F. Yang, H. O. Smith, J. C. Venter, B. A. Dougherty, D. A. Morrison, S. K. Hollingshead, and C. M. Fraser. 2001. Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293:498-506. [PubMed].
38.
Tong, H., L. Blue, M. James, Y. Chen, and T. DeMaria. 2000. Evaluation of phase variation of nontypeable Haemophilus influenzae lipooligosaccharide during nasopharyngeal colonization and development of otitis media in the chinchilla model. Infect. Immun. 68:4593-4597. [PubMed].
39.
Tong, H. H., I. Grants, X. Liu, and T. F. DeMaria. 2002. Comparison of alteration of cell surface carbohydrates of the chinchilla tubotympanum and colonial opacity phenotype of Streptococcus pneumoniae during experimental pneumococcal otitis media with or without an antecedent influenza A virus infection. Infect. Immun. 70:4292-4301. [PubMed].
40.
Tong, H. H., M. A. McIver, L. M. Fisher, and T. F. DeMaria. 1999. Effect of lacto-N-neotetraose, asialoganglioside-GM1 and neuraminidase on adherence of otitis media-associated serotypes of Streptococcus pneumoniae to chinchilla tracheal epithelium. Microb. Pathog. 26:111-119. [PubMed].
41.
Waite, R., D. Penfold, J. Struthers, and C. Dowson. 2003. Spontaneous sequence duplications within capsule genes cap8E and tts control variation in Streptococcus pneumoniae serotypes 8 and 37. Microbiology 149:497-504. [PubMed].
42.
Waite, R., J. Struthers, and C. Dowson. 2001. Spontaneous sequence duplication within an open reading frame of the pneumococcal type 3 capsule locus causes high frequency phase variation. Mol. Microbiol. 42:1223-1232. [PubMed].
43.
Wani, J., J. Gilbert, A. Plaut, and J. Weiser. 1996. Identification, cloning and sequencing of the immunoglobulin A1 protease gene of Streptococcus pneumoniae. Infect. Immun. 64:3967-3974. [PubMed].
44.
Wastfelt, M., M. Stalhammar-Carlemalm, A. M. Delisse, T. Cabezon, and G. Lindahl. 1996. Identification of a family of streptococcal surface proteins with extremely repetitive structure. J. Biol. Chem. 271:18892-18897. [PubMed].
45.
Whatmore, A. M., and C. G. Dowson. 1999. The autolysin-encoding gene (lytA) of Streptococcus pneumoniae displays restricted allelic variation despite localized recombination events with genes of pneumococcal bacteriophage encoding cell wall lytic enzymes. Infect. Immun. 67:4551-4556. [PubMed].
46.
Whatmore, A. M., A. Efstratiou, A. P. Pickerill, K. Broughton, G. Woodard, D. Sturgeon, R. George, and C. G. Dowson. 2000. Genetic relationships between clinical isolates of Streptococcus pneumoniae, Streptococcus oralis, and Streptococcus mitis: characterization of “atypical” pneumococci and organisms allied to S. mitis harboring S. pneumoniae virulence factor-encoding genes. Infect. Immun. 68:1374-1382. [PubMed].
47.
Whiley, R. A., H. Fraser, J. M. Hardie, and D. Beighton. 1990. Phenotypic differentiation of Streptococcus intermedius, Streptococcus constellatus, and Streptococcus anginosus strains within the “Streptococcus milleri group.” J. Clin. Microbiol. 28:1497-1501. [PubMed].
48.
Wisplinghoff, H., R. R. Reinert, O. Cornely, and H. Seifert. 1999. Molecular relationships and antimicrobial susceptibilities of viridans group streptococci isolated from blood of neutropenic cancer patients. J. Clin. Microbiol. 37:1876-1880. [PubMed].