pmc logo imageJournal ListSearchpmc logo image
Logo of genoresCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. 1998 May; 8(5): 557–561.
PMCID: PMC310720
Sequencing Multimegabase-Template DNA with BigDye Terminator Chemistry
Cheryl R. Heiner,1 Kathryn L. Hunkapiller,1 Shiaw-Min Chen,2 John I. Glass,3 and Ellson Y. Chen1,4
1Advanced Center for Genetic Technology and 2Genetic Analysis Department, PE-Applied Biosystems, Foster City, California 94404 USA; 3Department of Microbiology, University of Alabama at Birmingham, Birmingham, Alabama 35294 USA
4Corresponding author.
Received November 26, 1997; Accepted March 18, 1998.
Abstract
Using the recently introduced BigDye™ terminators, large-template DNA can be directly sequenced with custom primers on automated instruments. Cycle sequencing conditions are presented to sequence DNA samples isolated from a number of microbial genomes including 750-kb Ureaplasma urealyticum, 1.2-Mb Mycoplasma fermentans, 2.3-Mb Streptococcus pneumoniae, and 4.6-Mb Escherichia coli. Average read lengths of >700 bp from unique primer annealing sites are often sufficient to fill final gaps in microbial genome sequencing projects without additional manipulations of template DNA. The technique can also be applied to sequence-targeted regions, thereby bypassing tedious subcloning steps.
 
In microbial genome or large-insert clone sequencing projects that use the predominant random subclone sequencing strategy, progress tends to decrease dramatically at late stages as one confronts gaps. At these points, DNA is under-represented or unstable in subclones (E.Y. Chen et al. 1996; Chissoe et al. 1997). Further sequencing with additional random subclones is then inefficient at best, and one must frequently employ alternative cloning systems or additional methods like long-range PCR to recover missing DNA (C.N. Chen et al. 1996). The variability of performance of these methods and the necessity for custom-tailored work tend to hamper the late stages of sequencing efforts. In contrast, if one can sequence directly from genomic DNA (or large-insert clones such as BACs or PACs) with walking primers, cumbersome work to fill gaps could be completed in a much shorter time.
As an example, in a recent project to sequence the 750-kb genome of Ureaplasma urealyticum (J. Glass, in prep.) assemblage of ~13,000 sequence reads and combinatorial PCR reactions to join contigs left two gaps. No λ pUC, or M13 subclones were recovered that spanned the gaps, nor were PCR products derived with any of several sets of flanking primers. The difficulty of cloning these segments is probably attributable to repeated sequences in and near the two gaps, but the high sensitivity of the recently introduced BigDye terminator (Rosenblum et al. 1997) permitted direct sequencing of the gap regions on genomic U. urealyticum DNA templates. Using the conditions described in this report, two gaps of 259 and 121 bp were sequenced from both strands with walking primers to complete the project of 751,723 bp.
Direct sequencing was further tested for larger templates, and good results were reproducibly obtained with 1.2-Mb Mycoplasma fermentans, 2.3-Mb Streptococcus pneumoniae, and 4.6-Mb Escherichia coli genomic DNA (see example in Fig. 1). In addition, several difficult gaps in sequencing projects with BAC clones, ranging in size from 140 to 250 kb, have also been filled in this manner. Essentially the method is applicable whenever 2–3 μg of high-quality large-template DNA is available.
Figure 1Figure 1
Sequencing of E. coli K12 strain genomic DNA with BigDye terminators. Approximately 3 μg of E. coli DNA was sequenced with an apaG gene primer (5′-GTTCCCACACTCATTCATTA) using the conditions described in the text.
RESULTS AND DISCUSSION
Figure 1 shows an example of the results from these experiments. Although the signal intensity tends to be low—only ~10%–20% compared to the data from regular M13 or pUC templates—base-calling quality remains high, because the baseline noise is sharply reduced by the increased brightness and improved spectral resolution of the BigDye terminators (Rosenblum et al. 1997). Lower signal strength is expected considering the molarity of microbial template DNA, which is several hundred to a thousand times less than that of the regular plasmid templates. Higher level of primers (2×–5×) and greater number of cycles (from 45 to 60, more cycles for larger templates) as described in Methods helped to boost the signal intensities. The addition of cycles (up to 99) has been found to increase the signal strength and decrease the readable range (see Table 1). Accurate quantitation of template DNA to within 2–4 μg is essential. Too much template (>5 μg) produced much lower quality results (see Fig. 2 for an example), whereas too little DNA also gave rise to weak signal and low-quality results (data not shown).
Table 1Table 1
Sequence Quality and Signal Strength as a Function of the Number of Cycles
Figure 2Figure 2
Sequencing of S. pneumoniae genomic DNA with BigDye terminators using either 2.5 μg (A) or 5.0 μg (B) of template and a SKH1 primer (23-mer 5′-AACAATAATGTAGAAGACTACTT). All other conditions are as described in the text.
There are several other factors that are also important for optimal results. First, template DNA should be free of salts, detergents, proteins, and cell debris that might interfere with the primer annealing. Second, as may be expected, having a good primer is critical. Successful primers have been typical 21–25-mers with appropriate GC contents from a unique site. Only 1 of the 10 primers tested failed to generate a useful sequence because that primer annealed at two different locations on the S. pneumoniae genome. Third, special care should be taken in the elimination of excess dye terminators before loading samples on gels (note that carryover of dyes can be seen in Fig. 1 in the region of bases 45–55; apparently the system is very sensitive to residual dye when signals are so low). Fourth, to get high-quality, low-signal data, it is important to have a well-tuned sequencing instrument equipped with a good multicomponent matrix and base-calling software capable of analyzing data with weak signals. Here we used version 3.0.1b3 software, which requires no minimal signals.
There appears to be no correlation between the quality of the sequence data and the size of the template, as shown in Table 2, in which nine different sequences obtained from bacterial genomic templates are compared with corresponding GenBank sequences that were presumably derived from subcloned templates. Using the unedited sequence files of up to 600 bases beyond the first legible base, genome-derived sequences have an average fidelity of 98.2% (1.1% N and 0.7% discordant base-calls). After minimal manual editing of the sequences, the average useable read length was 712 ± 18 bases. It is interesting that the signal strength remains relatively constant among these nine genomic templates ranging from 0.75 to 4.6 Mb. This suggests that the present protocol, after some modifications, may be applicable to even larger templates. The method, nevertheless, does require 2–3 μg of high-quality template DNA for each sequencing reaction—an amount that could be difficult to obtain from certain microorganisms.
Table 2Table 2
Sequence Quality as a Function of Read Lengths
In summary, we have demonstrated that template DNAs up to 4.6 Mb in size can be directly sequenced with automated sequencers using custom primers and BigDye terminators. This protocol could expedite large-scale sequencing projects by facilitating gap closure, especially for difficult areas that are refractory to cloning and PCR methods. The technique can also be applied to bypass tedious subcloning steps in bacterial sequencing projects that focus on individual genes.
METHODS
Preparation of DNA Samples
Multiple methods, including Easy-DNA kit (Invitrogen, USA), SDS–proteinase K lysis procedure, and isopycnic banding in CsCl gradients (Wilson 1994), all worked well to prepare microbial genomic DNA samples for sequencing. BAC DNA was purified as described by Rosenblum et al (1997). Sequencing primers were designed using Oligo 5.0 (National Biosciences, USA). Sizes varied from 21 to 25 bases and Tm from 60°C to 74°C (GC% method).
Sequencing Conditions
Each cycle sequencing reaction contained 16 μl of BigDye Terminator mix (PE-Applied Biosystems, USA), 13 pmoles of primer, and 2–3 μg of microbial DNA (or 6 pmoles of primer and 0.4 μg of BAC DNA) in a total of 40-μl volume. In some experiments 1 μl of ThermoFidelase I (Fidelity Systems, USA) was included to improve data quality, but the results were inconsistent, so it is not a standard component of current reaction mixtures. The cycle conditions were initial denaturation at 95°C for 5 min followed by 30–60 cycles (30 for BACs, 45 for all microbial DNAs, except E. coli samples, which get 60 cycles; see Results/Discussion) at 95°C for 30 sec, 55°C for 20 sec, and 60°C for 4 min. Excess dye terminators were removed with a spin column (PE-Applied Biosystems User Manual) and reaction mixtures were dried in a SpeedVac system. Each sample was resuspended in 2 μl of formamide solution and denatured by heat, and the entire volume loaded on an ABI 377 automated DNA sequencing instrument with a 36-well comb and a 48-cm WTR (well-to-read) gel. Electrophoresis was at 2400 V for 10–11 hours. Sequence data were analyzed by ABI version 3.0.1b3 software modified for weak signal base-calling (available through PE-ABI ftp site).
Acknowledgments
We thank Dan Allison for modifying the base-calling software, Jennifer Glass for help in preparing the microbial DNAs, and Chun-Nan Chen for valuable discussions. This work is supported, in part, by the National Institutes of Health (grants HG00201 and RO1 AI28279).
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.
Footnotes
E-MAIL cheney/at/perkin-elmer.com; FAX (650) 638-6177.
REFERENCES
  • Chen, CN; Su, Y; Baybayan, P; Siruno, A; Nagaraja, R; Mazzarella, R; Schlessinger, D; Chen, E. Ordered shotgun sequencing of a 135 kb Xq25 YAC containing ANT2 and 4 possible genes, including three confirmed by EST matches. Nucleic Acids Res. 1996;24:4034–4041. [PubMed]
  • Chen, EY; Zollo, M; Mazzarella, R; Ciccodicola, A; Chen, C; Zuo, L; Heiner, C; Burough, F; Ripetto, M; Schlessinger, D; D’Urso, M. Long-range sequence analysis in Xq28: Thirteen known and six candidate genes in 219.4 kb of high GC DNA between the RCP/GCP and G6PD loci. Hum Mol Genet. 1996;5:659–668. [PubMed]
  • Chissoe, SL; Marra, MA; Hillier, L; Brinkman, R; Wilson, RK; Waterston, RH. Representation of cloned genomic sequences in two sequence vectors: Correlation of DNA sequence and subclone distribution. Nucleic Acids Res. 1997;25:2960–2966. [PubMed]
  • Rosenblum, BB; Lee, LG; Spurgeon, SL; Khan, SH; Menchen, SM; Heiner, CR; Chen, S-M. New dye-labeled terminators for improved DNA sequencing patterns. Nucleic Acids Res. 1997;25:4500–4504. [PubMed]
  • Wilson, K. Preparation of genomic DNA from bacteria. In: Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K. , editors. Current protocols in molecular biology. Vol. 1. New York, NY: John Wiley & Sons; 1994. pp. 2.4.1–2.4.5.