JGI Home

Scientific Publications

May 4, 2008
Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). (Nature Biotechnology) Trichoderma reesei is the main industrial source of cellulases and hemicellulases used to depolymerize biomass to simple sugars that are converted to chemical intermediates and biofuels, such as ethanol.

March 6, 2008
The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis. (Nature) Mycorrhizal symbioses—the union of roots and soil fungi—are universal in terrestrial ecosystems and may have been fundamental to land colonization by plants.

February 14, 2008
The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. (Nature) Choanoflagellates have long fascinated evolutionary biologists for their marked similarity to the 'feeding cells' (choanocytes) of sponges and the possibility that they might represent the closest living relatives of metazoans.

January 6, 2008
Ultraconservation identifies a small subset of extremely constrained developmental enhancers. (Nature Genetics) Extended perfect human-rodent sequence identity of at least 200 base pairs (ultraconservation) is potentially indicative of evolutionary or functional uniqueness.

December 13, 2007
The Physcomitrella Genome Reveals Evolutionary Insights into the Conquest of Land by Plants. (Science) We report the draft genome sequence of the model moss Physcomitrella patens and compare its features to those of flowering plants, from which it is separated by more than 400 million years, and unicellular aquatic algae.

November 22, 2007
Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite. (Nature) The first system-wide gene analysis of a microbial community specialized towards plant lignocellulose degradation.

October 18, 2007
Genome-Wide Experimental Determination of Barriers to Horizontal Gene Transfer. (Science Express) Our data suggest that toxicity to the host inhibited transfer regardless of the species of origin and that increased gene dosage and associated increased expression may be a predominant cause for transfer failure.

October 12, 2007
The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions. (Science) Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

September 26, 2007
Deinococcus geothermalis: The Pool of Extreme Radiation Resistance Genes Shrinks. (PLoS One) . . . we report the whole-genome sequence of a second Deinococcus species, the thermophile Deinococcus geothermalis, which at its optimal growth temperature is as resistant to IR, UV and desiccation as D. radiodurans, and a comparative analysis of the two Deinococcus genomes.

September 4, 2007
Deletion of Ultraconserved Elements Yields Viable Mice. (PLoS Biology) Lines of mice lacking ultraconserved elements were viable and fertile, and failed to reveal any critical abnormalities when assayed for a variety of phenotypes including growth, longevity, pathology, and metabolism.

July 6, 2007
Sea Anemone Genome Reveals Ancestral Eumetazoan Gene Repertoire and Genomic Organization. (Science) Here, we report a comparative analysis of the draft genome of an emerging cnidarian model, the starlet sea anemone Nematostella vectensis.

June 11, 2007
Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. (PNAS) Our analysis shows that S. tropica dedicates a large percentage of its genome ({approx}9.9%) to natural product assembly, which is greater than previous Streptomyces genome sequences as well as other natural product-producing actinomycetes.

May 1, 2007
The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation. (PNAS) The genome of Ostreococcus lucimarinus has been completed and compared with that of O. tauri. This comparison reveals surprising differences across orthologous chromosomes in the two species . . .

April 29, 2007
Use of simulated data sets to evaluate the fidelity of metagenomic processing methods. (Nature Methods) To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes.

March 7, 2007
Strain-resolved community proteomics reveals recombining genomes of acidophilic bacteria. (Nature) Here we use community genomic data sets to identify, with strain specificity, expressed proteins from the dominant member of a genomically uncharacterized, natural, acidophilic biofilm.

March 4, 2007
Genome sequence of the lignocellulose-bioconverting and xylose-fermenting yeast Pichia stipitis. (Nature Biotechnology) Xylose is a major constituent of plant lignocellulose, and its fermentation is important for the bioconversion of plant biomass to fuels and chemicals. Pichia stipitis is a well-studied, native xylose-fermenting yeast.

February 25 , 2007
Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. (Nature Genetics) Resequencing of ANGPTL4 in a multiethnic population allowed analysis of the phenotypic effects of both rare and common variants while taking advantage of genetic variation arising from ethnic differences in population history.

February 23, 2007
Quantitative Phylogenetic Assessment of Microbial Communities in Diverse Environments. (Science) We used a set of protein-coding marker genes, extracted from large-scale environmental shotgun sequencing data, to provide a more direct, quantitative, and accurate picture of community composition than that provided by traditional ribosomal RNA–based approaches depending on the polymerase chain reaction.

February 16, 2007
The Calyptogena magnifica Chemoautotrophic Symbiont Genome. (Science) The Calyptogena magnifica (Bivalvia: Vesicomyidae) symbiont, Candidatus Ruthia magnifica, is the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced, revealing a suite of metabolic capabilities.

February 1, 2007
Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. (Nature Biotechnology) The filamentous fungus Aspergillus niger is widely exploited by the fermentation industry for the production of enzymes and organic acids, particularly citric acid. We sequenced the 33.9-megabase genome of A. niger CBS 513.88, the ancestor of currently used enzyme production strains.

November 3, 2006
Accelerated Evolution of Conserved Noncoding Sequences in Humans. (Science) We identified 992 conserved noncoding sequences (CNSs) with a significant excess of human-specific substitutions. These accelerated elements were disproportionately found near genes involved in neuronal cell adhesion.

October 9, 2006
Comparative Genomics of Lactic Acid Bacteria. (PNAS) Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria.

September 25, 2006
Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. (Nature Biotechnology) Enhanced biological phosphorus removal (EBPR) is one of the best-studied microbially mediated industrial processes because of its ecological and economic relevance. Despite this, it is not well understood at the metabolic level. Here we present a metagenomic analysis of two lab-scale EBPR sludges dominated by the uncultured bacterium, "Candidatus Accumulibacter phosphatis."

September 18, 2006
Symbiosis insights through metagenomic analysis of a microbial consortium. (Nature) Symbioses between bacteria and eukaryotes are ubiquitous, yet our understanding of the interactions driving these associations is hampered by our inability to cultivate most host-associated microbes. Here we use a metagenomic approach to describe four co-occurring symbionts from the marine oligochaete Olavius algarvensis.

September 15, 2006
The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) (Science) We report the draft genome of the black cottonwood tree, Populus trichocarpa. Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified.

September 1, 2006
Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis (Science) Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum . . . Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection . . .

May, 2006
Computational analysis of the Phanerochaete chrysosporium v2.0 genome database and mass spectrometry identification of peptides in ligninolytic cultures reveal complex mixtures of secreted proteins (Fungal Genetics and Biology) The white-rot basidiomycete Phanerochaete chrysosporium employs extracellular enzymes to completely degrade the major polymers of wood: cellulose, hemicellulose, and lignin. Analysis of a total of 10,048 v2.1 gene models predicts 769 secreted proteins, a substantial increase over the 268 models identified in the earlier database . . .

September 12, 2005
Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate (PLoS Biology) We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, and then determined when each gene duplicated relative to the evolutionary tree of the organisms. . . [T]heir global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution.

July 2, 2005
The Phanerochaete chrysosporium secretome: Database predictions and initial mass spectrometry peptide identifications in cellulose-grown medium (Journal of Biotechnology) The white rot basidiomycete, Phanerochaete chrysosporium, employs an array of extracellular enzymes to completely degrade the major polymers of wood: cellulose, hemicellulose and lignin. Towards the identification of participating enzymes, 268 likely secreted proteins were predicted using SignalP and TargetP algorithms.

June 2, 2005
Genomic Sequencing of Pleistocene Cave Bears (Science Express) Despite the greater information content of genomic DNA, ancient DNA studies have largely been limited to amplification of mitochondrial sequences. We describe metagenomic libraries constructed using unamplified DNA extracted from skeletal remains of two 40,000-year-old extinct cave bears.

April 22, 2005
Comparative Metagenomics of Microbial Communities (Science) The identification of environment-specific genes through a gene-centric comparative analysis presents new opportunities for interpreting and diagnosing environments.

December 23 , 2004
The sequence and analysis of duplication-rich human chromosome 16 (Nature) Human chromosome 16 features one of the highest levels of segmentally duplicated sequence among the human autosomes. We report here the 78,884,754 base pairs of finished chromosome 16 sequence, representing over 99.9% of its euchromatin.

October 21, 2004
Megabase deletions of gene deserts result in viable mice (Nature) The functional importance of the roughly 98% of mammalian genomes not corresponding to protein coding sequences remains largely undetermined. Here we show that some large-scale deletions of the non-coding DNA referred to as gene deserts can be well tolerated by an organism.

Finishing the euchromatic sequence of the human genome (Nature) In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage.

October 1, 2004
The Genome of the Diatom Thalassiosira Pseudonana: Ecology, Evolution, and Metabolism (Science) Diatoms are unicellular algae with plastids acquired by secondary endosymbiosis. They are responsible for ~20% of global carbon fixation. We report the 34 million–base pair draft nuclear genome of the marine diatom Thalassiosira pseudonana and its 129 thousand–base pair plastid and 44 thousand–base pair mitochondrial genomes.

September 15, 2004
The DNA sequence and comparative analysis of human chromosome 5 (Nature) Chromosome 5 is one of the largest human chromosomes and contains numerous intrachromosomal duplications, yet it has one of the lowest gene densities.

September 3 , 2004
Reverse Methanogenesis: Testing the Hypothesis with Environmental Genomics (Science) Nearly all genes typically associated with methane production are present in one specific group of archaeal methanotrophs. These genome-based observations support previous hypotheses and provide an informed foundation for metabolic modeling of anaerobic methane oxidation.

May 2, 2004
Genome sequence of the lignocellulose degrading fungusPhanerochaete chrysosporium strain RP78 (Nature Biotechnology) The P. chrysosporium genome reveals an impressive array of genes encoding secreted oxidases, peroxidases and hydrolytic enzymes that cooperate in wood decay. Analysis of the genome data will enhance our understanding of lignocellulose degradation, a pivotal process in the global carbon cycle, and provide a framework for further development of bioprocesses for biomass utilization, organopollutant degradation and fiber bleaching. This genome provides a high quality draft sequence of a basidiomycete, a major fungal phylum that includes important plant and animal pathogens.

April 1 , 2004
The DNA Sequence and Biology of Human Chromosome 19 (Nature) The finished human chromosome 19 sequence, comprising a gene density more than double the genome-wide average, marks the culmination of 18 years of research spanning the history of modern genomics.

February 1, 2004
Community Structure and Metabolism through Reconstruction of Microbial Genomes from the Environment (Nature AOP) Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other genomes.

October 17, 2003
Scanning Human Gene Deserts for Long-Range Enhancers (Science 302: 413) Approximately 25% of the genome consists of gene-poor regions greater than 500 kb, termed gene deserts. These segments have been minimally explored, and their functional significance remains elusive. One category of functional sequences postulated to lie in gene deserts is gene regulatory elements that have the ability to modulate gene expression over very long distances.

March 21, 2003
Hexapod Origins: Monophyletic or Paraphyletic? (Science 299: 1887-1889) Recent morphological and molecular evidence has changed interpretations of arthropod phylogeny and evolution. Here we compare complete mitochondrial genomes to show that Collembola, a wingless group traditionally considered as basal to all insects, appears instead to constitute a separate evolutionary lineage that branched much earlier than the separation of many crustaceans and insects and independently adapted to life on land. Therefore, the taxon Hexapoda, as commonly defined to include all six-legged arthropods, is not monophyletic.

February 28, 2003
Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome (Science 299: 1391-1394) Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies.

December 12, 2002
The Draft Genome of Ciona intestinalis: Insights into Chordate and Vertebrate Origins (Science 298: 2157-2167) The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates.

August 23, 2002
Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes (Science 297: 1301-1310) The compact genome of Fugu rubripes has been sequenced to over 95% coverage, and more than 80% of the assembly is in multigene-sized scaffolds.

July 6, 2001
Human Chromosome 19 and Related Regions in Mouse: Conservative and Lineage-Specific Evolution (Science 293: 104-111) The JGI team's analysis permitted more than 1200 HSA19 genes to be verified or defined and revealed clues to the evolutionary history of this gene-rich human chromosome. This first chromosome-wide comparative sequencing study also provides a preview of how the whole mouse genome sequence will aid in discovery of genes and other functional DNA sequences throughout all 23 sets of human chromosomes.

February 15, 2001
Initial sequencing and analysis of the human genome. (Nature 409: 860-921) The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.