NHGRI Announces New Sequencing Targets
Comprehensive Strategy Seeks to Identify Structural Variation in Human Genome
Bethesda, Maryland — The National Human Genome Research Institute (NHGRI),
one of the National Institutes of Health (NIH), today announced its latest round
of sequencing targets, with an emphasis on enhancing the understanding of how
human genes function and how genomic differences between individuals influence
the risk of health and disease.
The National Advisory Council for Human Genome Research, which is a federally
chartered committee that advises NHGRI on program priorities and goals, recently
approved three plans to specify the targets as part of its comprehensive strategy
for NHGRI’s Large-Scale Sequencing Research Network
“The goal of our sequencing program is to build the most powerful toolbox possible
for advancing human health. By identifying and seeking to fill crucial gaps in
our knowledge, these new sequencing plans represent yet another important step
in that direction,” said NHGRI Director Francis S. Collins, M.D., Ph.D.
The plan given the highest priority is a project to identify structural variations
in the human genome, which will characterize the most common types of structural
variation in human DNA. The effort will use 48 human DNA samples donated for
the recently completed International HapMap Project, which produced a comprehensive
catalog of human genetic variation, or haplotypes, designed to speed the search
for genes involved in common diseases. The HapMap identified neighborhoods of
tiny changes in DNA — known as single nucleotide polymorphisms (SNPs) — that
can be involved in human disease. The structural variation effort will seek to
identify instances where larger segments of DNA have been deleted, duplicated
or rearranged — all of which can cause disease by disrupting the structure and
function of genes.
A recent analysis has shown that these large-scale structural variations are
much more common than previously appreciated. In fact, the genomes of any two
humans are thought to differ by several hundred insertions, deletions and inversions.
The second plan will add DNA sequence to existing draft sequences of a number
of primate species and add additional sequence information in regions of high
biological interest within those genomes. The increased coverage — a high-density
genome sequence — will allow for an even better understanding of the factors
contributing to the evolution of the human genome. The primates chosen for this “index
species” effort are rhesus macacque (Macaca mulatta), marmoset (Callithrix
jacchus)
and orangutan (Pongo pygmaeus). In the future, NHGRI intends to add other organisms
to the list of index species for which high-density genome sequences are desirable.
The third plan includes sequencing the genomes of eight new mammals at low-density
draft coverage, which will be generated by sequencing their genomes at two-fold
coverage. That will bring to 24 the number of mammalian genomes sequenced at
two-fold coverage, in addition to human and another seven mammalian genomes in
draft or finished form sequenced by NHGRI-supported centers and made freely available
in public databases. Scientists will use the combined data to look for features
that are similar, or conserved, among the genomes of the human and other mammals.
The eight new mammals to be sequenced will be chosen from the following 10
species: dolphin (Tursiops truncates), elephant shrew (Elephantulus
species),
flying lemur (Dermoptera species), mouse lemur (Microcebus murinus), horse (Equus
caballus), llama (Llama species), mole (Cryptomys species), pika (Ochotona
species),
a cousin of the rabbit, kangaroo rat (Dipodomys species) and tarsier (Tarsier
species), an early primate and evolutionary cousin to monkeys, apes, and humans.
NHGRI will base the choice of the eight mammals to be sequenced on the availability
of high-quality DNA samples, the organisms’ promise as biomedical models, and
the presence of unique, innovative biological processes that may have contributed
to the human genome over the course of evolution.
Such comparisons between mammalian genomes represent one of the most effective
ways to pinpoint the roughly 5 percent of the 3-billion base pair human genome
that is most obviously functional. According to computer modeling results, it
is expected that comparisons among the 24 genome sequences will allow conserved
sequences as small as six base pairs to be identified reliably. Six base pairs
is roughly the size of a transcription factor binding site: a small DNA sequence
occurring near a gene that is involved in switching the gene on or off.
Sequencing efforts will be carried out by the NHGRI-supported, Large-Scale
Sequencing Research Network, which consists of five centers: Agencourt Bioscience
Corp., Beverly, Mass.; Baylor College of Medicine, Houston; the Broad Institute
of MIT and Harvard, Cambridge, Mass.; the J. Craig Venter Institute, Rockville,
Md.; and Washington University School of Medicine, St. Louis. Assignment of each
organism to a specific center or centers will be determined at a later date.
NHGRI's process for selecting sequencing targets begins with three working
groups comprised of experts from across the research community. Each of the working
groups is responsible for developing a proposal for a set of genomes to sequence
that would advance knowledge in one of three important scientific areas: to identify
areas in genetic research where the application of high-throughput sequencing
resources would rapidly lead to significant medical advances; understanding of
the human genome; and understanding the evolutionary biology of genomes. A coordinating
committee then reviews the working groups' proposals, helping to fine-tune the
suggestions and integrate them into an overarching set of scientific priorities.
The recommendations of the coordinating committee are reviewed and approved by
one of NHGRI's advisory groups, The National Advisory Council for Human Genome
Research, which in turn forwards its recommendations to NHGRI leadership. For
more on the selection process, go to: www.genome.gov/Sequencing/OrganismSelection.
A complete list of organisms and their sequencing status can be viewed at www.genome.gov/10002154.
High-resolution photos of many of the organisms being sequenced in NHGRI’s Large-Scale
Sequencing Program are available at: www.genome.gov/10005141.
NHGRI is one of the 27 institutes and centers at NIH, an agency of the Department
of Health and Human Services. The NHGRI Division of Extramural Research supports
grants for research and for training and career development at sites nationwide.
Additional information about NHGRI can be found at its Web site, www.genome.gov.
The National Institutes of Health (NIH) — The Nation's Medical Research
Agency — includes 27 Institutes and Centers and is a component of
the U.S. Department of Health and Human Services. It is the primary federal
agency for conducting and supporting basic, clinical and translational medical
research, and it investigates the causes, treatments, and cures for both common
and rare diseases. For more information about NIH and its programs, visit http://www.nih.gov. |