NHGRI Announces Latest Sequencing Targets
Bethesda, Maryland — The National Human Genome Research
Institute (NHGRI), one of the National Institutes of Health (NIH), today announced
several new sequencing targets including the Northern white-cheeked gibbon (Nomascus
leucogenys), setting the stage for completing a quest to sequence the genome
of at least one non-human primate genome from each of the major positions along
the evolutionary primate tree and making available an essential resource for
researchers unraveling the genetic factors involved in human health and disease.
Comparing the genomes of other species to humans is an exceptionally powerful
tool to help researchers understand the working parts of the human genome in
both health and illness.
NHGRI’s Large-Scale Sequencing Research Network and their international partners
have already sequenced or been approved to sequence at high-density coverage
the genomes of several non-human primates including the chimpanzee (Pan troglodytes),
rhesus macaque (Macaca mulatto), orangutan (Pongo pygmaeus),
marmoset (Callithrix jacchus) and gorilla (Gorilla gorilla).
“The gibbon genome sequence will provide researchers with crucial information
when comparing it to the human genome sequence and other primate genomes, shedding
light on molecular mechanisms implicated in human health and disease — from
infectious diseases and neurological disorders to mental illness and cancer,” said
NHGRI Director Francis S. Collins, M.D., Ph.D.
The gibbon genome is unique because it carries an extraordinary high number
of chromosome rearrangements, even when compared to other primates. These rearrangements
occur when small or large segments of a chromosome become detached and reattach
to the same chromosome or another chromosome. Such chromosomal rearrangements
can wreak havoc on a cell, and can contribute to birth defects or cancer in humans.
The gibbon genome will also help scientists better understand rearrangements
called segmental duplications which are large, almost identical copies of DNA,
present in at least two locations in the human genome. A number of diseases are
known to be associated with mutations in segmental duplicated regions, including
a form of mental retardation and other neurological and birth defects.
Segmental duplications cover 5.3 percent of the human genome, significantly
more than in the rat genome, which has about 3 percent, or the mouse genome,
which has between 1 and 2 percent. Segmental duplications provide a window into
understanding how the human genome evolved and how it may still be changing.
The high proportion of segmental duplications in the human genome shows how human
genes have undergone rapid functional innovation and structural change during
the last 40 million years, presumably contributing to unique characteristics
that separate humans from non-human primate ancestors.
With the sequencing of major primate genomes, researchers are able to more
precisely study the differences between primates and humans. For instance, an
analysis of the chimpanzee genome sequence has revealed three key genes involved
in inflammation have been deleted in the chimpanzee genome, possibly explaining
some of the known differences between immune and inflammatory responses of chimps
and humans. Identifying these genes gives researchers a more precise starting
point for understanding molecular pathways and developing better diagnostics
and therapies involved in immune and inflammatory diseases.
In addition, some primates are important biomedical models because of their
genetic, physiologic and metabolic similarities with humans. For example, the
rhesus macaque is an essential research model for drug development, neuroscience,
behavioral biology, reproductive physiology, endocrinology, and cardiovascular
studies. In addition, because it can be infected with simian immunodeficiency
virus, a close cousin to the human immunodeficiency virus (HIV), the rhesus is
widely recognized as the best animal model for research on Acquired Immune Deficiency
Syndrome, or AIDS. It also serves as a valuable model for studying other human
infectious diseases and for vaccine research, most recently for the virus causing
Severe Acute Respiratory Syndrome, or SARS.
Comparing the human genome with the genomes of other non-human primates and
other organisms has been shown to be an effective tool for identifying the function
and structure of genes. Most sections of the human genome originated long before
humans themselves. Consequently, scientists can use genome sequences of strategically
selected organisms to learn more about how, when and why the genomes of humans
and other mammals came to be composed of certain DNA sequences.
The latest sequencing plan, which includes the gibbon, was recently approved
by the National Advisory Council for Human Genome Research, a federally chartered
committee that advises NHGRI on program priorities and goals. It also consists
of a set of organisms whose genome sequence will add to the comprehensive strategic
list of priority targets for genomic sequencing by the NHGRI’s Large-Scale Sequencing
program.
Seven mammals which have been previously approved to be sequenced at low-density
genome coverage have been targeted to now be sequenced at high-density genome
coverage. The refined genome sequences will improve the accuracy of comparisons
between mammalian genomes, one of the most effective ways to pinpoint the roughly
5 percent of the 3-billion base pair human genome that is most obviously functional.
The seven mammals to be sequenced are: the nine-banded armadillo (Dasypus
novemcinctus); domestic cat (Felis catus); guinea pig (Cavia
porcellus); African savannah elephant (Loxodonta Africana); tree
shrew (Tupaia species); rabbit (Oryctolagus cuniculus); and
a bat species that will be determined based on the availability of a high-quality
DNA sample and the selected bat’s promise as a biomedical model. NHGRI has
recently approved the sequencing of the horse (Equus caballas) to
high-density genome coverage.
A set of five fungi, known as dermatophytes, and which are the most common sources
of human fungal disease, will also have their genomes sequenced. Dermatophyte
fungi are highly communicable and infect millions of people worldwide leading
to costs of approximately $400 million a year for treatment alone. The dermatophytes
to be sequenced are Trichophyton rubrum, Microsporum canis and Microsporum
gypseum, all which will be sequenced to a high-density genome coverage;
and Trichophyton tonsurans and Trichophton equinum, both of
which must be sequenced to a medium-density genome coverage. Scientists then
will be able to compare the genome sequence information from these organisms
to determine which genes are responsible for the differences in infectivity.
Those genes will be logical starting points for developing more effective diagnostic,
prevention and treatment approaches to fungal infections in both humans and animals.
Also selected in the latest round is a project to sequence up to 50 strains
of the yeast Saccharomyces cerevisiae. The genome of Saccharomyces
cerevisiae was first completed in 1996 and is a primary model for studying
variations in genomes that can contribute to health and disease. The genomic
data provided by this effort will allow researchers to develop basic tools to
better understand human variation, such as distinguishing functional from non-functional
variations within genes.
A final set of sequencing targets was chosen to address the question: What
genes and other genomic features were responsible for the origin of multi-celled
organisms? More than 1 billion years ago, two of the major multi-cellular groups
of organisms (fungi and animals) shared a single-celled ancestor. This project
targets ten of the earliest branches of animals and fungi along with some of
their single-celled relatives providing, for the first time, comprehensive data
to fill gaps in our understanding of animal and fungal evolution. Recent research
has shown that some genes in the human genome that are responsible for early
animal development arose much earlier than thought, in some cases in single-celled
organisms. Therefore, this set of ten targets is likely to reveal the origins
of other genes important for multi-cellularity in all such animals, including
humans. The ten targets, all of which involve relatively small genomes, include
six to be sequenced at high-density genome coverage: Capsaspora owczarzaki; Sphaeroforma
arctica; an Amastigomonas species; a Salpingoeca or Codosiga
species; Allomyces macrogynus; and Nucleria simplex; and
four to be sequenced at low-density genome coverage: Amoebidium parasiticum; Mortierella
verticilllata; Spizellomyces punctatus; and a Stophanoeca or Acanthocoepis
species.
NHGRI’s Large-Scale Sequencing Research Network also includes a portfolio of
medical sequencing projects. These projects are designed to use high-throughput
sequencing resources to lead to significant medical advances. As more is learned
from sequencing and other studies about the genomic contribution to disease,
and as the cost of obtaining sequence information decreases, genomic sequence
information will become ever more important both for medical research and for
providing medically relevant information to individuals. When it becomes affordable
for an individual's genome to be fully sequenced, genomic information will allow
estimates of future disease risk for individuals, as well as improve prevention,
diagnosis, and treatment.
Projects given the highest priority will use large-scale sequencing over the
next few years to identify the genes responsible for dozens of relatively rare,
single-gene (autosomal Mendelian) diseases; sequence all of the genes on the
X chromosome from affected individuals to identify those involved in sex-linked
diseases; and to survey the range of variants in genes known to contribute to
some common diseases.
An example of a medical sequencing project launched last year is The Cancer
Genome Atlas (TCGA) pilot project, a groundbreaking effort between NHGRI and
the National Cancer Institute that seeks to systematically characterize the genetic
changes that occur in cancer. Information on TCGA is available at http://cancergenome.nih.gov.
Sequencing work on approved targets are carried out by the NHGRI-supported,
Large-Scale Sequencing Research Network, which consists of five centers: Agencourt
Bioscience Corp., Beverly, Mass.; Baylor College of Medicine, Houston; the Broad
Institute of MIT and Harvard, Cambridge, Mass.; the J. Craig Venter Institute,
Rockville, Md.; and Washington University School of Medicine, St. Louis. Assignment
of new organisms to a specific center or centers will be determined at a later
date.
NHGRI's process for selecting sequencing targets begins with three working
groups comprised of experts from across the research community. Each of the working
groups is responsible for developing a proposal for a set of genomes to sequence
that would advance knowledge in one of three important scientific areas: to identify
areas in genetic research where the application of high-throughput sequencing
resources would rapidly lead to significant medical advances; understanding of
the human genome; and understanding the evolutionary biology of genomes. A coordinating
committee then reviews the working groups' proposals, helping to fine-tune the
suggestions and integrate them into an overarching set of scientific priorities.
The recommendations of the coordinating committee are reviewed and approved by
one of NHGRI's advisory groups, The National Advisory Council for Human Genome
Research, which in turn forwards its recommendations to NHGRI leadership. For
more on the selection process, go to: www.genome.gov/Sequencing/OrganismSelection.
A complete list of organisms and their sequencing status can be viewed at www.genome.gov/10002154.
High-resolution photos of many of the organisms being sequenced in NHGRI’s Large-Scale
Sequencing Program are available at: www.genome.gov/10005141.
The NHGRI’s Division of Extramural Research supports grants for research
and for training and career development at sites nationwide. Additional information
about NHGRI can be found at its Web site, www.genome.gov.
The National Institutes of Health (NIH) — The Nation's Medical Research
Agency — includes 27 Institutes and Centers and is a component of
the U.S. Department of Health and Human Services. It is the primary federal
agency for conducting and supporting basic, clinical and translational medical
research, and it investigates the causes, treatments, and cures for both common
and rare diseases. For more information about NIH and its programs, visit www.nih.gov. |