BMAP: Reports and Publications

Setting Priorities for Molecular Neuroanatomy
in the Postgenomic Era

Summary and recommendations from a workshop organized under the auspices of the NIMH, NIDA, and NINDS at Laguna Beach, CA in January 2002

Marc Tessier-Lavigne and Lubert Stryer, organizers

Part I: Overview and Summary

Opportunities and Challenges
The brain is the most complex organ in our body, responsible for perception, behavior, cognition, memory, and consciousness. It is comprised of about a trillion nerve cells, or neurons, whose intricate and precisely wired connections underlie all of these functions. But the neurons are not all the same: many thousands of different classes of neurons, defined by a variety of criteria such as morphology, patterns of connectivity, and expression of particular neurotransmitters and receptors, serve as the cellular building blocks of the brain. Each of these neuronal types has a specific physiological role in brain function. Neurological and psychiatric diseases are diseases of particular neurons or neural circuits.

The complexity of brain cell types and circuits is reflected in the complexity of gene expression patterns in the brain. It is believed that perhaps a third to half of all genes are largely or exclusively dedicated to directing the development, maintenance and functioning of the brain. With more than 30,000 genes in the human genome, the task of mapping all genes to the many thousands of neuronal classes and neuronal circuits Ð what is being called molecular neuroanatomy Ð might seem beyond reach. In fact, this mapping is taking place. It is proving to be remarkably informative and illuminating despite being performed on a relatively small scale thus far, with about a thousand genes mapped, and some mapped only in particular subregions of the brain. Such studies have shown that analysis of gene expression in neurons can yield essential information on neural development and function, such as the identity of neurons involved in responses to particular drugs, or the genes that control the development of particular classes of neurons. Such analysis has also defined molecular markers of particular neuronal cell types, helping with the taxonomy of brain cell types Ð the division of known neuronal classes into further subclasses. Especially important, the availability of cell type-specific molecular markers for particular neuronal classes has provided tools to deliver genes and gene products to those neurons (in ways discussed below), dramatically facilitating the analysis of their development, connectivity, function, and dysfunction.

The potential medical benefits that will derive from this knowledge are immense. The identification of genes expressed in particular classes of neurons linked to specific diseases provides new drug targets for the treatment of a wide range of ailments including stroke, spinal cord injury, neurodegenerative diseases like Parkinson's disease, brain tumors, schizophrenia, depression, anxiety disorders, and addiction. Neuronal cell type-specific markers provide a means for developing gene therapies that involved changing gene expression in particular neurons. They also make it possible to visualize neural circuits in their normal and abnormal states, which is likely to have a large impact on the diagnosis of disease and the evaluation of the effectiveness of therapy. The identification of transcription factors that control cell fate and connectivity in the brain will accelerate the development of therapies to regenerate nervous tissue.

In the 1990s, several Institutes of the National Institutes of Health, recognizing the importance and promise of molecular neuroanatomy for public health, launched the Brain Molecular Anatomy Project (BMAP), a series of funding initiatives to map gene and gene product expression to neuronal cell types to create a Molecular Brain Map. In January 2002, a workshop was convened by NIMH, NINDS, and NIDA to bring together experts in the field (Table 1) to take stock of existing efforts, formulate recommendations for upcoming work in this field, and help establish scientific priorities, especially in light of the exceptional opportunities created by the completion of the Human Genome Project and the identification of nearly all genes in the genome.
A need to generate rapidly a Molecular Brain Map
A consensus view of the working group, reiterating the premise of BMAP, is that enormous benefit will derive from a systematic, large-scale, and organized effort to generate a Molecular Brain Map for humans and the mouse. The utility of a systematic effort is well illustrated by the Human Genome Project. Prior to the Project, the genome was already being sequenced in a piecemeal fashion by thousands of investigators world-wide, but these laboratory-based initiatives involved considerable redundancy as well as inefficiencies because of their small scale. An organized effort to sequence the human genome Ð and that of other species Ð made it possible to achieve an enormous economy of scale and to complete the sequencing of genomes much more rapidly, thereby empowering the entire biomedical community and greatly accelerating the pace of discovery of new knowledge and novel therapeutics. In the same way, the creation of a Molecular Brain Map would eventually result from the independent activities of individual investigators, but an economy of scale can be achieved through more systematic efforts, like those already supported by the NIH (see below). The acceleration of this process will in turn accelerate the pace of discovery in the neurosciences, neurology, and psychiatry.
Summary of recommendations
The working group achieved consensus on the following principles, many of which have already been incorporated into existing efforts at the National Institutes of Health.
1. It affirmed the importance of driving to completion the generation of a Molecular Brain Map, a tool that will revolutionize the study of both normal brain function and development, as well as neurological and psychiatric disease.
2. An impressive start has been made in initiating several approaches to accomplish this goal, but the pace needs to be markedly accelerated. At the next stage, the scope should also be broadened: It is essential that functional circuits as well as individual neurons be mapped, i.e. the Gene Expression Map needs to be supplemented with a Connectivity Map. Both efforts would benefit at this stage from the simultaneous pursuit of a broad survey of expression of all genes, and an in-depth focus on particular model neural circuits.
3. It affirmed the need to set standards for the generation, acquisition, mining and sharing of data in this Map, to permit its efficient construction and utilization.
4. It catalogued a set of enabling reagents, datasets, and complementary technologies that need to be developed to construct this Map efficiently and to exploit it to the fullest for the study of brain function and dysfunction (Tables 2 and 3).
5. It affirmed the importance of full public access to the information, reagents, and technologies that will be generated in this initiative, to leverage these resources for the advancement of knowledge and for biomedical progress.
In Parts II-IV of this report, we provide background information on molecular neuroanatomy, including a description of existing large scale initiatives, and discuss emerging opportunities. In part V, we discuss in detail the recommendations and priorities formulated by the participants that were summarized above.

Table 1: Workshop Participants

David J. Anderson, Ph.D.
California Institute of Technology

Carolee Barlow, M.D., Ph.D.
The Salk Institute for Biological Studies

Sydney Brenner, Ph.D.
The Salk Institute for Biological Studies

Catherine Dulac, Ph.D.
Harvard University

Gregor Eichele, Ph.D.
Baylor College of Medicine

Scott Fraser, Ph.D.
California Institute of Technology

Jeffrey M. Friedman, M.D., Ph.D.
The Rockefeller University

Fred H. Gage, Ph.D.
The Salk Institute for Biological Studies

Paul Greengard, Ph.D.
The Rockefeller University

Bruce Hamilton, Ph.D.
University of California, San Diego, School of Medicine

Mary-Beth Hatten, Ph.D.
The Rockefeller University

Nathaniel Heintz, Ph.D.
The Rockefeller University

Ali Hemmati-Brivanlou, Ph.D.
The Rockefeller University

RenŽ Hen, Ph.D.
Columbia University College of Physicians and Surgeons

Tom Jessell, Ph.D.
Columbia University College of Physicians and Surgeons

Alexandra L. Joyner, Ph.D.
New York University Medical Center

Eric Kandel, M.D.
Columbia University College of Physicians and Surgeons

Larry Katz, Ph.D.
Duke University Medical Center

Stuart Kim, Ph.D.
Stanford University School Of Medicine

Alex Kolodkin, Ph.D.
Johns Hopkins University
School of Medicine

George Lake, Ph.D.
Institute for Systems Biology, Seattle

Pat R. Levitt, Ph.D.
University of Pittsburgh School of Medicine

David Lockhart, Ph.D.
The Salk Institute for Biological Studies

Robert C. Malenka, M.D. Ph.D.
Stanford University School of Medicine

Susan K. McConnell, Ph.D.
Stanford University

Dennis OÕLeary, Ph.D.
The Salk Institute for Biological Studies

John Rubenstein, M.D., Ph.D.
University of California, San Francisco

Edward Scolnick, M.D.
Merck Research Laboratories

Lubert Stryer, M.D.
Stanford University

Michael P. Stryker, Ph.D.
University of California, San Francisco

Joseph S. Takahashni, Ph.D.
Northwestern University

Marc Tessier-Lavigne, Ph.D.
Stanford University

Roger Y. Tsien, Ph.D.
University of California, San Diego

Arthur Toga, Ph.D.
University of California, Los Angeles

Kamil Ugurbil, Ph.D.
University of Minnesota

Chris A. Walsh, M.D., Ph.D.
Harvard Medical School
BIDMC Room 816
Harvard Institutes of Medicine

Richard Woychick, Ph.D.
Lynx Therapeutics, Inc.

Charles Zuker, Ph.D.
University of California, San Diego

NIH Program Staff

James Battey, M.D., Ph.D.
National Institute of Deafness and Other Communication Disorders

Robert Baughman, Ph.D.
National Institute of Neurological Disorders and Stroke

Hemin Chin, Ph.D.
National Institute of Mental Health

Stephen Foote, Ph.D.
National Institute of Mental Health

Glen Hanson, Ph.D., D.D.S.
National Institute on Drug Abuse

Michael Huerta, Ph.D.
National Institute of Mental Health

Michael Iadorola, Ph.D.
National Institute on Dental and Craniofacial Research

Gabrielle Leblanc, Ph.D.
National Institute of Neurological Disorders and Stroke

Steven O. Moldin, Ph.D.
National Institute of Mental Health

Bret Peterson, Ph.D.
National Center for Research Resources

Jonathan Pollock, Ph.D.
National Institute on Drug Abuse

Brad Wise, Ph.D.
National Institute on Aging

Graeme Wistow, Ph.D.
National Eye Institute

Observer

John H. Williams, Ph.D.
The Wellcome Trust

Table 2: Enabling Reagents and Datasets

A dataset of full-length transcripts/cDNAs expressed in the nervous system; this dataset is well on its way to completion.
A bank of Bacterial Artificial Chromosomes that permit transgenic labeling and manipulation of major neuronal populations in the mouse brain; only a few hundred of these exist today.
A bank of short promoter elements for extending such manipulations to primates and other non-genetic systems; few of these exist at present.
A set of antibody probes (for immunohistochemistry) that help extend and leverage the Molecular Brain Map. A priority is the generation of antibodies to the ~1,500 transcription factors encoded in the genome, to help identify neuronal cell types and stem cells; only a small fraction (~10%) of these exist at present.

Table 3: Complementary Technologies

Short term:

Improved methods for isolation of high-quality mRNA from small numbers of neurons, including laser-capturing of mRNA from neurons identified through expression of a transgenic reporter (such as GFP), and for faithful linear amplification of cDNA from the mRNA for gene expression analysis

Medium term and longer term:
Improved methods for mapping neuronal circuits (identifying all inputs to each neuron and all outputs from each neuron)
Methods for detecting electrical activity in mammalian neurons by optical recording using genetically-encoded reporters
Methods for controlling activity in defined neurons, in particular genetically-encoded modulators of electrical activity that can be activated with specific signals such as pharmacological agents or light
Methods for detecting neuronal activity in deep brain structures using such genetically-encoded reporters
Methods for mapping patterns of neuronal activity onto patterns of gene expression and neuronal interconnectivity
Methods for persistent labeling of neurons over time, to follow plastic changes in morphology, connections, or function
Methods for visualizing changes in neuronal connections, such as pulse-chase labeling of synapses

Part II: Background: Gene-Based and Cell-Based Approaches to Creating a Map

Historical background
Using histological tools like the Golgi staining method, neuroanatomists in the late 19th and early 20th centuries defined hundreds of different neuronal cell types based on their location in the brain, morphology, and connectivity. Throughout the 20th century, this number grew, and neuroscientists were also able to assign particular physiological functions to a large number of these neuronal types and many of the circuits linking these cells. At present, the number of distinct neuronal cell types is not precisely known, but it is estimated that there must be at least several thousand. This estimate is supported by the finding of novel subclasses whenever a particular population of neurons is studied in detail. For example, a recent assessment of amacrine cells subtypes in the retina, previously thought to number about a dozen, revealed the existence of more than thirty easily definable subtypes. These thousands of cell types, through their specific patterns of interconnections, direct the functioning of the brain.

Subpopulations of neurons are increasingly being distinguished based on their pattern of gene expression. Historically, molecular neuroanatomy started with cellular pharmacology - the definition of the neurotransmitters and neurotransmitter receptors made by particular neurons, using a variety of physiological, in situ radioligand binding, and histochemical methods. The development of the technique of immunohistochemistry accelerated the mapping of proteins (such as neuropeptides) and other epitopes (such as specific carbohydrate moieties) to particular neurons in cases where a suitable antibody was available. However, it is the development of sensitive and reproducible mRNA in situ hybridization techniques that unleashed the systematic analysis of gene expression in neurons, because this approach can be readily applied to all genes without requiring the labor-intensive development of specific detection reagents (such as antibodies); in all cases, a small gene fragment is sufficient to develop an appropriate in situ hybridization probe. In situ hybridization methods have been supplemented by transgenic (promoter-based, and BAC-based) and knock-in approaches, which make it possible to visualize the pattern of expression of particular genes using genetically encoded reporters driven from the gene locus in transgenic mouse lines. (These approaches are explained further below).

Gene- (and Gene Product-) Based vs. Cell- (and Region-) Based Approaches to Gene Expression Profiling

Existing powerful methods for mapping gene and gene product expression are summarized in Table 4, and are divided into gene- (and gene product) based and cell- (and region) based approaches. In situ hybridization is an example of a gene-based method for gene expression analysis. Such methods make it possible to detect a single gene product (e.g. a particular mRNA) in a very large number of cells in brain slices.

A complementary approach is, however, provided by cell- (and region-) based methods of analysis. Such methods involve isolating small brain regions or, in the limit, specific neuronal populations or even single cells, extracting mRNA from these cells, and subjecting them to gene expression analysis using DNA arrays or other methods (such as direct cDNA library sequencing). These methods make it possible to detect a very large number (thousands) of gene products in a small number of cells.

Table 4: Methods to Map Genes and Gene Products to Neurons

A. Gene- (and Gene-Product) Based:

Method	Limitations/Limiting Steps	Current Applicability
Histochemical Radioligand binding Immunohistochemistry Transgenic: promoter-based	Requires optimized stain Requires optimized ligand Requires antibodies Requires promoters	Current Applicability
Transgenic: BAC-based or knock-in	Limited by generation of constructs, ES cells, mice	Hundreds to thousands of genes
In situ hybridization	Limited by tissue sectioning	Thousands to tens of thousands of genes

B. Cell- (and Region) Based:

Methods for isolating cells and mRNA:
Dissection (for small or large region)
Cell marking and purification (e.g. by Fluorescence Activated Cell Sorting)
Single cell picking
Laser-capture microdissection

Methods for analyzing gene expression
DNA arrays
cDNA library sequencing
Other (SAGE, MPSS, etcÉ)

Gene-based and cell-based approaches are complementary, with different advantages and disadvantages.

Gene- (and gene-product) based approaches have the advantage of providing a broad overview of any particular gene throughout the brain. Of these methods, in situ hybridization has the highest throughput. Its disadvantage can be limited cellular resolution, particularly since only the cell body (or part of the cell body) is labeled. In some instances, it is possible to tell within different brain regions exactly which cells are expressing the gene, for example when the gene is found in cells with a particular morphological feature (such as a large cell body) that makes it possible to distinguish them from other cells in the vicinity. In many cases, however, it is not possible to assign the expressed gene to a specific cell. As a consequence, when comparing two different genes expressed in the same region, it is often not possible to know whether the two are expressed by the same or different cells, unless the expression analysis is performed simultaneously with two probes that can be visualized independently (e.g. using two fluorescent tags). Such double-labeling technologies are being improved but are still limiting at present. Better cellular resolution can be obtained by other gene- or gene-product based methods that result in labeling of the neuron's processes, since shape and connectivity often identify particular neurons. This can be achieved, for instance, with transgenic approaches discussed in the next section. Thus, although such approaches have a lower-throughput than in situ hybridization, they help provide greater resolution.
Cell- (and region-) based approaches have a complementary set of advantages and disadvantages. In these approaches, the entire pattern of genes expressed by a particular cell (or population of like cells) can be identified by extracting mRNA from the cell(s) and subjecting it to analysis using DNA arrays (or through some other method such as direct cDNA library sequencing). These technologies make it possible to determine the expression profile of a small number of cells, and in some cases, even a single cell. What limits these approaches is the difficulty of obtaining mRNA from just the cells of interest. In experimental animals such as mice, specific neuronal populations can often be isolated and purified (see below) and used as a source of mRNA, but this approach is of limited utility in the human brain. An alternative method is laser capture microdissection, in which mRNA is directly isolated from specific cells identified in fixed brain sections. Laser-capture is thus more applicable to human tissue, but the method is still being perfected (for instance, the captured mRNA can be degraded, a technical problem that is likely to be resolved before long). It is also limited by the ability to identify the neurons of interest in tissue sections (discussed in more detail below).

Thus, gene-based approaches give broad coverage of all brain regions but often with more limited cellular resolution. Cell-based approaches can provide high cellular resolution and give information on the complete set of transcripts expressed by a cell, but their throughput is lower and they are more difficult to apply to the human brain. Both types of approaches are needed.

Tools to Deliver Genes to Particular Neurons Facilitate Cell-Based Approaches
The utility of cell-based approaches is limited by the ability to isolate and to purify particular neuronal populations (for approaches based on cell isolation), or to recognize particular populations in tissue sections (in the case of laser capture microdissection). Some select populations of neurons can be purified based on physical characteristics, such as size, through the use of probes (such as antibodies) directed against particular cell surface epitopes, or by using fluorescent markers injected into the termination sites of the neurons' axons, which are taken up by the axons and retrogradely transported. However, these approaches are currently applicable only to small numbers of neurons.
More general approaches to isolating particular populations of neurons involve transgenic approaches in which expression of a genetically-encoded reporter, such as the Green Fluorescent Protein (GFP) or some other molecular tag is driven in the neurons of interest in transgenic mice. The neurons can then be recognized and purified by some other method that makes use of the reporter tag, e.g. by Fluorescence Activated Cell Sorting using GFP fluorescence. (In principle, the molecular tagging of neurons should also permit their identification in tissue sections for laser-capture microdissection; current laser-capture methods are not readily compatible with such molecular tagging, although it can be expected that this technical problem will be solved before long).
The transgenic labeling approaches require the ability to drive reporter expression in the cells. This is usually done by first identifying a particular marker gene expressed in the neurons of interest; the marker must be specific for those cells - at least in a particular region of brain that can be isolated through dissection from any other regions where the marker is expressed. Once an adequate marker gene is identified, the generation of a transgenic mouse in which the reporter is expressed from the marker gene locus is currently achieved primarily by three methods.
1. Short promoters. In the case of some marker genes, it is possible to isolate a promoter element (stretch of DNA) that directs expression of the reporter. One disadvantage of this approach is that it is often labor-intensive and difficult to identify the promoter. Another is that it can often be difficult to isolate a transgenic mouse line in which the reporter, driven by the promoter, gives faithful expression, because of position effects on integration of the reporter construct in the host genome. One advantage, however, is that small promoters can be used in viral delivery systems, which can be used more readily in organisms other than mice (including primates).
2. Bacterial Artificial Chromosomes (BACs). As an alternative, large (>50kB) bacterial artificial chromosomes (BACs) containing the gene of interest as well as its control regions can be modified to drive expression of a reporter construct in a pattern that mirrors that of the starting gene when reintroduced into transgenic animals. BACs have the disadvantage that they are too large for use in viral vectors. However, in transgenic mice, BACs have some advantages over small promoter elements: they are readily isolated, can contain all the gene regulatory elements, and because of their size are usually less subject to position effects on integration. In principle, BACs can also be used in other species that are amenable to the generation of transgenic animals (such as rats).
3. Knock-ins. Another means of delivering reporters to particular cell types is through knock-in technology in mice. In this approach, a reporter is introduced into a particular marker gene locus by homologous recombination in embryonic stem (ES) cells, and transgenic mice are generated from the ES cells. This approach is applicable only in species where knock-ins are possible, primarily mice. One advantage of this over other transgenic approaches is that the marker gene can be inactivated during the procedure, so that gene function can be assessed. Traditionally, a disadvantage has been that making knock-ins has been labor intensive, but this has been improving.
The ability to deliver constructs selectively to particular neuronal populations is so important to generating and exploiting a Molecular Brain Map that a high priority should be assigned to generating the tools (such as BACs and promoter elements) that will make this possible (Table 2).
Tools to deliver genes to particular neuronal populations facilitate tracing neuronal connections, and monitoring and manipulating neuronal function
In addition to allowing the marking and isolation of particular populations of neurons, these tools make it possible to deliver other transgenic constructs to the neurons, which facilitates identifying patterns of neuronal connections, and monitoring and manipulating electrical activity, and manipulating neuronal function.
1. Mapping connectivity. Traditionally, the connections a neuron makes have been identified using electrophysiological recordings, from histochemical techniques (such as the Golgi technique), or through the use of anterograde or retrograde tracers (such as Horseradish Peroxidase (HRP)) that make it possible visualize the patterns of projections of individual neurons. These methods are laborious and in many cases applicable only to small populations of cells. The ability to drive expression of reporter genes in particular neuronal populations makes it possible to readily trace their connections by delivering genetically encoded reporters of neuronal projections provided by proteins such as beta-galactosidase, alkaline phosphatase or GFP (or modified versions of such proteins that are readily transported down axons), which allow simple histochemical or fluorescent labeling of axons of neurons making the reporter. In addition, much effort is currently being devoted to devising genetically-encoded tracers that are transported across synapses and hence permit transsynaptic tracing of connections, both anterogradely and retrogradely, allowing patterns of connectivity to be defined.
2. Monitoring and manipulating neuronal activity. The labeling of specific neuronal populations with genetically encoded fluorescent markers like GFP makes it possible to identify these neurons in slices or cultures of brain tissue, allowing for electrical recordings with microelectrodes on these neurons. Potentially even more powerful are genetically encoded reporters of electrical activity. At present, indirect measures of electrical activity can be obtained with genetically encoded reporters of cellular properties like intracellular calcium concentration that can serve as indirect measures of electrical activity. Proteins whose optical properties change with membrane potential have been devised that can be used to monitor electrical activity of model cells such as frog oocytes; the extension of this technology to create similar proteins that can function in mammalian neurons would make it possible to monitor electrical activity of neurons through optical recording after delivery to these neurons using the approaches described above. Similarly, genetically encoded modulators of neuronal activity, such as particular ion channels that can be opened or closed through use of pharmacological agents or other means (e.g. by light stimulation), when delivered to particular neuronal populations, provide tools to control neuronal activity and thereby to assess neuronal function.
There are significant limitations on existing genetically encoded constructs for tracing connections and monitoring and manipulating neuronal activity, so that exploiting the Molecular Brain Map fully will require improving both sets of technologies (Table 3).

Part III: Requirements of a Molecular Brain Map

With this background, we now discuss what needs to be discovered.

A Molecular Map of the Adult Mouse Brain
We first discuss the challenges involved in establishing a Molecular Brain Map for the normal adult mouse brain. We focus on the mouse because its brain is highly analogous in structure and organization to the human brain, but at the same time it is readily amenable to manipulation of gene activity and function in a manner not yet possible in species more closely related to humans. These properties have made the mouse the key model organism for molecular studies of brain function and dysfunction, and there was consensus at the workshop on the need to give highest priority to generating a mouse Map.

A comprehensive Molecular Brain Map should provide the pattern of expression of all genes in all neurons throughout the brain, and involves the following component parts.
1. Mapping all genes and splice variants. The existing Genome Projects are in the process of completing the cataloguing of the more than 30,000 genes in the genomes of the human and the mouse. In the first instance, it will be important to describe the expression of all these genes. A further complexity arises from the fact that many genes are subject to alternative splicing, which can often alter the function of gene products. This splicing is often poorly understood. Ultimately, a comprehensive Map should incorporate information on the specific splice variants for each gene that are expressed by each neuron.
2. Devising an Atlas of the brain onto which information can be mapped. The enormous amount of information generated through expression analysis will be of general utility only if it can be organized in a way that makes it easily accessible by the community of researchers. In practice, this means that the information must be mapped onto a standard brain atlas, in which neuronal populations are assigned particular coordinates. There are of course variations in brain size and structure among individuals in any given species, so that any atlas will be at best an idealization. It will be the most accurate in the case of inbred strains of animals, such as mice, that show the least variation between individuals. Efforts have been made in the past decade to develop such atlases for mice and other species, including humans.
3. Integrating information obtained from gene-based and cell- (or region-) based approaches. As discussed above, gene-based approaches such as in situ hybridization permit the mapping of all genes, but the degree of cellular resolution is limited. Cell-based approaches make it possible to define for particular cells the full complete of genes that are expressed, but this must be done on a laborious cell-by-cell basis. The two approaches are complementary, making it desirable to use both, and to cross-reference the information obtained by the two approaches.
4. Mapping circuits as well as cells. Understanding the pattern of interconnections of specific neural cells is essential to understanding their roles in nervous system function and dysfunction. Therefore, a comprehensive Brain Map should include not just information about the patterns of gene expression of cells, but also about the connections those cells make with other neural cells. The identification of genes that can serve as markers for particular cell types, and whose promoters can be used to drive cell-type specific expression of transgenes, can help trace the connections of these cells, through the transgenic expression in the cells of genetically encoded markers of neural connectivity. A powerful variation on this theme, recently developed, involves driving in the cells not the genetically encoded marker itself, but rather a recombinase like cre that makes it possible to activate a genetically encoded reporter delivered to the cells by other means (e.g. using a virus). Technologies like these have the potential to dramatically facilitate the tracing of connections in the complex environment of the brain. While this type of tracing has been initiated in an investigator-initiated way in several laboratories, there is at present no systematic effort to accelerate the development of a Connectivity Map; this should be given priority.
5. Mapping proteins not just mRNAs. For technical reasons mentioned above, the analysis of mRNA expression patterns is considerably more straightforward than the analysis of the expression patterns of their protein products, so that highest priority is being given to high-throughput mRNA expression analysis. Ultimately, however, it will be necessary to know cell type-specific expression patterns of protein products and their subcellular localization on particular portions of neurons or in specific intracellular compartments. The improvement of proteomic methods is the focus of intense activity in all areas of biomedical research, and these methods should be incorporated into the generation of Molecular Brain Maps in an ongoing way as they are developed.
One specific initiative in proteomics deserves mention and high priority at this time: the generation of antibodies to transcription factors. It is estimated that there are approximately 1,500 transcription factors encoded in the genome. The identity of particular neural cell types is controlled by combinatorial expression of specific transcription factors, and the availability of antibody probes to detect transcription factors in histological sections of the brain will accelerate attempts to identify neural cells, including neural stem cells, and to devise means to alter the development and fate of these cells for neural repair. Furthermore, transcription factors as a class have proven in general to be readily amenable to the generation antibodies that work in immunohistochemistry, whereas many other important classes of proteins (e.g. G protein-coupled receptors) are often more refractory. The combination of importance and ease justifies giving priority to the generation of antibodies to transcription factors (Table 2).

From this description, it is evident that the development of a comprehensive Molecular Brain Map will be an iterative process, as information from gene-based and cell-based approaches accumulates and is incorporated into an ever more refined Atlas incorporating information not just about the locations of cells but also about their interconnectivity and function. In particular, it can be expected that in many cases what is thought of as a single homogeneous class of neurons defined by expression of a particular maker gene will, on further analysis, be discovered to comprise two or more subpopulations. Such refinements, subdivisions and reinterpretations are expected, and will be facilitated by the integration of information obtained from gene-based and cell-based approaches, and the integration of that information with other anatomical and functional data. As part of this refinement process, more accurate cell-type specific markers are likely to be defined that will permit gene manipulation in particular populations of neurons with high selectivity.
Three axes: species, developmental stage, and disease condition
A single Molecular Brain Map of the adult mouse brain would already provide an invaluable tool, dramatically accelerating the pace of discovery by freeing individual investigators of the need to derive the information in a piecemeal and inefficient way. However, the full utility of the Map will be evident only as a variety of Maps are generated to document gene expression in different species, developmental stages, and disease conditions.
1. Maps in different species. Just as important as the creation of a Molecular Brain Map for the mouse is the creation of such a map for the adult human brain. A number of approaches feasible in the mouse are not possible in the human (e.g. those of a transgenic nature), but in situ hybridization (gene-based) and laser-capture microdissection (cell-based) are both possible. Thus, it should be possible to generate such a Map for humans, though these limitations in mapping technologies appropriate for the human brain, as well as the variation in brain size and structure alluded to above, will result in the generation of a lower-resolution Map, at least in the first instance. After humans, a Molecular Map for the brain of one or more non-human primates (whose genomes have been sequenced) will be desirable, as primates can be excellent models for a variety of higher brain functions that are either not present or not easily studied in the mouse.
2. Maps at different developmental stages. Understanding how the adult brain is generated during development will require creating Maps at different stages in embryonic, fetal, and early postnatal life. One or more Embryonic and Fetal Maps will help identify the genes responsible for the generation and differentiation of neuronal cells, as well as the genes important in establishing the initial connectivity of the brain. One or more Postnatal Maps will help elucidate mechanisms through which early sensory experience helps shape the ultimate pattern of neuronal connections. High resolution Maps at these stages will be most useful.
3. Maps in different states (disease, drug-treatment, injury, or diverse normal physiological states) and strains. Important clues to the causes of disease will come from generating Maps in different disease states in both animal models such as the mouse, and also in humans. Comparing Maps in different strains of the same animal (e.g. different mouse or rat strains) that show significant behavioral or physiological differences can similarly be expected to shed light on the causes of these differences. Similarly, insight into both drug mechanisms and some disease states will be obtained by generating Maps following drug treatment, especially for those drugs, such as anti-depressants, that require considerable time to achieve their effects (and which therefore likely involve changes in gene expression). Maps generated following injury or trauma to the nervous system will provide information on the brain's response to these insults, including on the behavior of stem cells, and changes in gene expression that may either facilitate or impede regenerative responses. Finally, even in the normal, uninjured brain, the generation of different Maps is likely to be useful, for example in discerning changes in gene expression that occur in particular learning paradigms. In general, for each of these stages it is anticipated that low resolution Maps might be generated in the first instance, with subsequent high resolution mapping of particular brain regions fingered by the first-pass analysis.

Part IV: Existing Large Scale Efforts Funded by the NIH

The NIH recognized the need for a Molecular Brain Map in the 1990s. Many pilot efforts have been funded to further the development of this Map, and are not described here because of space constraints. Three large-scale efforts funded at high levels by various Institutes of the NIH are, however, discussed in detail in Appendix 1 and introduced briefly here.

Creating a dataset of all transcripts expressed in the adult and developing mouse brain, and a physical collection of cDNA probes for each of these transcripts.
This resource and dataset, created by Dr. Bento Soares (University of Iowa) complements the Genome Projects in helping identify transcripts and splice-variants of genes expressed in the brain, and in generating cDNA libraries and full-length cDNA/EST probes that can facilitate analysis of these genes.
Creating a database of gene expression patterns in the nervous system
The GENSAT project, initiated by Drs. Gabrielle Leblanc, Bob Baughman, and colleagues at NINDS, aims to systematically map the expression patterns of thousands of genes in histological sections of the mouse brain and spinal cord in the adult and at three stages of development (E10.5, E15.5, and P7). The gene expression data are being collected by two groups of investigators, one led by Dr. Gregor Eichele (Baylor College of Medicine and Max Planck Institute, Hannover), and another led by Drs. Nathaniel Heintz, Mary-Beth Hatten (Rockefeller University) and Alexandra Joyner (New York University). Dr. Eichele's group is collecting data using high throughput in situ hybridization, whereas Dr. HeintzÕs group is using BAC (bacterial artificial chromosome) transgenic technology. In the latter approach, transgenic mouse lines are generated in which expression of a reporter, Green Fluorescent Protein (GFP) is driven in the same patterns as selected genes. The gene expression data from both efforts is to be placed in a public database at the National Center for Biotechnology Information (NCBI). The project is collecting data for 300 genes in its first year, and aims to ramp up to at least a thousand genes per year in future years.
Creating Atlases of the mouse and human brains
Dr. Arthur Toga of the University of California at Los Angeles and his colleagues have been developing computerized brain Atlases, and tools to map data obtained from gene expression analysis or other approaches (e.g. functional MRI) onto those Atlases. In the case of the mouse, the result is a formal computerized representation of the mouse brain, providing coordinates that define the diverse anatomical structures and landmarks.

These projects are helping set the groundwork for accelerating the generation of comprehensive Molecular Brain Maps for the human and the mouse.

Part V: Goals and Priorities

The working group applauded existing large-scale efforts for the establishment of a Molecular Brain Map, and achieved broad consensus on the following priorities going forward, building on those initial efforts.

A two-pronged approach: breadth and depth
1. Breadth: A first priority is to accelerate efforts like those just mentioned to provide a broad but comprehensive survey of expression of all genes in the adult mouse and human brains. Achieving this goal involves several challenges.
  - Increased throughput. The current throughput of existing efforts in the mouse is on the order of many hundreds of genes a year. To deal with the more than 30,000 genes in the mammalian genome in a matter of years rather than decades, it will therefore be necessary to increase this throughput by an order of magnitude at least.
  - Integrating and coordinating gene-based and cell-based approaches. At present the highest throughput is provided by in situ hybridization (a gene-based approach). Because of its limited cellular resolution, however, it is necessary to pursue complementary approaches with equal vigor. This includes transgenic approaches, such as BAC-mediated or knock-in approaches, which are of lower throughput but provide more detailed cellular resolution data on expression of particular genes, and also provide tools to label particular neuronal populations for cell-based approaches. Cell-based approaches, in which mRNA is extracted from particular cells and probed (e.g. using DNA arrays) to identify the transcript profile of the cells, must also be pursued vigorously. In particular, laser-capture microdissection holds high promise, particularly for use on human tissue, but the technique needs to be improved significantly (e.g. to improve the quality of recovered mRNA, and to permit capture from marked cells). These efforts will all benefit from coordination: for instance, in selecting genes for in situ hybridization and BAC transgenic analysis a priority should be given to identifying regional- and cell-type specific markers that can then be used for cell-based approaches.
  - Broaden to include circuitry. An essential part of a Molecular Brain Map will be information on the pattern of connections made by individual neuronal populations. Mapping connectivity has not been a focus of large-scale efforts to date, and should become an integral part of such efforts.
  - Competition between centers and technologies. Since the methods for most efficiently collecting information about gene expression are still being worked out, it was deemed essential to encourage competition between multiple centers and multiple approaches.
2. Depth: A second priority is to focus in detail on a few selected brain regions and/or functional circuits (such as the retina, spinal cord, or cerebellum). The aim here is to characterize in detail all the neuronal subtypes in the selected circuits, defining not just their full complement of gene expression but also their interconnectivity. These data would then be related to the functional properties of the system. The reason for focusing on a few systems in depth is that this would make it possible to discern whether any organizational or analytic principles emerge relating gene expression patterns to the structure and function of the circuit. Such principles could then help guide and organize similar studies in other brain regions.
How many Maps?
- The adult mouse brain will continue to be the first focus of efforts to create a Molecular Brain Map, because of its relevance to the human brain, and the powerful transgenic tools that facilitate Map generation. It was deemed important, however, to initiate the creation of additional Maps.
- Priority should also be given to generating a Molecular Brain Map for the adult human brain, despite the difficulty of obtaining high quality post-mortem human brain tissue for mapping studies, and the fact that transgenic approaches are not possible. These limitations mean that the generation of this Map will lag behind that of the mouse and have lower resolution, at least initially. Nonetheless, given its central position in the analysis of behavior and in biomedical research, even a limited Map for the human brain will have broad utility.
- It is likely to be desirable to generate several other Maps. Strong arguments in the first instance can be made for the broad importance of several Maps of the developing mouse brain (to illuminate brain development and plasticity), and a Map of another adult primate species (which will be more experimentally tractable than the human brain, and closer in structure and function than the mouse brain).
- Yet other Maps are likely be important, but different constituencies of basic and clinical researchers are likely to have different priorities. In many cases, partial Maps may be sufficient. For example, illuminating some neurological diseases may be achieved by focusing mapping efforts on subregions of the brain or even particular neuronal populations in either a mouse model of the disease or on post-mortem human brain tissue.
Need for a versatile repository for data
To be truly useful, the data generated by these approaches must be organized into a central data repository that is easily accessible to the community the "Genbank" equivalent for brain gene expression. All data must be mapped onto a digital brain Atlas, accessible through graphical interfaces, that can integrate data from both quantitative gene expression analysis (such as that provided from DNA arrays) with histochemical data (immunostaining, in situ hybridization), and which is accessible for "where is" and "what is in" queries. The format must be user-friendly and make it straightforward for researchers who are focusing on a particular brain region (for instance, researchers interested in a particular region because of fMRI data implicating it in some cognitive task) to obtain clues to the function of the region from the data contained within the Map.
Need for standards
The development of this database will require the establishment of standards for the organization and display of data, as well as standards for data collection that are compatible with the digital atlas framework of the database. The absence of such standards is severely limiting the utility of data that are already being generated by existing large-scale efforts. Various Institutes at the NIH have initiated discussions on setting these standards, and they should be encouraged to drive this process to completion as rapidly as possible, broadly involving the community in the process.
Public access is essential
The full impact of a Molecular Brain Map will be felt only if all data in the database is accessible to the scientific community at large. Standards for the timing of public release of data therefore need to be defined and implemented. Early release of data, for example on a quarterly basis or more frequently, is essential both to ensure access and to ensure quality control by the community. For example, data generated in large-scale efforts funded by NIH and described in Part IV should be released to the scientific community according to these standards.
Need for tools to leverage and extend the Molecular Brain Map
There was consensus on the need to develop novel technologies to allow the Molecular Brain Map to be leveraged to its fullest.

The first priority is to develop a bank of specific promoters and modified BACs, to permit delivery of transgenes to specific neuronal populations, and the simultaneous improvement of efficient transgenic and/or viral delivery methods for gene delivery in species other than mouse (including rats and primates).

The impact of having this bank will be greatly increased by the further development of genetically encoded reporter and modulator constructs to allow: (i) the marking and isolation of neurons, (ii) the tracing of all connections made by a neuron, (iii) the persistent or time-lapse labeling of connections, to help identify plastic changes in connections (iv) the detection of electrical activity in neurons by optical and other means, and (v) the controlled modulation of electrical activity in specific neuronal populations (e.g. by expression of an ion channel that can be gated by a specific pharmacological agent or by light). The latter two applications in particular will help define the function of particular neurons and neuronal circuits.

Finally, whereas the analysis of gene transcripts (mRNAs) in neurons is amenable to high-throughput analysis today and hence should be the initial focus of effects to construct Molecular Brain Maps, the characterization of protein products and their subcellular localization, i.e. the proteomic characterization of neurons, is an essential longer term goal that will rely on improvements in proteomic analysis methods throughout the biomedical community. As discussed, one proteomic initiative that deserves priority at this time is the generation of antibodies to transcription factors.

These tools, reagents and methods are summarized in Tables 2 and 3.
Need for organization and coordination
The successful development of a Molecular Brain Map will require a concerted and large-scale effort. It is therefore imperative that an ongoing Working Group be established, comprising members of the scientific community as well as granting agencies, to monitor developments and continually define and refine priorities for molecular neuroanatomy.
Need for both coordinated large-scale projects and investigator-initiated projects
The collection of initiatives required to generate and leverage a series of Molecular Brain Maps require a mix of funding initiatives. Some aspects, such as the high throughput generation, collection, and collation of gene expression data, will benefit greatly from large-scale integrated funding initiatives. Others, such as the development of tools to leverage the Molecular Brain Maps, will continue to benefit from smaller-scale individual investigator initiated projects. Both types of funding initiatives should be supported by the NIH.

Appendix 1: Existing Large-Scale Efforts

Various institutes at the NIH recognized the need for a Molecular Brain Map in the 1990s. Many small-scale efforts have been funded to further the development of this Map, and are not described here in the interests of space. Several large-scale efforts funded at high levels by various Institutes of the NIH are, however, discussed.

Creating a dataset of all transcripts expressed in the adult and developing mouse brain, and a physical collection of cDNA probes for each of these transcripts.
Under contract to the NIMH, Dr. Bento Soares (University of Iowa) has undertaken the generation of cDNA libraries from mouse brain regions, and the direct sequencing of cDNAs from these libraries, in an effort to identify all transcripts expressed in the brain. In phase I, completed in the year 2000, a non-redundant collection comprising approximately 30,000 brain and 9,000 retina cDNAs/ESTs was identified, re-arrayed, sequence verified and made publicly available. In phase II, initiated in September 2001, cDNA libraries enriched in full-length transcripts are being generated from whole brain and eyes at various developmental stages, and sequenced.

The identification of all transcripts has, of course, been greatly accelerated by the sequencing of the human and mouse genomes. Dr. Soares' project was initiated before it was clear how long the sequencing of the human and mouse genomes would take. It still remains complementary to the genome sequencing projects in important ways. First, programs for identifying individual genes from genomic sequence remain imperfect, so that cDNA sequencing efforts continue to provide valuable information on the identity of individual genes. Second, direct sequencing of cDNAs also provides important information on the usage of alternative exons of particular genes (alternative splicing and promoter usage) that is not always easily inferred from genomic sequence. Finally, the project also provides a physical series of cDNA probes for each of the genes that is identified.
Mapping gene expression
1. Mapping gene expression through high-throughput in situ hybridization As part of NINDS's GENSAT project, Dr. Gregor Eichele (Baylor College and Max Planck Institute, Hannover) and his colleagues have undertaken an effort to map gene expression through high-throughput in situ hybridization. This has involved the development of a robot for performing reproducible in situ hybridization analysis on sections of adult or embryonic brain, which will in the first instance provide a collection of brain sections on microscope slides on which the pattern of expression of individual genes is visualized with a histochemical reaction product. The aim of the project, in the first instance, is to map expression of over one thousand genes per year to the adult mouse brain and the developing brain (at E10.5, E15.5, and P7)
2. Mapping gene expression through generation of BAC transgenic mice Also under the auspices of NINDS's GENSAT project, a consortium of Dr. Nathaniel Heinz, Dr. Mary-Beth Hatten and Dr. Alex Joyner (Rockefeller University and New York University) and their colleagues has undertaken to use BAC (bacterial artificial chromosome) transgenic technology to map gene expression in the brain. In this approach, for each gene a transgenic mouse is generated in which the reporter GFP (green fluorescent protein) is expressed in a pattern mimicking that of the starting gene. This is achieved by isolating a bacterial artificial chromosome (BAC) containing the gene locus of interest (including its regulatory regions), recombining a GFP cDNA into the locus, and creating a transgenic mouse containing the modified BAC. At high frequency, expression of GFP in such mice is representative of expression of the endogenous gene. The aim of the project is to ramp up to the production of about a thousand modified BACs per year, which are then used to generate transgenic mouse lines, followed by visualization of GFP in a select series of sections from adult brains, and collection of images with this information. This approach is complementary to the in situ hybridization approach in (a). It is slower to make the modified BACs and transgenic mouse lines than simply to perform in situ hybridization. However, the GFP signal, unlike the in situ hybridization signal, can often give more information on the particular cell type expressing the gene because the morphology of the cell and its pattern of projections often provides a unique identifier of the cell. In addition, the modified BAC construct in principle provides a valuable tool that can be used by other investigators in transgenic mouse lines to mark cells expressing the gene or to deliver other gene-modifying constructs to those cells for functional studies in transgenic mice.
3. Creation of Atlases of the mouse and human brains Under a grant supported by NIMH, NINDS, NIDA, NIA, NIAAA, and NIDCD, Dr. Arthur Toga of the University of California at Los Angeles, Dr. Russell Jacobs at the California Institute of Technology, Dr. Larry Swanson of the University of Southern California, and their colleagues, have been developing computerized brain Atlases and tools to map data obtained from gene expression analysis or other approaches (e.g. connectional data, structural or functional MRI data, immunocytochemical data, chemo- or cytoarchitectural data, etc. ) onto those Atlases. In the case of the mouse, the result is a formal computerized representation of the mouse brain, providing a systematic and comprehensive digital space with links between a coordinate system and systems of nomenclature for the structural subdivisions of the nervous system. In addition, tools have been developed to map information derived from brain sections onto this three-dimensional Atlas, making it possible to correct for distortions of brain tissue that occur during various histological procedures to visualize gene expression. Such Atlases are being developed for adult mice and mice of specific ages through development. Similarly, an Atlas of the human brain has been devised, and software tools generated to import information derived from multiple modalities (e.g. fMRI) and map it onto the Atlas.