Wheat ESTs Proposal

frames version available

The Structure and Function of the Expressed Portion of the Wheat Genomes

C.O. Qualset, O.D. Anderson
J. Dvorak, B.S. Gill, M.E. Sorrells

James A. Anderson, Timothy J. Close, J. Dubcovsky, Kulvinder S. Gill, J. Perry Gustafson, Shahryar F. Kianian, Nora Lapitan, Henry T. Nguyen, Mary-Kay Walker-Simmons

SECTION C: PROJECT DESCRIPTION

1) LONG-TERM GOAL AND IMMEDIATE OBJECTIVES OF THE PROJECT

The long-term goal of this project is to decipher the chromosomal location and biological function of all genes in the wheat (Triticum spp.) genomes. This knowledge will greatly enhance our understanding of the biology of the wheat plant and create a new paradigm for the improvement of this exceedingly important crop. Moreover, because of the extensive genetic and metabolic conservation among species in the grass family, efforts to decipher gene function in wheat and its close allies will work synergistically with similar efforts in maize, rice, sorghum, and other crops in the grass family to arrive at a global understanding of the function, structure, and evolution of genomes.

Complementary DNA libraries from wheat or closely related species will be produced and normalized to maximize the access to all genes in the wheat genomes. Approximately 80,000 of these cDNA clones will be single-pass 5' sequenced thereby producing a large database of expressed sequence tags (ESTs). Identification of singleton gene motifs from these ESTs, and those generated by international collaborators, will be carried out by 20,000 3' single-pass sequencing. Nucleotide sequence comparison of these ESTs will be used to arrive at a minimum of 10,000 EST singletons, representing approximately half of all the gene motifs in the wheat genomes. All chromosomal loci complementary to these singletons in the wheat genomes will be identified by their Southern hybridization with a unique mapping population of chromosome deletion stocks. Rice, maize and other grass genome ESTs homologous to the wheat singletons will be identified by nucleotide sequence match searches. Additional rice and maize ESTs of known chromosomal position in those genomes and not matching wheat ESTs will be mapped in wheat. These two approaches will produce detailed comparative maps between wheat and other grass genomes. The pool of wheat EST singletons will be used in gene-function studies focusing on the reproductive phase of the wheat plant. The database generated by these objectives will provide means to study specific questions about the structure, putative function, chromosomal location, and evolution of the expressed component of the wheat genomes.

2) INTRODUCTION

Wheat (Triticum aestivum L.) is the most widely grown crop in the world and its economical significance for the humankind is matched only by rice. Furthermore, wheat belongs to the tribe Triticeae comprising some 300 species classified into 22 genera (Löve 1984) including several other important crops (barley, rye, and triticale) and a number of important forage species.

Wheat genomes, like all genomes in the tribe Triticeae, are large in comparison to the current plant models, Arabidopsis and rice. A. thaliana was estimated to have 21,000 genes (Bevan et al. 1998). While wheat genomes have probably similar numbers of gene motifs as the small-genome plants, the large genome size makes it unrealistic to anticipate complete sequencing any of the Triticeae genomes in the near future. Therefore, large-scale discovery, isolation and deciphering gene function in wheat and its relatives must rely on other, less direct methods. The current approach of map-based cloning, transposon mutagenesis, differential display techniques, and other strategies of deciphering gene function, are either not suitable for large genomes or are tedious and expensive because they deal with a single gene at a time. Therefore, it is necessary to devise alternative strategies to access genomes of wheat and its relatives on a large scale and thus ensure continued advances in the biology of this immensely important plant. Such a strategy is described in this proposal. It focuses solely on the expressed component of the wheat genomes and takes a maximum advantage of unique strengths of the wheat genetic system. Moreover, it constructs a bridge between the wheat genome and other important grass genomes for fluid movement of information on gene location and function from other grass genomes to Triticeae and vice versa and facilitates exploitation of the high conservation of gene order among grass genomes (Moore et al. 1995; Devos and Gale 1997) for comparative functional genomics.

3) RATIONALE AND SIGNIFICANCE

This project is designed (1) to develop and deploy the necessary tools for gene discovery and deciphering gene function in wheat and related species, (2) to maximize two-directional flow of information on gene function between wheat genomes and those of other grasses, and (3) to advance the understanding of the structure and evolution of large genomes. The centerpiece of this project is the development of a large database of expressed sequence tags (ESTs), arriving at a minimum of 10,000 EST singletons, and mapping of these singletons in the wheat genomes. Assuming that the number of gene motifs is similar among the angiosperm genomes, this number potentially represents about half of all the gene motifs present in the wheat genomes. We will exploit the unprecedented potential of bread wheat (T. aestivum) for efficient and rapid gene mapping, utilizing a set of homozygous, true-breeding deletion stocks. The deletion mapping approach is polymorphism-independent, requires only a single hybridization, results in the detection of all loci complementary to a probe present in a genome, and facilitates large-scale collaboration. The EST singletons will be mapped into a set of “bins” with an average size of 10 cM. A disadvantage of the use of deletions compared with segregating populations is that they offer comparatively less resolution of genetic order of loci in regions of the genome with high rates of recombination. However, this is compensated for by the greater resolution of genetic order in areas of the chromosomes with low rates of recombination and the fact that intragenomic polymorphism is not required. The wheat genomes show dramatic compressions of linkage maps in the proximal regions (Lukaszewski and Curtis 1993). High density mapping in deletion stocks also has an advantage in that it facilitates mapping of proximal regions of chromosomes that cannot be resolved by recombination (Werner et al. 1992; Endo and Gill 1996).

Identification of the chromosomal location of a large number of EST singletons will simultaneously construct a comparative map with other grass genomes. This is particularly important for linking wheat with rice genomes, since rice will likely be the first grass genome sequenced. Sequence matching of mapped wheat ESTs with a rice sequence database will automatically identify putative orthologous loci between the two genomes and facilitate transfer of information on gene function between wheat and rice. Needless to say, the project described here will lead to the development of technology and research tools which will benefit the entire Triticeae community in this country. Because of large-scale colinearity among Triticeae genomes, wheat and barley in particular, results of work on wheat can be extended to other Triticeae genomes with a minimum or no extra work.

The population of EST singletons will be the foundation for studies of gene-function, structure, and evolution of the expressed component of the wheat genomes, as illustrated in Figure 1. To study gene function, ESTs will be microarrayed and used in gene expression studies. Although a general, long-term utility of the ESTs is the primary motivation of the EST development in this project, we will focus the gene-function studies in this project on the reproductive phase of the wheat plant. The reproductive phase, commencing with the flowering signals and terminating by imposition of seed dormancy involves processes of great biological interests and is exceedingly important economically. Since seeds are the economically important part of the wheat plant and their characteristics are critical for the end-uses of wheat grain, the understanding of their development and environmental factors affecting their development are of a paramount practical significance. We will accordingly examine the seed development time-course and its modifications by environmental stresses and seed dormancy.

Transmission of genetic information in polyploid wheat and gametogenesis with complex nucleo-cytoplasmic interactions is another exceedingly important process of the reproductive phase. Meiotic chromosome pairing in polyploid wheats is tightly regulated by a complex gene system resulting in recognition of differentiation between homoeologous chromosomes and elimination of heterogenetic chromosome pairing that could potentially occur between wheat homoeologous chromosomes. The principal, and the best known, component of this genetic mechanism is the suppressor of homoeologous chromosome pairing, Ph1 (Riley and Chapman 1958; Sears and Okamoto 1958). We will exploit wheat deletion stocks and other cytogenetic tools in studies of gene expression with EST microarrays to identify genes participating in this fascinating modification of meiosis. Since wheat meiosis is likely to resemble the processes of diploid organisms, the EST microarray approach will facilitate relating the finding to diploid models, such as yeast, Drosophila, and Arabidopsis.

A general tendency in plant genomics has been to emphasize small-genome models. Nevertheless, the existence of large genomes is an undeniable reality. It is, therefore, essential to keep advancing the knowledge of the organization and evolution of large plant genomes in balance with efforts in the small-genome models. For many reasons, wheat and its close relatives offer unique advantages as models of large-genome plants. The mapping of a large number of ESTs in wheat deletions will provide extensive data on gene densities across chromosome arms. This knowledge is important for the basic understanding of the organization of large genomes and is exceedingly important in the practical design of gene isolation strategies by chromosome landing (Tanksley et al. 1995). Additionally, mapping of a large number of ESTs of known pattern of expression will provide data on the distribution of genes with similar developmental expression within the genome and may, in turn, contribute to understanding of factors responsible for conservation of gene order during the evolution of grass genomes.

Figure 1. Schematic of Proposal Organization. Budgeted activities are EST production, deletion mapping, and functional genomics.

4) RELATIONSHIP TO STATE OF KNOWLEDGE

Wheat genetic stocks and molecular marker maps. Wheat polyploidy dictated special mapping approaches in the past. Sears (Sears 1954) developed stocks individually deficient for each of the 21 wheat chromosomes (nullisomics) and individual chromosome arms (ditelosomics). Later, he developed a set of compensating nullisomic-tetrasomics (Sears 1966). These stocks greatly facilitated gene synteny mapping. Technical advances in chromosome identification (Gill et al. 1991a) and the discovery of a genetic system for chromosome breakage to produce unlimited numbers of deletion stocks (reviewed in Endo and Gill 1996) opened the era of sub-arm localization of genes to specific chromosome segments (Endo and Gill 1996). Extensive RFLP maps have been developed for diploid wheat (Dubcovsky et al. 1996), tetraploid and hexaploid wheats (Chao et al. 1989; Liu and Tsunewaki 1991; Devos et al. 1992, 1993b, 1994, 1995a; Devos and Gale 1993; Xie et al. 1993; Dubcovsky et al. 1995b; 1997; Nelson et al. 1995a,b; Van Deynze et al. 1995; Marino et al. 1996; Jia et al. 1996) and Aegilops tauschii, the donor of the hexaploid wheat D genome (Gill et al. 1991b; 1992; Lagudah et al. 1991; and GrainGenes database). Extensive deletion maps for each of the 21 wheat chromosomes have been developed and anchored to the maps of hexaploid wheat and Ae. tauschii (Werner et al. 1992; Gill et al. 1993; 1996a; Kota et al. 1993; Hohmann et al. 1994; Delaney et al. 1995a,b; Mickelson-Young et al. 1995). In addition to wheat, detailed maps have been constructed for barley and rye and other relatives of wheat so that jointly, over 5,000 RFLP markers, in addition to AFLP, SSR, morphological, and isozyme markers have been mapped (see GrainGenes database). Close to one-hundred QTLs have been identified in various maps. Clearly, there is the basic framework for the development of the high density EST maps and their utilization in wheat and the tribe Triticeae.
Comparative mapping. Common wheat (T. aestivum) is an allohexaploid (2n=6x=42). The three genomes of hexaploid wheat were contributed by T. urartu (A genome), a primitive Ae. speltoides or a now extinct close relative of Ae. speltoides (B genome), and Ae. tauschii (D genome) (Kihara 1944; McFadden and Sears 1946; Nishikawa 1983; Dvorak and Zhang 1990; Dvorak et al. 1993). The order of loci in the three wheat genomes appears to be colinear except for a 4A-5A-7B translocation, a putative 2B-6B translocation, and two inversions in chromosome 4A (Devos et al. 1995a; Mickelson-Young et al. 1995). Diploid wheat, T. monococcum, shares with the A genome of wheat the 4A-5A translocation but is devoid of the two inversions which occurred at the polyploid level (Devos et al. 1995a). No fixed structural difference has been reported between the D genome of wheat and the Ae. tauschii genome. Remarkably, comparison of the T. monococcum and barley genomes uncovered only two inversion differences, in addition to the 4-5 translocation difference (Dubcovsky et al. 1996). Rye differs from wheat and barley by a number of structural changes (Devos et al. 1993a).
Within the grass family, comparative genetics has revealed a remarkable level of macro-colinearity. Large segments of the maize, sorghum, rice, wheat, and barley genomes conserve gene content and order (Hulbert et al. 1990; Ahn and Tanksley 1993; Ahn et al. 1993; Kurata et al. 1994a; Van Dayze et al. 1995; Moore et al. 1995; Saghai Maroof et al. 1996; Devos and Gale 1997; Devos et al. 1998), although the correspondence among some of the genomes has been further modified by segmental chromosome duplications, inversions, translocations and paleopolyploidy. To date, most comparative mapping among the grasses has relied on RFLP probes to establish gross gene orders in specific chromosome segments.
Because rice is anticipated to become a model species for the entire grass family, rice-Triticeae comparative mapping is of great importance. Comparisons of the rice and hexaploid wheat and barley have led to the identification of a number of homoeologous regions and established the genetic correspondence of the seven homoeologous groups of the Triticeae genomes with the 12 rice chromosomes (Ahn et al. 1993; Kurata et al. 1994a; Sherman et al. 1995; Van Dayze et al. 1995; Saghai Maroof et al. 1996; Devos et al. 1995b; Devos and Gale 1997). Nevertheless, homoeologous relationships are obscured for many chromosome segments. The utility of rice as a model for the Triticeae genomes will not reach its full potential unless the colinearity between the rice genome and a consensus genome for wheat and other Triticeae genomes is established in greater detail.
Structure and evolution of the expressed portion of the wheat genomes. The large-scale conservation of gene order in the grass family is particularly surprising for the large genomes, such as those in the tribe Triticeae and Aveneae, which are exceedingly rich in repeated nucleotide sequences. Since repeated nucleotide sequences are believed to be one of the principal vehicles of evolutionary change in both the nucleus and organelles (for reviews see Wallace 1982; Palmer 1990; Clegg et al. 1994; Kidwell and Lisch 1997) the structural stability of genomes in the tribe Triticeae tribe is an apparent contradiction. A conspicuous characteristic of large Triticeae genomes is dramatic recombination rate gradients across chromosome arms resulting in low or virtual absence of recombination in the proximal regions of the chromosomes (Dvorak and Chen 1984; Dvorak and Appels 1986; Jampates and Dvorak 1986; Curtis and Lukaszewski 1991; Werner et al. 1992; Lukaszewski and Curtis 1993; Hohmann et al. 1994; Delaney et al. 1995a,b; Mickelson-Young et al. 1995; Gill et al. 1993,1996a,b; Dubcovsky et al. 1996). The low recombination regions correlate with low levels of polymorphism in genomes of both diploid and polyploid species (Dvorak et al. 1998); loci in the centromeric regions show little or no polymorphism. The recombination maxima and minima were observed to coincide with high- and low-gene density regions in wheat chromosomes (Gill et al. 1993; Gill et al. 1996a,b). These recombination “hot” and “cold” spots have been observed many times in both plant and animals. Data on the location of a large numbers of random ESTs will provide further evidence of gene densities across chromosomes of the large wheat genomes.
It has been speculated that loci expressed in specific developmental stages or responding to specific environmental stimuli are nonrandomly distributed in the wheat genomes. For instance, loci expressed in seed development tend to be on wheat chromosome 1, those responding to osmotic stresses tend to be on chromosome 5 (Dubcovsky et al. 1995a) and meiotic loci tend to be on chromosomes 3 and 5. However, since only small numbers of loci of known function have been mapped, these trends may just be a coincidence. Nevertheless, studies of expression of yeast ORFs during cell cycle with microarrays revealed that loci expressed during same stages of cell cycle tend to be co-located (Cho et al. 1998). In tomato, a gene complex controlling genetic and morphological mechanisms of reproduction is co-located on the long arm of chromosome 1 (Bernacchi and Tanksley 1997). Similar associations in other flowering plants suggest that such complexities may have been conserved since early periods of plant evolution or may reflect a convergent evolutionary process (Bernacchi and Tanksley 1997). The determination of function and location of thousands of ESTs that will be conducted in this project will provide the necessary framework to shed light on the global organization of the expressed portion of the wheat genomes and, in turn, may suggest factors constraining the global gene order on the evolutionary time-scale.
Functional genomics of the wheat reproductive phase. The various aspects of wheat reproduction include areas of both economic and scientific interest. The PIs of this proposal have productive research programs into these areas and it is proposed to use the generated EST and map database to concentrate the functional genomics objectives on the wheat reproductive tissues.
1. Flowering signals: Two interacting signals are of paramount importance for the initiation of the reproductive phase: vernalization and day-length. Vernalization is "the acquisition or acceleration of the ability to flower by a chilling treatment" (Chouard 1960). Thus, a period of exposure to cold given to imbibed seeds, seedlings, or vegetative plants has an inductive effect on flowering once the plants are returned to warmer growth temperatures. Vernalization response genes (Vrn) have a strong effect on the regulation of the development of the wheat shoot apex into reproductive structures. The shoot apical meristem, which will give rise to new vegetative and floral organs, is the most receptive region of the plant to the long-term exposure to low temperatures that occur during the vernalization treatment (Metzger 1988; Thomas 1994). The apex also receives signals from the leaves that are the main organs involved in the detection of the day-length signal (Lumsden 1994). Understanding the function of genes involved in determining a vernalization requirement would facilitate breeding between summer and winter cultivars and allow exploitation of possible heterotic effects as well as broadening the gene pool of each group. Research on genes involved in vernalization will also hasten the development of crop varieties with improved adaptability to different climates and local environmental conditions. The transition between vegetative and flowering induced apexes is easy to identify through clear morphological changes in the apexes that allow isolation of RNA from different stages of apex developoment. Transcriptional profiling of a large subset of genes from the apices to detect global changes in gene expression will provide novel insight into dynamic aspects of floral induction from a holistic perspective not possible before with more tranditional approaches. Especially important is the availability of isogenic genotypes of diploid wheat carrying different allelic combinations of vernalization genes Vrn-1 and Vrn-2 are available (Dubcovsky et al. 1998; Tranquilli 1999).
2. Meiosis: The unique feature of meiosis in polyploid plants is the necessity for regulation of heterogenetic chromosome pairing. Past cytogenetic studies on wheat unraveled a complex system of supressors and promoters of heterogenetic chromosome pairing (Riley and Law 1965; Mello-Sampayo 1973; Sears 1976; Feldman and Avivi 1988). Of these, the best known are the Ph1 and Ph2 suppressors on chromosomes 5B (Riley and Chapman 1958) and 3D (Upadhya and Swaminathan 1967). Additionally, genetic variation was discovered in wheat relatives that profoundly modifies the activity of the heterogenetic pairing suppressors (Dover and Riley 1972; Dvorak 1972, 1987; Chen and Dvorak 1984). Cytogenetic mechanisms of the regulation of heterogenetic pairing has been the subject of intense research and great deal of controversy leading to incongruent results (Feldman 1993; Dubcovsky et al. 1995b; Luo et al. 1996; Aragón-Alcaide et al. 1997; Moore 1998). Clearly, study of gene expression during wheat meiosis with EST microarrays employing wheat cytogenetic stocks with deleted heterogenetic pairing suppressors or promoters is poised to make important contributions to the understanding of genetic regulation of pairing between homoeologous chromosomes in wheat and relate these processes to meiotic mechanism being unraveled in diploid organisms, such as yeast and fruit-fly.
3. Pollen developoment and nucleo-cytoplasmic interactions: In general, wheat lines with alien cytoplasm (alloplasmic lines) produce less seed and less biomass than those with wheat cytoplasm (Tsuji and Maan 1981; Busch and Maan 1978). Certain wheat nuclear genes are able to ameliorate the effects of alien cytoplasm; wheat Rf (restoration of fertility) genes are the best known example. Highly productive commercial lines have been produced with Rf genes. Another class of nuclear loci involved in nucleocytoplasmic interactions are Vi (vigor and vitality) genes which, if absent cause sterility (Maan 1992a). Incorporation of specific Vi and scs (species-specific-cytoplasm) genes into an alloplasmic stock can entirely reverse the detrimental effects of alien cytoplasm (Maan 1992b). The function of wheat nuclear genes interacting with alien or wheat-native cytoplasm, and often causing dramatic segregation distortions, is unknown, and any advances in understanding will have significant impacts.
4. Seed development: Since the characteristics of wheat seed are crucial for the wheat end-uses their genetics has received a great deal of attention. The literature voluminous, but it suffices to say that a great deal has been learned about genetic control of the various classes of seed storage proteins (Wrigley and Shepherd 1973; Holt et al. 1981; Jackson et al. 1983; Galili and Feldman 1984; Payne 1987; Dubcovsky et al. 1997) and their role for the various end-uses of wheat (Payne et al. 1981,1987; Gupta et al. 1989,1994, just to list a few). A number of these genes have been cloned and used to infer the molecular basis of their effects on wheat end-use quality (Anderson et al. 1989; Cassidy and Dvorak 1991; Halford et al. 1992; Tao et al. 1992; Gao and Bushuk 1993; Cassidy et al. 1998). Among the important tasks ahead is identification of cis- and trans-regulatory sequences having quantitative and temporal effects on the expression of genes controlling the development of the wheat kernel, and a more global analysis of genes contributing to seed quality characteristics. Another long-term research objectives of seed research is nutritional enhancement, such as increasing seed lysine - areas likely to be better understood with clearly pictures of seed metabolism. Anticipating deployment of the microarray technology in this area, coPI O. Anderson (unpublished) has initiated isolation of ESTs expressed in the developing wheat endosperm.
5. Seed dormancy/Germination: Seed dormancy is a fundamental biological mechanism that affects the basic survival, cultivation, and processing of all grains. Seed dormancy in wheat, barley, maize, and rice has been extensively studied by several scientific disciplines to approach an understanding and ability to manipulate this trait (recent advances for wheat and barley compiled in (Walker-Simmons and Ried 1993; Noda and Mares 1995). Several biological compounds (ABA, GA) and genes (kinases, regulators) have been associated by the induction, abundance, absence, or depletion at various stages prior to quiescence and/or germination. Both carotenoid and flavonoid biosynthetic pathways are known to be involved in control of grain dormancy as well as other important traits (Coe et al. 1988; Jende-Strid 1991). Red-kernel wheat varieties nearly always exhibit pronounced grain dormancy at physiologic maturity; whereas white varieties readily germinate within 24 hrs of imbibition. Attempts to separate the trait grain dormancy from red pericarp color have been partially successful (DePauw and McCraig 1983; Soper et al. 1989), and it is apparent that a single gene or group of genes tightly linked to red kernel color may underlie this correlation. In addition to the effect of kernel color on grain dormancy, it is apparent that other loci influence dormancy in white wheat genotypes. Ten QTL on 8 different chromosomes that affect grain dormancy (as measured by preharvest sprouting tests) have been mapped in two white wheat populations (Anderson et al. 1993; Sorrells and Anderson 1996). Recent studies are based upon comparative genetic analysis using grass comparative maps and DNA sequence to facilitate the identification, isolation, and characterization of candidate loci underlying the white wheat QTL controlling grain dormancy (Sorrells and Anderson 1996). These studies suggest that the carotenoid biosynthetic pathway and its regulation play a role in grain dormancy through synthesis and/or recognition of abscisic acid. Evidence awaits studies of temporal gene expression patterns in dormant and non-dormant genotypes.
ESTs and functional genomics:
1. What are ESTs and how are they used? In spite of recent increases in DNA sequencing capacity, it is improbable that the genomes of more than a few flowering plants with the smallest genomes will be completely sequenced in the near future. However, by partially sequencing large numbers of expressed genes, it is possible to obtain enough information to search databases for similar genes in the same or other organisms.
  Messenger RNAs (mRNAs) provide the opportunity to obtain significant information in a more rapid and usable form than studying the entire genome by converting the labile mRNAs into stable double-stranded (ds) DNA for cloning as complementary DNAs (cDNAs). Recombinant libraries of cDNAs then represent an approximate snapshot of the population of mRNAs in the tissue or organism under study and indirectly a picture of the total pattern of gene expression.
  ESTs are the products of genes which serve as the templates for the synthesis of proteins and ultimately determine the shape, size, and characteristics of an organism. The basic strategy for EST production and use was formulated by Craig Ventner’s group (TIGR) in 1991 and is a rapid, efficient method for sampling a genome for active gene sequences. Typically, anonymous cDNAs are used to determine short DNA sequences (300-400 bp) in a single sequencing reaction. These sequences are then used to search existing databases (Adams et al. 1991a) to determine if a specific gene (or gene motif) has been found in the same or other organisms and if its function has been determined.
  There is, as yet, no formula or protocols to insure identifying all wheat genes. An extensive EST program would likely require 300,000-500,000 sequences to hit most of the genes. As an example in another crop EST program, a goal of 300,000 reactions has been set by the recently organized and funded public soybean EST project (Vodkin et al. 1998; Shoemaker 1998) with a target of 30,000 singletons. However, even fewer wheat ESTs would provide a tremendous resource for U.S. wheat research and development, and comparative genomics.
  The availability of EST databases is also essential to move onto the use of a number of new techniques, including cDNA and oligonucleotide micro-arrays, and SAGE (Serial Analysis of Gene Expression). The developing technology of DNA-chips is poised to revolutionize many areas of plant biology. In this technology, DNA is fixed to a solid surface (glass or a membrane) and hybridized with a fluorochrome-labeled nucleic acid probe. The degree of hybridization to each DNA gives a measure of the amount of probe complementary to the immobilized DNA. In one report, microarrays of 1046 of human anonymous cDNAs were produced and used to monitor differential expression in a two-color hybridization assay (Schena et al. 1996). Fluorescence intensities spanned more than three orders of magnitude. The differences in expression were compared between heat shock versus control tissue and found that 17 individual array elements displayed altered fluorescence of 2-fold or greater. These 17 were sequenced and 14 matched known genes, mainly related to heat-shock. Three did not match any known gene. Control experiments indicated a sensitivity of this assay as 1 mRNA in 500,000. Another version of DNA microarray usage is to immobilize gene sequences rather than cDNAs (DeRisi et al. 1997). Less than 2-fold changes in expression were easily detectable. As glucose was depleted in yeast culture, 710 genes were induced by at least 2-fold, and mRNA levels for 1030 genes decreased by at least 2-fold. About half of the differentially expressed genes had no apparent homology to genes of known function, thus providing the first clue to their metabolic roles. The data were used to compare entire pathways, assign genes into groups by common responses, and reveal previously unknown metabolic links. Chu et al. (1998) used microarrays representing all yeast genes to study the developmental system of sporulation in budding yeast. Seven distinct patterns of gene induction were observed, including coordination of gene sets and clues to potential functions for hundreds of previously uncharacterized genes. Ruan et al. (1998) and Desprez et al. (1998) monitored expression profiles of Arabidopsis genes with such microarrays. Novel expression patterns were identified for genes with putative identification and suggested possible functions for novel sequences of previously unknown function.
  Oligonucleotide arrays are the versions of DNA microarrays where short oligonucleotides complementary to known genes or cDNA sequences can be directly synthesized and gridded on glass slides (Ramsay 1998; Marshall and Hodgson 1998). One nice example of use of oligonucleotide arrays is that of (Chee et al. 1996) who arrayed 135,000 oligonucleotides complementary to the entire 16.6 kb human mitochondrial genome and were able to detect single base polymorphisms throughout the entire mitochondrial genome in single hybridizations.
  The SAGE technique allows for both qualitative and quantitative profiles of gene expression by relying on short DNA tags to identify individual transcripts (Velculescu et al. 1995). It is a rapid method of analyzing and cataloging tens of thousands of transcripts through the concatamerization and sequencing of gene tags. It offers the advantage of quantifying gene expression of thousands of genes without hybridization probes and is an alternative to DNA microarrays although the exact advantages and limitations of the two techniques for complex genomes is still to be determined. (Velculescu et al. 1997) analyzed 60,633 tags representing 4,665 yeast genes with expression levels varying from an average of 0.3 to more than 200 copies per yeast cell. One impressive feature of their SAGE results is the finding of genes that had not been predicted from previous analyses of the complete yeast genome sequence.
  Although SAGE can quantitate the abundance of individual transcripts without prior information on the genome under study, its full utilization depends on knowledge of mRNA sequences (the 3' end in this case) and the association of each tag with a known gene (the eventual goal). Similarly, DNA microarrays depend on knowledge of at least partial gene sequences such can be obtained through EST programs.
2. EST projects: Major portions of several eukaryotic genome projects have focused on the production of ESTs; i.e., human (Adams et al. 1991b, 1992; Hillier et al. 1996), Caenorhabditis (McCombie et al. 1992; Waterston et al. 1992), mouse (Höög 1991; Takahashi and Ko 1994), and including microorganisms such as yeast (Vassarotti and Goffeau 1992) and a protozoan (Ajioka et al. 1998). For plants, Arabidopsis genome projects (Höfte et al. 1993; Newman et al. 1994; Cooke et al. 1996; Rounsley et al. 1996) estimate that more than 70% of the estimated 21-24,000 expressed genes have been identified via ESTs. EST production in rice has been reported by Sasaki et al. (1994), Uchimiya et al. (1992), Nahm (1996,1998), and Yamamoto and Sasaki (1997). The latter have reviewed the EST sequencing portion of the Japanese Rice Genome Research Program (RGP) as of August 1996. At that time more than 29,000 rice cDNA clones from various tissue sources had been sequenced at their 3' ends. They found that about 25% of the sequences were similar to proteins registered in PIR, and the sequences apparently represented approximately 10,000 independent rice genes. In addition, the location of many of the positions of rice ESTs in the linkage map of the rice genome have been determined (Kurata et al. 1994b; Harushima et al. 1998). A separate rice EST program is underway in Korea and reports 6,800 rice ESTs as of August 1, 1998 (http://bioserver.myongji.ac.kr/ricemac.html).
  The National Center for Biotechnology Information’s (GenBank) dbEST database contains (11-26-98) almost 2 million ESTs (http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html). Over half are human, with about 470,000 mouse + rat, 72,000 Caenorhabditis, 55,000 Drosophila, and lesser numbers for other organisms. Among plants, Arabidopsis has the most ESTs deposited with NCBI, at 37,000, and rice second at 35,000. After these two the numbers decline significantly: 2,700 for maize, 2,000 for loblolly pine, 1,400 for oilseed rape, 1,300 for upland cotton, and 1,034 for the common ice plant. Wheat and its relatives in the Triticeae tribe are pathetically low on the list for such an important crop group, with only 76 ESTs listed for barley and wheat combined.
  Examples of private sector plant EST programs include the corn EST program at Pioneer Hi-Bred (Wang and Bowen 1998) which has reported 125,000 sequences clustered into approximately 54,000 unique sequences, although an unknown percentage of these are likely to be multiple unconnected contigs (an update given Susan Martino-Catt at the ITMI Workshop at the 1999 Plant and Animal Genome VII Conference puts the current figures at approximatley 350,000 ESTs and 59,000 unique sequences). DuPont (Rafalski et al. 1998) similarly has EST programs and has described using soybean ESTs to profile soybean seed development. The private programs have generally not been accessible to public researchers, or accessible only with proprietary restrictions. However, no matter what plans are pursued by private or other international entities, it remains critical that there are public EST databases available as a fundamental resource to U.S. public researchers.
  Recently several larger public plant EST projects have been initiated in the U.S. The crops of these projects include tomato, cotton, maize, alfalfa, sorghum, soybeans, but not wheat or other Triticeae (http://www.nsf.gov/search97cgi/vtopic; search on plant genome awards.)
3. Wheat ESTs: There is wide recognition of the need for wheat EST resources. During a meeting of approximately 300 wheat and barley researchers at the 8th International Wheat Genetics Symposium (Saskatoon, Canada, August 1998), it was determined to organize an international effort to develop Triticeae genomics resources and collaborations. The effort would be carried out by individual countries, with coordination through meetings, communications, and organization/steering committees. The first, and highest, priority was agreed to be the development of "deep" wheat and barley EST resources. Phase I has as its goal immediate action for Triticeae EST production by combining what individual laboratories can contribute with existing resources, with a target from all international contributors of 20,000 unselected EST each for wheat and barley (details at http://wheat.pw.usda.gov/genome/) by July 1, 1999. The Phase II goal is the generation of 300,000 ESTs (mainly normalized) each for wheat and barley, with funding solicited from both public and private organizations within the contributing countries. One purpose of the current proposal is to represent the U.S. public contribution for wheat to this Phase II effort. At the initial organization of ITEC it was requested that the ARS (Albany, CA) site serve as the central data collection and EST repository site for this international effort, if feasible.
Preliminary Data:
1. Genetic stocks: Because of genetic triplication in wheat, whole or partial chromosome deletions tend to not be as lethal as in diploids. This also means that multiple fragments detected in Southern blots by probe hybridization are most efficiently mapped by deletion stocks. There is no need for polymorphism as DNA fragments are scored for presence or absence. It was possible to isolate deletion lines where an individual chromosome arm was missing the segment from a breakpoint outward to the original telomere. Unique for wheat is the existence of a set of 436 such deletion stocks, derived from a single original line, that are, with some exceptions, more or less randomly distributed along the length of the 21 chromosomes of wheat (Endo and Gill 1996; http://wheat.pw.usda.gov/mEST). This set of lines provides on the average 62 different breakpoints per each of the homoeologous chromosome groups. Since the three genomes are colinear, the deletion set for each homoeologous chromosome group divides the consensus homoeologous chromosome into many small “bins” of an average of less than 10 cM each.
2. Wheat ESTs: One of the co-PIs (Olin Anderson) of this proposal is the coordinator of the Agricultural Research Service’s computer database for wheat, barley, rye, oats, sugarcane, and related wild grasses (the GrainGenes database). In anticipation of the coming flood of genomic data from both the crops of GrainGenes’ responsibility and plants of comparative interest to those crops, the GrainGenes staff at both Albany, CA, and Cornell University have been discussing and planning procedures for handling such data and exploring changes in existing database models. As part of this review and preparation, the Albany staff have made some initial changes in the GrainGenes database model and displays (Lazo et al. 1998; http://wheat.pw.usda.gov/wEST/) and has initiated a small-scale EST effort to better understand both the technical and bioinformatics requirements for EST production and analysis. A total of 501 clones mapped to Triticeae genomes have been single-pass sequenced. These probes are part of the GrainGenes probe collection maintained at Albany. In addition, 2400 random wheat endosperm cDNAs have also been single-pass sequenced from their 5' ends (total as of 1-26-99). Sequencing 3' ends may be performed as singletons are identified and/or when needed for specific techniques such as SAGE.
  These sequences are being used by Albany staff (Dr. Gerard Lazo) to test methods of automating sequence analysis and data formatting for entry into the GrainGenes database or other relevant database (detailed more below) - see Table 1 for a test batch of 684 sequences, and http://wheat.pw.usda.gov/wEST for updates. Testing of automated sequence processing on the additional endosperm cDNAs will commence shortly.
  The generation of wheat endosperm ESTs will continue at Albany and serve as both a test case for general wheat EST production and as a significant EST resource for this practically important tissue. The immediate plan is to complete 3000 random cDNAs, then proceed on a series of screenings (normalization/subtractions) to reduce redundant sequences. This screening will give the Albany personnel direct experience in several normalization procedures (described in detail below). The future plan is to sequence at least 5000 more endosperm cDNAs after normalization procedures. Thus, this current proposal assumes a basic wheat endosperm EST resource will be available through the ongoing Albany projects, and this proposal will concentrate on the other wheat tissues and conditions, although wheat endosperm subjected to specific conditions, such as heat stress, will still be included in new EST production.
3. Functional genomics: The PIs of this proposal all have records of research into the structure and function of specific genes and gene families. This research was previously limited to such a narrow focus by technology limitation - limitations now ready to be transcended to allow more global examination of genome function.
ITMI: An international network for wheat genomics:
The group of investigators on this project has a long and fruitful history of close collaboration. Virtually all participants on this project are members of the International Triticeae Mapping Initiative (ITMI). ITMI was founded in 1989 by the core-group of investigators on this project with the specific goal to develop genetic maps of wheat and its relatives. Founding of this international organization, which has been managed from the University of California since its conception, was a response to the realization that the development of detailed comparative genetic maps for wheat and its relatives required close coordination, collaboration, and sharing of resources and results. The success of ITMI is evident from the fact that after eight years of ITMI activity, wheat and related species now rank among the best-mapped plant species. The steady progress made by ITMI is clearly evident from the voluminous progress reports prepared annually by the coordinators of individual homoeologous groups in the tribe (McGuire and Qualset 1997). This exemplary track record in collaborative work provides the best guarantee for successful accomplishment of tasks placed in front of this group in this project. The existence of this formal, well established collaborative organization will greatly facilitate rapid dissemination of the results of this project in the ITMI workshops that are held annually in different countries.
The wheat EST project proposed here has galvanized complementary efforts in other countries participating in ITMI as well as preliminary efforts in the US. In the summer of 1998, ITMI fostered foundation of the International Triticeae EST Cooperative (ITEC) to initiate collaborative studies on gene function in wheat and the other members of the tribe Triticeae (see Wheat EST sub-section in the section below). The ITEC effort illustrates the role ITMI plays in fostering development of international links. There is little doubt that this grant will stimulate similar investments in other countries as indicated by attached letters from our collaborators in Australia and U.K. which will greatly amplify this country’s investment.

Table 1. A pilot batch of 684 wheat endosperm cDNAs were searched against the GenBank non-redundant database, below are listed examples of the types and numbers of matches obtained for different classes:

Prolamine Storage Proteins: 94
     Alpha/Beta-gliadins
     Gamma-gliadins
     LMW glutenins
     HMW glutenins
     Other (avenin, hordein, secalin)
Non-Prolamine Storage Proteins: 30
     Globulin, CMx, CM1, CM3, CM16
Starch associated: 15
     UDP-glucose pyrophosphorylase,
     starch branching enzyme I, II
     alpha-amylase
Puroindolines, Grain softness: 20
     GSP-1a, GSP-1b
Ribosomal Sequences: 30
Ribosomal Proteins: 34
Histones: 18
     H2A, H2B, H3, H4
Tubulins: 10
     alpha-, beta-
Elongation Factors: 16
     EF-1 alpha, EF-1 beta

viral, repeats: 27
Vector: 6
Initiation Factors: 5
G-Proteins, Kinases: 14
Environmentally-induced: 29
     Heat shock protein 70,82,90;
     low temperature
     salt-stress,
     defense
Thionins: 6
     alpha-1
     alpha-2
     purothionin
     type V
Retrotransposons,
Unknown Genomic: 79
Membrane Transport: 13
Miscellaneous Remaining: 211
     Ubiquitin
     extensin-like protein
     invertase
     actin

victorin binding protein
S-adenosylmethionine synthetase
S-adentlmet. decarboxylase
glyceraldehyde-3-phosphate dehydrogenase,
ribulose-1,5-bisphosphate carboxylase
chlorophyll a/b binding
acyl-CoA synthetase
disulfide isomerase
betaine aldehyde dehydrogenase
alanine aminotransferase,
glycine decarboxylase
cysteine proteinase
tryptophan synthase
peptidylprolyl cis-trans isomerase
glyoxysomal malate dehydrogenase
phenylalanine ammonia-lyase
3-hydroxy-3-methylglutaryl coenzyme A reductase
glutathione reductase
glutathione S-transferase
lipoxygenase

others...

5) SPECIFIC RESEARCH OBJECTIVES AND HYPOTHESES

Structural Genomics

Objective 1. Construct and normalize new and existing cDNA libraries from wheat and close relatives.
Premise: By targeting each organ and numerous induction regimes for mRNA isolation and by normalization of the libraries, most of wheat genes will become accessible for EST production.

Objective 2. Sequence a sufficient number of Triticeae clones to obtain 10,000 or more nonduplicated ESTs (singletons).
Premise: Normalization of cDNA clones will increase the efficiency of identification of unique clones by sequencing so that the targeted number of singletons can be achieved.

Objective 3. Hybridize singleton ESTs with a panel of DNAs of T. aestivum deletions to produce a high density EST map of T. aestivum.
Premises: The use of true-breeding deletion stocks and a collaborative approach will facilitate mapping of virtually all wheat loci homologous to each EST, thus beginning the process of physical mapping of Triticeae chromosomes. Comparison of nucleotide sequences of wheat ESTs with the mapped rice and maize ESTs, augmented by mapping of those heterologous ESTs that will not be detected in wheat, will allow construction of comparative EST maps of the wheat genomes and thus provide a framework for comparative functional genomics among grasses.

Functional Genomics

Objective 4. Prepare microarrays of PCR-isolated Triticeae ESTs at a central facility and distribute them to all collaborating investigators for studies of gene expression during critical processes of the reproductive phase of the wheat plant.
Premise: Microarrays will facilitate identification of numerous genes expressed during specific developmental stages and responding to specific external stimuli, or being differentially expressed in contrasting genotypes.

Genomic Informatics

Objective 5. Develop bioinformatics tools for analysis of sequence data, summarization, verification, and sharing of data produced in objectives 1 through 4.
Premise: Bioinformatics is a critical component of all genomics programs both to process the large amounts of data and make the data easily accessible and be of maximum utility to all researchers.

Genome Structure

Objective 6. Construct detailed maps of gene densities across chromosome arms in wheat and location of genes with similar developmental or induction expression patterns.
Hypotheses: Gene loci are homogeneously distributed across the euchromatic portions of chromosome arms and show no clustering with respect of their common expression pattern.

) RESEARCH PLAN:

a) Objective 1: cDNA libraries and normalization:

cDNA libraries.
No single strategy for library selection and sequencing has been adopted. Höfte et al. (1993) utilized a set of libraries each constructed from different tissues, while Newman et al. (1994) constructed a single library from pooled mRNA samples from a number of different tissues. The latter strategy was intended to help reduce redundant sequencing of abundant cDNAs - by diluting the abundance of any cDNA specific to one or a few tissues. After discussion among Triticeae researchers and other ongoing plant EST projects we have decided on the multiple library approach. Libraries will be mainly from wheat, plus from closely related Triticeae with known potentials for trait transfer to wheat. Discussion among a number of U.S. wheat laboratories concluded library construction would best be accomplished at one site to better monitor library quality. Timothy Close (UC Riverside) was asked to take the lead in library construction because of his experience.
Based on ongoing and planned public and private plant EST projects, it is estimated that a comprehensive wheat EST program would require at least 30-50 different libraries from different tissues, developmental stages, and growth conditions. It is our intention to establish a set of at least 30 libraries from different tissues and conditions, with special emphasis on five areas: grain, floral/reproductive tissues, apex, roots, leaves and stems. These libraries will be made available to other sites for sequencing as part of an international Triticeae EST collaboration (ITEC; described above).
Acquisition of Existing cDNA Libraries: Existing libraries will be used if they match the project’s needs for appropriate source and quality. Examples of available cDNA libraries thus far identified and committed are listed below. Additional verbal, email, and letter offers of libraries and RNA preparations are being coordinated by the office of Calvin Qualset (University of California, Davis).
- A drought-induced wheat root tip cDNA library and a heat-stressed wheat leaf cDNA library, Texas Tech University (Henry Nguyen).
- Wheat anthers. Centre for Basic and Applied Plant Molecular Biology, Australia (Peter Langridge).
- Wheat endosperm. ARS, Albany CA (O. Anderson).
- Etiolated seedlings. Cornell University (M. Sorrells).
- Two rye cDNA libraries prepared from aluminum-stressed and unstressed rye root tissue, University of Missouri (Perry Gustafson).
- Two nine-day wheat shoot libraries. Purdue University (J. Anderson)
- Wheat roots and leaves. Centre for Basic and Applied Plant Molecular Biology, Australia (Peter Langridge).
- Two libraries of cold-stressed and cold-recovered wheat crowns, and two wheat libraries for carbon dioxide up- and down-regulation. ARS, USDA, Beltsville (E. Herman).
- Two libraries for control and ABA-treated dormant wheat embryos. ARS, USDA, Pullman. (M.K. Walker-Simmons).
Acquisition of RNA for New cDNA libraries: New cDNA libraries will be produced from RNA from selected plants and regimes. Examples of commitments to provide RNA include:
- Heat stressed wheat flag leaves at the flowering stage, and heat stressed developing grains of plants at the mid-grain filling stage (20 days after pollination (H. Nguyen).
- Leaves of wheat plants at the full tillering stage exposed to gradual drought stress regime; relative water content and osmotic potentials (T. Close).
- Roots and leaves of Lophopyrum elongatum stressed with NaCl for 6 hours (to capture early- salt response genes) and 14 days (late-response genes)
- Vernalized and unvernalized shoot apex and young leaves of T. monococcum winter line G3116 (J. Dubcovsky).
- Control, heat, and drought stressed developing kernels, imbibed or germinating kernels, and embryos, roots, and leaves (S. Altenbach and W. Hurkman).
- Ae. speltoides anthers from the penultimate mitosis to the tetrad stage of meiosis, and developing pistils and anthers from the microspore stage to anthesis (J. Dvorak).
- Wheat floral tissues, and developing kernels minus the endosperm (O. Anderson and S. Altenbach)
- Wheat embryos and seed aleurone treated with ABA, cold stress and control wheat crown tissue, and dehydrated + cold stressed and control wheat seedling shoots and roots (M.K. Walker-Simmons).
Construction of cDNA Libraries: As with several aspects of this proposal we have attempted to provide backup capability in case unforeseen problems occur or a specific aspect requires more capacitiy/capability than initially planned. For example, libraries will be constructed in the laboratory of Timothy Close at the University of California, Riverside, with backup construction to be provided, if necessary, at Texas Tech (Henry Nguyen). Total RNA will mainly be prepared in collaborating laboratories although in specific cases (to be determined) RNA may be prepared in the laboratories of the PIs Olin Anderson, Tim Close, and Henry Nguyen. The integrity of each RNA sample will be assessed by formaldehyde-agarose gel electrophoresis upon receipt. Investigators who provide sub-standard RNA samples will be advised on RNA extraction procedures and asked to send new samples.
PolyA RNA will be purified from total RNA using the Promega PolyATract System IV, which utilizes a biotinylated oligo dT probe and streptavidin-conjugated paramagnetic particles for separation of polyA RNA from other RNA. Synthesis of cDNA and production of directional primary cDNA libraries will be achieved using a Stratagene Uni-ZAP XR library construction kit containing XhoI and EcoRI digested lambda Uni-ZAP II. The directionality provides an advantage later in the choice of sequencing primers, when the N-terminal end of the protein-coding region of each cDNA will be desired. Another advantage of the Uni-ZAP vector is that it contains the pBluescript phagemid, which can be excised in vivo using a helper phage.
Both the amplified lambda lysates and amplified phagemid libraries will be stored in 7% DMSO at -80oC in aliquots for distribution and future use. Another portion of each unamplified lambda ZAP library will be used to produce an amplified population of phagemids by excision of phagemid DNA from the lambda ZAP vector.
Testing library quality: Quality will be determined by the average insert size of the library, an estimation of the percent full-length clones, the representation of rRNA clones, and monitoring of the percent of singletons during sequencing. Newman et al. (1994) checked the quality of a library by using the sequences of multiple isolates of two example cDNAs - 7 of 12 catalase and 12 of 15 Chl a binding protein isolates included the initiation codon. We will select similar sequence examples as one of our quality controls, and plan on test sequencing 100 clones from each selected library before continuing more extensive sequencing.
Normalization.
A fundamental problem during EST projects is the occurrence of redundant sequences. This occurs because mRNA titers, and their derived cDNA clones, vary in cellular abundance over several orders of magnitude. The range of expression may be as high as 200,000 mRNA molecules per cell or as low as <1 per cell (on the average), and with perhaps 30% of the genes expressing at levels of less than 10 mRNA molecules per cell at any given time (Bishop et al. 1974; Galau et al. 1977). This can mean that several million clones are needed to have a reasonable chance of finding any specific low-abundance transcript. In large-scale cDNA sequencing for these low-abundance classes, strategies are needed to normalize the cDNA populations. Since our goal is to generate as many unique wheat ESTs as possible, some form of normalization is required although even normalization cannot completely remove members of gene families (Hillier et al. 1996).
Colonies will be picked and arrayed into microtiter plates. The ARS Albany site is planning to purchase a Genetix Q-Bot arraying robot which has the capability of picking colonies and plaques, arraying them into either 96- or 384-well microtiter plates, rearraying selected wells, and DNA micro-arraying (newly released accessory head). Backup arraying is expected to be available at Texas Tech University (H. Nguyen), and several other sites have offered assistance.
There is, as yet, no single optimal strategy to maximize the relative abundance of different sequences for EST production. Different strategies each have advantages and disadvantages which also depend on the scope and purpose of the EST project. A search of the literature and communications with researchers in large-scale EST projects leads us to three approaches that appeal to our predilections and anticipated needs of the wheat research community.
Screening with abundant sequences: These can either be total cDNA probes from mRNA of the same tissue as the library or with pools of common-sequence clones. Screening out the 50-100 most common sequences is the protocol settled on by Pioneer Hi-Bred’s maize EST program after testing various hybridization based normalization procedures (Dr. Tim Helentjaris, NAS Plant Genome Colloquium, Irvine, 1997; personnel communication).
Reassociation: The methods of Soares (Soares et al. 1994; Bonaldo et al. 1996) use several variations of reassociation of single-stranded plasmid/phagemid preparations and synthesized insert DNAs primed off the same clones. This is the favored procedure of DuPont’s (Delaware) plant EST program (Dr. Guo-Hua Miao, personnel communication), and the main procedure proposed for the new U.S. soybean EST project (Vodkin et al. 1998; Shoemaker 1998).
Hybridization to total genomic DNA: Kopezynski et al. (1998) used single-stranded biotin-dUTP labeled (ENZO Biochem) genomic DNA bound to streptavidin-coated magnetic beads (Dynal) to select first-strand cDNA according to relative genome representation. Upon elution, conversion to double-stranded cDNA, and cloning, libraries enriched for rarer mRNA sequences are generated. An added feature of the procedure is the use of a competitor RNA during hybridization to the bead-bound genomic DNA. In their experiment, cytosolic mRNA competitor resulted in a 5-fold enrichment of membrane and secreted proteins (whose mRNAs occur mainly on ER-bound ribosomes). The procedure does over represent transcripts from related large families, but it requires no optimization of hybridization times or hydroxylapatite conditions as does the Soares et al. (1994) procedure.
Our intention is to immediately begin testing these normalization procedures at Albany, CA, as part of an ongoing wheat endosperm EST sequencing project (O. Anderson) described in Preliminary Results. The initial evaluation of this testing is anticipated to be completed by Summer 1999.

b) Objective 2. Sequencing of cDNAs.

Template preparation: For DNA preparation and sequencing, master 1.5 ml cultures are grown in 2ml deep well 96 format blocks (Corning Costar). Blocks are sealed with a sterile mono-directional porous sheet (Qiagen) to prevent cross-contamination between wells. The blocks are shaken for 14 hours at 650 rpms in a GeneMachine HiGro shaker at 37oC. The shaker can oxygenate cultures if necessary for sufficient growth. One hundred microliters of the culture is then transferred to a standard U-well 96 format 0.3ml sterile block and glycerol is added to make a 15% freezer stock. The main growth cultures are pelleted at 500 g and the supernatant immediately decanted. DNA is then mini-prepped using the Qiaprep 96 Turbo Miniprep Kit (Qiagen) on the Beckman Biomek 2000 liquid handling robot. The DNA preparation procedure involves minimal staff participation and costs $1.65 per sample (with volume discounts). The DNA is sufficiently pure and of an adequate concentration to perform PCR with no further action.
DNA sequencing: DNA sequencing will be performed at the ARS Albany site. We will remain receptive to other sequencing potentials if they arise, but prefer in-house sequencing at this time. We have surveyed examples of contract sequencing services and find costs typically range $8-20 per reaction. We are conservatively estimating we can perform the sequencing at $7/reaction or less (excluding staff costs) using the Beckman CEQ2000 (8-channel capillary system). Additional sequencing capability will be available by summer 1999 with an ABI 3700 96-channel capillary DNA sequencer to be installed at the Albany site. ABI has not released cost estimates as yet, but representatives say the same chemistry will be used as for the ABI 310 single-channel sequencer. Bulk purchases for an EST project would lower costs about 20%. The ABI 3700 has a capacity of about 1000 samples per day, and this project would have up to approximately 30-40% use of this capacity, as needed.
The sequencing reaction mixture is prepared from a standard Fluorescent Dye Terminator kit provided by Beckman that is specific to the CEQ2000 8-channel capillary electrophoresis autosequencer, or from a master mix provided by Perkin Elmer/Applied Biosystems for use with the ABI 310 and ABI 3700 capillary autosequencers. The processes and results are equivalent with only the total time to complete the process varying. With either chemistry, the reactions can be prepared with 50% of the recommended reagents. Backup/supplementary sequencing has been arranged with staff at Texas Tech University with several additional CEQ2000 sequencers (H. Nguyen).
There is no fixed criterion for how many sequencing reactions should be preformed on any single cDNA library before moving on to other libraries - the specifics depend on goals, resources, quality of libraries, and ongoing sequencing results. DuPont’s criterion for continuing sequencing a specific library is generally 40% new genes per sequencing session, or 20% if a deep hunt is executed (Dr. Guo-Hua Miao, personnel communication). Pioneer Hi-Bred (Dr. Tim Helentjaris; NAS Colloquium, 1997 Irvine; personnel communication ) sequences 600 clones from a library, then assesses if the library is of sufficient quality. If yes, then they proceed to sequence 3000-6000 clones from the library. Our initial strategy in this proposal will be to sequence approximately 3000 clones in each of 25-30 normalized libraries, although this plan may change as sequencing progresses. Each clone will be 5' sequenced. Each putative singleton will also be 3' sequenced to ascertain that a unique gene motif was discovered. We initially will consider >80% homology in the 3' coding region as an arbitrary criterion for declaring two different 5' ESTs as representing the same singleton.
Maintenance of clones and libraries: Copies of all clones and libraries will be maintained at the ARS Albany, CA, facility for long-term storage and initial distribution - in conjunction with staff of the GrainGenes Probe Repository. Long-term maintenance with be through other arrangements such as the I.M.A.G.E consortium and/or facilities that may be developed by the USDA. The Albany site is also tenatively planned to be the/a central site for all international Triticeae ESTs that are developed by ITEC.
Quality Control: Quality control is critical throughout this effort. Reports at various scientific meetings from large-scale human genome projects has noted human error as one of their biggest problems, with as much automation as possible a significant help. We will use the Beckman Biomek 2000 and Genetix Q-Bot for as much sample handling as possible, and sequencing will be performed on multichannel capillary systems that handle 96-well microtiter plates containing the sequencing reactions. In addition, the technician position requested for Albany will work in conjunction with an in-house technician to double-check accuracy of sample handling.
Contamination of the cDNAs could occur either by experimenter error or through inadvertent inclusion of cells of other eukaryotes associated with non-sterile wheat tissues (as has been reported in other projects). These potential library contaminations will become apparent during analyses of BLAST scores.
Single-pass sequencing inevitably includes errors. While this could potentially lead to problems in homology analyses, States (1992) has indicated that these errors do not significantly effect the reliability of BLAST searches.
Selection of singletons: The generated EST population will be analyzed for internal homologies using BLAST software running at the Albany site. Perspective singletons will be identified as having no internal BLAST scores of greater than a specified score as yet to be determined (updates on testing screening criteria will be posted at http://wheat.pw.usda.gov/wEST). Selected ESTs will be 3' single-pass sequenced to confirm the 5' homology and to distinguish multiple isolates of the same motif but with different 5' ends due to truncated cDNA clones. A second internal BLAST analysis will be used to select singletons to pass on the the mapping laboratories. Further characterization of each singleton EST will be obtained from Southern analysis during mapping and BLAST analysis against GenBank. Thus, we will eventually have considerable information about each mapped EST: similarity of both 5' and 3' sequences to all other ESTs, map positions, fragment pattern/complexity from the mapping blots, and homologies to all other known sequences.
Preparation of cDNA inserts for mapping, arrays, and quality control: Our initial plan is to produce all inserts at a single site (Albany) to reduce the possible introduction of errors during shipment and handling by the 9 different mapping laboratories. Once singletons are identified from the total EST collection, the appropriate clones will be consolidated by rearraying into microtiter dishes. Inserts of each clone will be amplified to produce enough insert for four uses: checking insert DNA on by gel electrophorssis, duplicate mapping, resequencing for quality control, and at least the first sets of microarrays. Controls will be relaxed if and when results justify.

c) Objective 3: Deletion mapping of ESTs:

Wheat EST mapping: Singletons from this project and later rice and maize mapped ESTs that have not been detected in the wheat EST project will be mapped using wheat deletion lines by Southern blotting and hybridization. DNAs from 94 strategically selected deletion lines will be digested with DraI and organized in four nylon membranes. DraI sites tend to be polymorphic in wheat DNA. Deletion line will be selected on the basis of existing mapping data (Werner et al. 1992; Gill et al. 1993,1996a; Kota et al. 1993; Hohmann et al. 1994; Delaney et al. 1995a,b; Mickelson-Young et al. 1995) so that the breakpoints dissect the genomes into smaller segments. The 94 deletion lines will provide an average of thirteen deletions per chromosome resulting in bins averaging 10 cM. Ninety per cent of the time, one restriction digest is enough to map all the fragments in the deletion lines. In addition, 14 ditelosomics, one for each 14 chromosome arms (x = 7), will be included to determine the arm location of the most proximal markers. Molecular weight standards will be included in three lanes of each gel to facilitate accurate calculation of fragment sizes. Each EST singleton will be hybridized once with these four membranes containing the deletion lines, ditelosomics, and molecular weight standards (94 + 14 + 12 = 120 lanes on four gels). Using alkaline blotting, a high bond NT membrane can routinely be used 20 times.
The following protocol will be employed to ensure high quality standards in the EST mapping. Each singleton targeted for mapping will be amplified and unincorporated nucleotides will be removed by Wizard Purification Kits (Promega) and gel-sized in the Albany (O. Anderson) site. The 5' end of each insert will be resequenced to determine that the correct insert was PCR generated. An aliquot of the insert will be send to the one of the nine mapping laboratories (J. Anderson, J. Dubcovsky, J. Dvorak, B.S. Gill, K.S. Gill, P. Gustafson, S. Kianian, N. Lapitan, M. Sorrells). Probes will be prepared by the random priming method and hybridized with the four membranes. The phosphorimager files or electronic files of autoradiograms will be forwarded to UC Davis/ARS (Albany, CA) and the sizes of the fragments and their relative intensities will be computed using the FPC computer program (Soderlund et al. 1997) and the position of an EST singleton on the deletion map will be determined on the basis of lost bands. Since a large number of mapping labs are involved, initially 20% of EST singletons will be mapped in two different labs to ensure detection of potential problems. This duplication will be reduced once confidence in results are demonstrated. The sizes of restriction DNA fragments characterizing each clone and those associated with a specific deletion will be deposited in a database such as Xcel for cross-referencing and in GrainGenes. If a singleton shares fragment sizes and map positions with another mapped singleton in the database, the history of the clone will be reexamined for potential errors.
Reaching 10,000 mapped EST singletons: Using 24 hybridization bottles/week and four membranes per bottle, hybridization of 1500 clones will thus take a worker about 52 weeks. With this rate of progress, nine labs can reach the target of 10,000 mapped ESTs in about 12 months. The real rate may be slower because of initial duplication for quality control and occasional failure of producing results of acceptable quality. When including mapping probes from other cereals and some start-up time, the deletion mapping is expected to be completed within 18 months.
Comparative EST mapping to other cereals: Singletons from other cereal mapping projects will be compared to identified wheat singletons in a cooperative effort with other projects (rice - Susan McCouch; maize - Ed Coe; letters included) and unique singletons exchanged and mapped in each cereal.
Comparative EST mapping via sequence matching: EST sequences will be submitted to the NCBI BLAST (Altschul et al. 1990) mail server, and compared to sequences to the nonredundant (nr) and expressed sequence tag (EST) GenBank databases v.104.0, having 316,258 and 1,364,418 sequences, respectively. Within the ACEDB format, BLAST alignments are then grouped for GrainGenes database comparisons among the wheat, barley, oat, rye, maize, rice, sugarcane, other grass species, and nongrass species genomes. In-depth evaluation of the efficacy of sequence matching for linking genomes will be conducted using programmed scripts to identify all clones that have been sequenced and mapped in multiple species. Published and in-house linkage maps will be used to compare clones mapped by Southern analysis to sequences identified in BLAST searches. The linear order and orthology of ESTs will be scrutinized by comparing map positions of other ESTs in the same general area. This procedure and the mapping in wheat of heterologous ESTs which were absent from the wheat EST pool will ultimately be the foundation for comparative EST maps of wheat with other grass genomes.
Coordination of homoeologous groups: Mapping data from all collaborating laboratories will be coordinated for each of the 7 homoeologous chromosome groups by a coPI: group 1, Nora Lapitan; 2, James Anderson; 3, Mark Sorrells; 4, Perry Gustafson; 5, Jorge Dubcovsky; 6, Kulvinder Gill; 7, Shahryar Kianian. Each autoradiogram will be deposited to the database as an electronic file, and each autoradiogram will be read three times: once by the post-doc in the originating laboratory, once by the PI of that laboratory, and a third time by the homoeologous chromosome group coordinator who will make the final EST map assignments. If necessary, final inconsistencies will be resolved by examination of original autoradiograms at the project annual meeting.
Map utilizations: The use of the generated maps is largely beyond the budgeted activities of this current proposal, but will include use in candidate gene isolations and physical contig assembly of specific chromosome regions.

d) Objective 4: Functional genomics.

The PCR generated EST amplicons will be microarrayed on glass slides. We are currently considering a purchase of the OmniGrid arrayer ($ 55,000) (Genomic Instrumentation Services). Several options exist for the acquisition of the scanner: Molecular Dynamics ($85,000) and ScanArray ($70,000) (General Scanning). We will decide on a specific instrument on the basis of our initial studies utilizing the OmniGrid arrayer and scanner acquired by T. Wilkins at UC Davis (letter attached) for the NSF-funded Cotton Genome Project. We will evaluate two software packages for analysis and management of microarray data. A commercially available program GeneSpring ($20,000) for Windows platform (Silicon Genetics) is on the market and publicly available programs are expected to be released by Pat Brown and David Botstein (Stanford University). Arrayer and scanner will be housed with the Albany/Davis labs (O. Anderson, J. Dvorak, J. Dubcovsky). PCR cDNA insert amplification and arraying will be done in the Albany lab. The glass arrays will be produced and distributed to collaborating laboratories (B.S. Gill, K.S. Gill, P. Gustafson, S. Kianian, H. Nguyen, M. Sorrells, K. Walker-Simmons), as well as used in O. Anderson, J. Dubcovsky, and J. Dvorak’s labs, for hybridization with fluorochrome labeled cDNA probes. Each investigator will produce cDNA from mRNA isolated from tissues of plants at specified developmental stages, contrasting environmental regimes, or contrasting genotypes.

Space limitations prevents a detailed description of how each laboratory intends using the arrays, but a few examples are instructive. To investigate ESTs involved in the regulation of meiosis in the presence of the Ph1 locus, genes differentially expressed during a time-course involving premeiotic stages and onset of meiosis in anthers of Ph1 and Ph1-deficient nearly isogenic wheat deletion stocks will be investigated (B.S. Gill and J. Dvorak). ESTs involved in anther development and high pollen load will be investigated and compared to variations occurring in wheat (P. Gustafson and S. Kianian). Developmental time courses and responses to known relevant stimuli will be investigated for seed development (O. Anderson, H. Nguyen, M.-K. Walker-Simmons).

Hybridized microarrays will be analyzed either at the experimental site or sent to another project laboratory for analysis. Currently at least four sites (Albany, Cornell, Davis, and TexasTech) either already have, or are planning to purchase readers. The scanner data will be analyzed with GeneSpring or similar software. Data will be returned to the authors who will query them for two-fold or greater differences in the relative representation of a specific cDNA in the pair of probes. The patterns of the expression of EST during a developmental or induction time-course will be compiled annotated and stored in a database (see Bioinformatics section). Initially a selection of array experiments will be read at at least two sites to judge any variability between laboratories and to further ensure quality control.

Individual laboratories will carry out other functional assays (such as SAGE) as pertinent to their individual research objectives. As with microarrays, contact with other laboratories will be emphasized and contact with the bioinformatics staff will be required. Array construction and functional studies will not wait for the complete of ESTs to be generated. Smaller arrays, likely representing single tissues such as the endosperm, will be used as soon as possible both to analyze those tissues and to test the overall array construction, use, and analysis of results.

The five aspects of wheat reproduction which will be focused on are listed along with the coPIs to work on the functional genomics of that aspect. By different laboratories focusing on specific problems, the total result is expected to be bigger picture of total reproductive phase gene expression. A specific investigator may have a narrow interest in a specific tissue, but the total result from this collaborative project will be a picture of large numbers of ESTs in multiple tissues and conditions.

Flowering signals: Jorge Dubcovsky (UC Davis).
Meiosis: Jan Dvorak (UC Davis), Bikram Gill (Kansas State).
Pollen development: Shahryar Kianian (South Dakata State), Perry Gustafson (ARS, Missouri), Olin Anderson (ARS, Albany).
Seed development: Olin Anderson (ARS, Albany), Henry Nguyen (Texas Tech), Mary-Kay Walker-Simmons (ARS, Pullman).
Seed dormancy/Germination: Mark Sorrells (Cornell), Mary-Kay Walker-Simmons (ARS, Pullman)

e) Objective 5: Bioinformatics.

Needs: The acquisition of ESTs must be followed with processing of the DNA sequences to provide researchers with information in accessible and annotated formats. To accomplish this, sequences will be processed, analyzed, and formatted for inclusion in appropriate computer-based systems. All data processing, database modifications, and web page maintenance for this proposal will be performed by staff at Albany and Cornell. The initial main WWW milieu will be the GrainGenes database (at http://probe.nalusda.gov), appropriate web pages on the GrainGenes web server (at http://wheat.pw.usda.gov), and submission to GenBank (Benson et al. 1998). This emphasis on data accessibility will also fulfill the mandate to ensure technology transfer to relevant U.S. scientific and industrial sectors.
Sequence Data Processing: Each EST sequence will be processed to remove the vector and uninformative sequences. Sequences and clones failing to meet length and quality criteria will be discarded. Software to perform the initial sequence processing will be a combination of computer programs and scripts publically available from other sites or written by in-house staff for this project (mostly in Perl and C programming language). In these experiments, 96-well formatted samples will be sequenced on DNA sequencers and data output will be in the standard chromatogram file (SCF, v.3) format (Dear and Staden 1992). Sequence from other machines in other formats may be converted to SCF format if needed (Staden 1998), to convert ABI and ALF file formats. The SCF files will be written to a disk partition, shared between the networked Pentium-II class machines of the sequencers and a Unix machine. The project is currently using a Sun Microsystems Ultra 30 (SunOS 5.6) computer to handle sequence output and processing. A crontab utility, for automated program execution, will parse the SCF files. A series of "cron" jobs will be running on the Unix machine to move processed data to the next steps as needed. This is a fairly automated process, requiring only periodic monitoring. Quality checking and initial handling of data will be with the program "phred" (Ewing et al. 1998; Ewing and Green 1998). This program can read SCF trace files, call sequence bases, assign quality values to the bases, and output these values to FASTA formatted files. The quality values are useful for trimming the sequence to exclude poor quality data and for selecting good cutoff points at the 5' and 3' ends. Another program, Lucy v.1.05 (Chou 1998), will be used to perform vector trimming. Similar to the cross_match program (Green 1998), the Lucy program also accepts sequence quality files, and works from full vector sequence data, which allows detection of the reverse complement vector sequence. Lucy is also useful for consolidating information for multiple sequencing of a given clone. An additional script, polytrim (in-house, G. Lazo), helps to remove poly oligo-dT (polyA) runs in the sequence at both 5' or 3' ends of sequences.
The submission of sequences for database searches have utilized the BLAST series of programs (Altschul et al. 1990). Pilot studies (G. Lazo) previously submitted sequences via e-mail using perl scripts (BLASTmailnr, BLASTmailest; in-house, G. Lazo) to the National Center for Biotechnology Information (NCBI) server which houses current sequence databases. Sequences will be searched using primarily BLASTN for nucleic acid homology, and BLASTX for peptide homology searches. PAM120 scores of >80 were considered to have potentially significant homology (Newman et al., 1994). The nr database contains all non-redundant GenBank, EMBL, DDBJ, and PDB sequences (but no EST, STS, GSS, or HTGS sequences) and contains 2,837,897 sequences comprising 2,008,761,784 bases (release 109.0, Oct. 1998). The dbEST database contains a collection of non-redundant GenBank, EMBL, and DDBJ EST classified sequences with 1,868,590 sequences and 707,570,380 bases. We will continue to utilize the service at NCBI, but are doing additional searches locally for comparisons of the Triticeae EST sequences. Likewise, as EST sequence data is generated by the project, the sequences will be submitted to GenBank using the NCBI Sequin v.2.70 (NCBI, 1998) program. Local Triticeae EST searches will help classify the cDNA populations generated by the project. A perl script, cleansearch (in-house, G. Lazo), will submit "cleaned" cDNA data to a locally housed Triticeae database and perform BLASTN and BLASTX homology searches (the utility of including other searches using other algorithms will be evaluated). This will show clustering within the sequenced populations (by plant, tissue, and condition). There are other programs that work with the phred program as a package (i.e. phrap, cross_match, consed). These programs are mainly used for sequence assembly. Another similar assembly program, TIGR_Assembler (Sutton et al., 1995), is also being evaluated for use. Each sequence will be searched against nucleic acid databases to identify gene classes using basic local alignment search tool programs. The tracking of local sequences will allow for the identification of gene families, over-abundant, and unique sequences. Identifying clusters of sequences will help in establishing library normalization.
The GrainGenes project currently uses the ACEDB (Thierry-Mieg and Durbin 1992; Dunham et al. 1994) program to display BLAST results. A perl script, blast2ace (in-house, G. Lazo), takes the raw BLASTN or BLASTX output and writes files for the ACEDB environment. Output files are then loaded into the GrainGenes database and linked to relevant data classes. In addition, the BLAST hits are annotated to reflect genome origin (color-coded), and by the raw scores and probability values and presented in a graphic display (http://wheat.pw.usda.gov/wEST/). The original trace file data may be viewed with the consed package (Gordon et al. 1998) or linked with ACEDB using the Acembly program (Thierry-Mieg and Thierry-Mieg 1998). We will also evaluate whether further processing should be conducted with other gene identification programs (HMM, PSI-BLAST, FASTA-ortholog clustering).
EST homologies analyses are, by nature, temporary. As more sequences are deposited in public databases, and as the function of genes become more defined, the result of a homology search for a specific EST will change. For this reason periodic database screens must be carried out and information updated. The current plan is to re-analyze all EST data for updates to GrainGenes quarterly, although more frequent updates may be instituted as needed.
Mapping data will be linked into the GrainGenes database by present procedures. It is anticipated that new map displays will be needed, and this will be a priority for ARS bioinformatics staff associated with GrainGenes and comparative genomics projects.
The recording, analysis, and linking of microarray data presents a new challenge, and we do not believe there is any consensus on the best approaches. Our intention is to initially use existing software and database resources, but be alert to what are expected to be rapid changes and improvements. For example, a number of software packages are now becoming availalable to measure such data (Eisen et al. 1998) (ImageQuant v.5.0/FluorSep v.2.0, Molecular Dynamics; ImaGene, BioDiscovery Inc.; SpotfirePro, Spotfire Inc.). Software such as ScanAlyze 1.5, Cluster, and TreeView (Eisen et al. 1998) can be used to sort, analyze, and visualize microarray data. Such data is usually comprised of a sequence identification (ID), sequence identity (if resolved from BLAST etc.), and fluorometric/radiometric intensity readings from a variety of experimental hybridizations. The trends exemplified over the tested conditions may be clustered, possibly grouping sequences associated with like function.
The ACEDB program currently used for displaying Triticeae-associated information be adapted for the display of microarray data. The C. elegans ACEDB contains arrayed hybridization data, and might be adapted to display new data types such as color tables associated with assigned fluoresence intensities of array hybridizations. Each data point in the array is hyperlinked linked to other associated information for a given clone, or probe. The display is primitive at this time, but may be able to incorporate newer displays. If not adaptable, then other programs and display tools will be used. As new tools become available, different ways of analyzing the data will be assessed. Eventually there will be extensive use of processed data displays to aid the researcher in mining the enormous amounts of data to be generated. Examples from recent reports include rearranging microarray results according to chromosome position (Winzeler and Davis 1997), grouping into categories based on knowledge of likely roles (Iyer et al. 1999), or grouping by time course of developmental or physiological induction (Chu et al. 1998). Exact determination of variation in expression of each two-color dot (array point) is another challenging task, but tools are in development; i.e., the Fourier transform method to assess periodicity and correlation measurements to determine if a gene is periodically regulated (Sherlock et al., 1999).
Computer Database Enhancements: A number of improvements are planned by ARS bioinformatics staff for the database's WWW interface including new features specifically for EST data. For example, a novel graphical display is needed for comparing maps of different species in which homologous sequences have been mapped. Other needed enhancements include connections to appropriate gene families in the Mendel database and to metabolic pathway databases. Recently, ARS has decided to shift its basic bioinformatics development effort from its National Agricultural Library in Beltsville to a new center at Ithaca, New York, in a cooperative agreement with the Cornell Computer Theory Center. This new effort will be in addition to relevant existing ARS bioinformatics staff at Cornell (David Matthew and Sam Cartinhour, plus support personnel) which has also recently received additional resources to add curation and programming staff to coordinated improvements in GrainGenes plus two other ARS databases at Cornell, RiceGenes (rice) and SolGenes (Solanaceae).
The ACEDB database system has both significant strengths and weaknesses. As a vertical application designed specifically for genome research, ACEDB has had the richest set of graphical interfaces available for visualizing and interacting with the genomic data, especially sequence data, although the display capabilities are becoming dated. However, it has still proven robust enough to support much larger projects than the one proposed here; i.e., entire physical map and sequence of the C. elegans genome. However, the web accessible version of the database is too limited and the querying language, while extremely powerful, has proven too obtuse for many users. Addressing these weakness is a major focus of ARS staff at Cornell and Albany. GrainGenes will also be joining its sister databases at Cornell in developing a parallel version of the database in a relational database management system (RDBMS) such as Oracle. The possibility of a hybrid (ACEDB/RDBMS) system is not excluded. For the RDBMS approach, all graphical interfaces will be written in Java with WWW as the target delivery mode, and designed to be independent of the underlying database management system. Communication with the database will be via middleware such as CORBA (Dicks et al. 1999), or Java DX which combines database access flexibility with powerful built-in visualization tools.

f) Objective 6: Structure and evolution of the expressed portion of the wheat genomes.

It is anticipated that information on larger-scale organization and evolution of the Triticeae genomes will naturally flow from the other objectives of this proposal. While no specific staff is budgeted specifically for this objective, most of the collaborating laboratories have major interests in this objective and will mine the generated data. Two examples of such interests are given below:

Gene densities: Information on the numbers of loci and approximate numbers of genes per locus will be extracted from the wheat EST singleton mapping database (B.S. Gill and J. Dvorak). The approximate number of genes per locus will be inferred from the relative numbers of bands mapped to a specific deletion. The total number of genes per specific deletion will be divided by the relative lengths of the deletion, expressed as a percent length of the arm, thereby yielding a number of genes per % of the arm length. If the null hypothesis of homogeneity is true, there will be no statistically significant differences among the deletions across the euchromatic portion of an arm. If there are significant differences, the null hypothesis will be rejected. On the basis of the current information, we expect to find higher gene density in the distal chromosome regions. Estimated gene density in a deletion will be converted into gene/kb.
Distribution of functionally related genes: Data on the chromosomal location (Objective 3) and tentative function of EST singletons (objective 4) will be extracted from the database (J. Dvorak and B.S. Gill). Loci will be categorized according to their time of expression during the reproductive phase, according to induction by an external stimulus, such as environmental stress, whether or not they encode a house-keeping enzyme, or whether or not they are single copy. Homogeneity of the distribution of loci sharing a common characteristic among the chromosomes and across the chromosome arms will be statistically tested by analysis of variance or Chi² test. If significant differences in the frequencies of loci are found, the null hypothesis of homogeneity will be rejected. Relationships to factors such as recombination rates across the chromosomes, expressed as coefficients of exchange (Dvorak et al. 1998), will be examined. Conservation of chromosomes or chromosome regions showing clustering of sharing a common characteristic across maps of grass genomes will be compared with those not showing clustering.

7) ROLES OF PARTICIPANTS

The role of each participant in the overall scheme of the project is indicated in Figure 1. Tim Close will be responsible for the acquisition of existing cDNA libraries and development of new cDNA libraries. Olin Anderson will be responsible for EST sequencing, microarray development and distribution, and for bioinformatics (along with Mark Sorrells). Henry Nguyen and Olin Anderson will be responsible for library normalization. Bikram Gill will distribute seeds of the deletion lines to the mapping laboratories (Jim Anderson, Jorge Dubcovsky, Jan Dvorak, Bikram Gill, Kulvinder Gill, Shahryar Kianian, Nora Lapitan, and Mark Sorrells) and coordinate the deletion mapping work and publication of these results. Chromosome groups coordination will be by one coPI per chromosome group (as described above). Jan Dvorak will coordinate genome structure studies among the coPIs. The functional genomics area will include the laboratories of Olin Anderson, Jim Anderson, Jorge Dubcovsky, Jan Dvorak, Bikram Gill, Kulvinder Gill, Shahryar Kianian, Mary-Kay Walker-Simmons, and Mark Sorrells who will coordinate the functional genomics aspects of the proposal.

8) STUDENT TRAINING AND UNDERREPRESENTED GROUPS

As indicated in the research plan and budget, staffing for this proposal is mainly via graduate students and postdoctoral positions. The mapping will be performed by halftime of postdoctoral staff (the other halftime to work on aspects of structural and functional genomics) along with undergraduate students. The graduate students will concentrate on functional genomics aspects of the proposal according to the specific research emphasis of each laboratory. Training of both graduate students and postdoctoral researchers will be via laboratory training and the following: 1) A yearly 2-day meeting of all members of this research project will afford the opportunity to present research results, and consult and coordinate activities, 2) Yearly ITMI meetings will be modified to include updates and planning of international collaborations and will be attended by all postdocs, 3) Training sessions of staff with project bioinformatics staff to both acquaint staff with basic bioinformatics and training in data formatting and quality control, 4) Rotations of staff through other laboratories within the project will allow a broader exposure to other aspects of the project. Initial plans include four week rotations of mapping and functional genomics staff through either the Albany or Cornell sites where bioinformatics staff will be located. The PIs of this proposal have records of hiring from under-represented groups. In addition to their normal search for such individuals, particular attention will be given to notification of positions to both local and national schools appropriate for such recruitments.

9) RESPONSES TO CRITICISMS OF THE PREVIOUS SUBMISSION

This proposal was submitted, but declined, in the previous cycle of this NSF program. The principal criticisms (italics) and our responses follow. (1) The methodology of EST microarrays were not sufficiently developed - this section has been expanded and updated. (2) Bioinformatics section was inadequate and the capability of the ACEDB database system to handle EST data was questioned - more details on the bioinformatics plans for both EST and microarray data are given, along with plans for the ACEDB platform. (3) Two bioinformatics staff are excessive and a single programmer should be substituted. We disagree: the coordination of this proposal, the formating and analysis of the data, maintaining connections to other genomics and database projects, and the anticipated need to some level of new software development will require at least the two staff originally proposed and has been left in the proposal. (4) The QTL candidate gene approach was questioned - this section was removed from the proposal. (5) The evolutionary genetic approach details was absent from the proposal although being in the title and highlighted in the abstract - a section was added describing examples of the approaches to evolutionary genetics. 6) The budget and salary descriptions were inadequate - besides checking the budget numbers closely, a text section describes the staffing rationale and budget. 7) The descriptions of library construction and normalization were sketchy - more details were added to explain our approaches to these objectives. 8) The training component was not described adequately - more details were included on student and post-doctoral staff training. 9) The functional genomics part had no focus - the abiotic stress component per se was dropped and is being covered in a separate proposal to be submitted; this proposal will concentrate its functional genomics aspect on tissues and conditions related to reproduction.

10) TIMETABLE

Library acquisition and evaluation will occur in the first year. Construction of new libraries will be completed within 18 months. Normalization of libraries will be completed within the first two years. Sequencing of ESTs will be mainly completed by the end of the second year. EST mapping will commence as soon as possible and continue throughout the term of the project. Microarray preparation and all functional genomics activities will initiate as soon as possible and continue through the project term. Bioinformatics activity will be continuous through the term of the project.

11) REFERENCES CITED

Adams, M. D., M. Dubnick, A. R. Kerlavage, R. Moreno, J. M. Kelley, T. R. Utterback, J. W. Nagle, C. Fields and J. C. Venter, 1992 Sequence identification of 2375 human brain genes. Nature 355:632-634.

Adams, M. D., J. M. Kelley, J. D. Gocayne, M. Dubnick, M. H. Polymeropoulos, H. Xiao, C. R. Merril, A. Wu, B. Olde, R. F. Moreno et al., 1991b Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252:1651-1656.

Adams, M. D., J. M. Kelley, J. D. Gocayne, M. Dubnick, M. H. Polymeropoulos, H. Xiao, C. R. Merril, A. Wu, B. Olde, R. F. Moreno et al., 1991a Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252:1651-1656.

Ahn, S. N., J. A. Anderson, M. E. Sorrels and S. D. Tanksley, 1993 Homoeologous relationships of rice, wheat and maize chromosomes. Mol. Gen. Genet. 241:483-490.

Ahn, S. N., and S. D. Tanksley, 1993 Comparative linkage maps of the rice and maize genomes. Proc. Natl. Acad. Sci. USA 90:7980-7984.

Ajioka, J. W., J. C. Boothroyd, B. P. Brunk, A. Hehl, L. Hillier, I. D. Manger, M. Marra, G. C. Overton, D. S. Roos, W. K.L. et al., 1998 Gene discovery by EST sequencing in Toxoplasma gondii reveals sequences restricted to the Apicomplexa. Genome Res. 8:18-28.

Altschul, S. F., W. Gish, W. Miller, E. W. Myers and D. J. Lipman, 1990 Basic logical aligment search tool. J. Mol. Biol. 215:403-410.

Anderson, J. A., M. E. Sorrells and S. D. Tanksley, 1993 RFLP analysis of genomic regions associated with resistance to preharvest sprouting in wheat. Crop Sci. 33:453-459.

Anderson, O. D., F. C. Greene, R. E. Yip, N. G. Halford, P. R. Shewry and J.-M. Malpica-Romero, 1989 Nucleotide sequences of the two high-molecular-weight glutenin genes from the D genome of hexaploid wheat, Triticum aestivum L. cv. Cheyenne. Nucleic Acid Research 17:461-462.

Aragón-Alcaide, L., S. Reader, T. Miller and G. Moore, 1997 Centromeric behaviour in wheat with high and low homoeologous chromosomal pairing. Chromosoma 106:327-33.

Benson, D. A., M. S. Boguski, D. J. Lipman, J. Ostell and B. F. Ouellette, 1998 GenBank. Nucleic Acids Res. 26:1-7.

Bernacchi, D., and S. D. Tanksley, 1997 An interspecific backcross of Lycopersicon esculentum x L. hirsutum: linkage analysis and a QTL study of sexual compatibility factors and floral traits. Genetics 147:861-77.

Bevan, M., I. Bancroft, E. Bent, K. Love, H. Goodman, C. Dean, R. Bergkamp, W. Dirkse, M. Van Staveren, W. Stiekema et al., 1998 Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis thaliana. Nature 391:485-8.

Bishop, J. O., J. G. Morton, M. Rosebach and M. Richardson, 1974 Three abundance classes in HeLa-cell messenger-RNA. Nature 250:199-204.

Bonaldo, M. F., G. Lennon and M. B. Soares, 1996 Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res. 6:791-806.

Busch, R. H., and S. S. Maan, 1978 Effects of alien cytoplasms on agronomic and bread making traits of two spring wheat cultivars. Crop Sci. 18:864-866.

Cassidy, B. G., and J. Dvorak, 1991 Molecular characterization of a low-molecular-weight glutenin cDNA clone from Triticum durum. Theor. Appl. Genet. 81:653-660.

Cassidy, B. G., J. Dvorak and O. D. Anderson, 1998 The wheat low-molecular-weight glutenin genes: characterization of six new genes and progress in understanding gene family structure. Theor. Appl. Genet. 96:743-750.

Chao, S., P. J. Sharp, A. J. Worland, E. J. Warham, R. M. D. Koebner and M. D. Gale, 1989 RFLP-based genetic maps of wheat homoeologous group 7 chromosomes. Theor. Appl. Genet. 78:495-504.

Chee, M., R. Yang, E. Hubbell, A. Berno, X. C. Huang, D. Stern, J. Winkler, D. J. Bockhart, M. S.Morris and S. P. A. Fodor, 1996 Accessing genetic information with high-density DNA arrays. Science 274:610-614.

Chen, K. C., and J. Dvorak, 1984 The inheritance of genetic variation in Triticum speltoides affecting heterogenetic chromosome pairing in hybrids with Triticum aestivum. Can. J. Genet. Cytol 26:279-287.

Cho, R. J., M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart et al., 1998 A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2:65-73.

Chou, H.-H., 1998 Lucy, pp. . Proceedings of the Tenth Annual Genome Sequencing and Annotation Conference (GSAC X), Miami, FL, September 1998.

Chouard, P., 1960 Vernalization and its relation to dormancy. Ann. Rev. Plant Physiol. 11:191-238.

Chu, S., J. DeRisi, M. Eisen, J. Mulholland, D. Botstein, P. O. Brown and I. Herskowitz, 1998 The transcriptional program of sporulation in budding yeast. Science 282:699-705.

Clegg, M. T., B. S. Gaut, G. H. Learn, Jr. and B. R. Morton, 1994 Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci. USA 91:6795-801.

Coe, E. H., M. G. Neuffer and D. A. Hoisington, 1988 The genetics of corn, pp. in Corn and Corn Improvement, edited by G. F. Sprague, and J. W. Dudley. American Society of Agronomy, Madison, WI.

Cooke, R., M. Raynal, M. Laudie, F. Grellet, M. Delseny, P. C. Morris, D. Guerrier, J. Giraudet, F. Quigley, G. Clabault et al., 1996 Further progress towards a catalogue of all Arabidopsis genes:analysis of a set of 5000 non-redundant ESTs. Plant J. 9:101-124.

Curtis, C. A., and A. J. Lukaszewski, 1991 Metaphase I pairing of deficient chromosomes and genetic mapping of deficient breakpoints in common wheat. Genome 34:553-560.

Dear, S., and R. Staden, 1992 A standard file format for data from DNA sequencing instruments. DNA Sequence 3:107-110.

Delaney, D. E., S. Nasuda, T. R. Endo, B. S. Gill and S. H. Hulbert, 1995a Cytologically based physical maps of the group-2 chromosomes of wheat. Theor. Appl. Genet. 91:568-573.

Delaney, D. E., S. Nasuda, T. R. Endo, B. S. Gill and S. H. Hulbert, 1995b Cytologically based physical maps of the group 3 chromosomes of wheat. Theor. Appl. Genet. 91:780-782.

DePauw, R. M., and T. N. McCraig, 1983 Recombining dormancy and white seed color in a spring wheat cross. Can. J. Plant Sci. 63:581-589.

DeRisi, J. L., V. R. Iyer and P. O. Brown, 1997 Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278:680-686.

Desprez, T., J. Amselem, M. Caboche and H. Höfte, 1998 Differential gene expression in Arabidopsis monitored using cDNA arrays. The Plant J. 14:643-652.

Devos, K. M., M. D. Atkinson, C. N. Chinoy, H. A. Francis, R. L. Harcourt, R. M. D. Koebner, C. J. Liu, P. Masoje, D. X. Xie and M. D. Gale, 1993a Chromosomal rearrangements in the rye genome relative to that of wheat. Theor. Appl. Genet. 85:673-680.

Devos, K. M., M. D. Atkinson, C. N. Chinoy, C. Liu and M. D. Gale, 1992 RFLP based genetic maps of the homoeologous group 3 chromosomes of wheat and rye. Theor. Appl. Genet. 83:931-939.

Devos, K. M., S. Chao, Q. Y. Li, M. C. Simonetti and M. D. Gale, 1994 Relationship between chromosome 9 of maize and wheat homeologous group 7 chromosomes. Genetics 138:1287-1292.

Devos, K. M., J. Dubcovsky, J. Dvorak, C. N. Chinoy and M. D. Gale, 1995a Structural evolution of wheat chromosomes 4A, 5A, and 7B and its impact on recombination. Theor. Appl. Genet. 91:282-288.

Devos, K. M., and M. D. Gale, 1993 Extended genetic maps of the homoeologous group 3 chromosomes of wheat, rye, and barley. Theor. Appl. Genet. 85:649-652.

Devos, K. M., and M. D. Gale, 1997 Comparative genetics in the grasses. Plant Mol. Biol. 35:3-15.

Devos, K. M., T. Milan and M. D. Gale, 1993b Comparative RFLP maps of the homoeologous group-2 chromosomes of wheat, rye, and barley. Theor. Appl. Genet. 85:784-792.

Devos, K. M., G. Moore and M. D. Gale, 1995b Conservation of marker synteny during evolution. Euphytica 85:367-372.

Devos, K. M., Z. M. Wang, J. Beales, Y. Sasaki and M. D. Gale, 1998 Comparative genetic maps of foxtail millet Setaria italica and rice Oryza sativa. Theor. Appl. Genet. 96:63-68.

Dicks, J., M. Anderson, L. Cardle, S. Cartinhour, M. Couchman, J. Dickson, M. Gale, D. Marshall, H. McWilliam, A. O'Malia et al., 1999 UK CropNet and CORBA, pp. p.62. http://www.intl-pag.org/pag/7/abstracts/pg7180.html/. in Plant and Animal Genome VII, San Diego.

Dover, G. A., and R. Riley, 1972 Prevention of pairing of homoeologous meiotic chromosomes of wheat by supernumerary chromosomes of Aegilops. Nature 240:159-161.

Dubcovsky, J., M. Echeide, F. Giancola, M. Rousset, M. C. Luo, L. R. Joppa and J. Dvorak, 1997 Seed storage protein loci and RFLP maps od diploid, tetraploid, and hexaploid wheat. Theor. Appl. Genet. 95:1169-1180.

Dubcovsky, J., D. Lijavetzky, L. Appendino and G. Tranquilli, 1998 Comparative RFLP mapping of Triticum monococcum genes controlling vernalization requirement. Theor. Appl. Genet. 97:968-975.

Dubcovsky, J., M. C. Luo and J. Dvorak, 1995a Linkage relationships among stress-induced genes in wheat. Theor. Appl. Genet. 91:795-801.

Dubcovsky, J., M. C. Luo, G. Y. Zhong, R. Bransteitter, A. Desai, A. Kilian, A. Kleinhofs and J. Dvorak, 1996 Genetic map of diploid wheat, Triticum monococcum L., and its comparison with maps of Hordeum vulgare L. Genetics 143:983-999.

Dubcovsky, J., M.-C. Luo and J. Dvorak, 1995b Differentiation between homoeologous chromosomes 1A of wheat and 1A m of Triticum monococcum and its recognition by the wheat Ph1 locus. Proceedings of National Academy of Sciences 92:6645-6649.

Dunham, I., R. Durbin, J. T. Mieg and D. R. Bentley, 1994 Physical mapping projects and ACEDB., pp. 111-158 in Guide to Human Genome Computing, edited by M. J. Bishop. Academic Press.

Dvorak, J., 1972 Genetic variability in Aegilops speltoides affecting homoeologous piaring in wheat. Can. J. Genet. Cytol. 14:371-380.

Dvorak, J., 1987 Chromosomal distribution of genes in Elytrigia elongata which promote or suppress pairing of wheat homoeologous chromosomes. Genome 29:34-40.

Dvorak, J., and R. Appels, 1986 Investigation of homologous crossing over and sister chromatid exchange in the wheat Nor-2 locus coding for rRNA and Gli-B2 locus coding for gliadins. Genetics 113:1037-1056.

Dvorak, J., and K.-C. Chen, 1984 Distribution of nonstructural variation between wheat cultivars along chromosome arm 6Bp: Evidence from the linkage map and physical map of the arm. Genetics 106:325-333.

Dvorak, J., P. di Terlizzi, H. B. Zhang and P. Resta, 1993 The evolution of polyploid wheats: Identification of the A genome donor species. Genome 36:21-31.

Dvorak, J., M.-C. Luo and Z.-L. Yang, 1998 Restriction fragment length polymorphism and divergence in the genomic regions of high and low recombination in self-fertilizing and cross-fertilizing Aegilops species. Genetics 148:423-434.

Dvorak, J., and H. B. Zhang, 1990 Variation in repeated nucleotide sequences sheds light on the phylogeny of the wheat B and G genomes. Proc. Natl. Acad. Sci. USA 87:9640-9644.

Eisen, M. B., P. T. Spellman, P. O. Brown and D. Botstein, 1998 Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95:14863-14868.

Endo, T. R., and B. S. Gill, 1996 The deletion stocks of common wheat. Journal of Heredity 87:295-307.

Ewing, B., and P. Green, 1998 Basecalling of automated sequencer traces using phred. II. Error probabilities. Genome Research 8:186-194.

Ewing, B., L. Hillier, M. Wendl and P. Green, 1998 Basecalling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research 8:175-185.

Feldman, M., 1993 Cytogenetic activity and mode of action of the pairing homoeologous (Ph1) gene of wheat. Crop Sci. 33:894-897.

Feldman, M., and L. Avivi, 1988 Genetic control of bivalent pairing in common wheat: the mode of Ph1 action, pp. 269-279 in Kew Chromosome Conference, edited by P. E. Brandham.

Galau, G. A., W. H. Klein, R. J. Britten and E. H. Davidson, 1977 Significance of rare mRNA sequences in liver. Arch. Biochem. Biophys. 179:584-599.

Galili, G., and M. Feldman, 1984 Mapping of glutenin and gliadin genes located on chromosome 1B of common wheat. Mol. Gen. Genet. 198:293-298.

Gao, L., and W. Bushuk, 1993 Polymeric glutenin of wheat lines with varying number of high molecular weight glutenin subunits. Cereal Chemistry 70:475-480.

Gill, B. S., B. Friebe and T. R. Endo, 1991a Standard karyotype and nomenclature system for description of chromosome bands and structural aberrations in wheat (Triticum aestivum). Genome 34:830-839.

Gill, K. S., B. S. Gill and T. R. Endo, 1993 A chromosome region-specific mapping strategy reveals gene-rich telomeric ends in wheat. Chromosoma 102:374-381.

Gill, K. S., B. S. Gill, T. R. Endo and E. V. Boyko, 1996a Identification and high-density mapping of gene-rich regions in chromosome group 5 of wheat. Genetics 143:1001-1012.

Gill, K. S., B. S. Gill, T. R. Endo and T. Taylor, 1996b Identification and high-density mapping of gene rich regions in chromosome group 1 of wheat. Genetics 144:1883-1891.

Gill, K. S., D. Hassawi, W. J. Raupp, A. K. Fritz, B. S. Gill, T. S. Cox and R. G. Sears, 1992 An updated genetic linkage map of Triticum tauschii, the D-genome progenitor of wheat, pp. 27-29 in Progress in genome mapping of wheat and related species. Proceedings of the 2nd International Triticeae Mapping Initiative, edited by B. S. Gill, W. j. Raupp and H. Corke. Genetic Conservation Program of University of California, Manhattan, Kansas.

Gill, K. S., E. L. Lubbers, B. S. Gill, W. J. Raupp and T. S. Cox, 1991b A genetic linkage map of Triticum tauschii (DD) and its relationship to the D genome of bread wheat. Genome 34:362-374.

Gordon, D., C. Abajian and P. Green, 1998 Consed: A graphical tool for sequence finishing. Genome Research 8:195-202.

Green, P., 1998 swat/cross_match/phrap package, http://bozeman.mbt.washington.edu/phrap.docs/phrap.html.

Gupta, R. B., J. G. Paul, G. B. Cornish, G. A. Palmer, F. Bekes and A. J. Rathjen, 1994 Allelic variation at glutenin subunit and gliadin loci, Glu-1, Glu-3 and Gli-1, of common wheats. 1. Its additive and interaction effects on dough properties. J. Cereal Sci. 19:9-17.

Gupta, R. B., N. K. Singh and K. W. Shepherd, 1989 The cumulative affects of allelic variation in LMW and HMW glutenin subunits on dough properties in the progeny of two bread wheats. Theor. Appl. Genet. 77:57-64.

Halford, N. G., J. M. Field, H. Blair, P. Urwin, K. Moore, L. Robert, R. Thompson, R. B. Flavell, A. S. Tatham and P. R. Shewry, 1992 Analysis of HMW glutenin subunits encoded by chromosome-1A of bread wheat (Triticum aestivum L.) indicates quantitative effects on grain quality. Theor. Appl. Genet. 83:373-378.

Harushima, Y., M. Yano, P. Shomura, M. Sato, T. Shimano, Y. Kuboki, T. Yamamoto, S. Y. Lin, B. A. Antonio, A. Parco et al., 1998 A high-density rice genetic linkage map with 2275 markers using a single F-2 population. Genetics 148:479-494.

Hillier, L. D., G. Lennon, M. Becker, M. F. Banaldo and B. Chiapelli, et. al., 1996 Generation and analysis of 280,000 human expressed sequence tags. Genome Res. 6:807-828.

Höfte, H., T. Desprez and J. Amselem, et al., 1993 An inventory of 1152 expressed sequence tags obtained by partial sequencing of cDNAs from Arabidopsis thaliana. Plant J. 4:1051-1061.

Hohmann, U., T. R. Endo, K. S. Gill and B. S. Gill, 1994 Comparison of genetic and physical maps of group 7 chromosomes from Triticum aestivum L. Mol. Gen. Genet. 245:644-653.

Holt, L. M., R. Astin and P. I. Payne, 1981 Structural and genetic studies on the high-molecular-weight subunits of wheat glutenin. 2. Relative isoelectric points determined by two dimensional fractionation in polyacrylamide gels. Theor. Appl. Genet. 60:237-243.

Höög, C., 1991 Isolation of a large number of novel mammalian genes by a differential screening strategy. Nucl. Acids Res. 19:6123-6127.

Hulbert, S. H., T. E. Richter, J. D. Axtell and J. L. Bennetzen, 1990 Genetic mapping and characterization of sorghum and related crops by means of maize DNA probes. Proc. Natl. Acad. Sci. USA. 87:4251-4255.

Iyer, V. R., M. B. Eisen, D. T. Ross, G. Schuler, T. Moore, J. C. F. Lee, J. M. Trent, L. M. Staudt, J. J. Hudson, M. S. Boguski et al., 1999 The transcriptional program in the response of human fibroblasts to serum. Science 283:83-87.

Jackson, E. A., L. M. Holt and P. I. Payne, 1983 Characterization of high molecular weight gliadin and low-molecular-weight glutenin subunits of wheat endosperm by two-dimensional electrophoresis and the chromosomal localisation of their expressed genes. Theor. Appl. Genet. 66:29-37.

Jampates, R., and J. Dvorak, 1986 Location of the Ph1 locus in the metaphase chromosome map and the linkage map of the 5Bq arm of wheat. Canad. J. Genet. Cytol. 28:511-519.

Jende-Strid, B., 1991 Gene-enzyme relations in the pathway of flavonoid biosynthesis in barley. Theor. Appl. Genet. 81:668-674.

Jia, J., K. M. Devos, S. Chao, T. E. Miller, S. M. Reader and M. D. Gale, 1996 RFLP-based maps of the homoeologous group-6 chromosomes of wheat and their application in the tagging of Pm12, a powdery mildew resistance gene transferred from Aegilops speltoides to wheat. Theor. Appl. Genet. 92:559-565.

Kidwell, M. G., and D. Lisch, 1997 Transposable elements as sources of variation in animals and plants. Proc. Natl. Acad. Sci. USA 94:7704-11.

Kihara, H., 1944 Discovery of the DD-analyser, one of the ancestors of Triticum vulgare (Japanese). Agric. and Hort. (Tokyo) 19:13-14.

Kopezynski, C. C., J. N. Noordermeer, T. L. Serano, W.-Y. Chen, J. D. Pendleton, S. Lewis, C. S. Goodman and G. M. Rubin, 1998 A high throughput screen to identify novel secreted and transmembrane proteins involved in Drosophila embryogenesis. Proc. Natl. Acad. Sci. USA. 95:9973-9978.

Kota, R. S., K. S. Gill, B. S. Gill and T. R. Endo, 1993 A cytogenetically based physical map of chromosome-1B in common wheat. Genome 36:548-554.

Kurata, N., G. Moore, T. Nagamura, T. Foote, Y. Yano, Y. Minobe and M. Gale, 1994a Conservation of genome structure between rice and wheat. Bio/Technology 12:276-278.

Kurata, N., Y. Nagamura, K. Yamamoto, Y. Harushima, N. Sue, J. Wu, B. A. Antonio, A. Shomura, T. Shimizu, S. Y. Lin et al., 1994b A 300 Kilobase Interval Genetic Map of Rice Including 883 Expressed Sequences. Nature Genetics 8:365-372.

Lagudah, E. S., R. Appels and A. D. H. Brown, 1991 The molecular-genetic analysis of Triticum tauschii, the D genome donor to hexaploid wheat. Genome 36:913-918.

Lazo, G. R., L. A. Larka, C. C. Hsia, K. F. McCue, M. E. Sorrells, D. E. Matthews, M. Au, N. A. Federspiel and O. D. Anderson, 1998 Assigning putative gene functions to mapped probe loci in the GrainGenes genome database and sequencing of wheat endosperm cDNAs. Plant and Animal Genome VI Conference. Abstract: http://www.intl-pag.org/pag/6/abstracts/404.html.

Liu, Y.-G., and K. Tsunewaki, 1991 Restriction fragment length polymorphism (RFLP) analysis in wheat. II. Linkage analysis of the RFLP sites in common wheat. Jap. J. Genet. 66:617-633.

Love, A., 1984 Conspectus of the Triticeae. Feddes Repertorium 95:425-521.

Lukaszewski, A. J., and C. A. Curtis, 1993 Physical distribution of recombination in B-genome chromosomes of tetraploid wheat. Theor. Appl. Genet. 84:121-127.

Luo, M. C., J. Dubcovsky and J. Dvorak, 1996 Recognition of homeology by the wheat Ph1 locus. Genetics 143:1195-1203.

Maan, S. S., 1992a A gene for embryo-endosperm compatibility and seed viability in allloplasmic Triticum turgidum. Genome 35:772-779.

Maan, s. s., 1992b The scs and Vi genes correct a syndrome of cytoplasmic effects in alloplasmic durum wheat. Genome 35:780-787.

Marino, C. L., Y. H. Nelson, Y. H. Lu, M. E. Sorrells, P. Leroy, N. A. Tuleen, C. R. Lopes and G. E. Hart, 1996 Molecular genetic maps of the group 6 chromosomes of hexaploid wheat (Triticum aestivum L. emend. Thell.). Genome 39:359-366.

Marshall, A., and J. Hodgson, 1998 DNA chips: An array of possibilities. Nature Biotechnol. 16:27-31.

McCombie, W. R., M. D. Adams, J. M. Kelley, M. G. FitzGerald, T. R. Utterback, M. Khan, M. Dubnick, A. R. Kerlavage, J. C. Venter and C. Fields, 1992 Caenorhabditis elegans expressed sequence tags identify gene families and potential disease gene homologues. Nature Genet. 1:124-131.

McFadden, E. S., and E. R. Sears, 1946 The origin of Triticum spelta and its free-theshing hexaploid relatives. Journal of Heredity 37:81-89, 107-116.

McGuire, P. E.and C. O. Qualset. (1997) Progress in Genome Mapping of wheat and Related Species Genetic Resources Conservation Program, Division of Agriculture and Natural Resources, University of California, Norwich, England and Sydney, Australia.

Mello-Sampayo, T., 1973 Somatic association of telocentric chromosomes carrying homologous centromeres in common wheat. Theor. Appl. Genet. 43:178-181.

Metzger, J. D., 1988 Localization of the site of perception of thermoinductive temperatures in Thalaspi arvense L. Plant Physiol. 88:424-428.

Mickelson-Young, L., T. R. Endo and B. S. Gill, 1995 A cytogenetic ladder-map of the wheat homoeologous group-4 chromosomes. Theor. Appl. Genet. 90:1007-1011.

Moore, G., 1998 To pair or not to pair: chromosome pairing and evolution. Curr. Opin. Plant Biol. 1:116-122.

Moore, G., K. M. Devos, Z. Wang and M. D. Gale, 1995 Cereal genome evolution - grasses, line up and form a circle. Curr. Biol. 5:737-739.

Nahm, B. H., S. H. Jeong, W. S. Kim, J.-K. Kim, S. J. Suk, M. C. Lee, Y. M. Park, K. Y. Kang, S. I. Kim and M. Y. Eun, 1998 Random sequencing analysis of rice immature seed ESTs from normal and subtracted libraries. Plant and Animal Genome VI Conference. Abstract: http://www.intl-pag.org/pag/6/abstrcts/168.html.

Nelson, J. C., M. E. Sorrells, A. E. Van Deynze, Y. H. Lu, M. Atkinson, M. Bernard, P. Leroy, J. D. Faris and J. A. Anderson, 1995a Molecular mapping of wheat. Major genes and rearrangements in homoeologous groups 4, 5, and 7. Genetics 141:721-731.

Nelson, J. C., A. E. VanDeynze, E. Autrique, M. E. Sorrells, Y. H. Lu, M. Merlino, M. Atkinson and P. Leroy, 1995b Molecular mapping of wheat. Homoeologus group 2. Genome 38:525-533.

Newman, T., F. J. de Bruijn, P. Green, K. Keegstra, H. Kende, L. McIntosh, J. Ohlrogge, N. Raikhel, S. Somerville, M. Thomashow et al., 1994 Genes galore: a summary of the methods for accessing the results of large scale partial sequencing of anonymous Arabidopsis thaliana cDNA clones. Plant Physiol. 106:1241-1255.

Nishikawa, K., 1983 Species relationship of wheat and its putative ancestors as viewed from isozyme variation, pp. 59-63 in Proc. 6th Int. Wheat Genet. Symp., Kyoto.

Noda, K., and D. J. Mares, 1995 Preharvest sprouting in cereal 1995, pp. in Center for Academic Societies Japan, Osaka.

Palmer, J. D., 1990 Contrasting modes and tempos of genome evolution in land plant organelles. Trends in Genet. 6:115-120.

Payne, P. I., 1987 Genetics of wheat storage proteins and the effect of allelic variation on bread-making quality. Ann. Rev. Pl. Physiol. 38:141-153.

Payne, P. I., K. G. Corfield, L. M. Holt and J. A. Blackman, 1981 Correlation between the inheritance of certain high-molecular weight subunits of glutenin and bread-making quality in progenies of six crosses of bread wheat. J. Sci. Food. Agric. 32:51-60.

Payne, P. I., J. A. Seeking, A. J. Worland, M. G. Jarvis and L. M. Holt, 1987 Allelic variation of glutenin subunits and gliadins and its effect on breadmaking quality in wheat: Analysis of F5 progeny from Chinese Spring x Chinese Spring (Hope 1A). J. Cereal Sci. 6:103-118.

Rafalski, J. A., M. Hanafey, G.-H. Miao, M. Dolan and S. V. Tingey, 1998 Electronic northerns: soybean gene expression information from EST data. Plant and Animal Genome VI Conference. Abstract: http://www.intl-pag.org/pag/6/abstracts/424.html.

Ramsay, G., 1998 DNA chips: State-of-the-art. Nature Biotechnol. 16:40-44.

Riley, R., and V. Chapman, 1958 Genetic control of the cytologically diploid behaviour of hexaploid wheat. Nature 182:713-715.

Riley, R., and C. N. Law, 1965 Genetic variation in chromosome pairing. Adv. Genet. 13:57-114.

Rounsley, S. D., A. Glodek, G. Sutton, M. D. Adams, C. R. Somerville, J. C. Venter and A. R. Kerlavage, 1996 The construction of Arabidopsis expressed sequence tag assemblies. Plant Physiol 112:1177-1183.

Ruan, Y., J. Gilmore and T. Conner, 1998 Towards Arabidopsis genome analysis: monitoring expression profiles of 1400 genes using cDNA microarrays. The Plant J. 15:821-833.

Saghai Maroof, M. A., G. P. Yang, R. M. Biyashev, P. J. Maughan and Q. Zhang, 1996 Analysis of the barley and rice genomes by comparative RFLP linkage mapping. Theor. Appl. Genet. 92:541-551.

Sasaki, T., J. Song, Y. Koga-Ban and E. Matsui, et al., 1994 Toward cataloguing all rice genes: large-scale sequencing of randomly chosen rice cDNAs from a callus cDNA library. Plant J. 6:615-624.

Schena, M., D. Shalon, R. Heller, A. Chai, P. O. Brown and R. W. Davis, 1996 Parallel human genome analysis: Microarray-based expression monitoring of 1000 genes. Proc. Natl. Acad. Sci. USA 93:10614-10619.

Sears, E. R., 1954 The aneuploids of common wheat. Research Bull. Univ. Missouri Agric. Exper. Station 572:1-59.

Sears, E. R., 1966 Nullisomic-tetrasomic combinations in hexaploid wheat, pp. 29-44 in Chromosome Manipulations and Plant Genetics, edited by R. Riley, and K. R. Lewis. Oliver & Boyd, Edinburgh.

Sears, E. R., 1976 Genetic control of chromosome pairing in wheat. Ann. Rev. Genet. 10:31-51.

Sears, E. R., and M. Okamoto, 1958 Intergenomic chromosome relationships in hexaploid wheat, pp. 258-259 in Proc. X Int. Congress Genet.

Sherman, J. D., A. L. Fenwick, D. M. Namuth and N. L. V. Lapitan, 1995 A barley RFLP map: alignment of three barley maps and comparisons to Gramineae species. Theor. Appl. Genet. 91:681-690.

Shoemaker, R. C., 1998 Welcome to the soybean EST project home page. http://129.186.26.94/soybeanest.html.

Soares, M. B., M. F. Bonaldo, P. Jelene, L. Su, L. Lawton and A. Efstratiadis, 1994 Construction and characterization of a normalized cDNA library. Proc. Natl. Acad. Sci. USA 91.

Soderlund, C., I. Longden and R. Mott, 1997 FPC: a system for building contigs from restriction fingerprinted clones. CABIOS 13:523-535.

Soper, J. F., R. G. Cantrell and J. W. Dick, 1989 Sprouting damage and kernel color relationships in durum wheat. Crop Sci 29:895-898.

Sorrells, M. E., and J. A. J.A. Anderson, 1996 Quantitative trait loci associated with preharvest sprouting in white wheat, pp. July 2-7, 1995, in Seventh International Symposium on Preharvest Sprouting in Cereals, Abashiri, Japan.

Staden, R., 1998 makeSCF, pp. . ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/src/makeSCF.c.

Takahashi, N., and M. S. Ko, 1994 Toward a whole cDNA catalog: construction of an equalized cDNA library from mouse embryos. Genomics 23:202-210.

Tanksley, S. D., M. W. Ganal and G. B. Martin, 1995 Chromosome landing: a paradigm for map-based gene cloning in plants with large genomes. Trends in Genet. 11:63-8.

Tao, H. P., A. E. Adelsteins and D. D. Kasarda, 1992 Intermolecular disulphide bonds link specific high-molecular- weight glutenin subunits in wheat endosperm. Biochimica et Biophysica Acta 1159:13-21.

Thierry-Mieg, D., and J. Thierry-Mieg, 1998 Acembly, pp. . http://alpha.crbm.cnrs-mop.fr/acembly/.

Thierry-Mieg, J., and R. Durbin, 1992 ACEDB: A C. elegans Database., pp. http://probe.nalusda.gov:8000/acedocs/users_guide.html.

Thomas, B., 1994 Internal and external controls on flowering. Oxon, UK, CAB International.

Tranquilli, G. E. a. D., J., 1999 Interactions between vernalization genes of Triticum monococcum, pp. http://www.intl-pag.org/pag/7/abstracts/pag7351.html/ in Plant and Animal Genome VII Conference.

Tsuji, S., and S. S. Maan, 1981 Differential fertility and transmission of male and female gametes in alloplamic wheat hybrids. Can. J. Genet. Cytol. 23:337-348.

Uchimiya, H., S. Kidou, T. Shimazaki, S. Takamatsu, H. Hashimoto, R. Nishi, S. Aotsuka, Y. Matsubayashi, N. Kidou, Umeda et al., 1992 Random sequencing of cDNA libraries reveals a variety of expressed genes in cultured cells of rice (Oryza sativa L.). Plant J. 2:1005-1009.

Upadhya, M. D., and M. S. Swaminathan, 1967 Mechanisms regulating chromosome pairing in Triticum. Biol. Zentrbl. 86:239-255.

Van Dayze, A., J. C. Nelson, E. G. Yglesias, S. E. Harrington, D. P. Braga, S. R. McCouch and M. E. Sorrells, 1995 Comparative mapping in grasses. Wheat relationships. Mol. Gen. Genet. 248:744-754.

Van Deynze, A. E., J. Dubcovsky, K. S. Gill, J. C. Nelson, M. E. Sorrells, J. Dvorak, B. S. Gill, E. S. Lagudah, S. R. McCouch and R. Appels, 1995 Molecular-genetic maps for group 1 chromosomes of Triticeae species and their relation to chromosomes in rice and oat. Genome 38:45-59.

Vassarotti, A., and A. Goffeau, 1992 Sequencing the yeast genome: the European effort. Trends Biotechnol. 10:15-18.

Velculescu, V. E., Z. L., W. Zhou, J. Vogelstein, M. A. Basrai, D. E. Bassett, P. Hieter, D. Vogelstein and K. W. Kinzler, 1997 Characterization of the yeast transcriptome. Cell 88:243-251.

Velculescu, V. E., L. Zhang, B. Vogelstein and K. W. Kinzler, 1995 Serial analysis of gene expression. Science 270:484-487.

Vodkin, L. O., R. C. Shoemaker, E. F. Retzel, P. Keim and N. D. Young, 1998 A functional genomics program for soybean. NSF Award Abstract : (http://www.nsf.gov/cgi-bin/showaward?award=9872565).

Walker-Simmons, M. K., and J. L. Ried, 1993 Pre-harvest sprouting in Cereals, pp. in 1992 American Association of Cereal Chemists, St. Paul MN.

Wallace, D., 1982 Structure and evolution of organelle genomes. Microbiological Reviews 46:208-240.

Wang, X., and B. Bowen, 1998 Plant and Animal Genome VI Conferences. Abstracts: http://www.intl-pag.org/pag/6/abstracts/wang.html.

Waterston, R., C. Martin and M. e. a. Craxton, 1992 A survey of expressed genes in Caenorhabditis elegans. Nature Genet. 1:114-123.

Werner, J. E., T. R. Endo and B. S. Gill, 1992 Towards a cytogenetically based physical map of the wheat genome. Proceedings of National Academy of Sciences 89:11307-11311.

Winzeler, E. A., and R. W. Davis, 1997 Functional analysis of the yeast genome. Current Opinion in Genetics and Development 7:771-6.

Wrigley, C. W., and K. W. Shepherd, 1973 Electrofocusing of grain proteins from wheat genotypes. Ann. N.Y. Acad. Sci. 209:154-162.

Xie, D. X., K. M. Devos, G. Moore and M. D. Gale, 1993 RFLP-based genetic maps of the homoeologous group 5-chromosomes of bread wheat (Triticum aestivum L.) Theor. Appl. Genet. 87:70-74.

Yamamoto, K., and T. Sasaki, 1997 Large-scale EST sequencing in rice. Plant Mol. Biol. 35:135-144.

Appendix A-1. Description of the relationship between the proposed activity and current research activities

Calvin O. Qualset

Research Management: The proposed research program is very complementary to two other on-going research coordination activities and, in fact, is a direct extension of current work with the International Triticeae Mapping Initiative (ITMI) which is located in the UC Genetic Resources Conservation Program (GRCP) offices on the UCDavis campus. Together with Associate Director Patrick E. McGuire, we provide logistical support for ITMI, including preparation of funding proposals; receipt and dispersement of funds for research and travel; organization of annual international workshops (US, Australia, England, France, Canada) and worksessions at the Plant and Animal Genome meeting in San Diego; maintenance of a registrar of some 250 ITMI-affiliated scientists; and editing of publications (Proceedings of international workshops) and research initiatives (Wheat and Barley Improvement for the Next Century: Priority Needs for Biotechnological Approaches to Wheat and Barley Improvement, April 1997).

As Director of GRCP, I serve as CoProject Director with R. Bye in Mexico on a McKnight Foundation funded six-year Collaborative Crop Research Program ($1.7 million), “Genetic Resource Conservation and Crop Improvement in Mexico: A Farmer-Based Approach”. This project involves collaboration among four U.S Universities (North Carolina State, Maine, Cornell, and UCDavis) and five institutions in Mexico. On the US side we coordinate graduate student training, visiting scientists, annual meetings and reports, and provide funding for research through GRCP. This project includes 18 investigators and 30 undergraduate and graduate students.

Previous NSF-funded Research: In the 1980s I served as CoPI with Professor E. Epstein on a field-based study of the potential for enhancing salinity tolerance in wheat and barley for crop production. We established genotype-specific response curves to detect salinity tolerance and the first field experiments to demonstrate salinity tolerance in wheat-Lophopyrum amphiploids and chromosome substitution lines, later used by Professor Dvorak in discovering Kna1, a major gene conferring salinity resistance in wheat.

I was PI on a small NSF International Programs grant in 1984 for exploration and collection of Dasypyrum villosum, an allogamous diploid (V) Triticeae species found in the Mediterranean area. With P. McGuire and colleagues in Italy, the resulting collections were used in studies on genetic diversity, mating system, and searches for useful traits. More than 20 papers have been published and useful traits were discovered, now being used in breeding programs throughout the world. Interesting diversity in seed storage proteins was found at Glu 1V, with 14 electrophoretically detectable alleles identified at that locus.

More recently (1990-94)I was CoPI with Professor S. Brush on a project to study the role of in situ conservation of landrace wheats in western Turkey in relation to genetic diversity for adaptive traits and socio-economic factors. This project is nearing completion with two Ph.D. students and collaborators in Turkey.

Current Research: Formal retirement from UCDavis in 1994 has permitted a return to Triticeae research, including a current BARD-funded project with M. Feldman in Israel to study the genetic architecture of quantitative traits in single chromosome arm recombinant substitution lines. Single chromosome arms were introgressed into hexaploid Bethlehem wheat from tetraploid T. dicoccoides and subsequently used to create recombinant inbred lines. Other QTL mapping work includes stomatal and yield-related traits in the ITMI mapping populations with Prof. E. Zeiger (UCLA). Long-term development of congenic lines for traits of putative adaptive significance in wheat is in various states of completion for several traits (liguleless, branched spike, reduced tillering, erect leaf, vernalization, HMW glutenin subunits). These stocks provide excellent materials for candidate gene detection and will be available to the various labs in this project and to others. The study of in situ landrace genetic diversity of wheat in Turkey, Iran, and Ethiopia is progress, as are studies of the genetics of wheat enduse quality with colleagues in Spain and Hong Kong.

Olin D. Anderson

The research of Olin Anderson involves two projects. The first is engineering of altered quality and agronomic traits for cereals, particularly wheat. This project involves gene isolations and sequencing, plant transformations, and testing resulting transgenic plants for altered traits. Part of this work involves developing fairly deep ESTs for the wheat endosperm, the major tissue related to wheat quality. The second project is the GrainGenes database - the USDA's computer database for the small grains (wheat, barley, rye, oats, and wild relatives). Olin Anderson is the Coordinator of this database, and the Albany site staff are involved in data entry, database model development, and exploring entry of genomic level data into the database. In addition, the GrainGenes project at Albany maintains a collection of DNA probes for safekeeping and distribution. Staff on this project are also supporting the wheat endosperm wheat EST development and are assisting other laboratories in developing ESTs for the ITEC collaboration on Triticeae ESTs. Olin Anderson is also involved in setting up a core plant genomics facility at the ARS, Albany site and will have access to high-throughput DNA sequencing and sample/culture handling robotics.

The relationship to the proposed project is two-fold. As coordinator of the GrainGenes database Olin Anderson is ideally placed to ensure maximum interaction with project research and bioinformatics staff and GrainGenes staff. In addition, the GrainGenes project has a history of interest in comparative mapping and genomics and future efforts include close interactions with other ARS database projects. Secondly, the specific EST portion of the proposed project is a logical extension of the ongoing wheat endosperm EST project at Albany which is also being used at a test of handling EST data and inclusion in databases.

Jan Dvorak

The principal focus of my laboratory is on the mechanisms of recognition of differentiation between homoeologous chromosomes during wheat meiosis and the cytological activity of the wheat Ph1 gene. The distribution of recombination across chromosomes is investigated by comparing recombination patterns between corresponding homoeologous and homologous chromosomes. The effects of mitotic agents on chromosome pairing and recombination are being critically compared.

The second most important activity in my laboratory is the identification and mapping of genes controlling K+/Na+ in Lophopyrum elongatum. The objectives of this project are: (1) the construction of the populations of the first- and second- cycle disomic recombinant substitution lines (RSLs) for chromosomes 1, 3 an 7 of L. elongatum and to map crossover points with molecular markers and (2) mapping of genes controlling salinity tolerance and K+/Na+ selectivity in these RSL populations. In parallel to this project my laboratory is working on the enhancement of osmotic (salinity) tolerance in rice. The objective is to identify sources of salinity tolerance in exotic rice germplasm by screening for superior K+/Na+ selectivity. Salinity tolerance will be introgressed to California cultivars using molecular markers.

I collaborate with Jan Valkoun at ICARDA, Aleppo, Syria and Harold Bockelman (the curator of the National Small Grains Colection at Aberdeen, Idaho) on the acquisition and characterization of germplasm of T. urartu and Ae. tauschii, ancestors of bread wheat. The objectives of the proposal are (1) to acquire accessions of T. urartu from ICARDA and (2) collect germplasm of Ae. tauschii in the Transcaucasia and characterize both germplasms with molecular markers prior to their deposition in the US Small Grains Germplasm Collection in Aberdeen.

My laboratory is in initial stages of studies of the evolution of genepools and interploidy gene flow in wheat. Objectives of these studies are: (1) to determine the genetic basis of the hulled habit of primitive wheats, (2) to assess genetic distances and the presence of diagnostic alleles in the A and B genomes of all forms of tetraploid and hexaploid wheat and use this information in the interpretation of genetic relationships among the genepools of tetraploid and hexaploid wheats.

Bikram S. Gill

Our overall research activities are organized under the auspices of the Wheat Genetics Resource Center (WGRC). The WGRC has a mandate to carry out strategic research in wheat genetic resources conservation and utilization, genetic and cytogenetic stock maintenance and development, and wheat germ plasm development and crop improvement. We maintain a living collection of 2,700 wild Triticum and Aegilops species and 5,400 genetic (mapping populations and isogenic lines, etc.). Under the auspices of the International Triticeae Mapping Initiative (ITMI), our laboratory participates and coordinates map development of diploid Ae. tauschii and homoeologous group-5 chromosomes of wheat and related cereals. With a long-term aim of enhancing crop performance in the highly stressed environment of the Great Plains region, we continually introgress novel resistance genes from unadapted germ plasm into wheat. The novel resistance genes are initially characterized by monosomic and cytogenetic analysis. However, as the list of known genes expands, it has become more cumbersome to establish the novelty of new genes by these methods. Thus, in recent years, we have emphasized molecular mapping and tagging of genes in wheat breeding programs and provide entry points for molecular cloning and novel genetic manipulations of resistance genes. Under the auspices of a McKnight Foundation grant, and in collaboration with Nanjing Agricultural University, we have initiated a multifaceted program on the control of scab disease of wheat including molecular mechanisms involved in host-pathogen interaction. Our collaborative research efforts over the years with Dr. T.R. Endo of Kyoto University, Japan, led to the production of 436 deletion stocks in wheat. One of the USDA grants is aimed at exploring the mechanism of chromosome healing in deletion chromosomes (they immediately acquire functional telomeres following the breakage). Among the pending proposals, in the ITMI proposal (CO Qualset PI), deletion stocks will be used for large-scale mapping of ESTs. In the proposal with Dr. Jan Dvorak (PI), our laboratory will be responsible for anchoring a global BAC-contig map using innovative procedures to the genetic map of Ae. tauschii on which we have been working for the last 10 years.

Mark E. Sorrells

The proposed research is a logical extension of our historic focus on novel breeding methods and strategies at Cornell. For the past 7 years, our basic research activities have utilized comparative genetics to link the vast informational and genetic resources of the Poaceae. This work has changed the way we think about crop improvement.

In recent years, molecular genetics has provided new methods to generate, identify, characterize, and manipulate genetic variation. We have used comparative genomics to facilitate the identification and localization of gene sequences controlling specific traits in the domesticated grasses. The emerging databases of gene sequences will allow directed discovery of genes in higher plants and classification of alleles present within breeding germplasm. Our goal is to identify the genes controlling critical traits and their DNA sequences and then classify variation in the germplasm pool by gene fingerprinting or by characterization of variation in key DNA sequences. Classification of the allelic variants for a particular locus would substantially reduce the amount of work required to determine the relative breeding value and lead to the identification of superior alleles based on DNA sequence. Incorporation of direct allele selection into our breeding program, allows more rapid and precise improvement of populations and breeding lines.

As shown in my current and pending support, our funding includes modest support for the small grains breeding program and 2 research projects as follows:

Application of molecular genetics for development of durum wheat varieties: This is a collaborative project with ICARDA and FCRI, Egypt. The overall objective of this project is to develop higher yielding, disease resistant durum varieties with improved pasta qualities for the primary wheat growing areas in middle and upper Egypt. Durum variety improvement will be enhanced using new breeding methods, biotechnology, and training. 1. Implement research to test new phenotypic selection criteria that are associated with higher yield and quality. 2. Construct a molecular marker map for a recombinant inbred population from the cross Jennah Ketifa x Cham1. 3. Identify, characterize, and map loci controlling yield potential, stress tolerance, disease resistance and grain quality in this durum recombinant inbred population grown under both high yield and stress environments. 4. Initiate marker assisted selection for superior alleles in backcross populations to develop improved durum varieties. 5. Provide training in breeding and genetics.

Integration of QTL, Candidate Gene, and Comparative Sequence Analyses for Crop Improvement: The overall objective of this project is to develop methodologies that integrate molecular information from different approaches and from distant taxonomic clades that facilitate genomic research on polyploids. To develop these techniques we propose to isolate and characterize candidate loci underlying QTL for PHS resistance in wheat by integrating information from our QTL, candidate gene, and comparative mapping research. Specific objectives are: 1. Utilize comparative genetics to identify, clone, and characterize candidate loci underlying genes affecting grain dormancy in wheat and barley. 2. Develop a catalog of sequence variation at each locus and associate variation with phenotypic effects in a selected germplasm core.

The proposed research will lead us to the characterization of candidate genes for use in our small grains breeding program.

James A. Anderson

I have been involved with the development of tools for genome analysis and their use in genetic studies and breeding applications in wheat since 1989. In my current position as Assistant Professor in the Department of Agronomy and Plant Genetics at the University of Minnesota, I have responsibilities for both wheat breeding and genetics. The genetics activities involve the use of DNA markers to genetically dissect complex traits and apply the knowledge gained to formulate efficient breeding strategies. I have 10 years experience in mapping of qualitatively and quantitatively inherited traits in wheat using RFLP, RAPD, STS, and AFLP markers.

This proposal not only adds needed genetic tools (mostly ESTs) for future use in locating genes affecting agronomically important traits, but also will reveal the degree of homology in important biochemical pathways that affect a large number of traits in grasses of agricultural importance. My contribution to the proposal will be as one of the eight laboratories assigned to map ESTs. The posdoctoral associate involved in mapping of ESTs using the deletion stocks also will use these ESTs in our continued mapping of Fusarium head blight resistance genes in wheat. Although not part of this proposal, we plan to collaborate with other scientists at the University of Minnesota (see letter from Dr. Muehlbauer) to use the EST arrays to investigate the wheat genes whose expression is modified in response to challenge by Fusarium graminearum. Below are objectives of my wheat breeding project at the University of Minnesota and the NRI-funded Fusarium head blight mapping grant. Other funded projects listed in my Current and Pending Support statement share objectives with the projects listed below.

University of Minnesota Wheat Breeding and Genetics

Development of superior spring wheat varieties.
Genetically enhance spring wheat germplasm in the Northern Plains.
Develop new breeding strategies for complexly inherited traits.

Molecular Mapping of Fusarium Head Blight Resistance Genes in Wheat

Determine the number of genes and the magnitude of their effects in conditioning resistance to Fusarium head blight in wheat.
Identify DNA markers linked to genes conditioning resistance to Fusarium head blight in wheat.

Timothy J. Close

General Description of Research. The long-term goals are to: 1) promote the development of environmental stress tolerance in plants through genetics and efficient cultural and post-harvest practices, and 2) use novel properties of plant stress proteins to develop commercial products. In general, this is a continuum of fundamental research, agricultural science, and biotechnology. Until recently, this effort has revolved around one family of proteins known as dehydrins (DHNs). DHNs are lipid-associating proteins produced in plants in response to low non-freezing temperatures or any environmental influence with a dehydrative component, including seed development, drought stress, freezing temperatures, and osmotic stress. A survey of the distribution of DHNs has revealed that organisms as distant from plants as cyanobacteria can produce immunologically related proteins during osmotic stress. There are also functionally related proteins in animals.

Current Research. DHNs have been purified from plants and genetically engineered Escherichia coli strains, as well as from the cyanobacterium Anabaena, for in vitro biochemical studies in work sponsored by the National Science Foundation and University of California Biotechnology Research and Education Program. Dr. Close’s laboratory demonstrated by immunocytochemical methods that plant dehydrins are present in the nucleus and cytoplasm of various cell types in the maize embryo, and by in vitro methods that DHNs are structured in association with lipids. The discovery that a cluster of barley dehydrin (Dhn) genes co-segregates with winter hardiness QTL is under further investigation in studies supported by the United States Department of Agriculture (USDA) to test the possibility that a Dhn gene is a growth-habit or freezing-tolerance determinant in barley and related cereal crop plants. Similarly a 35 kDa DHN of cowpea that is genetically associated with low temperature seed emergence is the subject of studies supported by the USDA through the Southwest Consortium on Plant Genetics and Water Resources. Dr. Close is active in several Triticeae genomics initiatives, from the production of cDNA libraries through allele-trait association studies and microsynteny analyses.

Relationship Between Proposed Activity and Current Research. During the studies described above, various cDNA and genomic libraries have been produced in Dr. Close’s laboratory. For example, Dr. Close’s laboratory recently produced lambda ZAP cDNA libraries from unstressed, cold-stressed and drought-stressed Morex barley seedling shoots and provided them to Rod Wing at Clemson University for arraying and sequencing. In addition, for several years Dr. Close has run a graduate laboratory course called “Plant Genomic Library Construction” and has provided a genomic library construction and distribution service for the North American Barley Genome Mapping Project. The assignment of cDNA construction duties to Dr. Close’s laboratory in the current proposal will ensure that all of the cDNA library construction needs of the group will be readily met. Dr. Close’s experience with the dispersed Dhn multigene family will also provide an important perspective in the design of EST arrays for gene expression and functional genomics studies in the Triticeae.

J. Dubcovsky

The long-term goal of the research conducted at my laboratory is the understanding of the genetics of vernalization in wheat. We have established detailed comparative maps of the most important vernalization genes in diploid wheat: Vrn-1 and Vrn-2. Crosses between spring and winter diploid wheats were used to generate large high-resolution genetic maps of vernalization genes. High throughput molecular marker technologies and regional targeting strategies are being used to develop closely linked markers encompassing the vernalization genes. The most closely linked markers flanking each vernalization gene were used to screen Bacterial Artificial Chromosome (BAC) libraries. My laboratory is responsible of the construction of a Triticum monococcum BAC library that currently have 140,000 BAC clones with inserts of 115 kilobases, representing 3.0 genome equivalents. The ends of the selected BAC are being mapped in the high-resolution genetic maps to select BAC clones encompassing the vernalization genes.

The tools that will be developed in this proposal will greatly accelerate the positional cloning efforts from our laboratory. The construction of microarrays including cDNAs from vernalized, unvernalized, and devernalized shoot apexes and young leaves will facilitate the identification of candidate genes based on their expression patterns. The sequence information that will be provided by this effort will be integrated to the expression patterns and the chromosome location to identify candidate genes.

This proposal is essential to incorporate new functional genomic technologies into our research program. The tools developed in this proposal in conjunction with the available BAC libraries will provide the appropriate tools for future cloning efforts of important agronomic genes. My laboratory is working in the preliminary QTL mapping of some of these traits and will greatly benefit for the availability of these genomic tools in wheat.

Kulvinder S. Gill

Currently, I am working on three major projects: (i) Understand the mechanism of chromosome pairing regulation in wheat and other polyploids. I am planning to accomplish this objective by first cloning the Ph1, a chromosome pairing regulator of wheat. The long term goal is to answer the following questions about the fundamental processes of chromosome pairing and recombination: (a) how do chromosome pairing process distinguish homologs from homoeologs, (b) during which stage of meiosis the chromosome pairing regulator genes are active, (c) how do these genes fit into the current model of chromosome pairing and recombination, (d) are these genes functional only in fertile polyploid species or are they universally present, and (e) what is the relationship between the Ph1 gene and other genes with a similar function such as mismatch repair genes characterized in E. coli and yeast. The second project is to understand the structural and functional organization of gene-rich regions of wheat. We have observed that most of the genes in wheat are present in clusters, chromosomal regions encompassing which are very small and submicroscopic. We have identified a submicroscopic gene cluster region on the short arm of wheat chromosome group 1. The region contains 42 agronomically important genes and is marked with more than 60 DNA probes. The region is flanked by the breakpoints of two deletion lines. The objective is to construct a BAC-based contiguous map of the region. The proposed contiguous map construction will reveal exact size, gene density, and relationship between physical and genetic distances for the region. This information will be valuable for molecular cloning of the genes present in the region. In congruence with this project, we have a third project on using flow sorted chromosomes for functional genomics of wheat. Wheat has a wealth of cytogenetic stocks and one such stock is ditelosomic lines which carry all the normal chromosome complement of wheat chromosomes except for one chromosome for which one arm is missing. Such stocks are available for all wheat chromosomes. We used ditelo 1DS of wheat for flow sorting and have isolated chromosome 1DS with 95% purity. We are currently generating chromosome arm specific probes and libraries using these sorted 1DS chromosomes. In the current proposal, we are proposing to sort all wheat chromosome arms and use them as probes to screen wheat cDNA library in order to make chromosome arm specific cDNA pools.

The major relationship between the current research effort and the proposed project is that currently I am trying to do the same for only a part of the genome and very inefficiently. The proposed project will target the whole wheat genome and we are approaching it in such a way that we clone and characterize wheat specific genes and in the process, generate many of the resources and infrastructure required for efficient and targeted functional genomics of worldÿs most important food crop.

J. Perry Gustafson

J.P. Gustafson's current research emphasizes further understanding of the genetics, cytogenetics, and evolution of wheat and its many relatives; by developing technology for introgressing new and alien germplasm into wheat which can be utilized in breeding programs as well as evaluating levels of alien gene expression. Specific major objectives of his current program continue to include the following: a) To investigate new approaches and procedures for obtaining potentially useful information utilizing new wheat and alien species germplasm, especially that from rye (Secale cereale L.) in order to better understand the structural organization of genes and sequences in cereals, and how to better manipulate those genes for cereal improvement. b) To conduct research on mechanisms for the control of nuclear and extranuclear inheritance in wheat and that of alien species when inserted into wheat. c) To work in collaboration with U.S. breeders in order to develop more precise and efficient methods for the improvement of commercial wheat varieties. d) To assess cytological and molecular techniques for the identification, and analysis of genetic variants. e) To develop techniques for the identification of alien gene complexes when they have been placed into a host background. (This has centered around the use of in situ hybridization for the detection of unique sequence DNA markers.) f) To study the mechanisms involved in controlling alien gene expression or suppression when placed in a host background. g) To continue to isolate and characterize DNA fingerprinting sequences in various cereal species for use in fingerprinting germplasm to enhance breeding programs. h) To establish that genotypic competition occurs within the environment where the material is being grown and that nucleotypic changes (i.e., amplification or deletion or redistribution of noncoded DNA and heterochromatin) can occur in the germplasm. Therefore the creation of ESTs from stressed cereal cDNA libraries and the subsequent analyses of their location and possible function will fit very well into the existing program.

Shahryar F. Kianian

The objectives of my program, wheat germplasm enhancement, are to 1) identify DNA markers for genes conditioning disease resistance, grain quality and overall productivity in durum and hard red spring wheat; 2) transfer genes for disease resistance, grain quality and overall productivity into adapted durum and hard red spring wheat germplasm; and 3) study and implement improved techniques for facilitating gene transfer into adapted wheat germplasm. The approach we have taken to address the first two objectives is to develop populations and/or cytogenetic stocks segregating for Fusarium head blight resistance, tan spot resistance, grain protein, grain starch and yield components. These populations are being evaluated for the trait under investigation and subjected to DNA marker analysis to identify genomic regions containing genes associated with the phenotype. Backcross populations developed by crossing germplasm sources having the desired phenotypes with elite durum and spring wheat lines will be subjected to both phenotypic and marker-assisted selection. Sequence-tagged-site markers will be developed for the genomic regions identified and used to aid in the selection of superior genotypes.

Even though wild species in the Triticeae family represent a vast reservoir of genes for improvement of pest resistance, grain quality, and agronomic fitness of wheat. Chromosome asynapsis and hybrid sterility are major obstacles to alien gene transfer, and genes producing nuclear-cytoplasmic (NC) interactions are directly or indirectly involved. A better understanding of NC compatibility and genic interactions in interspecific and intergeneric hybrids would broaden usage of alien species and improve gene transfer in wheat breeding. Thus, in a collaborative effort with Dr. S.S. Maan, my laboratory is investigating the molecular basis of NC interactions. The proposed project will provide the necessary tools to identify and clone the genes involved in these interactions and facilitate the ongoing effort of gene transfer from wild Triticeace into adapted wheat germplasm

As demonstrated, “The structure, function and evolution of the expressed portion of the wheat” will be an extension of wheat germplasm enhancement ongoing projects. Cloning, sequencing, tagging and mapping important regions of the wheat genome will greatly enhance our ability to transfer important genes from wild and related species and develop improved varieties. Wheat is the world’s most important food crop. The recent Fusarium head blight epidemic of the Northern Great Plains causing extensive damage to the US wheat production is a demonstration of how fragile our supply is. The proposed project will help us understand and devise rapid methods of obtaining better varieties to improve wheat production paralleling the increasing demand by ever-growing population.

Nora Lapitan

The long-term goal of my research program is to use genetic resources to facilitate modification of wheat and barley for improved performance, yield and quality. Two of my current projects include genetic mapping of genes for resistance to Russian wheat aphid (RWA) in wheat and resistance to fusarium head blight in barley. The RWA is the most significant pest problem in barley and wheat in the United States, causing economic damage to growers estimated at $475 million from 1987 to 1994 in the US and damage to the environment through the application of thousands of gallons of pesticides for its control. Resistance genes have been identified; however, these are found in unadapted lines or wild relatives of wheat. These genes can be used to breed resistant cultivars but this is a long process that can take at least 10 years. We are developing DNA markers for several RWA resistance genes to be used by breeders as selection tools to expedite breeding of resistant cultivars. For example, DNA marker-assisted selection can cut the breeding time by half if two or more RWA resistance genes are incorporated into a single genotype. Fusarium head blight or scab is the most devastating disease that threatens barley production in the midwest, and is also an important disease in wheat. Since scab is a quantitative trait, breeding for resistance is difficult and inexact. DNA marker-assisted selection will vastly improve the accuracy and speed of this process. The longer-term goal of my research is to clone and characterize RWA resistance genes. High resolution mapping of a resistance gene from rye that has been incorporated into wheat is being conducted using RFLP, AFLP, and microsatellite markers.

The large genome size of wheat (15,000 Mbp) makes the application of positional cloning extremely difficult and impractical. We are currently investigating the gene organization in barley to determine whether barley genes are located in specific domains of the genome. The knowledge gained from the study can be used to develop cloning strategies that are appropriate for barley and its related species, wheat.

The development and genetic mapping of wheat ESTs proposed in this project will provide a valuable resource to our current research in mapping and cloning of RWA resistance genes. The availability of wheat ESTs will open the possibility of using a candidate gene approach in cloning RWA resistance genes. Our laboratory will contribute to this project our expertise in wheat molecular mapping.

Henry T. Nguyen

The overall goal of Henry Nguyen’s laboratory is to determine the genetic basis of plant adaptation to environmental stresses, mainly drought and temperature stress. Research activities include the following goals.

Cloning and determining the role of dehydrins and heat shock proteins in the genetic control of drought and heat tolerance in wheat
Genetic mapping of major genes and QTLs controlling environmental adaptation and crop productivity in stress environments.
High-resolution mapping and positional cloning of major QTLs conferring stress resistance.
Comparative genetics of stress resistance trait loci across the cereal grass genomes.
Development of marker-assisted selection for stress resistance improvement.

Brief description of selected projects is presented below:

Molecular analysis of heritable acquired heat tolerance in wheat: This project is funded by the USDA-NRI program. The long-term goal of the project is to determine if heat shock protein (HSP) genes or other genetic components can be used as genetic markers and whether any specific gene can be identified for direct genetic manipulation to improve heat tolerance of this important cereal crop. In this project, we will use a population of RILs to investigate the molecular basis of acquired thermotolerance. A combination of protein gel electrophoresis, RNA analysis, and differential display methods will be employed to investigate the extent of genetic association between the production of unique HSPs, level of gene expression, and acquired thermotolerance.

High-resolution mapping of QTLs controlling the stay green trait in sorghum: This project is supported by the USDA-NRI Plant Genome program. Previous work has identified two major QTLs controlling the stay green trait, an adaptive post-flowering drought resistance trait in sorghum, from a RIL population of B35 and Tx7000. The current work is aimed at the development of high resolution genetic and physical maps in these two regions in the sorghum genome. The research includes development of near-isogenic lines for these QTLs through marker-assisted backcrossing and physical mapping using BAC libraries in collaboration with the BAC Center at Texas A&M University. Comparative mapping with maize genome is being carried out.

Mapping of QTLs controlling drought resistance in rice (osmotic adjustment, root development, and yield-related traits): This project is supported by the Rockefeller Foundation and in collaboration with the International Rice Research Institute (IRRI). Doubled-haploid and recombinant inbred populations have been developed for genetic mapping and phenotypic evaluation. Several advanced backcrossed populations are being developed for QTL detection and introgression of desirable traits into elite breeding lines. Mapping work is done using DNA probes from the Japan Rice Genome Program and Cornell rice linkage maps.

Molecular mapping of osmotic adjustment and drought resistance in wheat: This project is supported by the BARD program in collaboration with Dr. A. Blum in Israel. Two RIL populations were developed for this purpose and comparative genetic mapping between wheat and rice will be pursued with particular interest on loci influencing the expression of osmotic adjustment capacity under drought. The PI is a member of the International Triticeace Mapping Initiative with a focus on abiotic stress. Recently, the PI assumed a leadership role to coordinate the Plant Genomics Program at Texas Tech University. The proposed NSF project is a logical extension of these activities which will provide an opportunity for a large scale EST sequencing of the wheat genome, including stress.

Appendix A-2. Management of Intellectual Property

Information developed with support of this NSF-funded project will be promptly submitted for publication with authorship representative of the actual contributions by individual investigators following NSF Grant Proposal Guide, October 1997, Section H. The success of this project depends on linkages among several laboratories, hence multiple laboratory authorship of scientific papers. During the planning of research components, the investigators will be requested to agree upon publication policy for each potential research product, including the names of potential authors, and order of authorship.

A major aspect of this proposal is the development of molecular genetic resources for Triticeae researchers and for the broader plant science community. This will involve the development of EST clones and sequences and cDNA EST arrays which will be available to the entire research community. All data generated by the project will be stored in public databases such as the NCBI dbEST and the USDA/ARS small grains computer database project (GrainGenes). Initially, these materials will be maintained and distributed from the USDA Triticeae molecular genetics resources facility at Albany, CA. After the completion of this project, the genetic resources will be transferred to other facilities; the US distributors for the I.M.A.G.E. consortium (American Type Culture Collection, P.O. Box 1549, Manassas, VA 20108; Genome Systems, Inc., 4633 World Parkway Circle, St. Louis, MO 63134; and Research Genetics, Inc., 2130 Memorial Pkwy., SW, Huntsville, AL 35801), or USDA facilities as structured at that time.

The conduct of this research will utilize information and materials from various public and private sources. In some instances the investigators will be requested to sign agreements with owners of intellectual property to gain access to protected materials. Such agreements must be approved by all investigators in this proposal to prevent misunderstandings about the use of materials and third party distributions.

The ownership of intellectual property rights will be respected according to NSF Grant Proposal Guide, October 1998, Section VII-K. Investigators will be requested to acknowledge the status of their developments which occurred prior to the support of this grant and to clearly identify those discoveries which benefited from financial support of the this grant. Since a key element of this proposal is multi-investigator collaboration, hence multi-institutional, the investigators directly involved in collaboration will be co-inventors of patentable discoveries and ownership will be respected in the filing of patents and ultimate licensing of products derived from those discoveries. This principle must be respected by the all institutions employing investigators contributing to this proposal to ensure equitable recognition to the inventors.

Appendix A-3: Management Plan

Technical Steering Committee:

The project is organized according to the schema of Figure 1 in the Project Description. The PI, C.O. Qualset, will serve as the Chair of the Steering Committee. Each of the four research areas will be represented by a Technical Coordinator:

Olin Anderson for EST development.
Bikram Gill for deletion stock development and distribution and EST mapping.
Mark Sorrells for functional genomics.
Jan Dvorak for structural genomics and comparative genetics

The Steering Committee will review progress and limitations for each of the project objectives and be responsible for coordinating their area at research meeting. They will also work with the investigators in each areas to develop publication plans for the results. They will also be appraised of the financial situation at each laboratory and develop budgets for each year, reflecting the needs. The short-term training course Microarray Technology to be offered by Olin Anderson will likely have more applicants than spaces available, so it will be the task of the Steering Committee to select trainees from the project and to determine if persons outside the project can be accepted.

The Management Office:

This project will be facilitated and coordinated through the UC Genetic Resources Conservation Program (GRCP), a statewide UC program in the Office of the Vice President for Agriculture and Natural Resources located on the UC Davis campus. GRCP has been the home of the International Triticeae Mapping Initiative (ITMI) since its inception in 1989. During these years, ITMI has provided many of the coordination services required for this proposed project, such as organization of meetings, editing and publishing reports, preparing research proposals, receiving and dispersing funds for research subcontracts and PI travel, and employment and deployment of staff to participating laboratories. Based on this experience, ITMI is able to accept the additional responsibility of this new effort in Triticeae genomic research.

C.O. Qualset (PI) will serve as the project director and Chair of the Steering Committee, as noted above. P.E. McGuire will provide management services[25% time], including preparation of subcontracts, detailed planning of workshops and annual meetings, and writing and editing reports. He will service as webmaster for the project. Dr. McGuire is a key person for the project because of his personal interest in this Triticeae research and because, as Associate Director of GRCP, he has great knowledge of UC administrative procedures and authority for management of financial accounts. The project will provide 25% financing of his salary and benefits. No funds are requested for the project director. He has partial recall to service as Director of GRCP and will volunteer his time for work on this project. Based on experience with a similar extramural funded project (by McKnight Foundation) managed by GRCP, the workload for supporting a project of 14 CoPIs and numerous students and postdoctorals requires the part-time service of a program assistant (0.5 FTE requested). The program assistant will be required to handle project-generated purchase orders for the collaborative research of CoPI O. Anderson at nearby Albany, CA, to make arrangements for providing stipends to trainees and subsistence funds to visiting scientists, and distribution of information.

Genetic Resources and Information

The Project Description gives details about the flow of materials to the laboratory producing cDNA libraries, and seeds of wheat deletion stocks to the mapping laboratories, and coordination of the EST clone development and sequencing. The EST clones and sequences and cDNA EST arrays will be available to the entire research community All data generated by the project will be stored in public databases such as the NCBI dbEST and the USDA/ARS small grains computer database project (GrainGenes). Initially, these materials will be maintained and distributed by from a central facility at the USDA/ARS laboratory at Albany, CA. After the completion of this project, these functions will be transferred to the US distributors for the I.M.A.G.E. consortium (American Type Culture Collection, P.O. Box 1549, Manassas, VA 20108; Genome Systems, Inc., 4633 World Parkway Circle, St. Louis, MO 63134; and Research Genetics, Inc., 2130 Memorial Pkwy., SW, Huntsville, AL 35801), or to USDA facilities as structured at that time. Special consideration will be given to the microarrays of Triticeae cDNAs. These will be an invaluable resource to Triticeae researchers world-wide. Since the development of microarrays requires PCR amplification of thousands cDNA inserts, it will not be economical to perform this task repeatedly by each interested investigator or by a public sector research laboratory. The possibility of engaging a commercial enterprise to undertake this will be chosen on competitive basis.

Research Progress and Coordination

This project will hold annual research reporting and coordination meetings prior to the PAG meetings each January in San Diego, CA and will also organize annual working sessions through ITMI. This venue will also be used for 2-day project meetings to be held in advance of the PAG meetings for critical review of progress coordination of results, planning research publications, review of budgets, and training activities. Formal reports will be prepared from these meetings and distributed to participating labs, international collaborators, and advisors to the project. Funds have been budgeted through the Coordination Office for CoPI travel to PAG to support the project meeting and participation in PAG. As noted in the Budget Justification section we have budgeted funds for each PI to travel to one or more labs each year for coordination. The informatics staff, 2 professionals, also will travel to the data centers in New York and California for implementation of programs and data coordination. Graduate students, postdocs and faculty will have opportunity to undertake the Microarray Technology short course offered by Olin Anderson.

Appendix A-4. Collaboration with Outside Groups

International Triticeae Mapping Initiative (ITMI) collaboration

ITMI was formed 1989 to facilitate collaboration among Triticeae researchers to produce public RFLPs maps as quickly as possible. It was greatly enhanced by a five-year grant from NSF/USDA/DOE for research collaboration and a USDA/NRI grant for mapping four chromosome groups. In ITMI the seven chromosome groups were assigned to investigators for coordination of the data and for mapping. The resulting maps, practically finished in 1996 (McGuire and Qualset, 1997), demonstrated that ITMI was a valid concept. It became obvious that additional genetic resources, such as SSRs, were a critical need for advancing the utilization of the genetic maps and to enhance them further. To this end, a research initiative was proposed (ITMI/NABGMP, 1997) which outlined the needs for advancing genomic research in wheat and barley. Part of the needs expressed in that document, namely mapped ESTs, are addressed in this proposal. Investigators in this proposal (B. Gill, J. Dvorak, P. Gustafson, M. Sorrells, and O. Anderson) were original mapping coordinators in the ITMI efforts. New investigators to this proposal are Nora Lapitan, James Anderson, Timothy Close, Kulvindar Gill, Shahyrar Kianian, Kay Walker-Simmons, Henry Nguyen, and Jorge Dubcovsky.

There is great interest in the proposed work on ESTs from the international members of the ITMI community, with letters of expressed collaboration from Australia, Canada, England, and Mexico. Further, ITMI has originated another collaborative group, ITEC, expressly to develop and share ESTs. The present proposal represents the capability of the US laboratories to participate. Briefly, ITEC is described below:

International Triticeae EST Cooperative (ITEC)

At the ITMI Workshop held in conjunction with the 9th International Wheat Genetics Symposium held at the University of Saskatchewan, Saskatoon August 2-7, 1998, a proposal was developed to establish a public database of Expressed Sequence Tags (ESTs) from species of the Triticeae. The target is to have at least 40,000 ESTs available publicly from 1 July 2000. To reach this goal it was suggested that all contributing scientists/organizations would contribute 1,000 sequenced ESTs to a common pool by July 1, 1999. A steering committee [Peter Langridge, Univ. Of Adelaide, Australia, Olin Anderson, USDA-ARS, Perry Gustafson, USDA-ARS, Mike Gale, John Innes Centre, UK, Cal Qualset, ITMI, UC Davis, Pat McGuire, ITMI, UC Davis, USA] has developed and published guidelines for participation. More than 20 labs have joined the effort and more are expected to enroll in the next few months.

This activity was seen as the first stage in developing an international effort to produce a public set of information and materials for Triticeae genome research.

Complementary Proposals [NSF 99-13]

We have discussed the progress of other proposals with PIs and there is cross-over of several investigators on this proposal with others. In all of the proposals there is direct need and use of ESTs being developed in this proposal for work on physical mapping and functional genomics on traits other than the ones related to reproductive fitness under study in this proposal. In particular, the following proposals are specifically planned for collaboration: “Drought and Temperature stress in the Triticeae” [T. Close, senior PI], “An integrated approach to tame the Triticeae genomes” [A. Kleinhofs, senior PI], and Physical Mapping of the Wheat D Genome, A Large Plant Genome Model”[J. Dvorak, senior PI], and “Comparative Genomics for Transferring Tools and Information to Wheat and Barley" [P. S. Baeniziger, senior PI].

The present project will be supportive of the national effort to solve a crisis-level problem in wheat and barley production (Fusarium head blight). The disease can only be practically controlled by host plant resistance and locating genes has been particularly elusive for this disease. The EST libraries will have immediate value to the researchers on Fusarium.

Student training and promotion of involvement of underrepresented groups

Training is strongly addressed in the proposed research. For example, funds are requested for most of the laboratories to employ undergraduate students to gain experience in all aspects of this project. At UC Riverside, an undergraduate class will have direct involvement in the cDNA library preparations. All institutions represented have active graduate programs in plant sciences. Graduate students affiliated with this project will be offered a training course in Microarray Technology at the laboratory of Olin Andersin, which will directly support their thesis research. Visiting scientists (not funded on this project) will be encouraged to join the participating labs for short-term training. Several universities have summer programs for underrepresented groups. These students will be recruited to participating labs for special training opportunities and to enhance the graduate student recruitment pool.

(frames version of this page is available)

[Project home] [Project progress]
[wEST Page] [GrainGenes]