Annotation

From AAAWiki

Contents

Annotation Plan

  • Please see the summary of the workshop held at the April 2006 Fly meeting for a description of the annotation plans for these 12 species.
  • Please see the Annotation Coordination page for ongoing updates to these plans and other relevant information (such as ID codes for annotation submissions, etc...).

Externally available annotations of CAF1 assemblies

  • Chris Smith of the DHGP and Robert Edgar have generated de novo repeat predictions for the 12 Drosophilids and repeatmasked their genomes. GFF3 and FASTA are available here here
  • Don Gilbert at Indiana University has generated protein coding gene predictions using Ian Korf's SNAP gene predictions system. The results are available here.
  • Sourav Chatterji and Lior Pachter (UC Berkeley) have generated protein coding gene predictions with 5'/3' UTR annotations using GeneMapper. GeneMapper annotations are obtained by mapping the FlyBase D. Melanogaster annotations using Mercator homology maps. The results are available here
  • Casey Bergman (University of Manchester) and Dave Ardell (Uppsala University) have generated tRNA gene predictions using a tRNAscan-SE/Aragorn pipeline. The results are available here.
  • Charles Robin, Robert Good, and Lloyd Low (University of Melbourne) have generated manual annotation on all cytochrome P450s and cytosolic GSTs for all 12 Drosophila species CAF1. The results are available here.
  • Venky Iyer, Daniel Pollard and Michael Eisen (UC Berkeley) have generated protein coding gene annotations using GeneMapper and Exonerate. Results are available here. GeneWise annotations and orthology/paralogy assignments are coming soon.
  • Anat Caspi and Lior Pachter (UC Berkeley) have posted annotations of repeat elements here.
  • Chris Ponting's group at the University of Oxford have generated a set of gene predictions by mapping D. melanogaster peptides to other species using Guy Slater's Exonerate. The results are available here.

Annotations of older assemblies

  • UCSC has annotations and browsers for most of the sequenced Drosophila species which can be accessed from the individual species pages or directly at UCSC
  • Venky Iyer and Daniel Pollard in Michael Eisen's group at UC Berkeley/LBNL have developed a homology-based gene-finding pipeline and applied it to all available genomes. These annotations can be browsed and searched through GBROWSE browsers accessible on each species page, or can be downloaded here
  • Ian Holmes at UC Berkeley has performed whole genome scans for conserved RNA secondary structures using Joten Hein's PFOLD. The results are available here.

Annotation Desirables

  • Gene predictions for all 12 species
  • EisenLab annotation pipeline builds protein-coding gene models in each species based on D. melanogaster annotations
  • Sourav's GeneMapper transfers annotations from D. melanogaster to 11 other species
  • Lior Pachter's group will run de-novo gene predictors on all species
  • [note: UCSC has GenScan, etc. in their pipeline, Don Gilbert has SNAP predictions on CAF1]

Desirable analyses