The DDBJ/EMBL/GenBank Feature Table Definition

The DDBJ/EMBL/GenBank 
Feature Table:
Definition

Version 8  Oct 2008



DNA Data Bank of Japan, Mishima, Japan.
EMBL Nucleotide Sequence Database, Cambridge, UK.
GenBank, NCBI, Bethesda, MD, USA.


1 Introduction
2 Overview of the Feature Table format
2.1 Format Design
2.2 Key aspects of this feature table design
2.3 Feature Table Terminology
3 Feature table components and format
3.1 Naming conventions
3.2 Feature keys
3.2.1 Purpose
3.2.2 Format and conventions
3.2.3 Key groups and hierarchy
3.2.4 Feature key examples
3.3 Qualifiers
3.3.1 Purpose
3.3.2 Format and conventions
3.3.3 Qualifier values
3.3.4 Qualifier examples
3.4 Feature labels
3.4.1 Purpose
3.4.2 Format and conventions
3.4.3 Examples of feature labels
3.5 Location
3.5.1 Purpose
3.5.2 Format and conventions
3.5.3 Location examples
4 Feature table Format
4.1 Format examples
4.2 Definition of line types
4.3 Data item positions
4.4 Use of blanks
5 Examples of sequence annotation
5.1 Eukaryotic gene
5.2 Bacterial operon
5.3 Artificial cloning vector (circular)
5.4 Plasmid
5.5 Repeat element
5.6 Immunoglobulin heavy chain
5.7 T-cell receptor
5.8 Transfer RNA
6 Limitations of this feature table design
7 Appendices
7.1 Appendix I EMBL, GenBank and DDBJ entries
7.1.1 EMBL Format
7.1.2 GenBank Format
7.1.3 DDBJ Format
7.2 Appendix II Feature table: Backus-Naur form
7.3 Appendix III: Feature keys reference
7.4 Appendix IV: Summary of qualifiers for feature keys
7.4.1 Qualifier List
7.4.2 Feature qualifiers - mapped to Feature keys
7.5 Appendix V: Controlled vocabularies
7.5.1 Nucleotide base codes (IUPAC)
7.5.2 Modified base abbreviations
7.5.3 Amino acid abbreviations
7.5.4 Modified and unusual Amino Acids
7.5.5 Genetic Code Tables
7.5.6 Country Names



1 Introduction

Nucleic acid sequences provide the fundamental starting point for describing 
and understanding the structure, function, and development of genetically 
diverse organisms. The GenBank, EMBL, and DDBJ nucleic acid sequence data 
banks have from their inception used tables of sites and features to describe 
the roles and locations of higher order sequence domains and elements within 
the genome of an organism. 
In February, 1986, GenBank and EMBL began a collaborative effort (joined by 
DDBJ in 1987) to devise a common feature table format and common standards for 
annotation practice. 

2 Overview of the Feature Table format

The overall goal of the feature table design is to provide an extensive 
vocabulary for describing features in a flexible framework for manipulating 
them. The Feature Table documentation represents the shared rules that allow 
the three databases to exchange data on a daily basis. 
The range of features to be represented is diverse, including regions which: 
* perform a biological function, 
* affect or are the result of the expression of a biological function, 
* interact with other molecules, 
* affect replication of a sequence, 
* affect or are the result of recombination of different sequences, 
* are a recognizable repeated unit, 
* have secondary or tertiary structure,
* exhibit variation, or have been revised or corrected.


2.1 Format Design
 
The format design is based on a tabular approach and consists of the following 
items: 

Feature key - a single word or abbreviation indicating functional group  
Location - instructions for finding the feature 
Qualifiers - auxiliary information about a feature 
 

2.2 Key aspects of this feature table design 

* Feature keys allow specific annotation of important sequence features.

* Related features can be easily specified and retrieved.
Feature keys are arranged hierarchically, allowing complex and compound 
features to be expressed. Both location operators and the feature keys show 
feature relationships even when the features are not contiguous. The hierarchy 
of feature keys allows broad categories of biological functionality, such as 
rRNAs, to be easily retrieved.

* Generic feature keys provide a means for entering new or undefined features.
A number of "generic" or miscellaneous feature keys have been added to permit 
annotation of features that cannot be adequately described by existing feature 
keys. These generic feature keys will serve as an intermediate step in the 
identification and addition of new feature keys. The syntax has been designed 
to allow the addition of new feature keys as they are required. 

* More complex locations (fuzzy and alternate ends, for example) can be specified.
Each end point of a feature may be specified as a single point, an alternate 
set of possible end points, a base number beyond which the end point lies, or 
a region which contains the end point. 

* Features can be combined and manipulated in many different ways.
The location field can contain operators or functional descriptors specifying 
what must be done to the sequence to reproduce the feature. For example, a 
series of exons may be "join"ed into a full coding sequence. 

* Standardized qualifiers provide precision and parsibility of descriptive details 
A combination of standardized qualifiers and their controlled-vocabulary 
values enable free-text descriptions to be avoided.
 
* The nature of supporting evidence for a feature can be explicitly indicated.
Features, such as open reading frames or sequences showing sequence similarity 
to consensus sequences, for which there is no direct experimental evidence can 
be annotated. Therefore, the feature table can incorporate contributions from 
researchers doing computational analysis of the sequence databases. However, 
all features that are supported by experimental data will be clearly marked as 
such. 

* The table syntax has been designed to be machine parsible.
A consistent syntax allows machine extraction and manipulation of sequences 
coding for all features in the table.
 
2.3 Feature Table Terminology 
The format and wording in the feature table use common biological research 
terminology whenever possible. For example, an item in the feature table such as: 

Key             Location/Qualifiers
CDS             23..400
                /product="alcohol dehydrogenase" 
                /gene="adhI"
 
might be read as: 
The feature  CDS  is a coding sequence beginning at base 23 and ending at base 
400, has a product called 'alcohol dehydrogenase' and is coded for by a gene 
called "adhI".

A more complex description:
Key             Location/Qualifiers
CDS             join(544..589,688..>1032)
                /product="T-cell receptor beta-chain"

which might be read as: 
This feature, which is a partial coding sequence,  is formed by joining 
elements indicated to form one contiguous sequence encoding a product called T-
cell receptor beta-chain. 

The following sections contain detailed explanations of the feature table 
design showing conventions for each component of the feature table, examples 
of how the format might be implemented, a description of the exact column 
placement of all the data items and examples of complete sequence entries that 
have been annotated using the new format. The last section of this document 
describes known limitations of the current feature table design. 

Appendix I gives an example database entry for the DDBJ, GenBank  and EMBL  
formats. 

Appendix II describes the format in Backus-Naur Form (BNF). This information
will not be presented in future editions of this document. 

Appendices III and IV provide reference manuals for the feature table keys and 
qualifiers, respectively. 

Appendix V includes controlled vocabularies such as nucleotide base codes, 
modified base abbreviations, genetic code tables etc.

This document defines the syntax and vocabulary of the feature table. The 
syntax is sufficiently flexible to allow expression of a single biological 
entity in numerous ways. In such cases, the annotation staffs at the databases 
will propose conventions for standard means of denoting the entities. 
This feature table format is shared by GenBank, EMBL and DDBJ. Comments, 
corrections, and suggestions may be submitted to any of the database staffs. 
New format specifications will be added as needed. 
 
3 Feature table components and format
3.1 Naming conventions

Feature table components, including feature keys, qualifiers, accession 
numbers, database name abbreviations, feature labels, and location operators, 
are all named following the same conventions. Component names may be no more 
than 20 characters long  (Feature keys 15, Feature qualifiers 20) and must 
contain at least one letter. Case should not be regarded as significant in 
comparing feature labels ("Prot1" and "pROT1" are the same). The following 
characters are permitted to occur in feature table component names: 

* Uppercase letters (A-Z) 
* Lowercase letters (a-z) Numbers (0-9) 
* Underscore (_) 
* Hyphen (-) 
* Single quotation mark or apostrophe (') 
* Asterisk (*) 


3.2 Feature keys
3.2.1 Purpose

Feature keys indicate 
(1) the biological nature of the annotated feature or 
(2) information about changes to or other versions of the sequence. 
The feature key permits a user to quickly find or retrieve similar features or 
features with related functions. 

3.2.2 Format and conventions

There is a defined list of allowable feature keys, which is shown in Appendix 
III. Each feature must contain a feature key. 
 
3.2.3 Key groups and hierarchy

The feature keys fall into families which are in some sense similar in 
function and which are annotated in a similar manner. A functional family may 
have a "generic" or miscellaneous key, which can be recognized by the 'misc.' 
prefix, that can used for instances not covered by the other defined keys of 
that group. 

The feature key groups are listed below with a short definition and an 
annotation example: 

1. Difference and change features 
Indicate ways in which a sequence should be changed to produce a different 
"version": 
misc_difference location
              /replace="change_location"

2. Expression signal features
Indicate regions containing a signal that alters a biological function: 
misc_signal     location

3. Transcript features
 Indicate products made by a region: 
misc_RNA        location

4. Binding features
Indicate that a sequence or nucleotide is covalently, non-covalently, or 
otherwise bound to something else: 
misc_binding    location
              /bound_moiety="bound molecule" 

5. Repeat features
Indicate repetitive sequence elements: 
repeat_region   location

6. Recombination features
Indicate regions that have been either inserted or deleted by recombination: 
misc_recomb     location

7. Structure features
Indicate sequence for which there is secondary or tertiary structural 
information: 
misc_structure  location

In addition to the functional groupings shown above, the feature keys can also 
be arranged in a hierarchical tree based on the degree of specificity or level 
of detail known about a feature. This hierarchy is shown in outline form in 
Appendix III where the most general level is the 'misc_feature' key and other 
keys are arranged in increasing level of detail. By using more general keys, 
features can be annotated even if their biological functions are 
insufficiently well characterized to assign them more specific keys. 

3.2.4 Feature key examples

Key                     Description     

CDS                     Protein-coding sequence 
RBS                     ribosome binding site
rep_origin              Origin of replication
protein_bind            Protein binding site on DNA
tRNA                    mature transfer RNA

See Appendix III for descriptions of all feature keys. 

3.3 Qualifiers

3.3.1 Purpose

Qualifiers provide a general mechanism for supplying information about 
features in addition to that conveyed by the key and location. 

3.3.2 Format and conventions

Qualifiers take the form of a slash (/) followed by the qualifier name and, if 
applicable, an equal sign (=) and a value. Each qualifier should have a single 
value; if multiple values are necessary, these should be represented by 
iterating the same qualifier, eg: 
Key             Location/Qualifiers

CDS             1..1000
                /codon=(seq:"cug",aa:Ser)
                /codon=(seq:"tga",aa:Trp)

If the location descriptor does not need a continuation line, the first 
qualifier begins a new line in the feature location column. If the location 
descriptor requires a continuation line, the first qualifier may follow 
immediately after the location. Any necessary continuation lines begin in the 
same column. See Section 4 for a complete description of data item positions. 
 

3.3.3 Qualifier values 

Since qualifiers convey many different types of information, there are several value formats: 
1. Free text 
2. Controlled vocabulary or enumerated values 
3. Citation or reference numbers 
4. Sequences 
5. Feature labels 

3.3.3.1 Free text

Most qualifier values will be a descriptive text phrase which must be enclosed 
in double quotation marks. When the text occupies more than one line, a single 
set of quotation marks is required at the beginning and at the end of the 
text. The text itself may be composed of any printable characters (ASCII 
values 32-126 decimal). If double quotation marks are used within a free text 
string, each set (") must be 'escaped' by placing a second double quotation 
mark immediately before it (""). For example: 
              /note="This is an example of ""escaped"" quotation marks"

3.3.3.2 Controlled vocabulary or enumerated values

Some qualifiers require values from a controlled vocabulary and are entered 
without quotation marks. For example, the '/direction' qualifier has only 
three values: 'left', 'right' or 'both'. Qualifier value controlled 
vocabularies, like feature table component names, must be treated as 
completely case insensitive: they may be entered and displayed in any 
combination of upper and lower case ('/direction=Left' '/direction=left' and '/
direction=LEFT' are all legal and all convey the same meaning). The database 
staffs reserve the right to regularize the case of qualifier values in the 
interest of readability, unlike the case of feature labels where the databases 
will maintain the case as originally entered (see Section 3.4.2). Qualifier 
value controlled vocabularies will be maintained by the cooperating database 
staffs. Examples of controlled vocabularies can be found in Appendices IV and 
V. The database staff should be contacted for the current lists. 

3.3.3.3 Citation or reference numbers

The citation or published reference number (as enumerated in the entry 
'REFERENCE' or 'RN' data item) should be enclosed in square brackets 
(e.g., [3]) to distinguish it from other numbers. 

3.3.3.4 Sequences

Literal sequence of nucleotide bases e.g., join(12..45,"atgcatt",988..1050) in 
location descriptors has become illegal starting from implementation of 
version 2.1 of the Feature Table Definition Document (December 15, 1998) 

3.3.4 Qualifier examples

Key             Location/Qualifiers

source          1..1509
                /organism="Mus musculus"
                /strain="CD1"
                /mol_type="genomic DNA"
promoter        <1..9
                /gene="ubc42"
mRNA            join(10..567,789..1320)
                /gene="ubc42"
CDS             join(54..567,789..1254)
                /gene="ubc42"
                /product="ubiquitin conjugating enzyme"
                /function="cell division control"

3.4 Feature labels

The /label= qualifier takes as its value a feature label. Feature labels 
follow the same naming conventions as other feature table components (e.g., 
keys and qualifiers). While feature labels are optional, attaching a label to 
a feature allows it to be referred to unambiguously. For example, the feature 
label can be used to refer unambiguously to a coding region that exists in a 
different entry to the exons of which it is comprised.

3.4.1 Purpose

The feature label identifies a feature item within an entry and, when combined 
with the entry's primary accession number and the name of the database from 
which it came, is a permanent internationally unique tag for that feature. 
There are, however, certain situations in which a "permanent" feature may "
disappear" from the distributed version of the database and others in which it 
may be desirable to change a feature's label.  

3.4.2 Format and conventions

Each feature in a feature table may have a label which must be unique within 
that entry, but which may be the same as feature labels used in other entries. 
A feature can be given any label. However, labels containing meaningful 
abbreviations will be much more easily remembered than non-descriptive labels. 
Because letter case is not significant, two features within one entry cannot 
have labels that differ only in case: '16S_rRNA' and '16s_rRNA' could not both 
be used in the same entry. 
The full feature name syntax is as follows: 
          Database name::primary accession number:feature label
References to a feature should use as much of the full feature name as 
required to unambiguously identify the feature. 

3.4.3 Examples of feature labels

Feature label           Description     

adhI                    adhI gene coding for alcohol dehydrogenase
tfp35                   tail fiber protein 35
3'-ltr                  long terminal repeat
a1col_x51               prepro-alpha-1-collagen, exon 51
X10045:diff1            first conflict for the sequence of entry X10045
GB::K10675:catexA       feature with label catexA in entry K10675 of the
                        GenBank databank

3.5 Location
3.5.1 Purpose

The location indicates the region of the presented sequence which corresponds 
to a feature. 

3.5.2 Format and conventions
The location contains at least one sequence location descriptor and may 
contain one or more operators with one or more sequence location descriptors. 
Base numbers refer to the numbering in the entry. This numbering designates 
the first base (5' end) of the presented sequence as base 1. 
Base locations beyond the range of the presented sequence may not be used in 
location descriptors, the only exception being location in a remote entry (see 
3.5.2.1, e).  

Location operators and descriptors are discussed in more detail below.  

 3.5.2.1 Location descriptors
The location descriptor can be one of the following: 
(a) a single base number
(b) a site between two indicated adjoining bases
(c) a single base chosen from within a specified range of bases (not allowed for new
    entries)
(d) the base numbers delimiting a sequence span
(e) a remote entry identifier followed by a local location descriptor
    (i.e., a-d)

A site between two adjoining nucleotides, such as endonucleolytic cleavage 
site, is indicated by listing the two points separated by a carat (^). The 
permitted formats for this descriptor are n^n+1 (for example 55^56), or, for 
circular molecules, n^1, where "n" is the full length of the molecule, ie 
1000^1 for circular molecule with length 1000.

A single base chosen from a range of bases is indicated by the first base 
number and the last base number of the range separated by a single period 
(e.g., '12.21' indicates a single base taken from between the indicated 
points). From October 2006 the usage of this descriptor is restricted : 
it is illegal to use "a single base from a range" (c) either on its own or 
in combination with the "sequence span" (d) descriptor for newly created entries. 
The existing entries where such descriptors exist are going to be retrofitted. 

Sequence spans are indicated by the starting base number and the ending base 
number separated by two periods (e.g., '34..456'). The '<' and '>' symbols may 
be used with the starting and ending base numbers to indicate that an end 
point is beyond the specified base number. The starting and ending base 
positions can be represented as distinct base numbers ('34..456') or a site 
between two indicated adjoining bases. 

A location in a remote entry (not the entry to which the feature table 
belongs) can be specified by giving  the accession-number and sequence version 
of the remote entry, followed by a colon ":", followed by a location 
descriptor which applies to that entry's sequence (i.e. J12345.1:1..15, see 
also examples below) 

3.5.2.2 Operators

The location operator is a prefix that specifies what must be done to the 
indicated sequence to find or construct the location corresponding to the 
feature. A list of operators is given below with their definitions and most 
common format. 

complement(location) 
Find the complement of the presented sequence in the span specified by "
location" (i.e., read the complement of the presented strand in its 5'-to-3' 
direction) 

join(location,location, ... location) 
The indicated elements should be joined (placed end-to-end) to form one 
contiguous sequence 

order(location,location, ... location) 
The elements can be found in the 
specified order (5' to 3' direction), but nothing is implied about the 
reasonableness about joining them 

Note : location operator "complement" can be used in combination with either "
join" or "order" within the same location; combinations of "join" and "order" 
within the same location (nested operators) are illegal.



3.5.3 Location examples 

The following is a list of common location descriptors with their meanings: 

Location                  Description   

467                       Points to a single base in the presented sequence 

340..565                  Points to a continuous range of bases bounded by and
                          including the starting and ending bases

<345..500                 Indicates that the exact lower boundary point of a feature
                          is unknown.  The location begins at some  base previous to
                          the first base specified (which need not be contained in 
                          the presented sequence) and continues to and includes the 
                          ending base 

<1..888                   The feature starts before the first sequenced base and 
                          continues to and includes base 888

1..>888                   The feature starts at the first sequenced base and 
                          continues beyond base 888

102.110                   Indicates that the exact location is unknown but that it is 
                          one of the bases between bases 102 and 110, inclusive

123^124                   Points to a site between bases 123 and 124

join(12..78,134..202)     Regions 12 to 78 and 134 to 202 should be joined to form 
                          one contiguous sequence


complement(34..126)       Start at the base complementary to 126 and finish at the 
                          base complementary to base 34 (the feature is on the strand 
                          complementary to the presented strand)


complement(join(2691..4571,4918..5163))
                          Joins regions 2691 to 4571 and 4918 to 5163, then 
                          complements the joined segments (the feature is on the 
                          strand complementary to the presented strand) 

join(complement(4918..5163),complement(2691..4571))
                          Complements regions 4918 to 5163 and 2691 to 4571, then 
                          joins the complemented segments (the feature is on the 
                          strand complementary to the presented strand)
  
J00194.1:100..202         Points to bases 100 to 202, inclusive, in the entry (in 
                          this database) with primary accession number 'J00194'
 
join(1..100,J00194.1:100..202)
                          Joins region 1..100 of the existing entry with the region
                          100..202 of remote entry J00194


4 Feature table Format

The examples below show the preferred sequence annotations for a number of 
commonly occurring sequence types. These examples may not be appropriate in 
all cases but should be used as a guide whenever possible. This section 
describes the columnar format used to write this feature table in "flat-file" 
form for distributions of the database. 

4.1 Format examples

Feature table format example (EMBL): 
     source          1..1859
                     /db_xref="taxon:3899"
                     /organism="Trifolium repens"
                     /tissue_type="leaves"
                     /clone_lib="lambda gt10"
                     /clone="TRE361"
                     /mol_type="genomic DNA"
     CDS             14..1495
                     /db_xref="MENDEL:11000"
                     /db_xref="SWISS-PROT:P26204"
                     /note="non-cyanogenic"
                     /EC_number="3.2.1.21"
                     /product="beta-glucosidase"
                     /protein_id="CAA40058.1"
                     /translation="MDFIVAIFALFVISSFTITSTNAVEASTLLDIGNLSR.......
---------+---------+---------+---------+---------+---------+---------+---------
1       10        20        30        40        50        60        70       79

Feature table format example (GenBank):

     source          1..8959
                     /organism="Homo sapiens"
                     /db_xref="taxon:9606"
                     /mol_type="genomic DNA"
     gene            212..8668
                     /gene="NF1"
     CDS             212..8668
                     /gene="NF1"
                     /note="putative"
                     /codon_start=1
                     /product="GAP-related protein"
                     /protein_id="AAA59924.1"
                     /translation="MAAHRPVEWVQAVVSRFDEQLPIKTGQQNTHTKVSTE.......
---------+---------+---------+---------+---------+---------+---------+---------
1       10        20        30        40        50        60        70       79

Feature table format example (DDBJ):

 
     source          1..2136
                     /clone="pK28"
                     /organism="Rattus norvegicus"
                     /strain="Sprague-Dawley"
                     /tissue_type="kidney"
                     /mol_type="genomic DNA" 
     mRNA            19..2128
     CDS             31..1212
                     /codon_start=1
                     /function="Dual specificity protein tyrosine/threonine
                     kinase"
                     /product="MAP kinase kinase"
                     /protein_id="BAA02603.1"
                     /translation="MPKKKPTPIQLNPAPDGSAVNGTSSAETNLEALQKKL.......
---------+---------+---------+---------+---------+---------+---------+---------
1       10        20        30        40        50        60        70       79


4.2 Definition of line types
The feature table consists of a header line, which contains the column titles 
for the table, and the individual feature entries. Each feature entry is 
composed of a feature descriptor line and qualifier and continuation lines, 
if needed. The feature descriptor line contains the feature's name, key, and 
location. If the location cannot be contained on the first line of the feature 
descriptor, it is continued on a continuation line immediately following the 
descriptor line. If the feature requires further attributes, feature qualifier 
lines immediately follow the corresponding feature descriptor line (or its 
continuation). Qualifier information that cannot be contained on one line 
continues on the following continuation lines as necessary.
 
Thus, there are 4 types of feature table lines: 
      Line type            Content                 #/entry     #/feature
      ---------            -------                 -------     ---------

      Header               Column titles           1*          N/A
      Feature descriptor   Key and location        1 to many*  1
      Feature qualifiers   Qualifiers and values   N/A         0 to many
      Continuation lines   Feature descriptor or   0 to many   0 to many
                           qualifier continuation


4.3 Data item positions

The position of the data items within the feature descriptor line is as follows: 
     column position    data item
     ---------------    ---------

     1-5                blank 
     6-20               feature key
     21                 blank
     22-80              location

Data on the qualifier and continuation lines begins in column position 22 (the 
first 21 columns contain blanks). The EMBL format for all lines differs from 
the GenBank / DDBJ formats  that it includes a line type abbreviation in 
columns 1 and 2. 

4.4 Use of blanks

Blanks (spaces) may, in general, be used within the feature location and 
qualifier values to make the construction more readable. The following rules 
should be observed: 
* Names of feature table components may not contain blanks (see Section 3.1) 
* Operator names may not be separated from the following open parenthesis (the 
  beginning of the operand list) by blanks. 
* Qualifiers may not be separated from the preceding slash or the following 
  equals sign (if one) by blanks 


5 Examples of sequence annotation

The examples below show the preferred sequence annotations for a number of 
commonly occurring sequence types. These examples may not be appropriate in 
all cases but should be used as a guide whenever possible.

5.1 Eukaryotic gene 

source             1..1509
                   /organism="Mus musculus"
                   /strain="CD1"
                   /mol_type="genomic DNA"
promoter           <1..9
                   /gene="ubc42"
mRNA               join(10..567,789..1320)
                   /gene="ubc42"
CDS                join(54..567,789..1254)
                   /gene="ubc42"
                   /product="ubiquitin conjugating enzyme"
                   /function="cell division control"
                   /translation="MVSSFLLAEYKNLIVNPSEHFKISVNEDNLTEGPPDTLY
                   QKIDTVLLSVISLLNEPNPDSPANVDAAKSYRKYLYKEDLESYPMEKSLDECS
                   AEDIEYFKNVPVNVLPVPSDDYEDEEMEDGTYILTYDDEDEEEDEEMDDE"
exon               10..567
                   /gene="ubc42"
                   /number=1
intron             568..788
                   /gene="ubc42"
                   /number=1
exon               789..1320
                   /gene="ubc42"
                   /number=2
polyA_signal       1310..1317
                   /gene="ubc42"



 
5.2 Bacterial operon

source                  1..9430
                        /organism="Lactococcus sp."
                        /strain="MG1234"
                        /mol_type="genomic DNA"
operon                  160..6865
                        /operon="gal"
-35_signal              160..165
                        /operon="gal"
                        /experiment="experimental evidence, no additional details
                        recorded" 
-10_signal              179..184
                        /operon="gal"
                        /experiment="experimental evidence, no additional details
                        recorded" 
CDS                     405..1934
                        /operon="gal"
                        /gene="galA"
                        /product="galactose permease"
                        /function="galactose transporter"
                        /experiment="experimental evidence, no additional details
                        recorded" 
CDS                     2003..3001
                        /operon="gal"
                        /gene="galM"
                        /product="aldose 1-epimerase"
                        /EC_number="5.1.3.3"
                        /function="mutarotase"
CDS                     3235..4537
                        /operon="gal"
                        /gene="galK"
                        /product="galactokinase"
                        /EC_number="2.7.1.6"
                        /experiment="experimental evidence, no additional details
                        recorded" 
mRNA                    189..6865
                        /operon="gal"
                        /experiment="experimental evidence, no additional details
                        recorded" 


5.3 Artificial cloning vector (circular)

source                  1..5300
                        /organism="Cloning vector pABC"
                        /lab_host="Escherichia coli"
                        /mol_type="other DNA"
                        /focus
source                  1..5138
                        /organism="Escherichia coli"
                        /mol_type="other DNA"
                        /strain="K12"
source                  5139..5247
                        /organism="Aequorea victoria"
                        /mol_type="other DNA"
                        /dev_stage="adult"
source                  5248..5300
                        /organism="Escherichia coli"
                        /mol_type="other DNA"
                        /strain="K12"
CDS                     join(complement(<1..799),complement(5080..5120))
                        /gene="mob1"
                        /product="mobilization protein 1"
CDS                     complement(1697..2512)
                        /gene="Km"
                        /product="kanamycin resistance protein"
CDS                     3037..3711
                        /gene="rep1"
                        /product="replication protein 1"
CDS                     complement(4170..4829)
                        /gene="Cm"
                        /product="chloramphenicol resistance protein"
CDS                     5139..5247
                        /gene="GFP"
                        /product="green fluorescent protein" 



5.4 Plasmid 5.4 Plasmid

source                  1..2245
                        /organism="Escherichia coli"
                        /plasmid="Plasmid XYZ"
                        /strain="K12"
                        /mol_type="genomic DNA"
rep_origin              6
                        /direction=LEFT
                        /note="ori"
CDS                     join(complement(567..795),complement(21..349))
                        /gene="trbC"
                        /product="transfer protein C"
CDS                     803..1344
                        /gene="traN"
                        /product="transfer protein N"
CDS                     1559..1985
                        /gene="incA"
                        /product="incompatability protein A"
CDS                     join(2004..2195,3..20)
                        /gene="finP"
                        /product="fertility inhibition protein P"
5.5 Repeat element

source                  1..1011
                        /organism="Homo sapiens"
                        /clone="pha281u/1DO"
                        /mol_type="genomic DNA"
repeat_region           80..401
                        /rpt_type=DISPERSED
                        /rpt_family="Alu-J"


5.6 Immunoglobulin heavy chain

source                  1..321
                        /organism="Mus musculus "
                        /strain="BALB/c2
                        /cell_line="hybridoma 1A4"
                        /rearranged
                        /mol_type="mRNA"
CDS                     <1..>321
                        /codon_start=1
                        /gene="VFM1-DFL16.1-JH4"
                        /product="immunoglobulin heavy chain"
V_region                1..277
                        /gene="VFM1"
                        /product="immunoglobulin heavy chain variable region" 


5.7 T-cell receptor

source                  1..402
                        /organism="Homo sapiens"
                        /sex="male"
                        /cell_type="CD4+ T-lymphocyte"
                        /rearranged
                        /clone="TCR1A.12"
                        /mol_type="mRNA"
sig_peptide             1..54
                        /gene="TCR1A"
CDS                     1..402
                        /gene="TCR1A"
                        /product="T-cell receptor alpha chain"
mat_peptide             55..399
                        /gene="TCR1A"
                        /product="T-cell receptor alpha chain"
V_region                55..327
                        /gene="TCR1A"
J_segment               328..393
                        /gene="TCR1A"
C_region                394..399
                        /gene="TCR1A" 




5.8 Transfer RNA

source          1..2345
                /organism="Yersinia sp."
                /strain="IP134"
                /mol_type="genomic DNA"
-35_signal      644..650
                /gene="tRNA-Leu(UUR)"
tRNA            655..730
                /gene="tRNA-Leu(UUR)"
                /anticodon=(pos:678..680,aa:Leu)
                /product="transfer RNA-Leu(UUR)"
 
6. Limitations of this feature table design

During the development of the feature table design numerous choices between 
simplicity and representational power had to be made. In order to create a 
design which was capable of representing the most common features of 
biological significance, a certain degree of complexity in the syntax was 
guaranteed. However, to limit that level of complexity, certain limitations of 
the design syntax have been accepted. 
 
7. Appendices

7.1 Appendix I EMBL, GenBank and DDBJ entries 

7.1.1 EMBL Format

ID   X64011; SV 1; linear; genomic DNA; STD; PRO; 756 BP.
XX   
AC   X64011; S78972;
XX
SV   X64011.1
XX
DT   28-APR-1992 (Rel. 31, Created)
DT   30-JUN-1993 (Rel. 36, Last updated, Version 6)
XX
DE   Listeria ivanovii sod gene for superoxide dismutase
XX
KW   sod gene; superoxide dismutase.
XX
OS   Listeria ivanovii
OC   Bacteria; Firmicutes; Bacillus/Clostridium group;
OC   Bacillus/Staphylococcus group; Listeria.
XX
RN   [1]
RX   MEDLINE; 92140371.
RA   Haas A., Goebel W.;
RT   "Cloning of a superoxide dismutase gene from Listeria ivanovii by
RT   functional complementation in Escherichia coli and characterization of the
RT   gene product.";
RL   Mol. Gen. Genet. 231:313-322(1992).
XX
RN   [2]
RP   1-756
RA   Kreft J.;
RT   ;
RL   Submitted (21-APR-1992) to the EMBL/GenBank/DDBJ databases.
RL   J. Kreft, Institut f. Mikrobiologie, Universitaet Wuerzburg, Biozentrum Am
RL   Hubland, 8700 Wuerzburg, FRG
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..756
FT                   /db_xref="taxon:1638"
FT                   /organism="Listeria ivanovii"
FT                   /strain="ATCC 19119"
FT                   /mol_type="genomic DNA"
FT   RBS             95..100
FT                   /gene="sod"
FT   terminator      723..746
FT                   /gene="sod"
FT   CDS             109..717
FT                   /db_xref="SWISS-PROT:P28763"
FT                   /transl_table=11
FT                   /gene="sod"
FT                   /EC_number="1.15.1.1"
FT                   /db_xref="GOA:P28763"
FT                   /db_xref="HSSP:P00448"
FT                   /db_xref="InterPro:IPR001189"
FT                   /db_xref="UniProtKB/Swiss-Prot:P28763"
FT                   /product="superoxide dismutase"
FT                   /protein_id="CAA45406.1"
FT                   /translation="MTYELPKLPYTYDALEPNFDKETMEIHYTKHHNIYVTKLNEAVSG
FT                   HAELASKPGEELVANLDSVPEEIRGAVRNHGGGHANHTLFWSSLSPNGGGAPTGNLKAA
FT                   IESEFGTFDEFKEKFNAAAAARFGSGWAWLVVNNGKLEIVSTANQDSPLSEGKTPVLGL
FT                   DVWEHAYYLKFQNRRPEYIDTFWNVINWDERNKRFDAAK"
XX
SQ   Sequence 756 BP; 247 A; 136 C; 151 G; 222 T; 0 other;
     cgttatttaa ggtgttacat agttctatgg aaatagggtc tatacctttc gccttacaat   60
     gtaatttctt ..........                                               120
// 
 
7.1.2 GenBank Format

LOCUS       LISOD                    756 bp    DNA     linear   BCT 30-JUN-1993
DEFINITION  Listeria ivanovii sod gene for superoxide dismutase.
ACCESSION   X64011 S78972
VERSION     X64011.1  GI:44010
KEYWORDS    sod gene; superoxide dismutase.
SOURCE      Listeria ivanovii
  ORGANISM  Listeria ivanovii
            Bacteria; Firmicutes; Bacillales; Listeriaceae; Listeria. 
REFERENCE   1  (bases 1 to 756)
  AUTHORS   Haas,A. and Goebel,W.
  TITLE     Cloning of a superoxide dismutase gene from Listeria ivanovii by
            functional complementation in Escherichia coli and characterization
            of the gene product
  JOURNAL   Mol. Gen. Genet. 231 (2), 313-322 (1992)
  MEDLINE   92140371
REFERENCE   2  (bases 1 to 756)
  AUTHORS   Kreft,J.
  TITLE     Direct Submission
  JOURNAL   Submitted (21-APR-1992) J. Kreft, Institut f. Mikrobiologie,
            Universitaet Wuerzburg, Biozentrum Am Hubland, 8700 Wuerzburg, FRG
FEATURES             Location/Qualifiers
     source          1..756
                     /organism="Listeria ivanovii"
                     /strain="ATCC 19119"
                     /db_xref="taxon:1638"
                     /mol_type="genomic DNA"
     RBS             95..100
                     /gene="sod"
     gene            95..746
                     /gene="sod"
     CDS             109..717
                     /gene="sod"
                     /EC_number="1.15.1.1"
                     /codon_start=1
                     /transl_table=11
                     /product="superoxide dismutase" 
                     /db_xref="GI:44011"
                     /db_xref="GOA:P28763"
                     /db_xref="InterPro:IPR001189"
                     /db_xref="UniProtKB/Swiss-Prot:P28763"
                     /protein_id="CAA45406.1"
                     /translation="MTYELPKLPYTYDALEPNFDKETMEIHYTKHHNIYVTKLNEAVS
                     GHAELASKPGEELVANLDSVPEEIRGAVRNHGGGHANHTLFWSSLSPNGGGAPTGNLK
                     AAIESEFGTFDEFKEKFNAAAAARFGSGWAWLVVNNGKLEIVSTANQDSPLSEGKTPV
                     LGLDVWEHAYYLKFQNRRPEYIDTFWNVINWDERNKRFDAAK"
     terminator      723..746
                     /gene="sod"
ORIGIN      
        1 cgttatttaa ggtgttacat agttctatgg aaatagggtc tatacctttc gccttacaat
       61 gtaatttctt ..........
// 


7.1.3 DDBJ Format

LOCUS       LISOD                    756 bp    DNA     linear   BCT 30-JUN-1993
DEFINITION  Listeria ivanovii sod gene for superoxide dismutase.
ACCESSION   X64011 S78972
VERSION     X64011.1  GI:44010
KEYWORDS    sod gene; superoxide dismutase.
SOURCE      Listeria ivanovii
  ORGANISM  Listeria ivanovii
            Bacteria; Firmicutes; Bacillales; Listeriaceae; Listeria. 
REFERENCE   1  (bases 1 to 756)
  AUTHORS   Haas,A. and Goebel,W.
  TITLE     Cloning of a superoxide dismutase gene from Listeria ivanovii by
            functional complementation in Escherichia coli and characterization
            of the gene product
  JOURNAL   Mol. Gen. Genet. 231 (2), 313-322 (1992)
  MEDLINE   92140371
REFERENCE   2  (bases 1 to 756)
  AUTHORS   Kreft,J.
  TITLE     Direct Submission
  JOURNAL   Submitted (21-APR-1992) J. Kreft, Institut f. Mikrobiologie,
            Universitaet Wuerzburg, Biozentrum Am Hubland, 8700 Wuerzburg, FRG
FEATURES             Location/Qualifiers
     source          1..756
                     /organism="Listeria ivanovii"
                     /strain="ATCC 19119"
                     /db_xref="taxon:1638"
                     /mol_type="genomic DNA"
     RBS             95..100
                     /gene="sod"
     gene            95..746
                     /gene="sod"
     CDS             109..717
                     /gene="sod"
                     /EC_number="1.15.1.1"
                     /codon_start=1
                     /transl_table=11
                     /product="superoxide dismutase" 
                     /db_xref="GOA:P28763"
                     /db_xref="HSSP:P00448"
                     /db_xref="InterPro:IPR001189"
                     /db_xref="UniProtKB/Swiss-Prot:P28763"
                     /protein_id="CAA45406.1"
                     /db_xref="SWISS-PROT:P28763"
                     /translation="MTYELPKLPYTYDALEPNFDKETMEIHYTKHHNIYVTKLNEAVS
                     GHAELASKPGEELVANLDSVPEEIRGAVRNHGGGHANHTLFWSSLSPNGGGAPTGNLK
                     AAIESEFGTFDEFKEKFNAAAAARFGSGWAWLVVNNGKLEIVSTANQDSPLSEGKTPV
                     LGLDVWEHAYYLKFQNRRPEYIDTFWNVINWDERNKRFDAAK"
     terminator      723..746
                     /gene="sod"
BASE COUNT          247 a          136 c          151 g          222 t
ORIGIN      
        1 cgttatttaa ggtgttacat agttctatgg aaatagggtc tatacctttc gccttacaat
       61 gtaatttctt ..........
// 




7.2 Appendix II Feature table: Backus-Naur form

This information will not be presented in future editions of this document. 

Feature table is a mandatory  part of an entry.  Full entry syntax is
specified elsewhere.
feature_table ::= <feature_table_header><feature_table_body> feature_table_header ::= 
FH Key Location/Qualifiers |

FEATURES Location/Qualifiers 

feature_table_body ::= <feature> | <feature_table_body><feature>
At least one feature is required.

feature ::= <feature_key><feature_details>
Key is required, location required, qualifier list optional

feature_key ::= <symbol> | -
feature_details ::= <location><qualifier_list> | <location>
There exists a table of legal keys.

location ::= <absolute_location> | <feature_name> |  

<functional_operator>(<location_list>)

absolute_location ::= <local_location> | <path> : <local_location>

path ::= <database> :: <primary_accession> | <primary_accession>

feature_name ::= <path>:<feature_label> | <feature_label>

feature_label :== <symbol>

local_location ::= <base_position> | <between_position> | <base_range> 

location_list ::= <location> | <location_list>,<location>

functional_operator ::= <symbol>

base_position ::= <integer> | <low_base_bound> | <high_base_bound> | 

<two_base_bound> 

low_base_bound ::= > <integer>

high_base_bound ::= < <integer>

two_base_bound ::= <base_position>.<base_position>

between_position ::= <base_position>^<base_position>

base_range ::= <base_position>..<base_position>

database  ::= <symbol>

primary_accession ::= <symbol>

sequence_character ::= a | b | c | d | g | h | k | m | n | r | s | t | u | v | w | y

qualifier_list ::= <qualifier> | <qualifier_list><qualifier>

qualifier ::= /<qualifier_name> | /<qualifier_name>=<value>

qualifier_name ::= <symbol>

value ::= <simple_value> | (<value_list>) | (<tagged_value_list>)

simple_value ::= <integer> | <location> | <reference_number> | "<text_string>" | 

<symbol>

value_list ::= <value> | <value_list>,<value>

tagged_value_list ::= <tagged_value> | <tagged_value_list>,<tagged_value>

tagged_value ::= <tag>:<value>

tag ::= <symbol>

reference_number ::= [ <unsigned_integer> ]

symbol  ::= <letter> | <symbol><symbol_character> | <symbol_character><symbol>

text_string ::= <string_character>| <text_string><string_character>

unsigned_integer ::= <digit> |  <unsigned_integer><digit>

integer ::= <unsigned_integer> | - <unsigned_integer>

string_character ::= <letter> | <digit> | <punctuation> | ""

symbol_character ::= <up_case_letter> | <low_case_letter> |<digit> | _ | - | ' | *

letter ::= <up_case_letter> | <low_case_letter> 

up_case_letter ::= A | B| ... | Z

low_case_letter ::= a | b | ... | z

digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

punctuation ::= <space> | ! | # | $ | % | & | ' | ( | ) | * | + | , |
 - | . | / | : | ; | < | = | > | ? | @ | [ | \ | ] | ^ | _ | ` | { |
 <bar> | } | ~

bar ::= |

space ::= ascii 32



7.3 Appendix III: Feature keys reference

The following has been organized according to the following format: 
Feature Key             the feature key name
Definition              the definition of the key
Mandatory qualifiers    qualifiers required with the key; if there are no
                        mandatory qualifiers, this field is omitted.
Optional qualifiers     optional qualifiers associated with the key
Organism scope          valid organisms for the key; if the scope is any
                        organism, this field is omitted.
Molecule scope          valid molecule types; if the scope is any molecule
                        type, this field is omitted.
References              citations of published reports, usually supporting the
                        feature consensus sequence
Comment                 comments and clarifications
Abbreviations: 
accnum                  an entry primary accession number
<amino_acid>            abbreviation for amino acid
<base_range>            location descriptor for a simple range of bases
<bool>                  Boolean truth value.  Valid values are yes and no
feature_label           the feature label (follows naming conventions for all
                        feature table components)
<integer>               unsigned integer value
<location>              general feature location descriptor
<modified_base>         abbreviation for modified nucleotide base
[number]                integer representing number of citation in entry's
                        reference list
<repeat_type>           value indicating the organization of a repeated
                        sequence.  
"text"                  any text or character string. Since the string is
                        delimited by double quotes, double quotes may only
                        appear as part of the string if they appear in pairs.
                        For example, the sentence:

                        The feature label "ops-tata" is used with the
                        "promotor" feature key

                        would be formatted thus:

                        "The feature label""ops-tata" " is used with the
                        " "promoter" "  feature key"


           



Feature Key           attenuator


Definition            1) region of DNA at which regulation of termination of
                         transcription occurs, which controls the expression
                         of some bacterial operons;
                      2) sequence segment located between the promoter and the
                         first structural gene that causes partial termination
                         of transcription

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /phenotype="text"

Organism scope        prokaryotes

Molecule scope        DNA


Feature Key           C_region


Definition            constant region of immunoglobulin light and heavy 
                      chains, and T-cell receptor alpha, beta, and gamma 
                      chains; includes one or more exons depending on the 
                      particular chain

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"


Parent Key            CDS

Organism scope        eukaryotes


Feature Key           CAAT_signal


Definition            CAAT box; part of a conserved sequence located about 75
                      bp up-stream of the start point of eukaryotic
                      transcription units which may be involved in RNA
                      polymerase binding; consensus=GG(C or T)CAATCT [1,2].

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Organism scope        eukaryotes and eukaryotic viruses

Molecule scope        DNA

References            [1]  Efstratiadis, A.  et al.  Cell 21, 653-668 (1980)
                      [2]  Nevins, J.R.  "The pathway of eukaryotic mRNA formation"  
                           Ann Rev Biochem 52, 441-466 (1983)


Feature Key           CDS

Definition            coding sequence; sequence of nucleotides that
                      corresponds with the sequence of amino acids in a
                      protein (location includes stop codon); 
                      feature includes amino acid conceptual translation.

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /codon=(seq:"codon-sequence",aa:<amino_acid>)
                      /codon_start=<1 or 2 or 3>
                      /db_xref="<database>:<identifier>"
                      /EC_number="text"
                      /exception="text"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /protein_id="<identifier>"
                      /pseudo
                      /ribosomal_slippage
                      /standard_name="text"
                      /translation="text"
                      /transl_except=(pos:<base_range>,aa:<amino_acid>)
                      /transl_table =<integer>
                      /trans_splicing

Comment               /codon_start has valid value of 1 or 2 or 3, indicating
                      the offset at which the first complete codon of a coding
                      feature can be found, relative to the first base of
                      that feature;
                      /transl_table defines the genetic code table used if
                      other than the universal genetic code table;
                      genetic code exceptions outside the range of the specified
                      tables are reported in /codon or /transl_except qualifiers
                      /protein_id consists of a stable ID portion (3+5 format
                      with 3 position letters and 5 numbers) plus a version 
                      number after the decimal point; when the protein 
                      sequence encoded by the CDS changes, only the version 
                      number of the /protein_id value is incremented; the
                      stable part of the /protein_id remains unchanged and as 
                      a result will permanently be associated with a given 
                      protein;



Feature Key           conflict


Definition            independent determinations of the "same" sequence differ
                      at this site or region;

Mandatory qualifiers  /citation=[number]
                      Or
                      /compare=[accession-number.sequence-version]
                      

Optional qualifiers   /allele="text"
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /replace="text"

Comment               use /replace="" to annotate deletion, e.g. 
                      conflict    4..5
                                  /replace=""  



Feature Key           D-loop


Definition            displacement loop; a region within mitochondrial DNA in
                      which a short stretch of RNA is paired with one strand
                      of DNA, displacing the original partner DNA strand in
                      this region; also used to describe the displacement of a
                      region of one strand of duplex DNA by a single stranded
                      invader in the reaction catalyzed by RecA protein

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Molecule scope        DNA


Feature Key           D_segment


Definition            Diversity segment of immunoglobulin heavy chain, and 
                      T-cell receptor beta chain;  

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      
Parent Key            CDS

Organism scope        eukaryotes



Feature Key           enhancer


Definition            a cis-acting sequence that increases the utilization of
                      (some)  eukaryotic promoters, and can function in either
                      orientation and in any location (upstream or downstream)
                      relative to the promoter;

Optional qualifiers   /allele="text"
                      /bound_moiety="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /label=feature_label
                      /gene="text"  
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Organism scope        eukaryotes and eukaryotic viruses

Feature Key           exon


Definition            region of genome that codes for portion of spliced mRNA, 
                      rRNA and tRNA; may contain 5'UTR, all CDSs and 3' UTR; 


Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /EC_number="text"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"



Feature Key           gap

Definition            gap in the sequence
Mandatory qualifiers  /estimated_length=unknown or <integer>
Optional qualifiers   /experiment="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /map="text"
                      /note="text"
Comment               the location span of the gap feature for an unknown 
                      gap is 100 bp, with the 100 bp indicated as 100 "n"'s in 
                      the sequence.  Where estimated length is indicated by 
                      an integer, this is indicated by the same number of 
                      "n"'s in the sequence. 
                      No upper or lower limit is set on the size of the gap.



Feature Key           GC_signal


Definition            GC box; a conserved GC-rich region located upstream of
                      the start point of eukaryotic transcription units which
                      may occur in multiple copies or in either orientation;
                      consensus=GGGCGG;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Organism scope        eukaryotes and eukaryotic viruses

Feature Key           gene


Definition            region of biological interest identified as a gene 
                      and for which a name has been assigned;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /pseudo
                      /phenotype="text"
                      /standard_name="text"
                      /trans_splicing

        
Comment               the gene feature describes the interval of DNA that 
                      corresponds to a genetic trait or phenotype; the feature is,
                      by definition, not strictly bound to it's positions at the 
                      ends;  it is meant to represent a region where the gene is 
                      located.
 




Feature Key           iDNA


Definition            intervening DNA; DNA which is eliminated through any of
                      several kinds of recombination;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Molecule scope        DNA

Comment               e.g., in the somatic processing of immunoglobulin genes.

Feature Key           intron


Definition            a segment of DNA that is transcribed, but removed from
                      within the transcript by splicing together the sequences
                      (exons) on either side of it;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /pseudo
                      /standard_name="text"


Feature Key           J_segment
 

Definition            joining segment of immunoglobulin light and heavy 
                      chains, and T-cell receptor alpha, beta, and gamma 
                      chains;  

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"

Parent Key            CDS

Organism scope        eukaryotes


Feature Key           LTR


Definition            long terminal repeat, a sequence directly repeated at
                      both ends of a defined sequence, of the sort typically
                      found in retroviruses;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Feature Key           mat_peptide


Definition            mature peptide or protein coding sequence; coding
                      sequence for the mature or final peptide or protein
                      product following post-translational modification; the
                      location does not include the stop codon (unlike the
                      corresponding CDS);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /EC_number="text"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"





Feature Key           misc_binding


Definition            site in nucleic acid which covalently or non-covalently
                      binds another moiety that cannot be described by any
                      other binding key (primer_bind or protein_bind);

Mandatory qualifiers  /bound_moiety="text"

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Comment               note that the key RBS is used for ribosome binding sites



Feature Key           misc_difference


Definition            feature sequence is different from that presented 
                      in the entry and cannot be described by any other 
                      Difference key (conflict, unsure, old_sequence, 
                      variation, or modified_base);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /clone="text"
                      /compare=[accession-number.sequence-version]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text" 
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /replace="text" 
                      /standard_name="text"

Comment               the misc_difference feature key should be used to 
                      describe variability that arises as a result of 
                      genetic manipulation (e.g. site directed mutagenesis);
                      use /replace="" to annotate deletion, e.g. 
                      misc_difference 412..433
                                      /replace=""  




Feature Key           misc_feature


Definition            region of biological interest which cannot be described
                      by any other feature key; a new or rare feature;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"

Comment               this key should not be used when the need is merely to 
                      mark a region in order to comment on it or to use it in 
                      another feature's location


Feature Key           misc_recomb

Definition            site of any generalized, site-specific or replicative
                      recombination event where there is a breakage and
                      reunion of duplex DNA that cannot be described by other
                      recombination keys or qualifiers of source key 
                      (/proviral);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Molecule scope        DNA
 



Feature Key           misc_RNA


Definition            any transcript or RNA product that cannot be defined by
                      other RNA keys (prim_transcript, precursor_RNA, mRNA,
                      5'UTR, 3'UTR, exon, CDS, sig_peptide, transit_peptide,
                      mat_peptide, intron, polyA_site, ncRNA, rRNA and tRNA);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /trans_splicing



Feature Key           misc_signal


Definition            any region containing a signal controlling or altering
                      gene function or expression that cannot be described by
                      other signal keys (promoter, CAAT_signal, TATA_signal,
                      -35_signal, -10_signal, GC_signal, RBS, polyA_signal,
                      enhancer, attenuator, terminator, and rep_origin).

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /phenotype="text"
                      /standard_name="text"


Feature Key           misc_structure


Definition            any secondary or tertiary nucleotide structure or 
                      conformation that cannot be described by other Structure
                      keys (stem_loop and D-loop);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"


Feature Key           modified_base


Definition            the indicated nucleotide is a modified nucleotide and
                      should be substituted for by the indicated molecule
                      (given in the mod_base qualifier value)
 
Mandatory qualifiers  /mod_base=<modified_base>

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /frequency="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Comment               value is limited to the restricted vocabulary for 
                      modified base abbreviations;


Feature Key           mRNA


Definition            messenger RNA; includes 5'untranslated region (5'UTR),
                      coding sequences (CDS, exon) and 3'untranslated region
                      (3'UTR);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /trans_splicing


Feature Key           ncRNA

Definition            a non-protein-coding gene, other than ribosomal RNA and
                      transfer RNA, the functional molecule of which is the RNA
                      transcript;

Mandatory qualifiers  /ncRNA_class="TYPE"
                      
Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /trans_splicing
                      /operon="text"

Example               /ncRNA_class="miRNA"
                      /ncRNA_class="siRNA"
                      /ncRNA_class="scRNA"       

Comment               the ncRNA feature is not used for ribosomal and transfer
                      RNA annotation, for which the rRNA and tRNA feature keys
                      should be used, respectively;


Feature Key           N_region


Definition            extra nucleotides inserted between rearranged 
                      immunoglobulin segments.

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref=":"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"

Parent Key            CDS

Organism scope        eukaryotes


Feature Key           old_sequence


Definition            the presented sequence revises a previous version of the
                      sequence at this location;

Mandatory qualifiers  /citation=[number]
                      Or
                      /compare=[accession-number.sequence-version]

Optional qualifiers   /allele="text"
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /replace="text"

Comment               /replace="" is used to annotate deletion, e.g. 
                      old_sequence 12..15
                      /replace="" 
                      NOTE: This feature key is not valid in entries/records
                      created from 15-Oct-2007.


Feature Key           operon

Definition            region containing polycistronic transcript                     
                      containing genes that encode enzymes that are 
                      in the same metabolic pathway and regulatory sequences 

Mandatory qualifiers  /operon="text"
 
Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /map="text"
                      /note="text"
                      /phenotype="text"
                      /pseudo
                      /standard_name="text"
        



Feature Key           oriT
Definition            origin of transfer; region of a DNA molecule where transfer is
                      initiated during the process of conjugation or mobilization

Optional qualifiers   /allele="text"
                      /bound_moiety="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /direction=value
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /rpt_family="text"
                      /rpt_type=<repeat_type>
                      /rpt_unit_range=<base_range>
                      /rpt_unit_seq="text"
                      /standard_name="text"


Molecule Scope        DNA

Comment               rep_origin should be used for origins of replication; 
                      /direction has legal values RIGHT, LEFT and BOTH, however only                
                      RIGHT and LEFT are valid when used in conjunction with the oriT  
                      feature;
                      origins of transfer can be present in the chromosome; 
                      plasmids can contain multiple origins of transfer


 
Feature Key           polyA_signal


Definition            recognition region necessary for endonuclease cleavage
                      of an RNA transcript that is followed by polyadenylation;
                      consensus=AATAAA [1];

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Organism scope        eukaryotes and eukaryotic viruses

References            [1] Proudfoot, N. and Brownlee, G.G. Nature 263, 211-214
                      (1976)


Feature Key           polyA_site


Definition            site on an RNA transcript to which will be added adenine
                      residues by post-transcriptional polyadenylation;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Organism scope        eukaryotes and eukaryotic viruses


Feature Key           precursor_RNA


Definition            any RNA species that is not yet the mature RNA product;
                      may include 5' untranslated region (5'UTR), coding
                      sequences (CDS, exon), intervening sequences (intron)
                      and 3' untranslated region (3'UTR);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"  
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /standard_name="text"
                      /trans_splicing

Comment               used for RNA which may be the result of 
                      post-transcriptional processing;  if the RNA in question 
                      is known not to have been processed, use the 
                      prim_transcript key.


Feature Key           prim_transcript


Definition            primary (initial, unprocessed) transcript;  includes 5'
                      untranslated region (5'UTR), coding sequences
                      (CDS, exon), intervening sequences (intron) and 3'
                      untranslated region (3'UTR);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"


Feature Key           primer_bind


Definition            non-covalent primer binding site for initiation of
                      replication, transcription, or reverse transcription;
                      includes site(s) for synthetic e.g., PCR primer elements;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /label=feature_label
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /PCR_conditions="text"

Comment               used to annotate the site on a given sequence to which a primer 
                      molecule binds - not intended to represent the sequence of the
                      primer molecule itself; PCR components and reaction times may 
                      be stored under the "/PCR_conditions" qualifier; 
                      since PCR reactions most often involve pairs of primers,
                      a single primer_bind key may use the order() operator
                      with two locations, or a pair of primer_bind keys may be
                      used.


Feature Key           promoter


Definition            region on a DNA molecule involved in RNA polymerase
                      binding to initiate transcription;

Optional qualifiers   /allele="text"
                      /bound_moiety="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /phenotype="text"
                      /pseudo
                      /standard_name="text"

Molecule scope        DNA


Feature Key           protein_bind


Definition            non-covalent protein binding site on nucleic acid;

Mandatory qualifiers  /bound_moiety="text"

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"

Comment               note that RBS is used for ribosome binding sites.


Feature Key           RBS


Definition            ribosome binding site;


Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

References            [1] Shine, J. and Dalgarno, L.  Proc Natl Acad Sci USA
                          71, 1342-1346 (1974)
                      [2] Gold, L. et al.  Ann Rev Microb 35, 365-403 (1981)

Comment               in prokaryotes, known as the Shine-Dalgarno sequence: is
                      located 5 to 9 bases upstream of the initiation codon;
                      consensus GGAGGT [1,2].


Feature Key           repeat_region


Definition            region of genome containing repeating units;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /mobile_element=:"<mobile_element_type>
                      [:<mobile_element_name>]"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /rpt_family="text"
                      /rpt_type=<repeat_type>
                      /rpt_unit_range=<base_range>
                      /rpt_unit_seq="text"
                      /satellite="<satellite_type>[:<class>][ <identifier>]"
                      /standard_name="text"

Comment               mobile_element qualifier replaced /transposon and 
                      /insertion_seq qualifiers in December 2006



Feature Key           rep_origin


Definition            origin of replication; starting site for duplication of
                      nucleic acid to give two identical copies; 

Optional Qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /direction=value
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /label=feature_label
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Comment               /direction has valid values: RIGHT, LEFT, or BOTH.


Feature Key           rRNA


Definition            mature ribosomal RNA; RNA component of the
                      ribonucleoprotein particle (ribosome) which assembles
                      amino acids into proteins.

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"

Comment               rRNA sizes should be annotated with the /product
                      Qualifier.   


Feature Key           S_region


Definition            switch region of immunoglobulin heavy chains;  
                      involved in the rearrangement of heavy chain DNA leading 
                      to the expression of a different immunoglobulin class 
                      from the same B-cell;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /gene="text"
                      /gene_synonym="text"
                      /experiment="text"
                      /label=feature_label
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"

Parent Key            misc_signal

Organism scope        eukaryotes


Feature Key           sig_peptide


Definition            signal peptide coding sequence; coding sequence for an
                      N-terminal domain of a secreted protein; this domain is
                      involved in attaching nascent polypeptide to the
                      membrane leader sequence;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"




Feature Key           source


Definition            identifies the biological source of the specified span of
                      the sequence; this key is mandatory; more than one source
                      key per sequence is allowed; every entry/record will have, as a
                      minimum, either a single source key spanning the entire
                      sequence or multiple source keys, which together, span the
                      entire sequence.

Mandatory qualifiers  /organism="text"
                      /mol_type="genomic DNA", "genomic RNA", "mRNA", "tRNA",
                                "rRNA", "other RNA", "other DNA", "transcribed
                                RNA", "viral cRNA", "unassigned DNA",
                                "unassigned RNA"


Optional qualifiers   /bio_material="[<institution-code>:[<collection-code>:]]<material_id>"
                      /cell_line="text"
                      /cell_type="text"
                      /chromosome="text"
                      /citation=[number]
                      /clone="text"
                      /clone_lib="text"
                      /collected_by="text" 
                      /collection_date="text"
                      /country="<country_value>[:<region>][, <locality>]"
                      /cultivar="text"
                      /culture_collection="<institution-code>:[<collection-code>:]<culture_id>"
                      /db_xref="<database>:<identifier>"
                      /dev_stage="text"
                      /ecotype="text"
                      /environmental_sample
                      /focus
                      /frequency="text"
                      /germline
                      /haplotype="text"
                      /host="text"
                      /identified_by="text"
                      /isolate="text"
                      /isolation_source="text"
                      /label=feature_label
                      /lab_host="text"
                      /lat_lon="text"
                      /macronuclear
                      /map="text"
                      /mating_type="text"
                      /note="text"
                      /organelle=<organelle_value>
                      /PCR_primers="[fwd_name: XXX, ]fwd_seq: xxxxx, 
                      [rev_name: YYY, ]rev_seq: yyyyy"
                      /plasmid="text"
                      /pop_variant="text"
                      /proviral
                      /rearranged
                      /segment="text"
                      /serotype="text"
                      /serovar="text"
                      /sex="text"
                      /specimen_voucher="[<institution-code>:[<collection-code>:]]<specimen_id>"
                      /strain="text"
                      /sub_clone="text"
                      /sub_species="text"
                      /sub_strain="text"
                      /tissue_lib="text"
                      /tissue_type="text"
                      /transgenic
                      /variety="text"

Molecule scope        any

Comment               transgenic sequences must have at least two source feature
                      keys; in a transgenic sequence the source feature key
                      describing the organism that is the recipient of the DNA
                      must span the entire sequence;
                      see Appendix IV /organelle for a list of <organelle_value>






Feature Key           stem_loop


Definition            hairpin; a double-helical region formed by base-pairing
                      between adjacent (inverted) complementary sequences in a
                      single strand of RNA or DNA. 

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"


Feature Key           STS

Definition            sequence tagged site; short, single-copy DNA sequence
                      that characterizes a mapping landmark on the genome and
                      can be detected by PCR; a region of the genome can be
                      mapped by determining the order of a series of STSs;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Molecule scope        DNA

Parent key            misc_binding

Comment               STS location to include primer(s) in primer_bind key or
                      primers.


Feature Key           TATA_signal


Definition            TATA box; Goldberg-Hogness box; a conserved AT-rich
                      septamer found about 25 bp before the start point of
                      each eukaryotic RNA polymerase II transcript unit which
                      may be involved in positioning the enzyme  for correct 
                      initiation; consensus=TATA(A or T)A(A or T) [1,2];

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token) 
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)

Organism scope        eukaryotes and eukaryotic viruses

Molecule scope        DNA

References            [1] Efstratiadis, A.  et al.  Cell 21, 653-668 (1980)
                      [2] Corden, J., et al.  "Promoter sequences of
                          eukaryotic protein-encoding genes"  Science 209,
                          1406-1414 (1980)


Feature Key           terminator


Definition            sequence of DNA located either at the end of the
                      transcript that causes RNA polymerase to terminate 
                      transcription;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /operon="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"

Molecule scope        DNA


Feature Key           tmRNA

Definition            transfer messenger RNA; tmRNA acts as a tRNA first,
                      and then as an mRNA that encodes a peptide tag; the
                      ribosome translates this mRNA region of tmRNA and attaches
                      the encoded peptide tag to the C-terminus of the
                      unfinished protein; this attached tag targets the protein for
                      destruction or proteolysis;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /tag_peptide=<base_range>

Comment               the tmRNA feature key will become valid on 15-Dec-2007


Feature Key           transit_peptide


Definition            transit peptide coding sequence; coding sequence for an
                      N-terminal domain of a nuclear-encoded organellar
                      protein; this domain is involved in post-translational
                      import of the protein into the organelle;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"



Feature Key           tRNA


Definition            mature transfer RNA, a small RNA molecule (75-85 bases
                      long) that mediates the translation of a nucleic acid
                      sequence into an amino acid sequence;

Optional qualifiers   /allele="text"
                      /anticodon=(pos:<base_range>,aa:<amino_acid>)
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /trans_splicing



Feature Key           unsure


Definition            author is unsure of exact sequence in this region;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /compare=[accession-number.sequence-version]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /replace="text"

Comment               use /replace="" to annotate deletion, e.g. 
                      Unsure      11..15
                                  /replace=""  




Feature Key           V_region
 

Definition            variable region of immunoglobulin light and heavy
                      chains, and T-cell receptor alpha, beta, and gamma
                      chains;  codes for the variable amino terminal portion;
                      can be composed of V_segments, D_segments, N_regions,
                      and J_segments;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"

Parent Key            CDS

Organism scope        eukaryotes



Feature Key           V_segment


Definition            variable segment of immunoglobulin light and heavy
                      chains, and T-cell receptor alpha, beta, and gamma
                      chains; codes for most of the variable region (V_region)
                      and the last few amino acids of the leader peptide;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"

Parent Key            CDS

Organism scope        eukaryotes


Feature Key           variation

Definition            a related strain contains stable mutations from the same
                      gene (e.g., RFLPs, polymorphisms, etc.) which differ
                      from the presented sequence at this location (and
                      possibly others);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /compare=[accession-number.sequence-version]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /frequency="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /product="text"
                      /replace="text"
                      /standard_name="text"

Comment               used to describe alleles, RFLP's,and other naturally occurring 
                      mutations and  polymorphisms; variability arising as a result 
                      of genetic manipulation (e.g. site directed mutagenesis) should 
                      be described with the misc_difference feature;
                      use /replace="" to annotate deletion, e.g. 
                      variation   4..5
                                  /replace=""  




Feature Key           3'UTR


Definition            region at the 3' end of a mature transcript (following 
                      the stop codon) that is not translated into a protein;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /trans_splicing



Feature Key           5'UTR


Definition            region at the 5' end of a mature transcript (preceding 
                      the initiation codon) that is not translated into a 
                      protein;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /function="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /trans_splicing



Feature Key           -10_signal


Definition            Pribnow box; a conserved region about 10 bp upstream of
                      the start point of bacterial transcription units which
                      may be involved in binding RNA polymerase;
                      consensus=TAtAaT [1,2,3,4];

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"

Organism scope        prokaryotes

Molecule scope        DNA

References            [1] Schaller, H., Gray, C., and Hermann, K.  Proc Natl
                          Acad Sci USA 72, 737-741 (1974)
                      [2] Pribnow, D.  Proc Natl Acad Sci USA 72, 784-788 (1974)
                      [3] Hawley, D.K. and McClure, W.R.  "Compilation and
                          analysis of Escherichia coli promoter DNA sequences" 
                          Nucl Acid Res 11, 2237-2255 (1983)
                      [4] Rosenberg, M. and Court, D.  "Regulatory sequences
                          involved in the promotion and termination of RNA
                          transcription"  Ann Rev Genet 13, 319-353 (1979)


Feature Key           -35_signal


Definition            a conserved hexamer about 35 bp upstream of the start
                      point of bacterial transcription units; consensus=TTGACa
                      or TGTTGACA;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /experiment="text"
                      /gene="text"
                      /gene_synonym="text"
                      /inference="TYPE[ (same species)][:EVIDENCE_BASIS]"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"

Organism scope        prokaryotes

Molecule scope        DNA

References            [1] Takanami, M., et al.  Nature 260, 297-302 (1976)
                      [2] Moran, C.P., Jr., et al.  Molec Gen Genet 186,
                          339-346 (1982)
                      [3] Maniatis, T., et al.  Cell 5, 109-113 (1975)

 
7.4 Appendix IV: Summary of qualifiers for feature keys
7.4.1 Qualifier List

The following is a list of available qualifiers for feature keys and their usage. 
The information is arranged as follows:


Qualifier       name of qualifier; qualifier requires a value if followed by an equal 
                sign
Definition      definition of the qualifier
Value format    format of value, if required
Example         example of qualifier with value
Comment         comments, questions and clarifications


Qualifier       /allele=
Definition      name of the allele for the given gene 
Value format    "text"
Example         /allele="adh1-1"
Comment         all gene-related features (exon, CDS etc) for a given 
                gene should share the same /allele qualifier value; 
                the /allele qualifier value must, by definition, be 
                different from the /gene qualifier value; when used with 
                the variation feature key, the allele qualifier value 
                should be that of the variant.


Qualifier       /anticodon=
Definition      location of the anticodon of tRNA and the amino acid for which
                it codes
Value format    (pos:<base_range>,aa:<amino_acid>)  where base_range
                is the position of the anticodon and amino_acid is the
                abbreviation for the amino acid encoded
Example         /anticodon=(pos:34..36,aa:Phe)


Qualifier       /bio_material=
Definition      identifier for the biological material from which the nucleic
                acid sequenced was obtained, with optional institution code and
                collection code for the place where it is currently stored.
Value format    "[<institution-code>:[<collection-code>:]]<material_id>"
Example         /bio_material="CGC:CB3912"      <- Caenorhabditis stock centre
Comment         the bio_material qualifier should be used to annotate the
                identifiers of material in biological collections that are not
                appropriate to annotate as either /specimen_voucher or
                /culture_collection; these include zoos and aquaria, stock
                centres, seed banks, germplasm repositories and DNA banks;
                material_id is mandatory, institution_code and collection_code
                are optional; institution code is mandatory where collection
                code is present;
                the /bio_material qualifier becomes legal on 15-Dec-2007;


Qualifier       /bound_moiety=
Definition      name of the molecule/complex that may bind to the 
                given feature 
Value format    "text"
Example         /bound_moiety="GAL4" 
Comment         Multiple /bound_moiety qualifiers are legal on "promoter" 
                and "enhancer" features. A single /bound_moiety qualifier 
                is legal on the "misc_binding", "oriT" and "protein_bind"
                features.


Qualifier       /cell_line=
Definition      cell line from which the sequence was obtained
Value format    "text"
Example         /cell_line="MCF7"


Qualifier       /cell_type=
Definition      cell type from which the sequence was obtained
Value format    "text"
Example         /cell_type="leukocyte"


Qualifier       /chromosome=
Definition      chromosome (e.g. Chromosome number) from which
                the sequence was obtained
Value format    "text"
Example         /chromosome="1"


Qualifier       /citation=
Definition      reference to a citation listed in the entry reference field
Value format    [integer-number] where integer-number is the number of the
                reference as enumerated in the reference field
Example         /citation=[3]
Comment         used to indicate the citation providing the claim of and/or
                evidence for a feature; brackets are used for conformity.


Qualifier       /clone=
Definition      clone from which the sequence was obtained
Value format    "text"
Example         /clone="lambda-hIL7.3"
Comment         not more than one clone should be specified for a given source 
                feature;  to indicate that the sequence was obtained from
                multiple clones, multiple source features should be given.


Qualifier       /clone_lib=
Definition      clone library from which the sequence was obtained
Value format    "text"
Example         /clone_lib="lambda-hIL7"


Qualifier       /codon=
Definition      specifies a codon which is different from any found in the
                reference genetic code
Value format    (seq:"codon-sequence",aa:<amino_acid>) where
                "codon-sequence" contains the bases of the codon and <amino_acid> is
                the abbreviation for the translated amino acid, the abbreviation
                for a modified unusual amino_acids from section 7.5,
                or the word OTHER
Example         /codon=(seq:"ttt", aa:Leu)
Comment         used to specify unusual genetic codes, organellar codes, etc,
                that are different from the "normal" code for the organism;
                the codon specified by "seq" codes for the amino acid or stop
                codon specified by "aa";
                the codon that is specified is used throughout the CDS;
                amino acids that are not on the controlled vocabulary list can be
                annotated by using "aa:OTHER" as the amino acid designation, and
                by giving the name of the residue in a /note qualifier;
                only nucleotides a, g, c or t can be used in "codon-sequence";
                multiple /codon qualifiers should be used to describe ambiguous
                nucleotides.


Qualifier       /codon_start=
Definition      indicates the offset at which the first complete codon of a
                coding feature can be found, relative to the first base of that
                feature.
Value format    1 or 2 or 3
Example         /codon_start=2


Qualifier       /collected_by= 
Definition      name of the person who collected the specimen 
Value format    "text" 
Example         /collected_by="Dan Janzen" 


Qualifier       /collection_date= 
Definition      date that the specimen was collected 
Value format    "DD-Mmm-YYYY", "Mmm-YYYY" or "YYYY" 
Example         /collection_date="21-Oct-1952" 
                /collection_date="Oct-1952" 
                /collection_date="1952" 
Comment         full date format DD-Mmm-YYYY is preferred; where day and/or month
                of collection is not known either "Mmm-YYYY" or "YYYY" can be used;
                three-letter month abbreviation can be one of the following: Jan,
                Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec. 


Qualifier       /compare=
Definition      Reference details of an existing public INSD entry 
                to which a comparison is made
Value format    [accession-number.sequence-version]
Example         /compare=AJ634337.1
Comment         This qualifier may be used on the following features:
                misc_difference, conflict, unsure, old_sequence 
                and variation. The features "old_sequence" and "conflict" must
                have either a /citation or a /compare qualifier. Multiple /compare
                qualifiers with different contents are allowed within a 
                single feature. 
                This qualifier is not intended for large-scale annotation 
                of variations, such as SNPs.


Qualifier       /country=
Definition      locality of isolation of the sequenced organism indicated in
                terms of political names for nations, oceans or seas, followed
                by regions and localities
Value format    "<country_value>[:<region>][, <locality>]" where 
                country_value is any value from the controlled vocabulary at 
                http://www.insdc.org/page.php?page=country
Example         /country="Canada:Vancouver"
                /country="France:Cote d'Azur, Antibes"
                /country="Atlantic Ocean:Charlie Gibbs Fracture Zone"
Comment         Intended to provide a reference to the site where the source
                organism was isolated or sampled. Regions and localities should
                be indicated where possible. Note that the physical geography of
                the isolation or sampling site should be represented in
                /isolation_source.


Qualifier       /cultivar=
Definition      cultivar (cultivated variety) of plant from which sequence was 
                obtained. 
Value format    "text"
Example         /cultivar="Nipponbare"
                /cultivar="Tenuifolius"
                /cultivar="Candy Cane"
                /cultivar="IR36"
Comment         'cultivar' is applied solely to products of artificial 
                selection;  use the variety qualifier for natural, named 
                plant and fungal varieties;  


Qualifier       /culture_collection=
Definition      institution code and identifier for the culture from which the
                nucleic acid sequenced was obtained, with optional collection
                code.
Value format    "<institution-code>:[<collection-code>:]<culture_id>"
Example         /culture_collection="ATCC:26370"
Comment         the /culture_collection qualifier should be used to annotate
                live microbial and viral cultures, and cell lines that have been
                deposited in curated culture collections; microbial cultures in
                personal or laboratory collections should be annotated in strain
                qualifiers;
                annotation with a culture_collection qualifier implies that the
                sequence was obtained from a sample retrieved (by the submitter
                or a collaborator) from the indicated culture collection, or
                that the sequence was obtained from a sample that was deposited
                (by the submitter or a collaborator) in the indicated culture
                collection; annotation with more than one culture_collection
                qualifier indicates that the sequence was obtained from a sample
                that was deposited (by the submitter or a collaborator) in more
                than one culture collection.
                culture_id and institution_code are mandatory, collection_code
                is optional;
                the /culture_collection qualifier becomes legal on 15-Dec-2007;


Qualifier       /db_xref=
Definition      database cross-reference: pointer to related information in 
                another database.
Value format    "<database:identifier>" where database is
                the name of the database containing related information, and 
                identifier is the internal identifier of the related information
                according to the naming conventions of the cross-referenced 
                database.
Example         /db_xref="UniProtKB/Swiss-Prot:P28763"
Comment         the complete list of allowed database types is kept at 
                http://www.insdc.org/page.php?page=db_xref


Qualifier       /dev_stage=
Definition      if the sequence was obtained from an organism in a specific 
                developmental stage, it is specified with this qualifier
Value format    "text"
Example         /dev_stage="fourth instar larva"


Qualifier       /direction=
Definition      direction of DNA replication
Value format    left, right, or both where left indicates toward the 5' end of
                the entry sequence (as presented) and right indicates toward
                the 3' end
Example         /direction=LEFT


Qualifier       /EC_number=
Definition      Enzyme Commission number for enzyme product of sequence
Value format    "text"
Example         /EC_number="1.1.2.4"
                /EC_number="1.1.2.-"
                /EC_number="1.1.2.n"
Comment         valid values for EC numbers are defined in the list prepared by the 
                Nomenclature Committee of the International Union of Biochemistry and
                Molecular Biology (NC-IUBMB) (published in Enzyme Nomenclature 1992,
                Academic Press, San Diego, or a more recent revision thereof). 
                The format represents a string of four numbers separated by full
                stops; up to three numbers starting from the end of the string can 
                be replaced by dash "." to indicate uncertain assignment. 
                Symbol "n" can be used in the last position instead of a number 
                where the EC number is awaiting assignment. Please note that such
                incomplete EC numbers are not approved by NC-IUBMB.


Qualifier       /ecotype=
Definition      a population within a given species displaying genetically 
                based, phenotypic traits that reflect adaptation to a local habitat.
Value Format    "text"
Example         /ecotype="Columbia"
Comment         an example of such a population is one that has adapted hairier
                than normal leaves as a response to an especially sunny habitat.
                'Ecotype' is often applied to standard genetic stocks of
                Arabidopsis thaliana, but it can be applied to any sessile 
                organism.


Qualifier       /environmental_sample
Definition      identifies sequences derived by direct molecular
                isolation from a bulk environmental DNA sample
                (by PCR with or without subsequent cloning of the
                product, DGGE, or other anonymous methods) with no
                reliable identification of the source organism.
                Environmental samples include clinical samples,
                gut contents, and other sequences from anonymous
                organisms that may be associated with a particular
                host. They do not include endosymbionts that can be
                reliably recovered from a particular host, organisms
                from a readily identifiable but uncultured field sample
                (e.g., many cyanobacteria), or phytoplasmas that can be 
                reliably recovered from diseased plants (even though 
                these cannot be grown in axenic culture).
Value format    none
Example         /environmental_sample
Comment         used only with the source feature key; source feature 
                keys containing the /environmental_sample qualifier 
                should also contain the /isolation_source qualifier.
                entries including /environmental_sample must not include 
                the /strain qualifier


Qualifier       /estimated_length=
Definition      estimated length of the gap in the sequence
Value format    unknown or <integer>
Example         /estimated_length=unknown
                /estimated_length=342


Qualifier       /exception=
Definition      indicates that the coding region cannot be translated using
                standard biological rules
Value format    "RNA editing", "reasons given in citation",
                "rearrangement required for product"
Example         /exception="RNA editing"
                /exception="reasons given in citation"
                /exception="rearrangement required for product"
Comment         only to be used to describe biological mechanisms such 
                as RNA editing;  where the exception cannot easily be described 
                a published citation must be referred to; protein translation of
                /exception CDS will be different from the according conceptual 
                translation; 
                - must not be used where transl_except would be adequate,
                  e.g. in case of stop codon completion use:
                /transl_except=(pos:6883,aa:TERM)
                /note="TAA stop codon is completed by addition of 3' A residues to   
                mRNA".
                - must not be used for ribosomal slippage, instead use join operator, 
                  e.g.: CDS   join(486..1784,1787..4810)
                              /note="ribosomal slip on tttt sequence at 1784..1787"


Qualifier       /experiment=
Definition      a brief description of the nature of the experimental 
                evidence that supports the feature identification or assignment.
Value format    "text"
Example         /experiment="Northern blot"
                /experiment="heterologous expression system of Xenopus laevis
                oocytes"       
Comment         detailed experimental details should not be included, and would
                normally be found in the cited publications; value 
                "experimental evidence, no additional details recorded" was used to 
                replace instances of /evidence=EXPERIMENTAL in December 2005


Qualifier       /focus
Definition      identifies the source feature of primary biological
                interest for records that have multiple source features
                originating from different organisms and that are not
                transgenic.
Value format    none
Example         /focus
Comment         the source feature carrying the /focus qualifier
                identifies the main organism of the entry, this
                determines: a) the name displayed in the organism
                lines, b) if no translation table is specified, the
                translation table, c) the DDBJ/EMBL/GenBank taxonomic
                division in which the entry will appear; only one
                source feature with /focus is allowed in an entry; the
                /focus and /transgenic qualifiers are mutually exclusive
                in an entry.


Qualifier       /frequency=
Definition      frequency of the occurrence of a feature
Value format    text representing the proportion of a population carrying the
                feature expressed as a fraction
Example         /frequency="23/108"
                /frequency="1 in 12"
                /frequency=".85"


Qualifier       /function=
Definition      function attributed to a sequence
Value format    "text"
Example         function="essential for recognition of cofactor"
Comment         /function is used when the gene name and/or product name do not 
                convey the function attributable to a sequence.


Qualifier       /gene=
Definition      symbol of the gene corresponding to a sequence region
Value format    "text"
Example         /gene="ilvE" 

Qualifier       /gene_synonym=
Definition      synonymous, replaced, obsolete or former gene symbol
Value format    "text"
Example         /gene_synonym="Hox-3.3"
                in a feature where /gene="Hoxc6"
Comment         used where it is helpful to indicate a gene symbol
                synonym; when used, a primary gene symbol must always be
                indicated in /gene


Qualifier       /germline
Definition      the sequence presented in the entry has not undergone somatic
                rearrangement as part of an adaptive immune response; it is the
                unrearranged sequence that was inherited from the parental
                germline
Value format    none
Example         /germline
Comment         /germline should not be used to indicate that the source of
                the sequence is a gamete or germ cell;
                /germline and /rearranged cannot be used in the same source
                feature;
                /germline and /rearranged should only be used for molecules that
                can undergo somatic rearrangements as part of an adaptive immune 
                response; these are the T-cell receptor (TCR) and immunoglobulin
                loci in the jawed vertebrates, and the unrelated variable 
                lymphocyte receptor (VLR) locus in the jawless fish (lampreys
                and hagfish);
                /germline and /rearranged should not be used outside of the
                Craniata (taxid=89593)


Qualifier       /haplotype=
Definition      name for a specific set of alleles that are linked together
                on the same physical chromosome. In the absence of
                recombination,each haplotype is inherited as a unit, and may
                be used to track gene flow in populations.
Value format    "text"
Example         /haplotype="Dw3 B5 Cw1 A1"


Qualifier       /host=
Definition      natural (as opposed to laboratory) host to the organism from
                which sequenced molecule was obtained
Value format    "text"
Example         /host="Homo sapiens"
                /host="Homo sapiens 12 year old girl"
                /host="Rhizobium NGR234"


Qualifier       /identified_by= 
Definition      name of the taxonomist who identified the specimen 
Value format    "text" 
Example         /identified_by="John Burns" 


Qualifier       /inference=
Definition      a structured description of non-experimental evidence that supports
                the feature identification or assignment.

Value format   "TYPE[ (same species)][:EVIDENCE_BASIS]"
where TYPE is one of the following:
"non-experimental evidence, no additional details recorded"
     "similar to sequence"
          "similar to AA sequence"
          "similar to DNA sequence"
          "similar to RNA sequence"
              "similar to RNA sequence, mRNA"
              "similar to RNA sequence, EST"
              "similar to RNA sequence, other RNA"
     "profile"
          "nucleotide motif"
          "protein motif"
          "ab initio prediction"
     "alignment"
where the optional text "(same species)" is included when the inference comes 
from the same species as the entry.
where the optional "EVIDENCE_BASIS" is either a reference to a database entry 
(including accession and version) or an algorithm (including version) , eg 
'INSD:AACN010222672.1', 'InterPro:IPR001900', 'ProDom:PD000600', 
'Genscan:2.0', etc.

Example         /inference="similar to DNA sequence:INSD:AY411252.1"
                /inference="similar to RNA sequence, mRNA:RefSeq:NM_000041.2"
                /inference="similar to DNA sequence (same
                species):INSD:AACN010222672.1"
                /inference="profile:tRNAscan:2.1"
                /inference="protein motif:InterPro:IPR001900"
                /inference="ab initio prediction:Genscan:2.0"
                /inference="alignment:Splign:1.0"

Comment         /inference="non-experimental evidence, no additional details 
                recorded" was used to replace instances of 
                /evidence=NOT_EXPERIMENTAL in December 2005;
                recommentations for choice of resource acronym for
                [EVIDENCE_BASIS] are provided in the /inference qualifier
                vocabulary recommendation document
                (http://www.insdc.org/page.php?page=inference);


Qualifier       /isolate=
Definition      individual isolate from which the sequence was obtained
Value format    "text"
Example         /isolate="Patient #152"
                /isolate="DGGE band PSBAC-13"


Qualifier       /isolation_source=
Definition      describes the physical, environmental and/or local
                geographical source of the biological sample from which
                the sequence was derived
Value format    "text"
Examples        /isolation_source="rumen isolates from standard 
                Pelleted ration-fed steer #67"
                /isolation_source="permanent Antarctic sea ice"
                /isolation_source="denitrifying activated sludge from
                carbon_limited continuous reactor" 
Comment         used only with the source feature key;
                source feature keys containing an /environmental_sample
                qualifier should also contain an /isolation_source
                qualifier; the /country qualifier should be used to 
                describe the country and major geographical sub-region.


Qualifier       /label=
Definition      a label used to permanently tag a feature
Value format    feature_label  
Example         /label=Alb1_exon1
Comment         feature labels follow the naming conventions
                for all feature table objects
                (see Sections 3.1 and 3.4)


Qualifier       /lab_host=
Definition      scientific name of the laboratory host used to propagate the
                source organism from which the sequenced molecule was obtained
Value format    "text"
Example         /lab_host="Gallus gallus"
                /lab_host="Gallus gallus embryo"
                /lab_host="Escherichia coli strain DH5 alpha"
                /lab_host="Homo sapiens HeLa cells"
Comment         the full binomial scientific name of the host organism should
                be used when known; extra conditional information relating to
                the host may also be included


Qualifier       /lat_lon= 
Definition      geographical coordinates of the location where the specimen was
                collected 
Value format    "text" 
Example         /lat_lon="47.94 N 28.12 W" 
                /lat_lon="45.01 S 4.12 E"
Comment         degrees latitude and longitude in format "d[d.dd] N|S d[dd.dd] W|E"
                (see the examples)
 

Qualifier       /locus_tag=
Definition      a submitter-supplied, systematic, stable identifier for a gene
                and its associated features, used for tracking purposes
Value Format    "text"(single token) 
                but not "<1-5 letters><5-9 digit integer>[.<integer>]"
Example         /locus_tag="ABC_0022" 
                /locus_tag="A1C_00001"
Comment         /locus_tag can be used with any feature that /gene 
                can be used with;  
                identical /locus_tag values may be used within an entry/record, 
                but only if the identical /locus_tag values are associated 
                with the same gene; in all other circumstances the /locus_tag 
                value must be unique within that entry/record. Multiple /locus_tag 
                values are not allowed within one feature for entries created 
                after 15-OCT-2004. 
                If a /locus_tag needs to be re-assigned the /old_locus_tag qualifier 
                should be used to store the old value. The /locus_tag value should
                not be in a format which resembles INSD accession numbers,                 
                accession.version, or /proteid_id identifiers.


Qualifier       /map=
Definition      genomic map position of feature
Value format    "text"
Example         /map="8q12-13"


Qualifier       /macronuclear
Definition      if the sequence shown is DNA and from an organism which 
                undergoes chromosomal differentiation between macronuclear and
                micronuclear stages, this qualifier is used to denote that the 
                sequence is from macronuclear DNA. 
Value format    none
Example         /macronuclear


Qualifier       /mating_type=
Definition      mating type of the organism from which the sequence was
                obtained; mating type is used for prokaryotes, and for
                eukaryotes that undergo meiosis without sexually dimorphic
                gametes
Value format    "text"
Examples        /mating_type="MAT-1"
                /mating_type="plus"
                /mating_type="-"
                /mating_type="odd"
                /mating_type="even"
Comment         /mating_type="male" and /mating_type="female" are
                valid in the prokaryotes, but not in the eukaryotes;
                for more information, see the entry for /sex.


Qualifier       /mobile_element=
Definition      type and name or identifier of the mobile element which is
                described by the parent feature
Value format    "<mobile_element_type>[:<mobile_element_name>]" where
                mobile_element_type is one of the following:
                "transposon", "retrotransposon", "integron", 
                "insertion sequence", "non-LTR retrotransposon", 
                "SINE", "MITE", "LINE", "other".
Example         /mobile_element="transposon:Tnp9"
Comment         /mobile_element is legal on repeat_region feature key only.  
                Mobile element should be used to represent both elements which 
                are currently mobile, and those which were mobile in the past.  
                Value "other" requires a mobile_element_name. 
                /mobile_element qualifier replaced /transposon and /insertion_seq
                qualifiers in December 2006


Qualifier       /mod_base=
Definition      abbreviation for a modified nucleotide base
Value format    modified_base
Example         /mod_base=m5c
Comment         modified nucleotides not found in the restricted vocabulary
                list can be annotated by entering '/mod_base=OTHER' with
                '/note="name of modified base"'


Qualifier       /mol_type=
Definition      in vivo molecule type of sequence  
Value format    "genomic DNA", "genomic RNA", "mRNA", "tRNA", "rRNA", "other
                RNA", "other DNA", "transcribed RNA", "viral cRNA", "unassigned
                DNA", "unassigned RNA"
Example         /mol_type="genomic DNA"
Comment         all values refer to the in vivo or synthetic molecule for
                primary entries and the hypothetical molecule in Third Party
                Annotation entries; the value "genomic DNA" does not imply that
                the molecule is nuclear (e.g. organelle and plasmid DNA should
                be described using "genomic DNA"); ribosomal RNA genes should be
                described using "genomic DNA"; "rRNA" should only be used if the
                ribosomal RNA molecule itself has been sequenced; /mol_type is
                mandatory on every source feature key; all /mol_type values
                within one entry/record must be the same; values "other RNA" and
                "other DNA" should be applied to synthetic molecules, values
                "unassigned DNA", "unassigned RNA" should be applied where in
                vivo molecule is unknown


Qualifier       /ncRNA_class=
Definition      a structured description of the classification of the
                non-coding RNA described by the ncRNA parent key
Value format   "TYPE"
Example         /ncRNA_class="miRNA"
                /ncRNA_class="siRNA"
                /ncRNA_class="scRNA"       
Comment         TYPE is a term taken from the INSDC controlled vocabulary for ncRNA
                classes (http://www.insdc.org/page.php?page=rna_vocab); on
                15-Oct-2008, the following terms were valid:

                      "antisense_RNA"
                      "autocatalytically_spliced_intron" 
                      "ribozyme"
                      "hammerhead_ribozyme" 
                      "RNase_P_RNA"
                      "RNase_MRP_RNA"
                      "telomerase_RNA"
                      "guide_RNA"
                      "rasiRNA"
                      "scRNA"
                      "siRNA"
                      "miRNA"
                      "piRNA"
                      "snoRNA"
                      "snRNA"
                      "SRP_RNA"
                      "vault_RNA"
                      "Y_RNA"
                      "other"

                ncRNA classes not yet in the INSDC /ncRNA_class controlled
                vocabulary can be annotated by entering
                '/ncRNA_class="other"' with '/note="[brief explanation of
                novel ncRNA_class]"';



Qualifier       /note=
Definition      any comment or additional information
Value format    "text"
Example         /note="This qualifier is equivalent to a comment."


Qualifier       /number=
Definition      a number to indicate the order of genetic elements (e.g.,
                exons or introns) in the 5' to 3' direction
Value format    unquoted text (single token) 
Example         /number=4
                /number=6B
Comment         text limited to integers, letters or combination of integers and/or 
                letters represented as an unquoted single token (e.g. 5a, XIIb);
                any additional terms should be included in /standard_name.
                Example:  /number=2A
                          /standard_name="long"


Qualifier       /old_locus_tag=
Definition      feature tag assigned for tracking purposes 
Value Format    "text" (single token)
Example         /old_locus_tag="RSc0382"
                /locus_tag="YPO0002"
Comment         /old_locus_tag can be used with any feature where /gene is valid and 
                where a /locus_tag qualifier is present.  
                Identical /old_locus_tag values may be used within an entry/record, 
                but only if the identical /old_locus_tag values are associated 
                with the same gene; in all other circumstances the /old_locus_tag 
                value must be unique within that entry/record. 
                Multiple/old_locus_tag qualifiers with distinct values are 
                allowed within a single feature; /old_locus_tag and /locus_tag 
                values must not be identical within a single feature.


Qualifier       /operon=
Definition      name of the group of contiguous genes transcribed into a 
                single transcript to which that feature belongs.
Value format    "text"
Example         /operon="lac"
Comment         currently valid only on Prokaryota-specific features


Qualifier       /organelle= 
Definition      type of membrane-bound intracellular structure from which the 
                sequence was obtained
Value format    mitochondrion, nucleomorph, plastid, mitochondrion:kinetoplast,
                plastid:chloroplast, plastid:apicoplast, plastid:chromoplast,
                plastid:cyanelle, plastid:leucoplast, plastid:proplastid,
Examples        /organelle="chromatophore"
                /organelle="hydrogenosome"
                /organelle="mitochondrion"
                /organelle="nucleomorph"
                /organelle="plastid"
                /organelle="mitochondrion:kinetoplast"
                /organelle="plastid:chloroplast"
                /organelle="plastid:apicoplast"
                /organelle="plastid:chromoplast"
                /organelle="plastid:cyanelle"
                /organelle="plastid:leucoplast"
                /organelle="plastid:proplastid"
Comments        modifier text limited to values from controlled list


Qualifier       /organism=
Definition      scientific name of the organism that provided the 
                sequenced genetic material.  
Value format    "text"
Example         /organism="Homo sapiens"
Comment         the organism name which appears on the OS or ORGANISM line 
                will match the value of the /organism qualifier of the 
                source key in the simplest case of a one-source sequence.  


Qualifier       /partial
Definition      differentiates between complete regions and partial ones
Value format    none
Example         /partial
Comment         not to be used for new entries from 15-DEC-2001;
                use '<' and '>' signs in the location descriptors to
                indicate that the sequence is partial. 


Qualifier       /PCR_conditions=
Definition      description of reaction conditions and components for PCR 
Value format    "text" 
Example         /PCR_conditions="Initial denaturation:94degC,1.5min"
Comment         used with primer_bind key


Qualifier       /PCR_primers=
Definition      PCR primers that were used to amplify the sequence.
                A single /PCR_primers qualifier should contain all the primers used  
                for a single PCR reaction. If multiple forward or reverse primers are                   
                present in a  single PCR reaction, multiple sets of fwd_name/fwd_seq 
                or rev_name/rev_seq values will be  present.
Value format    /PCR_primers="[fwd_name: XXX1, ]fwd_seq: xxxxx1,[fwd_name: XXX2,]
                fwd_seq: xxxxx2, [rev_name: YYY1, ]rev_seq: yyyyy1, 
                [rev_name: YYY2, ]rev_seq: yyyyy2"

Example         /PCR_primers="fwd_name: CO1P1, fwd_seq: ttgattttttggtcayccwgaagt,
                rev_name: CO1R4, rev_seq: ccwvytardcctarraartgttg"
                /PCR_primers=" fwd_name: hoge1, fwd_seq: cgkgtgtatcttact, 
                rev_name: hoge2, rev_seq: cg<i>gtgtatcttact" 
                /PCR_primers="fwd_name: CO1P1, fwd_seq: ttgattttttggtcayccwgaagt,
                fwd_name: CO1P2, fwd_seq: gatacacaggtcayccwgaagt, rev_name: CO1R4,  
                rev_seq: ccwvytardcctarraartgttg" 

Comment         fwd_seq and rev_seq are both mandatory; fwd_name and rev_name are
                both optional. Both sequences should be presented in 5'>3' order. 
                The sequences should be given in the IUPAC degenerate-base alphabet,
                except for the modified bases; those must be enclosed within angle
                brackets <> 


Qualifier       /phenotype=
Definition      phenotype conferred by the feature, where phenotype is defined as a 
                physical, biochemical or behavioural characteristic or set of 
                characteristics
Value format    "text"
Example         /phenotype="erythromycin resistance"


Qualifier       /pop_variant=
Definition      name of a variation that characterizes a particular
                sub-population within a given species. The variation could be
                in the genotype or the phenotype. 
Value format    "text"
Example         /pop_variant="pop1" 
                /pop_variant="Bear Paw"


Qualifier       /plasmid=
Definition      name of naturally occurring plasmid from which the sequence was 
                obtained, where plasmid is defined as an independently replicating
                genetic unit that cannot be described by /chromosome or /segment
Value format    "text"
Example         /plasmid="C-589"


Qualifier       /product=
Definition      name of the product associated with the feature, e.g. the mRNA of an 
                mRNA feature, the polypeptide of a CDS, the mature peptide of a 
                mat_peptide, etc.
Value format    "text"
Example         /product="trypsinogen" (when qualifier appears in CDS feature)
                /product="trypsin" (when qualifier appears in mat_peptide feature)
                /product="XYZ neural-specific transcript" (when qualifier appears in 
                mRNA feature)


Qualifier       /protein_id=
Definition      protein identifier, issued by International collaborators.
                this qualifier consists of a stable ID portion (3+5 format
                with 3 position letters and 5 numbers) plus a version number
                after the decimal point.
Value format    <identifier>
Example         /protein_id="AAA12345.1"
Comment         when the protein sequence encoded by the CDS changes, only 
                the version number of the /protein_id value is incremented; 
                the stable part of the /protein_id remains unchanged and as a
                result will permanently be associated with a given protein;
                this qualifier is valid only on CDS features which translate
                into a valid protein. 


Qualifier       /proviral
Definition      this qualifier is used to flag sequence obtained from a virus or
                phage that is integrated into the genome of another organism
Value format    none
Example         /proviral


Qualifier       /pseudo
Definition      indicates that this feature is a non-functional version of the
                element named by the feature key
Value format    none
Example         /pseudo


Qualifier       /rearranged
Definition      the sequence presented in the entry has undergone somatic
                rearrangement as part of an adaptive immune response; it is not
                the unrearranged sequence that was inherited from the parental
                germline
Value format    none
Example         /rearranged
Comment         /rearranged should not be used to annotate chromosome
                rearrangements that are not involved in an adaptive immune
                response;
                /germline and /rearranged cannot be used in the same source
                feature;
                /germline and /rearranged should only be used for molecules that
                can undergo somatic rearrangements as part of an adaptive immune 
                response; these are the T-cell receptor (TCR) and immunoglobulin
                loci in the jawed vertebrates, and the unrelated variable 
                lymphocyte receptor (VLR) locus in the jawless fish (lampreys
                and hagfish);
                /germline and /rearranged should not be used outside of the
                Craniata (taxid=89593)


Qualifier       /replace=
Definition      indicates that the sequence identified a feature's intervals is  
                replaced by the sequence shown in "text"; if no sequence is 
                contained within the qualifier, this indicates a deletion.
Value format    "text"
Example         /replace="a"
                /replace=""


Qualifier       /ribosomal_slippage
Definition      during protein translation, certain sequences can program
                ribosomes to change to an alternative reading frame by a 
                mechanism known as ribosomal slippage 
Value format    none 
Example         /ribosomal_slippage 
Comment         a join operator,e.g.: [join(486..1784,1787..4810)] should be used 
                in the CDS spans to indicate the location of ribosomal_slippage 


Qualifier       /rpt_family=
Definition      type of repeated sequence; "Alu" or "Kpn", for example
Value format    "text"
Example         /rpt_family="Alu"


Qualifier       /rpt_type=
Definition      organization of repeated sequence
Value format    tandem, inverted, flanking, terminal, direct, dispersed, and other
Example         /rpt_type=INVERTED
Comment         the values are case-insensitive, i.e. both "INVERTED" and "inverted" 
                are valid;
                Definitions of the values:
                tandem, a repeat that exists adjacent to another in the same
                orientation;
                inverted, a repeat which occurs as part of as set (normally a part)
                organized in the reverse orientation;
                flanking, a repeat lying outside the sequence for which it has
                functional significance (eg. transposon insertion target sites);
                terminal, a repeat at the ends of and within the sequence for which
                it has functional significance (eg. transposon LTRs);
                direct, a repeat that exists not always adjacent but is in the same
                orientation;
                dispersed, a repeat that is found dispersed throughout the genome;
                other, a repeat exhibiting important attributes that cannot be
                described by other values.


Qualifier       /rpt_unit_range=
Definition      identity of a repeat range
Value format    <base_range>
Example         /rpt_unit_range=202..245
Comment         used to indicate the base range of the sequence that constitutes 
                a repeated sequence specified by the feature keys oriT and
                repeat_region; qualifiers /rpt_unit_range and /rpt_unit_seq
                replaced qualifier /rpt_unit in December 2005


Qualifier       /rpt_unit_seq=
Definition      identity of a repeat sequence
Value format    "text"
Example         /rpt_unit_seq="aagggc"
                /rpt_unit_seq="ag(5)tg(8)"
                /rpt_unit_seq="(AAAGA)6(AAAA)1(AAAGA)12"
Comment         used to indicate the literal sequence that constitutes a
                repeated sequence specified by the feature keys oriT and
                repeat_region; qualifiers /rpt_unit_range and /rpt_unit_seq
                replaced qualifier /rpt_unit in December 2005


Qualifier       /satellite=
Definition      identifier for a satellite DNA marker, compose of many tandem
                repeats (identical or related) of a short basic repeated unit;
Value format    "<satellite_type>[:<class>][ <identifier>]"
                where satellite_type is one of the following 
                    "satellite", "microsatellite", "minisatellite"
Example         /satellite="satellite: S1a"
                /satellite="satellite: alpha"
                /satellite="satellite: gamma III"
                /satellite="microsatellite: DC130"
Comment         many satellites have base composition or other properties
                that differ from those of the rest of the genome that allows
                them to be identified.


Qualifier       /segment=
Definition      name of viral or phage segment sequenced
Value format    "text"
Example         /segment="6"


Qualifier       /serotype=
Definition      serological variety of a species characterized by its
                antigenic properties
Value format    "text"
Example         /serotype="B1"
Comment         used only with the source feature key;
                the Bacteriological Code recommends the use of the
                term 'serovar' instead of 'serotype' for the 
                prokaryotes; see the International Code of Nomenclature
                of Bacteria (1990 Revision) Appendix 10.B "Infraspecific
                Terms".


Qualifier       /serovar=
Definition      serological variety of a species (usually a prokaryote)
                characterized by its antigenic properties
Value format    "text"
Example         /serovar="O157:H7"
Comment         used only with the source feature key;
                the Bacteriological Code recommends the use of the
                term 'serovar' instead of 'serotype' for prokaryotes;
                see the International Code of Nomenclature of Bacteria
                (1990 Revision) Appendix 10.B "Infraspecific Terms".


Qualifier       /sex=
Definition      sex of the organism from which the sequence was obtained;
                sex is used for eukaryotic organisms that undergo meiosis
                and have sexually dimorphic gametes
Value format    "text"
Examples        /sex="female"
                /sex="male"
                /sex="hermaphrodite"
                /sex="unisexual"
                /sex="bisexual"
                /sex="asexual"
                /sex="monoecious" [or monecious]
                /sex="dioecious" [or diecious]
Comment         /sex should be used (instead of /mating_type)
                in the Metazoa, Embryophyta, Rhodophyta & Phaeophyceae;
                /mating_type should be used (instead of /sex)
                in the Bacteria, Archaea & Fungi;
                neither /sex nor /mating_type should be used
                in the viruses;
                outside of the taxa listed above, /mating_type
                should be used unless the value of the qualifier
                is taken from the vocabulary given in the examples
                above


Qualifier       /specimen_voucher=
Definition      identifier for the specimen from which the nucleic acid
                sequenced was obtained
Value format    /specimen_voucher="[<institution-code>:[<collection-code>:]]<specimen_id>"
Example         /specimen_voucher="UAM:Mamm:52179"
                /specimen_voucher="AMCC:101706"
                /specimen_voucher="USNM:field series 8798"
                /specimen_voucher="personal collection:Dan Janzen:99-SRNP-2003"
                /specimen_voucher="99-SRNP-2003"
Comment         the /specimen_voucher qualifier is intended to annotate a
                reference to the physical specimen that remains after the
                sequence has been obtained;
                if the specimen was destroyed in the process of sequencing,
                electronic images (e-vouchers) are an adequate substitute for a
                physical voucher specimen; ideally the specimens will be
                deposited in a curated museum, herbarium, or frozen tissue
                collection, but often they will remain in a personal or
                laboratory collection for some time before they are deposited in
                a curated collection;
                there are three forms of specimen_voucher qualifiers; if the
                text of the qualifier includes one or more colons it is a
                'structured voucher'; structured vouchers include
                institution-codes (and optional collection-codes) taken from a
                controlled vocabulary that denotes the museum or herbarium
                collection where the specimen resides;


Qualifier       /standard_name=
Definition      accepted standard name for this feature
Value format    "text"
Example         /standard_name="dotted"
Comment         use /standard_name to give full gene name, but use /gene to
                give gene symbol (in the above example /gene="Dt").


Qualifier       /strain=
Definition      strain from which sequence was obtained
Value format    "text"
Example         /strain="BALB/c"
Comment         entries including /strain must not include
                the /environmental_sample qualifier


Qualifier       /sub_clone=
Definition      sub-clone from which sequence was obtained
Value format    "text"
Example         /sub_clone="lambda-hIL7.20g"
Comment         the comments on /clone apply to /sub_clone


Qualifier       /sub_species=
Definition      name of sub-species of organism from which sequence was
                obtained
Value format    "text"
Example         /sub_species="lactis"


Qualifier       /sub_strain=
Definition      name or identifier of a genetically or otherwise modified 
                strain from which sequence was obtained, derived from a 
                parental strain (which should be annotated in the /strain 
                qualifier).sub_strain from which sequence was obtained
Value format    "text"
Example         /sub_strain="abis"
Comment         If the parental strain is not given, this should
                be annotated in the strain qualifier instead of sub_strain.
                Either:
                /strain="K-12"
                /sub_strain="MG1655"
                or:
                /strain="MG1655"


Qualifier       /tag_peptide=
Definition      base location encoding the polypeptide for proteolysis tag of 
                tmRNA and its termination codon;
Value format    <base_range>
Example         /tag_peptide=90..122
Comment         it is recommended that the amino acid sequence corresponding
                to the /tag_peptide be annotated by describing a 5' partial 
                CDS feature; e.g. CDS    <90..122;
                the /tag_peptide qualifier (and tmRNA feature) will become
                valid on 15-Dec-2007


Qualifier       /tissue_lib=
Definition      tissue library from which sequence was obtained
Value format    "text"
Example         /tissue_lib="tissue library 772"


Qualifier       /tissue_type=
Definition      tissue type from which the sequence was obtained
Value format    "text"
Example         /tissue_type="liver"


Qualifier       /transgenic
Definition      identifies the source feature of the organism which was 
                the recipient of transgenic DNA.
Value format    none
Example         /transgenic
Comment         transgenic sequences must have at least two source feature keys; 
                the source feature key having the /transgenic qualifier must 
                span the whole sequence; the source feature carrying the 
                /transgenic qualifier identifies the main organism of the entry, 
                this determines: a) the name displayed in the organism lines, 
                b) if no translation table is specified, the translation table;
                only one source feature with /transgenic is allowed in an entry; 
                the /focus and /transgenic qualifiers are mutually exclusive in 
                an entry.


Qualifier       /translation=
Definition      automatically generated one-letter abbreviated amino acid
                sequence derived from either the universal genetic code or the
                table as specified in /transl_table and as determined by
                exceptions in the /transl_except and /codon qualifiers
Value format    IUPAC one-letter amino acid abbreviation, "X" is to be used
                for AA exceptions.
Example         /translation="MASTFPPWYRGCASTPSLKGLIMCTW"
Comment         to be used with CDS feature only; this is a mandatory qualifier 
                in the CDS feature key except where /pseudo is shown;
                see /transl_table for definition and location of genetic code
                Tables. 


Qualifier       /transl_except=
Definition      translational exception: single codon the translation of which
                does not conform to genetic code defined by Organism and /codon=
Value format    (pos:location,aa:<amino_acid>) where amino_acid is the
                amino acid coded by the codon at the base_range position
Example         /transl_except=(pos:213..215,aa:Trp)
                /transl_except=(pos:1017,aa:TERM)
                /transl_except=(pos:2000..2001,aa:TERM)
                /transl_except=(pos:X22222:15..17,aa:Ala)
Comment         if the amino acid is not on the restricted vocabulary list use
                e.g., '/transl_except=(pos:213..215,aa:OTHER)' with
                '/note="name of unusual amino acid"';
                for modified amino-acid selenocysteine use three letter code
                'Sec'  (one letter code 'U' in amino-acid sequence)
                /transl_except=(pos:1002..1004,aa:Sec);
                for partial termination codons where TAA stop codon is
                completed by the addition of 3' A residues to the mRNA
                either a single base_position or a base_range is used, e.g.
                if partial stop codon is a single base:
                /transl_except=(pos:1017,aa:TERM)
                if partial stop codon consists of two bases:
                /transl_except=(pos:2000..2001,aa:TERM) with
                '/note='stop codon completed by the addition of 3' A residues 
                to the mRNA'.


Qualifier       /transl_table=
Definition      definition of genetic code table used if other than universal
                genetic code table. Tables used are described in appendix V,
                section 7.5.5.
Value format    <integer; 1=universal table 1;2=non-universal table 2;...
Example         /transl_table=4
Comment         genetic code exceptions outside range of specified tables are
                reported in /codon or /transl_except qualifiers.


Qualifier       /trans_splicing 
Definition      indicates that exons from two RNA molecules are ligated in
                intermolecular reaction to form mature RNA 
Value format    none 
Example         /trans_splicing 
Comment         should be used on features such as CDS, mRNA and other features
                that are produced as a result of a trans-splicing event. This
                qualifier should be used only when the splice event is indicated in
                the "join" operator, eg join(complement(69611..69724),139856..140087) 


Qualifier       /variety=
Definition      variety (= varietas, a formal Linnaean rank) of organism 
                from which sequence was derived.
Value format    "text"
Example         /variety="insularis"
Comment         use the cultivar qualifier for cultivated plant 
                varieties, i.e., products of artificial selection;
                varieties other than plant and fungal variatas should be            
                annotated via /note, e.g. /note="breed:Cukorova"



7.4.2 Feature qualifiers - mapped to Feature keys

The following is a list of available qualifiers mapped to the list of feature keys on which each qualifier is legal.
QUALIFIER FEATURE KEY
/allele  -10_signal
/allele  -35_signal
/allele  3'UTR
/allele  5'UTR
/allele  attenuator
/allele  C_region
/allele  CAAT_signal
/allele  CDS
/allele  conflict
/allele  D_segment
/allele  D-loop
/allele  enhancer
/allele  exon
/allele  GC_signal
/allele  gene
/allele  iDNA
/allele  intron
/allele  J_segment
/allele  LTR
/allele  mat_peptide
/allele  misc_binding
/allele  misc_difference
/allele  misc_feature
/allele  misc_recomb
/allele  misc_RNA
/allele  misc_signal
/allele  misc_structure
/allele  modified_base
/allele  mRNA
/allele  N_region
/allele  old_sequence
/allele  operon
/allele  oriT
/allele  polyA_signal
/allele  polyA_site
/allele  precursor_RNA
/allele  prim_transcript
/allele  primer_bind
/allele  promoter
/allele  protein_bind
/allele  RBS
/allele  rep_origin
/allele  repeat_region
/allele  rRNA
/allele  S_region
/allele  sig_peptide
/allele  stem_loop
/allele  STS
/allele  TATA_signal
/allele  terminator
/allele  transit_peptide
/allele  tRNA
/allele  unsure
/allele  V_region
/allele  V_segment
/allele  variation
/anticodon  tRNA
/bio_material  source
/bound_moiety  enhancer
/bound_moiety  misc_binding
/bound_moiety  oriT
/bound_moiety  promoter
/bound_moiety  protein_bind
/cell_line  source
/cell_type  source
/chromosome  source
/citation  -10_signal
/citation  -35_signal
/citation  3'UTR
/citation  5'UTR
/citation  attenuator
/citation  C_region
/citation  CAAT_signal
/citation  CDS
/citation  conflict
/citation  D_segment
/citation  D-loop
/citation  enhancer
/citation  exon
/citation  GC_signal
/citation  gene
/citation  iDNA
/citation  intron
/citation  J_segment
/citation  LTR
/citation  mat_peptide
/citation  misc_binding
/citation  misc_difference
/citation  misc_feature
/citation  misc_recomb
/citation  misc_RNA
/citation  misc_signal
/citation  misc_structure
/citation  modified_base
/citation  mRNA
/citation  N_region
/citation  old_sequence
/citation  operon
/citation  oriT
/citation  polyA_signal
/citation  polyA_site
/citation  precursor_RNA
/citation  prim_transcript
/citation  primer_bind
/citation  promoter
/citation  protein_bind
/citation  RBS
/citation  rep_origin
/citation  repeat_region
/citation  rRNA
/citation  S_region
/citation  sig_peptide
/citation  source
/citation  stem_loop
/citation  STS
/citation  TATA_signal
/citation  terminator
/citation  transit_peptide
/citation  tRNA
/citation  unsure
/citation  V_region
/citation  V_segment
/citation  variation
/clone  misc_difference
/clone  source
/clone_lib  source
/codon  CDS
/codon_start  CDS
/collected_by  source
/collection_date  source
/compare  conflict
/compare  misc_difference
/compare  old_sequence
/compare  variation
/compare  unsure
/country  source
/cultivar  source
/culture_collection  source
/db_xref  -10_signal
/db_xref  -35_signal
/db_xref  3'UTR
/db_xref  5'UTR
/db_xref  attenuator
/db_xref  C_region
/db_xref  CAAT_signal
/db_xref  CDS
/db_xref  conflict
/db_xref  D_segment
/db_xref  D-loop
/db_xref  enhancer
/db_xref  exon
/db_xref  GC_signal
/db_xref  gene
/db_xref  iDNA
/db_xref  intron
/db_xref  J_segment
/db_xref  LTR
/db_xref  mat_peptide
/db_xref  misc_binding
/db_xref  misc_difference
/db_xref  misc_feature
/db_xref  misc_recomb
/db_xref  misc_RNA
/db_xref  misc_signal
/db_xref  misc_structure
/db_xref  modified_base
/db_xref  mRNA
/db_xref  N_region
/db_xref  old_sequence
/db_xref  operon
/db_xref  oriT
/db_xref  polyA_signal
/db_xref  polyA_site
/db_xref  precursor_RNA
/db_xref  prim_transcript
/db_xref  primer_bind
/db_xref  promoter
/db_xref  protein_bind
/db_xref  RBS
/db_xref  rep_origin
/db_xref  repeat_region
/db_xref  rRNA
/db_xref  S_region
/db_xref  sig_peptide
/db_xref  source
/db_xref  stem_loop
/db_xref  STS
/db_xref  TATA_signal
/db_xref  terminator
/db_xref  transit_peptide
/db_xref  tRNA
/db_xref  unsure
/db_xref  V_region
/db_xref  V_segment
/db_xref  variation
/dev_stage  source
/direction  oriT
/direction  rep_origin
/EC_number  CDS
/EC_number  exon
/EC_number  mat_peptide
/ecotype  source
/environmental_sample  source
/estimated_length  gap
/exception  CDS
/experiment  -10_signal
/experiment  -35_signal
/experiment  3'UTR
/experiment  5'UTR
/experiment  attenuator
/experiment  C_region
/experiment  CAAT_signal
/experiment  CDS
/experiment  conflict
/experiment  D_segment
/experiment  D-loop
/experiment  enhancer
/experiment  exon
/experiment  GC_signal
/experiment  gene
/experiment  iDNA
/experiment  intron
/experiment  J_segment
/experiment  LTR
/experiment  mat_peptide
/experiment  misc_binding
/experiment  misc_difference
/experiment  misc_feature
/experiment  misc_recomb
/experiment  misc_RNA
/experiment  misc_signal
/experiment  misc_structure
/experiment  modified_base
/experiment  mRNA
/experiment  N_region
/experiment  old_sequence
/experiment  operon
/experiment  oriT
/experiment  polyA_signal
/experiment  polyA_site
/experiment  precursor_RNA
/experiment  prim_transcript
/experiment  primer_bind
/experiment  promoter
/experiment  protein_bind
/experiment  RBS
/experiment  rep_origin
/experiment  repeat_region
/experiment  rRNA
/experiment  S_region
/experiment  sig_peptide
/experiment  stem_loop
/experiment  STS
/experiment  TATA_signal
/experiment  terminator
/experiment  transit_peptide
/experiment  tRNA
/experiment  unsure
/experiment  V_region
/experiment  V_segment
/experiment  variation
/focus  source
/frequency  modified_base
/frequency  source
/frequency  variation
/function  3'UTR
/function  5'UTR
/function  CDS
/function  exon
/function  gene
/function  iDNA
/function  intron
/function  LTR
/function  mat_peptide
/function  misc_binding
/function  misc_feature
/function  misc_RNA
/function  misc_signal
/function  misc_structure
/function  mRNA
/function  operon
/function  precursor_RNA
/function  prim_transcript
/function  promoter
/function  protein_bind
/function  repeat_region
/function  rRNA
/function  sig_peptide
/function  stem_loop
/function  transit_peptide
/function  tRNA
/gene  -10_signal
/gene  -35_signal
/gene  3'UTR
/gene  5'UTR
/gene  attenuator
/gene  C_region
/gene  CAAT_signal
/gene  CDS
/gene  conflict
/gene  D_segment
/gene  D-loop
/gene  enhancer
/gene  exon
/gene  GC_signal
/gene  gene
/gene  iDNA
/gene  intron
/gene  J_segment
/gene  LTR
/gene  mat_peptide
/gene  misc_binding
/gene  misc_difference
/gene  misc_feature
/gene  misc_recomb
/gene  misc_RNA
/gene  misc_signal
/gene  misc_structure
/gene  modified_base
/gene  mRNA
/gene  N_region
/gene  old_sequence
/gene  oriT
/gene  polyA_signal
/gene  polyA_site
/gene  precursor_RNA
/gene  prim_transcript
/gene  primer_bind
/gene  promoter
/gene  protein_bind
/gene  RBS
/gene  rep_origin
/gene  repeat_region
/gene  rRNA
/gene  S_region
/gene  sig_peptide
/gene  stem_loop
/gene  STS
/gene  TATA_signal
/gene  terminator
/gene  transit_peptide
/gene  tRNA
/gene  unsure
/gene  V_region
/gene  V_segment
/gene  variation
/gene_synonym  -10_signal
/gene_synonym  -35_signal
/gene_synonym  3'UTR
/gene_synonym  5'UTR
/gene_synonym  attenuator
/gene_synonym  C_region
/gene_synonym  CAAT_signal
/gene_synonym  CDS
/gene_synonym  conflict
/gene_synonym  D_segment
/gene_synonym  D-loop
/gene_synonym  enhancer
/gene_synonym  exon
/gene_synonym  GC_signal
/gene_synonym  gene
/gene_synonym  iDNA
/gene_synonym  intron
/gene_synonym  J_segment
/gene_synonym  LTR
/gene_synonym  mat_peptide
/gene_synonym  misc_binding
/gene_synonym  misc_difference
/gene_synonym  misc_feature
/gene_synonym  misc_recomb
/gene_synonym  misc_RNA
/gene_synonym  misc_signal
/gene_synonym  misc_structure
/gene_synonym  modified_base
/gene_synonym  mRNA
/gene_synonym  N_region
/gene_synonym  old_sequence
/gene_synonym  oriT
/gene_synonym  polyA_signal
/gene_synonym  polyA_site
/gene_synonym  precursor_RNA
/gene_synonym  prim_transcript
/gene_synonym  primer_bind
/gene_synonym  promoter
/gene_synonym  protein_bind
/gene_synonym  RBS
/gene_synonym  rep_origin
/gene_synonym  repeat_region
/gene_synonym  rRNA
/gene_synonym  S_region
/gene_synonym  sig_peptide
/gene_synonym  stem_loop
/gene_synonym  STS
/gene_synonym  TATA_signal
/gene_synonym  terminator
/gene_synonym  transit_peptide
/gene_synonym  tRNA
/gene_synonym  unsure
/gene_synonym  V_region
/gene_synonym  V_segment
/gene_synonym  variation
/germline  source
/haplotype  source
/host  source
/identified_by  source
/inference  -10_signal
/inference  -35_signal
/inference  3'UTR
/inference  5'UTR
/inference  attenuator
/inference  C_region
/inference  CAAT_signal
/inference  CDS
/inference  conflict
/inference  D_segment
/inference  D-loop
/inference  enhancer
/inference  exon
/inference  GC_signal
/inference  gene
/inference  iDNA
/inference  intron
/inference  J_segment
/inference  LTR
/inference  mat_peptide
/inference  misc_binding
/inference  misc_difference
/inference  misc_feature
/inference  misc_recomb
/inference  misc_RNA
/inference  misc_signal
/inference  misc_structure
/inference  modified_base
/inference  mRNA
/inference  N_region
/inference  old_sequence
/inference  operon
/inference  oriT
/inference  polyA_signal
/inference  polyA_site
/inference  precursor_RNA
/inference  prim_transcript
/inference  primer_bind
/inference  promoter
/inference  protein_bind
/inference  RBS
/inference  rep_origin
/inference  repeat_region
/inference  rRNA
/inference  S_region
/inference  sig_peptide
/inference  stem_loop
/inference  STS
/inference  TATA_signal
/inference  terminator
/inference  transit_peptide
/inference  tRNA
/inference  unsure
/inference  V_region
/inference  V_segment
/inference  variation
/isolate  source
/isolation_source  source
/lab_host  source
/label  -10_signal
/label  -35_signal
/label  3'UTR
/label  5'UTR
/label  attenuator
/label  C_region
/label  CAAT_signal
/label  CDS
/label  conflict
/label  D_segment
/label  D-loop
/label  enhancer
/label  exon
/label  GC_signal
/label  gene
/label  iDNA
/label  intron
/label  J_segment
/label  LTR
/label  mat_peptide
/label  misc_binding
/label  misc_difference
/label  misc_feature
/label  misc_recomb
/label  misc_RNA
/label  misc_signal
/label  misc_structure
/label  modified_base
/label  mRNA
/label  N_region
/label  old_sequence
/label  operon
/label  oriT
/label  polyA_signal
/label  polyA_site
/label  precursor_RNA
/label  prim_transcript
/label  primer_bind
/label  promoter
/label  protein_bind
/label  RBS
/label  rep_origin
/label  repeat_region
/label  rRNA
/label  S_region
/label  sig_peptide
/label  source
/label  stem_loop
/label  STS
/label  TATA_signal
/label  terminator
/label  transit_peptide
/label  tRNA
/label  unsure
/label  V_region
/label  V_segment
/label  variation
/lat_lon  source
/locus_tag  -10_signal
/locus_tag  -35_signal
/locus_tag  3'UTR
/locus_tag  5'UTR
/locus_tag  attenuator
/locus_tag  C_region
/locus_tag  CAAT_signal
/locus_tag  CDS
/locus_tag  conflict
/locus_tag  D_segment
/locus_tag  D-loop
/locus_tag  enhancer
/locus_tag  exon
/locus_tag  GC_signal
/locus_tag  gene
/locus_tag  iDNA
/locus_tag  intron
/locus_tag  J_segment
/locus_tag  LTR
/locus_tag  mat_peptide
/locus_tag  misc_binding
/locus_tag  misc_difference
/locus_tag  misc_feature
/locus_tag  misc_recomb
/locus_tag  misc_RNA
/locus_tag  misc_signal
/locus_tag  misc_structure
/locus_tag  modified_base
/locus_tag  mRNA
/locus_tag  N_region
/locus_tag  old_sequence
/locus_tag  oriT
/locus_tag  polyA_signal
/locus_tag  polyA_site
/locus_tag  precursor_RNA
/locus_tag  prim_transcript
/locus_tag  primer_bind
/locus_tag  promoter
/locus_tag  protein_bind
/locus_tag  RBS
/locus_tag  rep_origin
/locus_tag  repeat_region
/locus_tag  rRNA
/locus_tag  S_region
/locus_tag  sig_peptide
/locus_tag  stem_loop
/locus_tag  STS
/locus_tag  TATA_signal
/locus_tag  terminator
/locus_tag  transit_peptide
/locus_tag  tRNA
/locus_tag  unsure
/locus_tag  V_region
/locus_tag  V_segment
/locus_tag  variation
/macronuclear  source
/map  -10_signal
/map  -35_signal
/map  3'UTR
/map  5'UTR
/map  attenuator
/map  C_region
/map  CAAT_signal
/map  CDS
/map  conflict
/map  D_segment
/map  D-loop
/map  enhancer
/map  exon
/map  GC_signal
/map  gap
/map  gene
/map  iDNA
/map  intron
/map  J_segment
/map  LTR
/map  mat_peptide
/map  misc_binding
/map  misc_difference
/map  misc_feature
/map  misc_recomb
/map  misc_RNA
/map  misc_signal
/map  misc_structure
/map  modified_base
/map  mRNA
/map  N_region
/map  old_sequence
/map  operon
/map  oriT
/map  polyA_signal
/map  polyA_site
/map  precursor_RNA
/map  prim_transcript
/map  primer_bind
/map  promoter
/map  protein_bind
/map  RBS
/map  rep_origin
/map  repeat_region
/map  rRNA
/map  S_region
/map  sig_peptide
/map  source
/map  stem_loop
/map  STS
/map  TATA_signal
/map  terminator
/map  transit_peptide
/map  tRNA
/map  unsure
/map  V_region
/map  V_segment
/map  variation
/mating_type  source
/mobile_element  repeat_region
/mod_base  modified_base
/mol_type  source
/ncRNA_class  ncRNA
/note  -10_signal
/note  -35_signal
/note  3'UTR
/note  5'UTR
/note  attenuator
/note  C_region
/note  CAAT_signal
/note  CDS
/note  conflict
/note  D_segment
/note  D-loop
/note  enhancer
/note  exon
/note  GC_signal
/note  gap
/note  gene
/note  iDNA
/note  intron
/note  J_segment
/note  LTR
/note  mat_peptide
/note  misc_binding
/note  misc_difference
/note  misc_feature
/note  misc_recomb
/note  misc_RNA
/note  misc_signal
/note  misc_structure
/note  modified_base
/note  mRNA
/note  N_region
/note  old_sequence
/note  operon
/note  oriT
/note  polyA_signal
/note  polyA_site
/note  precursor_RNA
/note  prim_transcript
/note  primer_bind
/note  promoter
/note  protein_bind
/note  RBS
/note  rep_origin
/note  repeat_region
/note  rRNA
/note  S_region
/note  sig_peptide
/note  source
/note  stem_loop
/note  STS
/note  TATA_signal
/note  terminator
/note  transit_peptide
/note  tRNA
/note  unsure
/note  V_region
/note  V_segment
/note  variation
/number  CDS
/number  exon
/number  iDNA
/number  intron
/number  misc_feature
/old_locus_tag  -10_signal
/old_locus_tag  -35_signal
/old_locus_tag  3'UTR
/old_locus_tag  5'UTR
/old_locus_tag  attenuator
/old_locus_tag  C_region
/old_locus_tag  CAAT_signal
/old_locus_tag  CDS
/old_locus_tag  conflict
/old_locus_tag  D_segment
/old_locus_tag  D-loop
/old_locus_tag  enhancer
/old_locus_tag  exon
/old_locus_tag  GC_signal
/old_locus_tag  gene
/old_locus_tag  iDNA
/old_locus_tag  intron
/old_locus_tag  J_segment
/old_locus_tag  LTR
/old_locus_tag  mat_peptide
/old_locus_tag  misc_binding
/old_locus_tag  misc_difference
/old_locus_tag  misc_feature
/old_locus_tag  misc_recomb
/old_locus_tag  misc_RNA
/old_locus_tag  misc_signal
/old_locus_tag  misc_structure
/old_locus_tag  modified_base
/old_locus_tag  mRNA
/old_locus_tag  N_region
/old_locus_tag  old_sequence
/old_locus_tag  oriT
/old_locus_tag  polyA_signal
/old_locus_tag  polyA_site
/old_locus_tag  precursor_RNA
/old_locus_tag  prim_transcript
/old_locus_tag  primer_bind
/old_locus_tag  promoter
/old_locus_tag  protein_bind
/old_locus_tag  RBS
/old_locus_tag  rep_origin
/old_locus_tag  repeat_region
/old_locus_tag  rRNA
/old_locus_tag  S_region
/old_locus_tag  sig_peptide
/old_locus_tag  stem_loop
/old_locus_tag  STS
/old_locus_tag  TATA_signal
/old_locus_tag  terminator
/old_locus_tag  transit_peptide
/old_locus_tag  tRNA
/old_locus_tag  unsure
/old_locus_tag  V_region
/old_locus_tag  V_segment
/old_locus_tag  variation
/operon  -10_signal
/operon  -35_signal
/operon  attenuator
/operon  CDS
/operon  gene
/operon  misc_RNA
/operon  misc_signal
/operon  mRNA
/operon  operon
/operon  precursor_RNA
/operon  prim_transcript
/operon  promoter
/operon  protein_bind
/operon  rRNA
/operon  stem_loop
/operon  terminator
/organelle  source
/organism  source
/PCR_conditions  primer_bind
/PCR_primers  source
/phenotype  attenuator
/phenotype  gene
/phenotype  misc_difference
/phenotype  misc_feature
/phenotype  misc_signal
/phenotype  operon
/phenotype  promoter
/phenotype  variation
/plasmid  source
/pop_variant  source
/product  C_region
/product  CDS
/product  D_segment
/product  exon
/product  gene
/product  J_segment
/product  mat_peptide
/product  misc_feature
/product  misc_RNA
/product  mRNA
/product  N_region
/product  precursor_RNA
/product  rRNA
/product  S_region
/product  sig_peptide
/product  transit_peptide
/product  tRNA
/product  V_region
/product  V_segment
/product  variation
/protein_id  CDS
/proviral  source
/pseudo  C_region
/pseudo  CDS
/pseudo  D_segment
/pseudo  exon
/pseudo  gene
/pseudo  intron
/pseudo  J_segment
/pseudo  mat_peptide
/pseudo  misc_feature
/pseudo  misc_RNA
/pseudo  mRNA
/pseudo  N_region
/pseudo  operon
/pseudo  promoter
/pseudo  rRNA
/pseudo  S_region
/pseudo  sig_peptide
/pseudo  transit_peptide
/pseudo  tRNA
/pseudo  V_region
/pseudo  V_segment
/rearranged  source
/replace  conflict
/replace  misc_difference
/replace  old_sequence
/replace  unsure
/replace  variation
/ribosomal_slippage  CDS
/rpt_family  oriT
/rpt_family  repeat_region
/rpt_type  oriT
/rpt_type  repeat_region
/rpt_unit_range  oriT
/rpt_unit_range  repeat_region
/rpt_unit_seq  oriT
/rpt_unit_seq  repeat_region
/satellite  repeat_region
/segment  source
/serotype  source
/serovar  source
/sex  source
/specimen_voucher  source
/standard_name  -10_signal
/standard_name  -35_signal
/standard_name  3'UTR
/standard_name  5'UTR
/standard_name  C_region
/standard_name  CDS
/standard_name  D_segment
/standard_name  enhancer
/standard_name  exon
/standard_name  gene
/standard_name  iDNA
/standard_name  intron
/standard_name  J_segment
/standard_name  LTR
/standard_name  mat_peptide
/standard_name  misc_difference
/standard_name  misc_feature
/standard_name  misc_recomb
/standard_name  misc_RNA
/standard_name  misc_signal
/standard_name  misc_structure
/standard_name  mRNA
/standard_name  N_region
/standard_name  operon
/standard_name  oriT
/standard_name  precursor_RNA
/standard_name  prim_transcript
/standard_name  primer_bind
/standard_name  promoter
/standard_name  protein_bind
/standard_name  RBS
/standard_name  rep_origin
/standard_name  repeat_region
/standard_name  rRNA
/standard_name  S_region
/standard_name  sig_peptide
/standard_name  stem_loop
/standard_name  STS
/standard_name  terminator
/standard_name  transit_peptide
/standard_name  tRNA
/standard_name  V_region
/standard_name  V_segment
/standard_name  variation
/strain  source
/sub_clone  source
/sub_species  source
/sub_strain  source
/tag_peptide  tmRNA
/tissue_lib  source
/tissue_type  source
/transgenic  source
/transl_except  CDS
/transl_table  CDS
/translation  CDS
/trans_splicing  CDS
/trans_splicing  gene
/trans_splicing  misc_RNA
/trans_splicing  mRNA
/trans_splicing  precursor_RNA
/trans_splicing  tRNA
/trans_splicing  3'UTR
/trans_splicing  5'UTR
/variety  source
 
7.5 Appendix V: Controlled vocabularies

This appendix contains information on the restricted vocabulary fields used in 
the Feature Table. The information contained in this appendix is subject to 
change, please contact the database staff for the most recent information 
concerning controlled vocabularies. This appendix is organized as follows: 

Authority       The organization with authority to define the vocabulary
Reference       Publications of (or about) the vocabulary
Contact         Name of database staff responsible for maintaining 
                the database copy of the vocabulary
Scope           Feature Table qualifiers which take members of this vocabulary 
                as values
Listing         A listing of the current vocabulary with definitions or
                explanations
This appendix includes reference lists for the following controlled vocabulary 
fields: 
- Nucleotide base codes (IUPAC)
- Modified base abbreviations 
- Amino acid abbreviations 
- Modified and unusual Amino Acids 
- Genetic Code Tables 
- Country Names
 
7.5.1 Nucleotide base codes (IUPAC)

Authority       Nomenclature Committee of the International Union of 
                Biochemistry 
Reference       Cornish-Bowden, A.  Nucl Acid Res 13, 3021-3030 (1985)
Contact         EMBL-EBI
Scope           Location descriptors 

Listing

        Symbol  Meaning
        ------  -------

        a       a; adenine
        c       c; cytosine
        g       g; guanine
        t       t; thymine in DNA; uracil in RNA
        m       a or c
        r       a or g
        w       a or t
        s       c or g
        y       c or t
        k       g or t
        v       a or c or g; not t
        h       a or c or t; not g
        d       a or g or t; not c
        b       c or g or t; not a
        n       a or c or g or t


7.5.2 Modified base abbreviations

Authority       Sprinzl, M. and Gauss, D.H.
Reference       Sprinzl, M. and Gauss, D.H.  Nucl Acid Res  10, r1 (1982).
                (note that in Cornish_Bowden, A.  Nucl Acid Res  13, 3021-3030
                (1985) the IUPAC-IUB declined to recommend a set of
                abbreviations for modified nucleotides)
Contact         NCBI
Scope           /mod_base

        Abbreviation    Modified base description
        ------------    -------------------------
        ac4c            4-acetylcytidine
        chm5u           5-(carboxyhydroxylmethyl)uridine
        cm              2'-O-methylcytidine
        cmnm5s2u        5-carboxymethylaminomethyl-2-thiouridine
        cmnm5u          5-carboxymethylaminomethyluridine
        d               dihydrouridine
        fm              2'-O-methylpseudouridine
        gal q           beta,D-galactosylqueosine
        gm              2'-O-methylguanosine
        i               inosine
        i6a             N6-isopentenyladenosine
        m1a             1-methyladenosine
        m1f             1-methylpseudouridine
        m1g             1-methylguanosine
        m1i             1-methylinosine
        m22g            2,2-dimethylguanosine
        m2a             2-methyladenosine
        m2g             2-methylguanosine
        m3c             3-methylcytidine
        m5c             5-methylcytidine
        m6a             N6-methyladenosine
        m7g             7-methylguanosine
        mam5u           5-methylaminomethyluridine
        mam5s2u         5-methoxyaminomethyl-2-thiouridine
        man q           beta,D-mannosylqueosine
        mcm5s2u         5-methoxycarbonylmethyl-2-thiouridine
        mcm5u           5-methoxycarbonylmethyluridine
        mo5u            5-methoxyuridine
        ms2i6a          2-methylthio-N6-isopentenyladenosine
        ms2t6a          N-((9-beta-D-ribofuranosyl-2-methyltiopurine-6-yl)car
                        bamoyl)threonine
        mt6a            N-((9-beta-D-ribofuranosylpurine-6-yl)N-methyl-carbam
                        oyl)threonine
        mv              uridine-5-oxyacetic acid-methylester
        o5u             uridine-5-oxyacetic acid (v)
        osyw            wybutoxosine
        p               pseudouridine
        q               queosine
        s2c             2-thiocytidine
        s2t             5-methyl-2-thiouridine
        s2u             2-thiouridine
        s4u             4-thiouridine
        t               5-methyluridine
        t6a             N-((9-beta-D-ribofuranosylpurine-6-yl)carbamoyl)threo
                        nine
        tm              2'-O-methyl-5-methyluridine
        um              2'-O-methyluridine
        yw              wybutosine
        x               3-(3-amino-3-carboxypropyl)uridine, (acp3)u
        OTHER           (requires /note= qualifier)


7.5.3 Amino acid abbreviations

Authority       IUPAC-IUB Joint Commission on Biochemical Nomenclature.
Reference       IUPAC-IUB Joint Commission on Biochemical Nomenclature.
                Nomenclature   and    Symbolism   for   Amino   Acids   and
                Peptides.
                Eur. J. Biochem. 138:9-37(1984).
                IUPAC-IUBMB JCBN Newsletter, 1999        
                http://www.chem.qmul.ac.uk/iubmb/newsletter/1999/item3.html
Scope           /anticodon, /codon, /transl_except
Contact         EMBL-EBI

Listing (note that the abbreviations are legal values for amino acids, not the full names)
        Abbreviation    Amino acid name
        ------------    ---------------

        Ala     A       Alanine
        Arg     R       Arginine
        Asn     N       Asparagine
        Asp     D       Aspartic acid (Aspartate)
        Cys     C       Cysteine
        Gln     Q       Glutamine
        Glu     E       Glutamic acid (Glutamate)
        Gly     G       Glycine
        His     H       Histidine
        Ile     I       Isoleucine
        Leu     L       Leucine
        Lys     K       Lysine
        Met     M       Methionine
        Phe     F       Phenylalanine
        Pro     P       Proline
        Pyl     O       Pyrrolysine
        Ser     S       Serine
        Sec     U       Selenocysteine
        Thr     T       Threonine
        Trp     W       Tryptophan
        Tyr     Y       Tyrosine
        Val     V       Valine
        Asx     B       Aspartic acid or Asparagine
        Glx     Z       Glutamine or Glutamic acid.
        Xaa     X       Any amino acid.
        Xle     J       Leucine or Isoleucine
        TERM            termination codon


7.5.4 Modified and unusual Amino Acids
        Abbreviation    Amino acid
        ------------    ---------

        Aad             2-Aminoadipic acid
        bAad            3-Aminoadipic acid
        bAla            beta-Alanine, beta-Aminoproprionic acid
        Abu             2-Aminobutyric acid
        4Abu            4-Aminobutyric acid, piperidinic acid
        Acp             6-Aminocaproic acid
        Ahe             2-Aminoheptanoic acid
        Aib             2-Aminoisobutyric acid
        bAib            3-Aminoisobutyric acid
        Apm             2-Aminopimelic acid
        Dbu             2,4-Diaminobutyric acid
        Des             Desmosine
        Dpm             2,2'-Diaminopimelic acid
        Dpr             2,3-Diaminoproprionic acid
        EtGly           N-Ethylglycine
        EtAsn           N-Ethylasparagine
        Hyl             Hydroxylysine
        aHyl            allo-Hydroxylysine
        3Hyp            3-Hydroxyproline
        4Hyp            4-Hydroxyproline
        Ide             Isodesmosine
        aIle            allo-Isoleucine
        MeGly           N-Methylglycine, sarcosine
        MeIle           N-Methylisoleucine
        MeLys           6-N-Methyllysine
        MeVal           N-Methylvaline
        Nva             Norvaline
        Nle             Norleucine
        Orn             Ornithine
        OTHER           (requires /note=)


7.5.5 Genetic Code Tables

Authority      International Nucleotide Sequence Database Collaboration
Contact        NCBI
Scope          /transl_table qualifier
URL            http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c

  Genetic Code [1]
  Standard Code (transl_table=1)  
 
    AAs  = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ---M---------------M---------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

 
  Genetic Code [2]
  Vertebrate Mitochondrial Code (transl_table=2)

    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSS**VVVVAAAADDEEGGGG
  Starts = --------------------------------MMMM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
 
 
  Genetic Code [3]
  Yeast Mitochondrial Code (transl_table=3)
 
    AAs  = FFLLSSSSYY**CCWWTTTTPPPPHHQQRRRRIIMMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ----------------------------------MM----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG 

 
  Genetic Code [4]
  Mold, Protozoan, Coelenterate Mitochondrial Code & Mycoplasma/Spiroplasma  
  Code (transl_table=4) 
  
    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = --MM---------------M------------MMMM---------------M------------ 
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  
  Genetic Code [5] 
  Invertebrate Mitochondrial Code (transl_table=5) 
  
    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSSSVVVVAAAADDEEGGGG
  Starts = ---M----------------------------MMMM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG



  Genetic Code [6]
  Ciliate, Dasycladacean and Hexamita Nuclear Code (transl_table=6) 
    
    AAs  = FFLLSSSSYYQQCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
 
   
  Genetic Code [9]  
  Echinoderm and Flatworm Mitochondrial Code (transl_table=9)  
       
    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG
  Starts = -----------------------------------M---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [10]   
  Euplotid Nuclear Code (transl_table=10) 
    
    AAs  = FFLLSSSSYY**CCCWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
  

  Genetic Code [11]
  Bacterial and Plant Plastid Code (transl_table=11) 
 
    AAs  = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ---M---------------M------------MMMM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [12]
  Alternative Yeast Nuclear Code (transl_table=12) 

    AAs  = FFLLSSSSYY**CC*WLLLSPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -------------------M---------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
                                                                   
 
  Genetic Code [13]
  Ascidian Mitochondrial Code (transl_table=13)

    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSGGVVVVAAAADDEEGGGG
  Starts = ---M------------------------------MM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

  Genetic Code [14]
  Alternative Flatworm Mitochondrial Code (transl_table=14) 
  
    AAs  = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [15]
  Blepharisma Nuclear Code (transl_table=15) 

    AAs  = FFLLSSSSYY*QCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

 
  Genetic Code [16]
  Chlorophycean Mitochondrial Code (transl_table=16)  

    AAs  = FFLLSSSSYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

  
  Genetic Code [21]
  Trematode Mitochondrial Code (transl_table=21) 

    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNNKSSSSVVVVAAAADDEEGGGG
  Starts = -----------------------------------M---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [22]
  Scenedesmus obliquus mitochondrial  

    AAs  = FFLLSS*SYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [23]
  Thraustochytrium Mitochondrial Code (transl_table=23) 
  
    AAs  = FF*LSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = --------------------------------M--M---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG



7.5.6 Country Names

Authority       International Nucleotide Sequence Database Collaboration
Contact         INSDC member databases
Scope           /country qualifier
URL             http://www.insdc.org/page.php?page=country





Feature Table Definition        
Version 8  Oct, 2008