Using Sequin to Prepare a HTG Submission |
Sequin | Entrez | BLAST | OMIM | Taxonomy | Structure |
|
This document assumes that you are already familiar with the Sequin program. If you are not, please visit the Sequin home page at http://www.ncbi.nlm.nih.gov/Sequin.
In order to create HTG records in Sequin, you must add some information to the active Sequin configuration file, which is not the configuration file that may be included when you download Sequin. The config files in the config directory in the Sequin archive are copies that need to be moved to the right place in order to work. In Unix, the active configuration file is called .sequinrc, on the Macintosh it is called sequin.cnf, and on the PC it is called sequin.ini. To the active config file, add the lines [SETTINGS] GENOMECENTER=genome_center_abbreviation where genome_center_abbreviation is usually your FTP login name. If there is already a section called [SETTINGS] in the configuration file, add the GENOMECENTER to the list. For the PC, the only active config file is in the "Windows" directory. This file can have different names, such as WINNT, depending upon which version of Microsoft Windows is being used. The easiest way to find the active config file is to download the Sequin archive, extract it, run sequin.exe, choose Misc->Net Configure, click on Normal, press Accept, and then Quit Program. This should save a sequin.ini file in the correct place. You should then search for sequin.ini. Once you find the right sequin.ini file, you should edit that one. If you are using a MAC, then the active config file is the sequin.cnf file in the "System Folder:Preferences" folder, not the one in the "Sequin Folder:config" folder. Furthermore, any config file in the "System Folder" itself will override the normal one, so you need to edit the one in "System Folder:Preferences". If there isn't a file there already, you can move the one from the "Sequin Folder:contig" folder, even if you have already edited this file. Be sure to do the editing while Sequin isn't running so that your changes are not overwritten when you quit Sequin. When you restart Sequin, there will be a new button on the first Welcome to Sequin form called "New FA2HTGS Submission". If you have previously set up the configuration file with the GENOMETAG setting and you see this button in Sequin, there is no need to change anything.
Before you create your first HTG submission, you need to make a Sequin submission template that contains contact, citation, and organism information. You can then use this template for subsequent submissions. To create the template, click on the "Start New Submission" button on the first Welcome to Sequin form. All of the information from the subsequent Submitting Authors form will be used for the HTG record. Fill out the Submission, Contact, Authors, and Affiliation pages carefully. Instructions are provided in the Sequin help documentation. On the Sequence Format form, select Single sequence and FASTA format. On the next form, Organism and Sequences, the only information that will be read into the HTG record is the name of the organism. Select the scientific name on the Organism page, and import a dummy nucleotide sequence on the Nucleotide page. This sequence will not be included in any submissions, therefore, any sequence, even just one nucleotide, is sufficient. There is no need to provide any Protein information. When you reach the record viewer (the GenBank flatfile view), save the file and close the window.
Single, unsegmented sequences should be in standard FASTA format. Segmented sequences for phase 1 or 2 HTGS should be in a modified FASTA format such as this: >P74A8 pcr product joining p130c12 and p91c10 gatcagcccaaagcattgattaggggaacttacctgtagagggctgcagcaatggggaac acctggctgggtcacagagtggtcaatgcactccatgacttttgggtcaggacacagaaa gaaagagcggggaaccggggggccctacagtgatgaattatactaactgattttagaatg >segment2 ttaaacaaacattgcatttccagaataaaccccatttagtaacgcatagtgtgcttgtat ctcagcctcccaaagtgctgggattatagacatgagccagcgcacctggctttgttagcc >segment3 ttttcaaataactttttgaactttgttaattttttaattgcacgttttctccttcattta ctaattccattcaaaagtagcatcaatgagaataaattacttaggaatacatttaattaa aaagtgctagacttgtacactgaaaattacaaagtactctggagatatattc The first line has the Sequence Id (P74A8) and a title. Each segment is separated by >segmentxThis line must be unique among all the lines of FASTA-formatted sequence being processed (e.g., ">segment2").
You can also import sequence files in the .ace file format, which is an output of Phrap. This format is not described here.
On the Welcome to Sequin form, select "New FA2HTGS Submission" if you are submitting sequences in FASTA format, and "New PHRAP Submission" if you are submitting sequences in PHRAP .ace format. On the next page, click on "Read Seq-submit template" to import the Sequin submission template file. Next, read in the FASTA or PHRAP-formatted sequence file. On the final page, enter details about the record.
When you are finished with the submission, deposit it on your FTP account under the "SEQSUBMIT" directory. Our software will look for it there every day, validate the center and sequence_name id's, check if the record is an update, and write a report that you can pick up the next day. Further information about how HTG records are processed is available from http://www.ncbi.nlm.nih.gov/HTGS/processing.html.
Revised: January 6, 2003. Questions or Comments?
|