Consed instructions

CONSED 11.0 DOCUMENTATION

CONTENTS:
WHAT IS NEW IN CONSED 11.0
WHAT IS AUTOFINISH
QUICK TOUR OF CONSED
USING AUTOFINISH
ADVANCED PHRAP/CONSED USAGE
INSTALLING CONSED
NOTE TO SGI USERS
FOR PROGRAMMERS AND FELLOW TRAVELLERS ONLY
MONITORS AND MICE FOR CONSED
HUGE ASSEMBLIES
AUTOFINISH AND PRIMER PICKING PARAMETERS
NEW ACE FILE FORMAT
WHAT THE COLORS MEAN

----------------------------------------------------------------------------

WHAT IS NEW IN CONSED 11.0

This section is mainly intended for advanced Consed users. Novice
users should consult the Quick Tour (below).

Consed/Autofinish is the finishing portion of the
Phred/Phrap/Consed/Autofinish package.

Consed 11.0 is now available. Every Consed/Autofinish site should
immediately discontinue using older versions of Consed/Autofinish that
contain bugs fixed in Consed 11.0. Older versions of
Consed/Autofinish are no longer supported.

Bug fixes:
The Linux font problem (in the navigate windows) is now fixed, as
are many other Consed bugs.

As more and more sites have used Autofinish in more different
ways, I've fixed Autofinish bugs that were not apparent in
testing. Thus this version of Autofinish is far more robust.

Each of these new features is explained in README.txt supplied with
Consed.

New Autofinish Features:

* PCR can be used to close gaps

* detects runs and stops and suggests special chemistry for reads
crossing such areas

* ability to set a *minimum* number of templates for a primer.
For example, you might specify that if a primer has only one
acceptable template, you don't want the primer chosen.

* ability to not pick primers in single subclone regions,
unaligned high quality regions, or regions with high quality
discrepancies

* you can specify a maximum size for finishing reads (useful if
your finishing reads are shorter than your shotgun reads)

New Consed Features:

* you can have backgrounds other than black--good for making color
copies

* chromatograms can be kept in some location other than
../chromat_dir
For example, they can be kept in an external database.

* ability to prevent 2 Consed users from making edits to the same
project at the same time

* can define keys to add tags and to run external programs. This
is helpful for connecting Consed to external databases for
recording judgments made by a user.

* if read names are too long to fit in the Aligned Reads Window,
you can change them in the General Preferences Window without
needing to restart Consed

* traces can be automatically scaled so a project can contain ABI
and MegaBACE traces and the user doesn't need to change the
scaling each time he/she views a trace

* can type in part of a read name and find it. E.g.,
read djs422_8064.s1 can be found by just typing 8064

----------------------------------------------------------------------------

WHAT IS AUTOFINISH?

Autofinish automatically chooses reads for finishing. Autofinish
sometimes is able to completely finish a project with no human
decisions. In other cases Autofinish mostly finishes a project, and a
human just needs to do the final difficult problems since all the
routine problems have already been completed by Autofinish. Thus a
human finisher is able to complete far more projects in the same
length of time.

Autofinish is flexible to the finishing strategy of your lab. It can
be used to finish with just universal primer reads, just oligo walks,
just minilibraries, or a combination of these. It can be used to
finish either genomic or cDNA.

Autofinish will do the following:

-close gaps
-improve sequence quality
-determine the relative orientation of contigs
-ensure that, at each consensus base, at least 2 reads from different
templates are aligned

(You can configure Autofinish to do any combination of these tasks.)

Autofinish will suggest the following types of experiments:

-universal primer reads (forward or reverse)
-custom primer reads with subclone templates
-custom primer reads with whole clone templates
-minilibraries (transposon or shatter) from subclone templates

(You can configure Autofinish to suggestion any combination of these
experiments.)

----------------------------------------------------------------------------

QUICK TOUR OF CONSED

Release 11.0

Consed is a program for viewing and editing assemblies assembled with
the phrap assembly program.

If you are already an advanced Consed user, you should read through
this and do any of the exercises on features that you are unfamiliar
with. I frequently run across people who are doing something in
Consed a hard way month after month, and request a new feature to make
things easier, when that new feature is already in Consed.

If you have never used Consed before, to follow this Quick Tour will
take you less than 4 hours. However, it will save you approximately 2
days in agony. If you have 2 extra days to spare, and prefer to waste
them in agony, then do not do this Quick Tour and instead immediately
skip down to 'INSTALLING CONSED' below.

If you do the Quick Tour, start your system administrator installing
consed (see INSTALLING CONSED below about 35 pages) because you will
need to have completed that for some of the more advanced sections of
the Quick Tour.

When you do the quick tour, I encourage you to be free about changing
the data set. If you really mess things up (such as changing all a
read's bases to N's), no problem--just delete the data set and start
again with a fresh copy.

1) After downloading the distribution with netscape (see www.phrap.org
and click on 'Consed'), copy the distribution to a unix computer (if
it is not already on one). Copy the distribution to a directory where
you can put the sample datasets to be used by anyone learning Consed,
and then unpack the files by typing the appropriate line below (which
one depends on what you named the file downloaded by netscape):

Note: You must run tar on a UNIX computer--not on an NT computer,
due to a difference in the handling of breaks between lines.

2) The only unix commands you must learn are the following 3:
pwd -- this tells you were you are
ls -- this tells you what files are there (Same as DIR in DOS)
cd -- this moves you (Same as CD in DOS)
That's it--use them a lot!

USING CONSED GRAPHICALLY

3) Type the following:

cd standard/edit_dir

4) start Consed by typing the appropriate command below:

../../consed_solaris
../../consed_alpha
../../consed_hp
../../consed_sgi
../../consed_linux

Two windows will appear. One of these will have the list of .ace
files and say 'select assembly file to open' and
'standard.fasta.screen.ace.1'. Double click on that name. The first
window goes away.

You will now see a list of one contig and a list of reads. This is the
'Main Consed Window'.

Double click on 'Contig1'.

The 'Aligned Reads Window' will appear.

Try scrolling back and forth. Try scrolling by dragging the thumb of
the scrollbar. Also try scrolling by clicking on the 4 << < > >>
buttons for scrolling by small amounts. For scrolling by tiny
amounts, click on the arrows at either end of the scrollbar. For
scrolling by huge amounts, use the middle mouse button and just click
on some location on the scrollbar. For scrolling to the beginning or
end of the contig, use the <<< or >>> buttons.

(Question: why can't you just move the scrollbar to the extreme left
in order to go to the beginning of the contig? Answer: in typical
assemblies, there are reads that protrude beyond the beginning of the
contig and reads that protrude beyond the end of the contig. Moving
the scrollbar to the extreme left will scroll the contig to the
beginning of the leftmost read--typically far to the left of the
beginning of the contig. Thus you should get in the habit of using
the <<< and >>> buttons.)

Notice the colors. Scroll to position 937 and notice the read 'a'.
The red bases are the ones that disagree with the consensus.

Notice the different shades of grey background (around the bases).
They have the following meanings, but first, you need to understand
the meaning of the quality values:

A quality value of 10 means 1 error in ten to the 1.0 power
A quality value of 20 means 1 error in ten to the 2.0 power
A quality value of 30 means 1 error in ten to the 3.0 power
A quality value of 40 means 1 error in ten to the 4.0 power

and for quality values in between:

A quality value of 25 means 1 error in ten to the 2.5 power

Get the idea?

(These have actually been empirically verified--if you are interested
in the gory details, read the phred papers:

Ewing B, Hillier L, Wendl M, Green P: Basecalling of automated
sequencer traces using phred. I. Accuracy assessment. Genome Research
8, 175-185 (1998).

Ewing B, Green P: Basecalling of automated sequencer traces using
phred. II. Error probabilities. Genome Research 8, 186-194 (1998).

In that same copy of the journal is a paper about Consed, as well.)

Also notice the upper and lowercase. This is just a cruder indication
of the quality of the bases.

5) To see the quality value of a particular base, point at it and
click with the left mouse button. You will see the quality displayed
in the Info Box on the Aligned Reads Window.

These quality values are shown in grey scales:

Quality 0 through 4 is given by dark grey
Quality 5 through 9 is given by a shade lighter
Quality 10 through 14 is given by a shade still lighter
.
.
.
Quality of 40 through 97 is given by white (the brightest shade)

A quality value of 99 is reserved for bases that have been edited and
the user is absolutely sure of the base ('high quality edited').

A quality value of 98 is reserved for bases that have been edited and
the user is not sure of the base ('low quality edit').

The ends of the reads shows bases that are grey and have a black
background. These are the low quality ends of the reads or the
unaligned ends of reads, as determined by phrap.

6) Click on a base on a read. Then hold down the control key and
type 'a'. You will move to the beginning of the read. Hold down the
control key and type 'e'. You will move to the end of the read.
(Emacs users will recognize these commands.)

7) Scroll so that location 490 is about in the middle of the aligned
reads window. Push the left mouse button down on the menu item 'Dim'.
There will be a list of choices that will appear. Drag the cursor
down to 'Dim Nothing' and release. Now look what happened to the
color of the bases. The ends of the reads that used to be with a
black background now appear red with a grey background. You are
seeing the clipped-off bases with all the same information as any
other base. Since there is a huge amount of red (discrepant) bases,
the screen becomes distracting and busy. Thus by default the low
quality clipped-off bases are made with a black background and a grey
foreground so they don't distract you.

Notice there is a distinction here between 'low quality ends of
reads' and 'unaligned ends of reads'. Unaligned ends of reads can be
low quality as well, or they can be high quality, as in the case of
chimeric reads.

Point with the mouse to a read name and hold down the right mouse
button. You will notice there is a line that says "high quality from
nnn to nnn; aligned from nnn to nnn; chem: prim". This is giving the
same information in number form. Check that the numbers agree with
the dimming.

You can play with the dimming options a bit. Then return it to 'Dim
Low Quality' for the rest of this tour.

TRACES AND EDITING

8) Point with the mouse at a base of one of the reads and click with the
middle mouse button. (If you have a 2 button mouse, see MONITORS AND
MICE FOR CONSED below.) The Trace Window showing the traces for that
stretch of read should popup.

There are 2 rows of numbers:

'con' are the consensus positions
'rd' are the read positions

There are 3 rows of bases in the trace window:

'con' is the consensus
'edt' is where you can edit the base calls of the read
'phd' is the original phred base calls

Notice that a red rectangle blinks (the 'cursor') in the corresponding
positions of the Aligned Reads Window and the Trace Window.

9) Try editing in the Trace Window. You can click the left mouse
button on a base in the 'edt' line to set the cursor (a blinking red
rectangle). You can directly overstrike a base by typing a letter.
Try this. Try undoing it (by clicking on 'undo' ). If you want to
undo more than one edit, you will have to go back to the main Consed
window and click on the button labeled 'Undo Edit...'--you will learn
that later.

You can move left and right with the arrow keys.

We believe that the user should change a base call only while
examining the traces. That is why editing is done here--not in the
Aligned Reads Window.

10) You can insert a column of pads by pushing the space bar. Try
this. (You may need to click on a base on the 'edt' line first.)

(For those of you new to editing assemblies, a 'pad', which in Consed
and phrap is represented by the '*' character, is used to align
two or more sequences such as these:
gttgacagtaatcta
gttgacataatcta
in which one sequence has an inserted or deleted base with respect to
the other. By inserting the pad character, it is possible to get a
good alignment:
gttgacagtaatcta
gttgaca*taatcta
This is the purpose of pad character--it is just a placeholder.)

You can then overstrike a pad with a base. In this way you
can insert a base, and still preserve the alignment.

11) Try highlighting a stretch of a read on the edt line by holding
down the middle mouse button and dragging the cursor over some bases.
They will turn yellow as you drag. Then release the mouse button. A
window will pop up giving you some choices of what to do with those
(yellow) bases.:

Make High Quality--makes the highlighted bases edited high quality
(99). This tells phrap (when it reassembles) that you are
sure of the sequence here.
Change Consensus--make the highlighted bases edited high quality and
change the consensus to agree with that stretch of the read.
This is a directive to phrap (upon reassembly) to use that
stretch of that read to be the consensus.
Make low quality--makes the highlighted bases edited low quality.
This tells phrap (when it reassembles) that you are not sure
of the bases here and phrap can go ahead and make a join even
if the bases in this region don't match perfectly.
Make Low Quality to Left End--same as above, but all the way to
the left end of the read.
Make Low Quality to Right End--same as above, but all the way to
the right end of the read.
Change to n's--Change the highlighted bases to n's which means
they are unknown bases. This tells phrap (when it
reassembles) to not make any join based on these bases. It is
useful when you believe the bases may be in the chimeric
portion of a read.
Change to n's to left--same as above but to left end.
Change to n's to right--same as above but to right end.
Add Tag--allows user to add any tag to a stretch of read bases.
Dismiss--you decided you don't really want to do anything with
this stretch of bases.

This popup is made so that nothing else works until you choose
something. Try each of these choices, except for tags, which you'll
try below.

'Change Consensus' has an additional function--if a read extends out
on the right beyond the end of the consensus, you can extend the
consensus by using this function. You might want to do this, for
example, if crossmatch did not correctly find the cloning site and
thus clipped too much. You can add these bases to the consensus
by using 'Change Consensus'. Typically, the quality of these bases in
the read and in the consensus is 99. That is so that next time phrap
runs, it will correctly extend the consensus.

However, if you aren't going to reassemble, you might want to just
leave the quality values the way phred originally called them. You
can do this by using a Consed resource
(consed.extendConsensusWithHighQuality), which you will learn more
about later (see CONSED CUSTOMIZATION).

12) To delete a base, overstrike it with a '*' character. (Phrap
ignores '*', so this is the same as deleting the character.) If you
overstrike all bases in a column with * characters so the entire
column consists of *'s (including the consensus base), there is no way
to remove the column. This is OK since when you export the
consensus (try the exercise on EXPORTING THE CONSENSUS), the
*'s are not exported. While you are editing in Consed,
we believe there should be a visual indication that a base was
deleted.

SAVING THE ASSEMBLY

13) To save the assembly, pull down the 'File' menu on the Aligned
Reads Window, and release on 'Save assembly'. A box will pop up with
a suggested name. I suggest you always use the one it suggests. The
idea is that the ace files:

(project).fasta.screen.ace.1
(project).fasta.screen.ace.2
(project).fasta.screen.ace.3
(project).fasta.screen.ace.4
(project).fasta.screen.ace.5

are in order of how old they are. If you feel you are taking up too
much disk space, then start deleting the ace files starting at the
oldest. I do not recommend that you overwrite existing ace files.
The version numbers just keep growing, and that is not a problem.

EXPORTING THE CONSENSUS

14) Exporting the consensus. Bring the Aligned Reads Window into view
again. Hold down the left mouse button on the 'File' menu and
release the button on 'Export consensus sequence'. Notice that the
consensus will be stored (in this case) in a file called
'Contig1.fasta'. Click 'OK'. There is now a file in your edit_dir
directory called 'Contig1.fasta' that has the consensus sequence in
it. If you want to see the file, bring up another Xterm (if you are
UNIX literate), and type:

cd standard/edit_dir
more Contig1.fasta

15) Fancier exporting the consensus. Bring the Aligned Reads Window
into view again. Hold down the left mouse button on the 'File' menu
but this time release on 'Export consensus sequence (with
options)...'. Just export a little snip of the consensus, from 400
to 410. (You will notice this contains a pad * character.) Ask for
both the bases file and the quality file. Click 'OK'. Consed will
want to call this file 'Contig1.fasta' again. You can overwrite the
existing file.

Look in your other Xterm at these files:

more Contig1.fasta
more Contig1.fasta.qual

The one file contains the bases (but no * pads) and the other
contains the corresponding qualities of those bases.

16) Exporting the consensus of all contigs at once: Go to the Main
Consed Window. Point to 'File', hold down the left mouse button, and
release on 'Write all contigs to fasta file'. You then can choose a
filename for all contigs to be written to. (In this project there is
only 1 contig, so there is no difference between this option and just
exporting a contig at a time.)

17) (For this step, first click on the 'Dim' menu and release on 'Dim
Nothing'.) Point to the 'Color' menu, hold down the left mouse button
and release on 'Color Means Edited and Tags'. Notice that the bases
that you have edited (make sure you have edited some bases) will stand
out in either white or grey (depending on whether the base was made
high quality or low quality). Observe this both in the Trace Window
and the Aligned Reads window. This colormode is useful if you are
interested in easily spotting which bases are edited.

Return to the 'Color Means Quality and Tags' colormode by the
following: point to the 'Color' menu, hold down the left mouse button
and release on 'Color Means Quality and Tags'.

FIND MAIN WINDOW

18) On the Aligned Reads window, click on 'Find Main Win'. This will
cause the Consed Main Window to pop up in the event you have buried it under
other windows or iconified it. (This may not work with some settings of
your X emulator. In that case you will have to find and click on the
Main Window to bring it up.)

MULTIPLE UNDO EDIT

19) Now that the Consed Main Window is visible, click the 'Undo
Edit...' button. There will be a popup indicating the most recent
edit. (If it says "no edits so far", then bring up a trace and make
several edits. Then click on 'Undo Edit...' again.) Click 'undo'.
Then you will see the edit that was done before that. Click 'undo'.
You can continue undoing if you like. You now know how to undo more
than one edit. You cannot choose which edits to undo and which to not
undo--edits can only be undone in precisely reverse order from the
order you made them. Once you save the assembly, you cannot undo
prior edits.

SCROLLING TRACES AND ALIGNED READS TOGETHER

20) In the Aligned Reads window, scroll along the contig to a
different point. Click the left mouse button on a read whose trace is
already up. Notice that the existing trace instantly scrolls to the
corresponding location. Now go to the Trace Window and scroll the
traces to a new location. Click on the edt line with the left mouse
button. You will notice that the Aligned Reads window will instantly
scroll to the corresponding location. Thus you can keep the Aligned
Reads window and the traces scrolled to the same location.

EXAMINING ALL TRACES

21) Go to a region where there are lots of reads, say base 1660. Push
down the right mouse button and release on 'Display traces for all
reads'. You will see all traces displayed in a scrolling window. You
can drag the scrollbar on the right down and up to see all the traces.
This feature is particularly useful for polymorphism/mutation
detection work. This feature was added to work in cooperation with
polyphred. To see it in action, exit Consed.

EXITING CONSED

22) On the Aligned Reads Window, point to 'File' menu, hold down the
left button and release on 'Quit Consed'. If it asks you some
questions, answer 'Quit Without Saving and Discard .wrk File'.

CONSED-POLYPHRED INTERACTION

Polyphred is a program for finding polymorphic sites; it was developed by
Debbie Nickerson's group (contact them at http://droog.mbt.washington.edu).

We have a test database, 'polyphred', which has had polyphred run on
it already. Polyphred has put a polymorphism tag on each polymorphic
site.

Type:

cd ../../polyphred/edit_dir
ls
../../consed_(computer type)

where (computer type) is one of solaris, hp, alpha, sgi, or linux.

Double click on example2.fasta.screen.ace.1

When Consed comes up, you should see 2 contigs.
Double click on Contig2

In the Aligned Reads Window, push the left mouse button while pointing
to the 'Navigate' menu and release on:

'Toggle feature: when navigating to consensus location, pop up all
traces (currently off)'

That will turn this feature on.

Now push the left mouse button while pointing to the 'Navigate' menu
and release on 'Tags'. Up should pop a list of tag types. Double
click on 'polymorphism'. Polyphred has already been run so the
consensus is tagged with polymorphism tags at each polymorphic site.
Up will pop a window labelled 'Polymorphism Tags' with a list of
sites. Click on 'Next'.

If you correctly followed the instructions above, all the traces should
pop up at the first polymorphic site. You may want to reposition the
traces window to see it better.

Now ignore the original 'Polymorphism Tags' window and instead click
on 'Next' in the *traces* window. This will take you to the next
polymorphic site. Pretty nice, huh?

23) ALPHABETICAL ORDERING OF READS

The reads can be ordered in two ways:

a) alphabetically
b) first all the top strand reads and then all the bottom
strand reads. The top strand reads are then ordered
by the left end of the reads. Same with the bottom
strand reads.

Try changing between a) and b). In the Main Consed Window (click on
'Find Main Win' on the Aligned Reads Window if you can't find the Main
Consed Window because it is covered up with other windows), pull down
the 'Options' menu, and release on 'General Preferences'. Scroll down
until you find 'Display reads sorted alphabetically or by strand/left
end of read.' Switch it between 'alpha' and 'strand'. Then click
'Apply and Dismiss'. Notice the effect in the Aligned Reads Window.
Many polymorphism and mutation detection labs find that alphabetically
sorting is most useful, while many genomic sequencing labs find that
sorting by strand/left end of read is most useful.

After you are done playing with these features, exit Consed and go back
to the previous database:

cd ../../standard/edit_dir
ls
../../consed_(computer type)
Double click on standard.fasta.screen.ace.1

When it says "There is an edit history file (a .wrk file)...Do you
want to apply those edits?", click on "no".

Double click on Contig1 to bring up the Aligned Reads Window again in
preparation for the next step.

NAVIGATING

24) In the Aligned Reads window, pull down the Navigate menu and
release on 'Low consensus quality'. You will see a list of locations.
Move the 'Low consensus quality' window down so you can see the
Aligned Reads window.

Repeatedly click on 'Next' until you reach the end of the list. (Low
consensus quality means an area in which the bases each have too high
probability of being wrong.) This saves you from having to look
through large amounts of high quality data trying to find problem
areas.

There are 2 'Next' buttons--one on the Aligned Reads Window and one on
the Low Consensus Quality Window. You can click on either, but it is
probably more convenient to use the 'Next' button on the Aligned Reads
Window. Thus you can keep the Aligned Reads Window in
front with input focus and keep the Low consensus quality window
pushed out of the way.

You may want to click on the 'Save' button in the Low consensus
quality Window to save to a file a copy of this list of problem areas
as you work through them.

In our experience, this will be the most important navigate list you
will use. In fact, finishing consists mainly of adding reads and
rephrapping until this list is reduced to nothing.

25) Dismiss the Low consensus quality window. Pull down the
'Navigate' menu again and release on 'High quality discrepancies as
above, but omitting tagged compressions and G_dropouts'. You will
probably notice there are no entries (unless you created some yourself
by editing). That is because there are no high quality discrepancies
with this dataset. So let's force there to be some by lowering the
quality threshold. First, dismiss the High quality discrepancies
window.

Click on 'Find Main Win'. In the Main Consed Window, pulldown the
'Options' menu and release on 'General Preferences'. Notice that the
default for 'Threshold for High Quality Discrepancy' is 40. Change it
to 15 and click 'Apply & Dismiss'.

Then follow the steps above to bring up the High quality discrepancies
menu. Now you will see several entries. Click 'next' repeatedly to
go successively to the next high quality discrepancy in the Aligned
Reads Window.

You can also double click on a particular line in the High quality
discrepancies window to go to that location. Alternatively, you can
single click on a line and then click the 'Go' button.

Dismiss the High quality discrepancies window.

26) Similarly, try the other navigate lists: Unaligned high quality
regions (this list will be empty with this data set), Edits, Regions
covered by only 1 strand and only 1 chemistry, and Regions covered by only 1
subclone.

Unaligned high quality regions are regions in which the traces are
high quality so there is no question of the bases, but the region
differs so much from other reads that phrap has given up trying to
align the region with the consensus. This could be due to a chimeric
read, or perhaps the read belongs somewhere else.

We believe that regions covered by only 1 subclone should be covered
by a 2nd subclone to prevent the possibility of there being a deletion
in the single subclone.

There are so many different problem lists that you may forget to check
one of them and thus miss a serious problem. Thus we combined them
all into a single list. This is the first menu item: 'Low Cons/High
Qual Discrep/Single Stranded/Single Subclone/Unaligned High'. We
suggest you use this list.

27) Also try navigate by tags by selecting 'tags' under navigate: when
the Select Tag Type Window appears, double click on 'compression'.
(Note that you can't do anything else until you deal with this
window.) This gives a list of a particular tag type in a particular
contig.

28) There is also a way of getting a list of a particular tag type in
all contigs: Click on 'Find Main Win'. In the Main Consed Window,
point to the 'Navigate' menu, hold down the left mouse button, and
release on 'Tags in all contigs'. Continue as in the previous step.
(Since there is only one contig, this list will not be any different
than the corresponding list for Contig1.)

PRIMER-PICKING

To do this step, you must have first completed the INSTALLING CONSED
(below). So, if you haven't done that yet, please complete that
first.

29) Go to some location near the right end of the contig, say base
2470. Click with the right mouse button on the consensus and click on
either one of the top strand primer choices (either from subclone
template or from clone template). Consed will pause a moment, and
then there will appear a selection of primers that pass all of
Consed's requirements. Templates are also chosen for each primer.
You may have to scroll the primer list to the right to see the
templates. Consed lists these templates in order of quality--all of
them will cover the read you want to make.

Double click on one of the primers in the Primers Window. That will
cause the Aligned Reads Window to scroll to show that oligo in
context. Click on 'Accept Primer'. A comment box will pop up. Enter
some comment and click 'OK'. Notice that a yellow oligo tag, with a
little red end, is created on the consensus for that primer. The red
end points in the direction of the oligo. The tag contains all the
information you need to order that oligo and do the reaction--you will
learn how to pop it up below under 'tags'.

What is the difference between 'Pick Primer from Subclone Template'
and 'Pick Primer from Clone Template'?

There are 3 differences:

A. which vector file the primers are screened against. In the former
case, the primer is screened against the file primerSubcloneScreen.seq
and in the latter case against the file primerCloneScreen.seq

B. In checking for false matches elsewhere in the assembly, if the
template is the whole clone, then Consed must check for false matches
in the *entire* assembly, including all other contigs. But if the
template is just going to be a subclone, Consed only needs to check
elsewhere in that subclone. Actually, to be conservative, Consed
checks for false matches +/- the maximum insert size of a subclone.

C. If you are picking primers for subclone template, then the primer
picker can also pick the subclone templates. If it doesn't find any
suitable subclone template, it will reject the primer. (By default,
picking of subclone templates is turned on. If you prefer to pick
your own primers, and want Consed's primer picker to be much faster,
you can turn it off temporarily or permanently. To turn it off
temporarily, go to the Consed Main Window, point to the Options menu,
hold down the left mouse button and release on 'Primer Picking
Preferences'. Scroll down to 'Pick Subclone Templates for Primers'
and click 'False'. Click on 'Apply and Dismiss'. To change this
permanently, see CONSED CUSTOMIZATION below. Beware: you must
correctly customize determineReadTypes.perl for template picking to
work. See INSTALLING CONSED below.)

If you are interested in the details of primer-picking, see the
section 'AUTOFINISH AND PRIMER PARAMETERS' (below).

When you are done editing and have saved the assembly and exited
Consed, run ace2Oligos.perl (supplied with this distribution--make
sure your system administration installed it) which will extract all
the oligos you just created. This is handy for email ordering of
oligos.

In the xterm, type:

ace2Oligos.perl standard.fasta.screen.ace.2 oligos.txt

where standard.fasta.screen.ace.2 is whatever the name is of the ace
file you just saved.

30) PICKING PCR PRIMER PAIRS

In the Aligned Reads Window, go to the location where you want to pick
the first PCR primer, say base 500. Point to the consensus, hold down
the right mouse button and release on "Top Strand PCR Primer". Then
scroll to the location where you want to pick the second PCR primer,
say base 2200. Point to the consensus, hold down the right mouse
button and release on "Bottom Strand PCR Primer". There will be a
pause and then there will be a list of PCR primer pairs. Click on the
pair you want and click "Accept Pair".

You can modify the parameters for choosing PCR primer pairs by going
to the Main Consed Window, pointing to "Options", holding down the
left mouse button, and releasing on "Primer Picking Preferences." For
example, by default Consed does not display all PCR primer pairs--this
would take too long and give you too many. However, you can ask it to
show you all such pairs. In the Primer Picking Preferences, scroll
down to "Check All PCR Pairs (huge) or Just Sample?" and click on
"All". Then click on "Apply and Dismiss". Then pick PCR primers
again, as above. Don't be surprised if you get 10,000 or more pairs
of primers!

SEARCH FOR STRING

31) Try the 'Search for String' button (left side of the Aligned Reads
Window). Type in a string (such as aaaca), and click 'ok'. There
should be a list of 'hits'. Double click on one of the hits (or
single click on it and click on 'go'.) Notice that the Aligned Reads
Window scrolls to that position and has the cursor on the found
string. (It might be complemented.)

Dismiss this window. Try this again, only this time in the Search For
String Window select 'Search Just Reads'. Then click 'OK'. You will
notice there are many more hits. This is because this shows hits in
each read, even if they are at the same consensus position.

You can also try the approximate match search for string by clicking
on 'Approximate' instead of 'Exact'. The 'Per Cent Mismatch' only
applies to the Approximate match search.

COPY AND PASTE

32) In the Aligned Reads Window, swipe some bases by holding down the
left mouse button. You should see the bases turn yellow, at least
temporarily. Then click the 'Search for String' button. Use the
middle mouse button to paste the bases you have just swiped into the
'Query string:' box. Notice that you can swipe bases either from the
consensus or from a read.

The search for string is case-insensitive so don't worry about the
pasting being upper or lowercase.

CORRECTING FALSE JOINS MADE BY PHRAP

33) Phrap may put several reads together that you believe do not belong
together. (For example, you may see several high quality
discrepancies between the reads.) If you are sure these reads do not
belong together, you can force a subsequent reassembly by phrap to not
assemble those reads together. You do this by finding a location
where there is a high quality discrepancy. Then click on the read
with the right mouse button and release on 'Tell phrap not to overlap
reads discrepant at this location'. There are no high quality
discrepancies with this dataset so Consed won't let you do this.
(Try it and see.) However, when you use your own data, you may get
the chance!

ADDING READS

34) For this to work, your system administrator must have set up
everything correctly. (See below in INSTALLING CONSED.) Assuming you
have set everything up correctly, you can now experiment with adding
reads.

From a UNIX prompt, copy the new chromatograms into the chromat_dir
directory:

cp ../chromats_to_add/* ../chromat_dir

Exit Consed and bring it up again using the original ace file
standard.fasta.screen.ace.1

If it asks if you want to apply edits, just say 'no'.

On the Main Window, click on the Add New Reads button. There will
appear a list of files ending with .fof. These are files that contain
lists of chromatograms. Double click on 'reads_to_add.fof' Then
Consed will ask "If a read doesn't align against any existing contig,
do you want to have it go into a contig by itself? (otherwise it will
just not be put into the assembly)" Answer yes or no--I don't care.
There should be lots of progress output in the xterm from which you
started Consed. When it completes, there will be a Reads Added Window
popup with a report of which reads were added. In this case, it
should say that 9 reads were successfully added and list them.

If you get an error message, look carefully at the full error message
in the xterm to diagnose the problem. Probably there is some mistake
in how you installed Consed. See INSTALLING CONSED (below).

TEARS AND JOINS

Just so you get the same results as I do, exit Consed and bring it up
again using the original ace file

standard.fasta.screen.ace.1

If it asks if you want to apply edits, just say 'no'.

35) When phrap really screws up, you may want to just tear the contig
apart in several places and then join the pieces back together in a
different way. Let's try it:

Go to location 1500. Point the mouse at the consensus base at 1500
and push the right mouse button down. Release the button on 'Tear
Contig at This Consensus Position'. Up will pop a list of reads with
2 little buttons next to them <- and ->. Leave everything as it is
and just click 'Do Tear'. (If you want to play around with which
reads goes into which contig, do that another time.)

Now you should have 2 Aligned Reads Windows on top of each other. One
should contain 'Contig2' and the other 'Contig3'. Dismiss the little
window that says 'Tear Complete'.

Now let's join these 2 contigs back together:

Click on 'Search for String' and type in the following bases:
agctgccatc

Click 'OK'.

Search for string should find 2 locations, one in Contig2 and one in
Contig3:

Contig2 (consensus) 1447-1456 (uncomplemented)
Contig3 (consensus) 829-838 (uncomplemented)

Double click on the first one. The Aligned Reads Window for Contig2
will scroll to location 1447 and the window will raise up. In that
Aligned Reads Window, click on 'Compare Cont'.

Now double click on the 'Contig3' line in the above Search for String
results. The Aligned Reads Window for Contig3 will scroll to location
829 and lift up. In that Aligned Reads Window, click on 'Compare
Cont'.

Now the Compare Contigs Window should be visible. In the Compare
Contigs Window, try scrolling back and forth. You can change the
cursors (blinking red), but if you do, please return them to the
locations 1447 and 829 for the next step. The cursors 'pin' these
bases together when doing an alignment. (The algorithm is a pinned
Smith-Waterman alignment.)

Click on Align. Try scrolling the alignment by dragging the thumb in
the lower half of the Compare Contigs. An 'X' means there is a
discrepancy between the 2 contigs. There is also a 'P' (see if you
can find it!) The P indicates the bases that you pinned together.

Click with the left mouse button on either contig in the bottom
alignment. You will notice that both contigs will have the red
blinking cursor in the same position. Click on 'Scroll Both Aligned
Reads Windows' and look at the Aligned Reads Windows to see that they
scroll to the corresponding positions. You can have traces up for the
contigs, and they will scroll as well. Experiment with this. Then
click 'Join Contigs'. The 2 previous Aligned Reads Windows will disappear and
there will be a new one which has a new contig 'Contig4'. You have
made a join!

It is possible to have more than one Compare Contigs windows up at a
time. This allows you to investigate a repeat that has more than 2 copies.

Compare Contigs is one method of exploring joins of contigs that were
not made by phrap. Another method is to use phrapview, supplied with
phrap. phrapview gives a high level view of all internal joins while
'compare contigs' shows the alignment of a single internal join. Some
users have found them to work well together--phrapview to find a join
and, having found it, 'compare contigs' to examine it in more detail.

REMOVING READS

36) You can also remove individual reads and put them into their own
contigs. For example, in the Aligned Reads Window, go to location
2000. Point to the read name of read djs74_2664.s1 and hold down the
right mouse button. Release on 'Put read djs74_2664.s1 into its own
contig.' Consed will ask you 'Are you sure...?' Answer 'yes'.
Presto-chango! The read is put into its own contig and the old
contig is redrawn without the read in it. At this point you should
save the assembly--you should always save the assembly after removing
reads.

TAGS

37) Bring up a trace for a read (as above). Swipe some bases on the
'edt' line with the middle mouse button. A list of choices will pop
up. Select 'Add Tag'. Type in a comment in the box at the bottom,
and select 'comment' from the list of tag types. You will now see a
blue box both in the Aligned Reads Window and in the Traces Window on
that read.

To see the comment, you can just point to it in the Aligned Reads
Window and you will see the comment in the lower right hand corner of
the Aligned Reads Window. Alternatively, you can click on that blue
tag in the Aligned Reads Window with the right mouse button and
release on 'Tag: comment Show more info?'. Alternatively, you can
click on the blue tag in the Traces Window with the right mouse
button.

Try creating some other kinds of tags: again swipe some bases in the
Trace Window by selecting a different tag type. You will notice that
different tags are in different colors. You can always use the
methods above to see what kind of tag it is if you forget what a
particular color means.

You can also define your own tag types. See below CREATING CUSTOM TAG
TYPES for how to do that.

38) You can create really, really long tags as follows: Just create a
short version of the tag as above for where you want the tag to start.
Then figure out the consensus position of where you want the tag to
end. In the Aligned Reads Window, click on the short tag with the
right mouse button and release on 'tag: show more info?' (as above).
A Tag Window will appear for that tag. In the Tag Window, simply
change the End Unpadded Consensus Position to the place you want it to
end. Then click 'OK'. You will now notice that the tag will be as
long as you wanted.

39) You can create tags on the consensus in the same way. In the
Aligned Reads Window, use the middle mouse button to swipe some bases
on the consensus in the Aligned Reads Window. Up will pop a list of
tag types. Click on one of them. Try it again somewhere else. Try
it with the tag type being 'comment'. In this case, you must enter a
comment. Notice the pretty colors! If you forget what a particular
color means, you can click on the colored tag with the right mouse
button and it will tell you.

40) Try creating some tags that overlap each other. You will notice
that the overlapping region will be purple. If you want to know which
tags overlap, you can click with the right mouse button on the purple
and you will be told all tags that are on that base.

41) If you have many tags that overlap and thus are purple, you can
hide some less relevant tag types so there is less purple and there is
less distraction. Make sure you have a few tags visible. Then click
on 'Find Main Win'. In the Main Window, open the Options menu, and
release on 'Hide Some Tag Types'. A list of tag types will pop up.
Select the type that you have visible (above). Then click 'OK'. Go
back to the Aligned Reads Window. That tag should still be visible.
Click on the button 'Some Tags' in the upper right part of the Aligned
Reads Window. Your tag should disappear. The 'Some Tags' button
should have changed to 'Sh All Tags'. Click on it again. Your tags
should have reappeared.

42) Normally, when you re-assemble, phrap will name the contigs
differently--what was Contig31 before may become Contig32. To help
you know which contig is which, Consed allows you to give a name
(e.g., "A") to a contig which will persist after re-assembling. To
do this, swipe some consensus bases with the middle mouse button (as
above). When the "Select Tag Type" box pops up, click on
"contigName" and also type a name into the "Contig Name:" field and
then click "OK". The next time you re-assemble, the name "A" will
appear in the list of contigs on the Main Consed Window.

SEARCH FOR READ NAME

43) Restart Consed using the original ace file

standard.fasta.screen.ace.1

If it asks if you want to apply edits, just say 'no'.

Instead of clicking on a read or contig name, type a read name into
the 'Find read:' box. Try typing djs74-2 You will notice that as you
type each letter, the first item in the list that matches the letters
typed will be highlighted. Experiment with deleting a few letters and
typing others. This is a powerful method of quickly getting to the
read name you are interested in. When you get to the name in the
list, you do not have to type the rest of the name--just type carriage
return or else click on 'OK'.

Even more powerful is the "Find read (with *'s):". In this case you
can just type "2689" and then push the "Enter" key and Consed will
immediately bring up the Aligned Reads Window with the cursor on read
djs74-2689.s1. Suppose that there were more than one read that
matched? For example, suppose you type: "26" and then push the
"Enter" key. This matches 3 reads:

djs74-2689.s1
djs74-2679.s1
djs74-2664.s1

Try it and see what happens...

Try entering "26*9" and see what happens. What does the "*" mean?

ONLINE DOCUMENTATION

44) On the Aligned Reads Window, click on the 'Help' menu and release
on 'Show Documentation'. You will see this document. You can search
for keywords in it.

GOTO POSITION

45) In the Aligned Reads Window, click in the 'Pos:' box in the upper
right-hand corner. Type in a number, such as 540, and push the
'Return' or 'Enter' key. The Aligned Reads Window will scroll to
position 540. We find this feature is particularly useful when one
person wants another person to look at something in the sequence.

HIGHLIGHTING READ NAMES

46) In the Aligned Reads Window, click on a read name with the left
mouse button. The name will turn magenta. Click again and it will
turn yellow again. Try turning it magenta and then scrolling. This
feature is helpful in keeping track of a particular read as you scroll.

If you have an emacs window open (or any editor window), you can paste
the read name in by just clicking with the middle mouse button.
When you clicked on the read name in the Aligned Reads Window with the
left mouse button, the read name was loaded into the paste buffer.

COMPLEMENTING THE CONTIG

47) Push 'Comp Contig' in the Aligned Reads Window to complement the
contig. This displays the opposite strand of the contig including the
consensus and all reads. Push this button again to uncomplement it.

RECOVERY FROM CRASHES

48) It is important to feel that your data are safe, even if the
computer (or Consed) were to crash. Consed will recover your data
from such a crash.

Make an edit (remember, edits are made in the Trace Window) and jot
down its location. Also note the name of the ace file which is
displayed in the upper left box in the Aligned Reads Window. Then
simulate a crash by going to the xterm where you started Consed and
typing control-C. Restart Consed and select the same ace file you
noted (above). A box will come up saying 'There is an edit history (a
.wrk file) Consed may have crashed during a previous session with this
same file. Do you want to apply those edits?' Click on 'yes'. Go
and find the edits you made before Consed crashed--you will find them.

This is the purpose of the .wrk files--they are a log file of your
edits and they are added to as you make edits.

49) You should save your edits by pulling open the 'File' menu on the
Aligned Reads Window, and releasing on 'Save assembly'.

PROTEIN TRANSLATION AND OPEN READING FRAMES

50) If you would like, you can see the amino acid translation of the
consensus in all reading frames. In the Aligned Reads Window, push
down the left mouse button on the 'Misc' menu and release on 'Show Top
Strand Protein Translation'. Try again but this time release on 'Show
Bottom Strand Protein Translation'. Notice that there are 2
characters that are in magenta color. What are those characters? Why
are they made in a different color? To not show the protein
translation, push down the left mouse button on the 'Misc' menu and
release on 'Don't show protein translation'.

51) You can search for open reading frames within a contig. In the
Aligned Reads Window, push the left mouse button on 'Navigate' and
release on 'Search for Open Reading Frames'. Notice that the open
reading frames are shown for all 6 reading frames and are sorted by
length.

ERROR RATE

52) In the Aligned Reads Window is a box (upper right) labelled
'Err/10kb'. This is the estimated error rate for this contig, and it
is a good indicator of when you are done (or not done) finishing.
In addition, you can find the error rate for a particular region of
contig as follows: Point at 'Misc' menu, hold down the left mouse
button, pull down and release on 'Show Error Info For Region'. Fill
in the boxes for left and right consensus position, click on
'Calculate' and you will be given the error and single subclone data
for that region.

RUNNING PHRED and PHRAP

phred and phrap *must* be run via the phredPhrap perl script. If you
don't do this, you are on your own. If you run phred on its own, and
then you run phrap on its own, you will get an ace file that will not
be usable by Consed. If you try to run phred and phrap without using
the phredPhrap script, you are on your own. After you have run into
problems (and you probably will), then do not email us--instead please
use the phredPhrap script. To use the phredPhrap script to run phred
and phrap:

53) Type:
phredPhrap -V

It should say:
000726
(or newer).

If it does not, then you probably have not installed all the perl
scripts from the scripts directory, as directed in INSTALLING CONSED.

54) Make a copy of the standard dataset. E.g.,

cp -r standard test
cd test

55) Delete all the files in phd_dir and edit_dir:

rm phd_dir/*
rm edit_dir/*

56) cd edit_dir

57) Run phredPhrap by typing

phredPhrap

That's it--you no longer need to type *any* arguments, and generally
you should not. (Please do *not* use the -notags option any longer.)
If you want to add phrap options, you can do that:

e.g.,

phredPhrap -forcelevel 3

Then run Consed on the resulting ace file as indicated in the beginning of
the Quick Tour (above). If you have any problems, this is the time to
diagnose them before you use your own data.

COMMON PROBLEMS RUNNING PHREDPHRAP

58) Problems that were due to polyphred. To check this, in
phredPhrap, leave the following line:

$bUsingPolyPhred = 0;

This will make polyphred not be used. If the problem then goes away,
you will know the problem has something to do with polyphred so do not
contact any of the phred/phrap/Consed people. Instead, contact the
polyphred people: http://droog.mbt.washington.edu and
dpc@u.washington.edu and debnick@u.washington.edu

59) Permission problems. Check that you have write access to the
phd_dir and edit_dir directories. You can do this by trying to create
a file in those directories:

touch ../phd_dir/xxx
which creates a file

ls -l ../phd_dir/xxx
which checks if the file was created.

Do the same with ../edit_dir/xxx

If you get a permission problem, do not contact me. UNIX permission
problems are very simple for anyone who knows UNIX--get someone
locally who understands UNIX and can help you solve the permission
problem.

----------------------------------------------------------------------------

USING AUTOFINISH

Note: Before you use Autofinish on your own data, you must modify
determineReadTypes.perl. See INSTALLING CONSED below for information
about this.

To do the exercises in this section, you must be able to edit a file
under UNIX and run a program under UNIX. If you can't do that, have
someone teach you.

60) cd to autofinish/edit_dir

61) Try starting Consed by typing:

../../consed -ace autofinish.fasta.screen.ace.1 -autofinish

(Note 'consed' above may be 'consed_solaris', 'consed_alpha',
'consed_hp', 'consed_sgi', or 'consed_linux' depending on your
executable. If you have trouble, use that 'ls' command (see above)! )

If Autofinish says:

Run-time exception error; current exception: InputDataError
No handler for exception.
Abort

that means that you have not followed the instructions under
'INSTALLING CONSED' below. Please follow those instructions and then
try this again.

Consed will create 7 files:

autofinish.fof
(project name).001014.155627.customPrimers
(project name).001014.155627.nav
(project name).001014.155627.out
(project name).001014.155627.sorted
(project name).001014.155627.univForwards
(project name).001014.155627.univReverses

The '001014.155627' is the date and time in format YYMMDD.HHMISS.
The first file, autofinish.fof, is a file of filenames. It contains
the names of the other files.

(project name).001014.155627.univForwards
is the summary file of the suggested universal forward subclone reads
(project name).001014.155627.univReverses
is the summary file of the suggested universal reverse subclone reads
(project name).001014.155627.customPrimers
is the summary file of the suggested custom primer reads

These are the files you will typically use for directing your bench
work. If you like, you can import these files into Excel since the
fields are separated by commas.

The .out file is the Autofinish output file. This is the most
important file to examine while you are evaluating Autofinish. If you
want to know *why* Autofinish picked the reads it did, it will tell
you. Consult this file before you start complaining about
Autofinish's choices. I've had people complain, and then, once they
look in the .out file, they learn information that persuades them that
Autofinish was correct all along. This is hard to over-emphasize, but
I will try:

CONSULT THIS FILE CAREFULLY IF YOU DISAGREE WITH SOME OF AUTOFINISH'S
CHOICES!

It will tell you lots more, such as the orientation of the contigs.
It will also tell you the value of all Autofinish parameters used. If you
try to customize one of the parameters, check in the .out file to be
sure that Autofinish used the value you intended.

The .sorted file gives the reads sorted by contig and position. This
file is useful if you want to find what reads Autofinish suggested for a
particular location.

The .nav file is a custom navigation file. This file allows a Consed
user to just click 'next', 'next', ... to review all of Autofinish's
suggestions in context. This is a great way to quickly and easily
review all of the reads suggested by Autofinish. See below "CUSTOM
NAVIGATION" for an explanation of this.

This finishing tool is designed to be run in batch after each
assembly. In a high throughput operation, the production people can
make these reads without anyone using Consed to examine the assembly
interactively. Only when Autofinish cannot help you any longer
(generally after 3 or more times of running Autofinish, making the
reads, and re-assembling), must you bring up Consed graphically and
examine the assembly.

We suggest that you write some of your own software to parse the
summary files to automatically order primers and reads. The summary
files (.customPrimers, .univForwards, .univReverses) will not change
much but the .out file is constantly changing, so don't try to parse
it.

AUTOFINISH: MINIMUM NUMBER OF ERRORS FIXED PER READ

62) By default, the minimum number of errors fixed by an experiment is
0.02

Human finishers typically look for low consensus quality
regions--regions that have one or more bases below a particular
quality threshold. However, Autofinish can do better: it can find
regions where the *total* number of errors is greater than some
particular cutoff value. This method can find regions where none of
the bases are low quality, but many are nearly low quality and thus
the total number of errors in the region is high. This is a better
critereon because it is the total number of errors that you are trying
to reduce when finishing.

Two bases of quality 20 have 0.02 errors (on average). Similarly, 20
bases of quality 30 have 0.02 errors (on average). (Quality values
were explained at the beginning of this document.) Suppose that you
want Autofinish to suggest an additional read for an area that even
just has one quality 20 base. (Be aware that Autofinish will consider
10 quality 30 bases to be just as severe as 1 quality 20 base since,
on average, they will both have precisely the same number of errors:
0.01)

In .consedrc, add the following line:

consed.autoFinishMinNumberOfErrorsFixedByAnExp: 0.01

Then run Autofinish again:

consed -ace autofinish.fasta.screen.ace.1 -autofinish

Look at the files just created:

and check that the consed.autoFinishMinNumberOfErrorsFixedByAnExp is
indeed 0.01 by looking in the .out file.

Then compare the .sorted files from this run of Autofinish and the
previous run of Autofinish in which the
consed.autoFinishMinNumberOfErrorsFixedByAnExp value 0.02 You will
notice that there is an additional read suggested when the parameter
is 0.01. This read is a resequence with dye terminator chemistry of
the djs228_474 template. Look at the .out file to see why Autofinish
chose this read. It will indicate that it is mainly to fix 0.01
errors in the region from 2536 to 2545.

Bring up Consed to see what is in that 10 base region. You will see
that there is a quality 25 base at 2539 and a quality 21 base at 2540.
After that come some bases whose qualities are in the high 30s.

In the Aligned Reads Window, point at the Misc menu, hold down the
left mouse button, and release on Show Error for a Region. Enter 2539
and 2549 for the "Left Consensus Position of Region" and "Right
Consensus Position of Region" respectively and click on "Calculate".
You will see that there are .0135 errors in this region. This is less
than 0.02 so Autofinish will not try to fix this region unless you
reduce consed.autoFinishMinNumberOfErrorsFixedByAnExp to 0.01

The default is 0.02 because most labs do not want to fix regions that
have less than 0.02 errors.

63) DIVERSION: UNIX LESSON

Note for UNIX novices: Earlier, I said that you only needed to know 3
UNIX commands: pwd, ls, and cd. Now I want you to learn one more:

ls -tlr

This is the same as ls, but it puts one file on a line and prints the
lines so that the most recent files are on the bottom. Since you will
be creating many, many files as you work through these Autofinish
exercises, this command gives an easy way to see the files you have
just created, without having to always look at autofinish.fof to look
for the names of the files you just created.

AUTOFINISH: CHANGING MELTING TEMPERATURES

64) Look near the top of the .out file and you will see the following
lines:

consed.primersMinMeltingTemp: 55
consed.primersMaxMeltingTemp: 60

Some labs prefer to use primers with lower melting temperatures. In
your .consedrc file, put the following lines:

consed.primersMinMeltingTemp: 50
consed.primersMaxMeltingTemp: 55

Then run Autofinish again:

Check that it now says:

consed.primersMinMeltingTemp: 50
consed.primersMaxMeltingTemp: 55

near the top of the .out file that was just created.

Compare the .sorted files from this run of Autofinish and the previous
run. The differences should be the custom primer reads:

The previous .sorted file had:
tcttttgtctttccatatacatttt,56
which means the melting temperature is 56.

The latest .sorted file had:
cattttagaatcagtttgttg,50
which means the melting temperature is 50.

The other custom primer read also changed.

65) AUTOFINISH: JUST CLOSING GAPS

You could use Autofinish to just close gaps. Add the following to the
.consedrc file:

consed.autoFinishCoverLowConsensusQualityRegions: false
consed.autoFinishCoverSingleSubcloneRegions: false

and run Autofinish again.

Now you should see in the .sorted file just 4 reads: one custom
primer read pointing out the left end of the contig and 3 reverses off
the left end of the contig. The right end is not extended because
Autofinish recognizes that it is the end of the BAC.

You can change any of the parameters listed at the top of the
Autofinish output file (or actually any of the more exhaustive list of
resources listed in the 'Info' menu, 'Show Consed Resources' list.)

We believe the defaults are an excellent starting point.

66) AUTOFINISH: NOT REPEATING FAILED EXPERIMENTS

If you are serious about doing the experiments Autofinish
suggests,

consed -ace (ace file name) -autofinish -doExperiments

-doExperiments causes Autofinish to record its suggestions in the ace
file. If one of these suggested reads fails to fix a problem, when
Autofinish is run again it won't pick the same read again.

If a forward or reverse universal primer read failed, Autofinish (when
run in a subsequent round) will not suggest that same experiment. If
a custom primer read fails, Autofinish will not pick that same
experiment again, and it won't pick a custom primer read that is even
close to the failed one. 'Close' is defined by the resource:

consed.autoFinishNewCustomPrimerReadThisFarFromOldCustomPrimerRead: 50

You can change the default of 50 if you like.

In addition, Autofinish (the next time it is run) will tell you how
well each experiment did in solving the problem it was intended to
solve.

See the

EVALUATING EXPERIMENTS

section of the Autofinish .out file.

(Note to programmers: the format of the autoFinishExp tags is likely
to change--parse them at your peril!)

-doExperiments will also cause oligos to be tagged. (You can turn
this off by setting:

consed.autoFinishTagOligosWhenDoExperiments: false

Primer id's created by Autofinish use the same naming scheme as
primers created in Consed and they will not conflict with each other.
For example, if Autofinish creates oligos djs14.1, djs14.2, and
djs14.3, then the next primer that a user accepts will be djs14.4. If
Autofinish is run a second time, it will start with primer djs14.5.

You should not type '-doExperiments' if you do not intend to do the
experiments Autofinish suggests. If you use -doExperiments, but you
don't really do the experiments, and then you run Autofinish again,
Autofinish will be very upset--it will think that all of its suggested
experiments failed (because it can't find them). It will see that all
of the problems are still present but it will think that it should not
choose any of those same experiments again so it will suggest
different experiments that will not be as good as its original
suggestions.

67) AUTOFINISH: doNotFinish particular regions

If there is a region that you don't care to finish (e.g., it has
already been finished by an overlapping clone or you know there is no
gene there), then you can put a doNotFinish tag on the consensus and
Autofinish will not try to finish this area.

First, delete the .consedrc file and run Autofinish again.

Now put a doNotFinish tag on the region from 2000 to 4000. (If you
don't know how to do that, read through the Consed Quick Tour, above.)
Save the assembly as autofinish.fasta.screen.ace.2

Run Autofinish again:

consed -ace autofinish.fasta.screen.ace.2 -autofinish

Look at the .sorted file. You will notice that, other than the
experiments to extend the contig to the left, there is only one
experiment which is from 315 to 1662. If you find that experiment in
the .out file, it will say that this is mainly to fix errors from 969
to 978. If you look with Consed, you will see that there is a quality
12 base at 974.

68) AUTOFINISH: NOT USING PARTICULAR SUBCLONE TEMPLATES

If you no longer have a template that was used in shotgun, and thus
you don't want Autofinish to pick that template, you can put it in a
file badTemplates.txt in edit_dir. This is a simple file with one
name per line.

In addition to the badTemplates.txt file, you can use a
badLibraries.txt file which contains a list of all libraries that are
off-limits to Autofinish (e.g., you threw away all subclone templates
from this library). Autofinish determines the library of a read by
the following in the PHD file:

WR{
template determineReadTypes 990603:090231
name: djs366_101
lib: library1
}

where "library1" is replaced by the actual library name.

69) MULTIPLE LIBRARIES WITH DIFFERENT INSERT SIZES

If some of your subclone templates have small inserts and some of your
subclone templates have large inserts, Autofinish must know which is
which. Modify determineReadTypes.perl so that it puts the library
name into the phd file like this:

WR{
template phredPhrap 990224:045110
name: ab08a29
lib: ab08
}

where ab08a29 is the name of the subclone template and ab08 is the
name of the library it came from.

Then you must construct a file in the same directory as the ace file
called 'librariesInfo.txt' that lists the insert sizes of the
different libraries like this:

LIB{
name: ab08
avgInsertSize: 1500
maxInsertSize: 3000
stranded: double
cost: 600.0
}

LIB{
name: ab09
avgInsertSize: 1500
maxInsertSize: 3000
stranded: double
cost: 600.0
}

LIB{
name: ab10
avgInsertSize: 3000
maxInsertSize: 5000
stranded: single
cost: 1200.0
}

'name' is the name of the library. This is the name that goes into
the PHD files after the 'lib:' keyword. 'avgInsertSize' is the
average insert size of the library--the figure to be used by
Autofinish if there are not enough forward/reverse pairs.
'maxInsertSize' is the maximum insert size--if forward/reverse pairs
are further apart than this, Autofinish will assume these reads are
misassembled. 'stranded' is whether this template is single or double
stranded. 'cost' is the cost of making a minilibrary out of a
template from this library.

For help in debugging your use of the librariesInfo.txt file, on
Consed's Main Window, point to 'Info', hold down the left mouse
button, and release on 'Show Library Info'. You should see the names
of your libraries and the correct number of reads in each library.

70) AUTOFINISH: TOO MANY UNIVERSAL PRIMER READS

St Louis wanted more universal primer reads, so I put in a feature
that allows for redundant universal primer reads. If you get too
many for your taste, then put this into your .consedrc file:

consed.autoFinishRedundancy: 1.0

The default is 2.0, meaning that Autofinish will try to fix every
problem area twice--once by some universal primer reads and once again
by other universal primer reads. Then, and only then, will it try
oligo walks to finish remaining problems.

Baylor wanted more reverses to close gaps, so I put a feature into
Autofinish that calls *all* reverses near gaps:

(contig) ___________________________

<- reverse 1
<- reverse 2
<- reverse 3
<- reverse 4

(including reverses that are likely to fall into the gap) in the hope
that enough of them will hook onto each other that the gap will be
closed. (If there is already a reverse pointing out but no forward,
Autofinish will suggest the forward.) If this feature gives you too
many reverses for your taste, then in your .consedrc file:

consed.autoFinishNearGapsSuggestEachMissingReadOfReadPairs: false

71) AUTOFINISH CLOSING GAPS WITH MINILIBRARIES

Use the following parameters:

consed.autoFinishAllowMinilibraries: true

consed.autoFinishPrintMinilibrariesSummaryFile: true

This will cause Autofinish to print a file with name similar to:
(project name).001014.155627.minilibraries

The following parameter can be set to true or false, depending on your
preference:

consed.autoFinishAlwaysCloseGapsUsingMinilibraries: false

If the parameter above is set to false, then Autofinish will only
choose minilibraries if the gap is the size below or larger:

consed.autoFinishSuggestMinilibraryIfGapThisManyBasesOrLarger: 800

Autofinish can suggest more than one minilibrary per gap:

consed.autoFinishSuggestThisManyMinilibrariesPerGap: 2

72) AUTOFINISH FOR CDNA ASSEMBLIES

The way to use Autofinish for cDNA assemblies is to pretend that the
cDNA is a BAC and that you are only going to allow whole clone custom
primer BAC reads. To do this, put the following into your .consedrc
file:

consed.autoFinishAllowResequencingReads: false
consed.autoFinishAllowWholeCloneReads: true
consed.autoFinishAllowCustomPrimerSubcloneReads: false
consed.autoFinishAllowDeNovoUniversalPrimerSubcloneReads: false
consed.autoFinishCDNANotGenomic: true
consed.autoFinishCheckThatReadsFromTheSameTemplateAreConsistent: false
consed.checkIfTooManyWalks: true
consed.autoFinishExcludeContigIfOnlyThisManyReadsOrLess: 0
consed.autoFinishExcludeContigIfDepthOfCoverageOutOfLine: false
consed.autoFinishExcludeContigIfThisManyBasesOrLess: 0
consed.autoFinishCoverSingleSubcloneRegions: false
consed.autoFinishContinueEvenThoughReadInfoDoesNotMakeSense: true
consed.autoFinishCallReversesToFlankGaps: false

You don't want Autofinish to try to extend off the 3' end or the 5'
end of the cDNA, right? How is Autofinish going to determine that?
It determines it as follows:

In the 5' end read, put the following into the phd file:

WR{
primer determineReadTypes 001019:112654
type: univ fwd
}

WR{
template determineReadTypes 001019:112654
name: cDNA1
}

In the 3' end read (the read that is primed off the polyA tail), put
the following into the phd file:

WR{
primer determineReadTypes 001019:112654
type: univ rev
}

WR{
template determineReadTypes 001019:112654
name: cDNA1
}

For all other reads, such as transposon reads and custom primer walks,
put the following into the phd file:

WR{
primer dscript 001019:112654
type: walk
}

WR{
template determineReadTypes 001019:112654
name: cDNA2
type: bac
}

If you are going to finish many cDNA's, you will find it will work
better to modify determineReadTypes.perl than to go editing every phd
file.

So Autofinish finds the univ fwd read and assumes it indicates the 5'
end of the cDNA and it finds the univ rev read and assumes it
indicates the 3' end of the cDNA. (The resource
consed.autoFinishCDNANotGenomic: true
tells it to try to find the end of the cDNA in this manner.)

There is one additional problem when using Autofinish for cDNA
assemblies: initially, tphe ace file created by phrap is empty since
the 3' and 5' reads don't overlap enough. You have *no* contigs for
Autofinish to finish. So phrap is of no use initially.

But you can use Consed to create the
assembly:

First run phredPhrap to phred both reads and run
determineReadTypes.perl Then pick the 3' read and run phd2Ace.perl on
it:

phd2Ace.perl (name of phd file)

This will give you an ace file with one read in it.

Now suppose that you have other reads from the same cDNA. You can use
this technique to add them to the ace file:

To add all the reads phrap has neglected to put into the ace file, do
the following:

1. create a file of phd filenames. E.g.,

djs74_1180.s1.phd.1
djs74_1432.s1.phd.1
djs74_1455.s1.phd.1
djs74_1465.s1.phd.1
djs74_1532.s1.phd.1
djs74_1802.s1.phd.1
djs74_1803.s1.phd.1

Typically, you will get this list of phd files by looking in the
singlets file.

Then run consed:

2. consed -ace old_ace.ace -addReads fileOfPhdFiles.txt -newAceFilename new_ace.
ace

where:
fileOfPhdFiles.txt is the name of the file (above) containing
the phd filenames
new_ace.ace is whatever you want the new ace file to be named
old_ace.ace is the name of the old ace file

Now you have an ace file that contains all the reads you have
sequenced for that cDNA. You can now run Autofinish on it:

consed -ace new_ace.ace -autofinish

----------------------------------------------------------------------------

ADVANCED PHRAP/CONSED USAGE

73) BACKING OUT EDITS AFTER YOU HAVE SAVED THE ASSEMBLY

If you decide that all your edits are terrible and you want to start
over (perhaps you have been training a new finisher), the cleanest
solution is to delete everything in phd_dir and edit_dir , but leave
everything in chromat_dir and just run
phredPhrap again.

74) SELECTIVELY BACKING OUT EDITS AND REMOVING READS

If you want to back out all edits in just particular reads, I have
provided a perl script to do this:

revertToUneditedRead (read name)

What it does it copy the .phd.1 to 1 greater than the highest
version.

Then you must reassemble using the phredPhrap script to create an ace
file that has no edits for that particular read. It will have all
edits for all other reads.

Why doesn't it just delete all phd files except for the
.phd.1? In that case, Consed could not read any previous ace file
since all previous versions of ace files would refer to phd files that
have been deleted.

75) REMOVING READS FROM AN ASSEMBLY

Create a file containing the filename of all the reads you want to
remove, one filename per line.
Then use the perl script

removeReads <file of filenames>

Then reassemble using the phredPhrap script.

76) ADDING READS WITHOUT CHROMATOGRAM FILES

This may happen if you, for example, download sequence from Genbank
and want to assemble it along with your reads.

There are 2 ways to do this, depending on whether you want to edit the
read or not.

a) If you want to edit the read, run mktrace to produce a fake trace. It
will have all perfect peaks.

Run:

mktrace (name of file with fasta sequence)

Then run the phredPhrap script normally. You will be able to bring up
the traces in Consed and edit the read.

b) If it is not important to edit the reads, there is a method that
is a little faster. Create just a fake phd file using:

fasta2Phd.perl (name of file with fasta sequence)

It will create a file whose name is taken from the fasta file name:
for example, if the fasta filename is Contig1.c.fasta, then the phd file
will be called Contig1.c.phd.1 The fasta name in the file is ignored.
You can then put this in the phd_dir, and reassemble using the
phredPhrap script.

If the reads are really fake (you don't want the templates to be
chosen by Consed/Autofinish as a template for a primer), then the read
should end with an extension .c or .a or .c1 or
.c2 ... or .a1 or .a2 or ... This indicates to Consed/Autofinish
that the read is a fake read.

Note: when you are creating phd files such as this, you must start with
(read name).phd.1 Do not start with (read name).phd.2 or any higher
version number. This is because Consed looks for the .1 version in
order to find the original phred calls so it expects there to be a .1
version.

77) FASTER CONSED STARTUP

You can greatly speed up Consed startup if you are willing to use more
disk space. The disk space used will be about equal to the total
space used by the PHD files. Try this will a large dataset (you won't
notice any difference with the test datasets that come with Consed.)

To use this method of startup:

1) cd to directory where ace file is kept
2) type: catPhdFiles.perl
3) start consed normally

In many situations, this will greatly speed up Consed startup. The
amount of speedup depends on which operating system is used: on Linux,
the time to read phd files dropped from 75 seconds to 8 seconds, and
thus the total time to start up consed dropped from 86 seconds to 17
seconds. I saw similar speedups on Solaris where the phd files are on
an nfs mounted disk. However, there was another situation in which
the startup time was the same.

78) WHY ARE ALL THE READS NOT IN THE ASSEMBLY?

You will notice that there are some contigs that contain only one
read. You will also notice that there are some reads that are not
shown by Consed at all, since phrap did not put them into the ace
file. Why?

If a read does not have a significant match (with Smith-Waterman score
exceeding minscore) to any other read, that read is not included in
the ace file. Instead, that read is put in the '.singlets' file.
That read will not appear in Consed.

If a read does have a significant match to any other read, then it
will appear in the ace file and be shown by Consed. However, such a
read might have other problems: it might not be possible to assemble
such a read with other reads (in the case of EST's this read may be a
unique representative of a particular gene (or a genomic sequence
contaminant) that happens to contain an Alu repeat and thus happens to
match other reads in the data set; or it may represent the only read
of a particular alternatively spliced form; or it may have data
anomalies of some sort (chimeras, etc.). Such a read would end up in
a contig all of its own.

79) ARE THERE READS THAT ARE TOTALLY UNALIGNED?

Unfortunately, yes. In my opinion, Phrap shouldn't have put them in
the assembly at all. But we just have to live with it. You can find
if a read is totally unaligned by pointing the the read name in the
Aligned Reads Window and holding down the right mouse button. Consed
will tell you the aligned positions, the high quality position, and
the chemistry of the read.

80) VIEWING THE CHROMATOGRAM OF SINGLETS OR NON-ASSEMBLED READS

If you have a chromatogram, you can use Consed to view it, even if it
hasn't been assembled into the ace file. This is common with cDNA
assemblies in which the reads don't overlap and thus phrap doesn't put
them together into a contig.

To do this, make the same edit_dir, phd_dir,
and chromat_dir as above, put the chromatogram into chromat_dir, run
phred on it to generate the phd file which goes into phd_dir.

Then go to edit_dir and run:

phd2Ace.perl (name of phd file)

For example, if your phd file is myRead.phd.1
from edit_dir, type:

phd2Ace.perl myRead.phd.1

This will produce myRead.ace

Then just start Consed normally:
consed -ace myRead.ace
and you can view the chromatogram.

MULTIPLE TRACE POPUP

81) Bring up dataset standard. In the Aligned Reads window, scroll to
a region that has many reads and that has some discrepancies--try
position 1162. Hold down the shift key, and click with the middle
mouse button on the consensus. At this location 3 traces will
pop up--these are the 2 highest quality traces that agree with the
consensus (on each strand) and the highest quality trace that
disagrees with the consensus. This feature is useful in areas of high
coverage when you want to rapidly examine just the most significant
traces rather than looking at all of them.

MAXIMUM NUMBER OF TRACES DISPLAYED

82) Bring up dataset standard. Scroll to position 1162. Bring up 4
reads and then try bringing up additional reads.You will notice that
new reads are put at the top of the stack of traces and, once there
are 4 traces displayed, traces are automatically removed from the
bottom of the stack. If you want to change this maximum number of
traces to something besides 4, you can do that: In the Main Consed
Window (click on 'Find Main Win' on the Aligned Reads window), pull
down the 'Options' menu, and release on 'General Preferences'. Try
changing the 'Max Number of Traces Shown' to 3. Then click 'Apply and
Dismiss'. Now dismiss the Trace Window and again start adding
additional traces to the Trace Window. You will notice that now the
number of traces shown will not exceed 3.

HOTKEYS FOR EDITING

83) If you do a lot of editing, you will want to have a faster method
of doing these edits than having the popup and selecting an option.
Thus the following hot keys exist:

< and > (less than and greater than) to make n's to the left
and the right (respectively) of the cursor
control-l and control-r to make low quality to the left and
the right (respectively) of the cursor
overstriking with a capital letter (e.g., C instead of c) causes
the base to become high quality rather than low quality
overstriking with a lower case letter causes the base to become
low quality

Give these a try.

84) Now go to the menu labelled 'color', and pulldown and release on
'color means match'.

Now you notice different colors: The
colors have the following meaning:

Blue: agrees with consensus
Orange: disagrees with consensus
Yellow: this stretch of this read was used to form the consensus
Grey: Low quality or unaligned ends of reads

Now go back to the colormode 'color means quality and tags' (the
default) for the next exercise.

(The other colormodes will mean more to you later.)

SCROLLING TRACES INDEPENDENTLY

85) Dismiss all of your Trace Windows. Then pop up traces for 2
different reads in approximately the same location. Scroll one of
them. You may want to scroll by clicking the arrows or clicking to
the left or right of the thumb. You will notice that both will
scroll. Consed will do its best to have corresponding peak lined up.
(Consed can't line all of them up because the peak spacing is not
uniform and differs from read to read.) Try removing a trace by
clicking on one of the 'Remove' buttons in the Trace Window. Try
adding other traces. Then click on 'No' for scrolling the traces
together and try scrolling. You will now observe that they scroll
separately.

ABI BASE CALLS

86) If you want to see the ABI base calls, no problem. Just go to the
Main Consed Window. Pull down the 'Options' menu and release on
'General Preferences'. Click on 'True' for 'Show ABI Bases in Trace
Window' and then click 'OK' at the bottom of the window. The ABI
bases will not be shown immediately--you must first dismiss the trace
window and bring it up again. You will then see an additional line
with the ABI base calls.

MEASURING ERROR RATE AND SINGLE SUBCLONE BASES FOR A REGION

87) Some contigs have long tails of low quality bases and you would
like to find out the error rate for the contig without that long
tail. On the Align Reads Window, pull down the Misc menu, and release
on 'Show Errors for a Region'. This will tell you both the error rate
for the region and the number of single subclone bases for that region.

88) LONG, LONG, LONG READ NAMES

If you have very long read names, you might not be able to see the
whole name in the Aligned Reads Window. To fix this, go to the Main
Consed Window, pulldown the 'Options' menu and release on 'General
Preferences'. Scroll down until you see "Max Chars for Read Names in
Aligned Reads Window". Increase the number and click on "Apply".
When you are satisfied with how the read names look in the Aligned
Reads Window, click on "Cancel" in the General Preferences Window.

You can make this change permanent with the resource:

consed.alignedReadsWindowMaxCharsForReadNames: 20

(see CONSED CUSTOMIZATION)

89) PREVENTING 2 USERS FROM MAKING CONFLICTING EDITS

If there are 2 users that are both editing in the same directory,
there is the possibility they will both make edits to the same read.
Whoever saves their assembly last will wipe out the edits of the other
person, even if they were using different ace files. To help prevent
this, consed can warn you if someone else is making edits in the same
directory. Set the consed parameter:

consed.onlyAllowOneReadWriteConsedAtATime: true

The default is "false" so you have to turn this to true to make it
work (see CONSED CUSTOMIZATION).

This will usually work even if the 2 users are on different computers
(and the directory is nfs-mounted between them) and even if the
different computers have different operating systems. I've tested the
following combinations:
user 1 on Solaris; user 2 on Solaris
user 1 on Linux; user 2 on Linux
user 1 on Solaris; user 2 on Alpha (Digital Unix)
user 1 on Linux; user 2 on Solaris <--- does not work

Only the last combination doesn't work.

90) PRINTING CONSED WINDOWS

There is a free (or nearly free) program called "xv". One web site is
http://www.trilon.com/xv It is written by one of those dying breed of
UNIX programmers who just *loved* UNIX and programming and sharing it.
His web site is enjoyable because some of his passion comes through.
With xv, you can make a postscript file from a Consed window. Then
you can print the postscript file on a color printer.

However, since some Consed windows are mostly black (Aligned Reads
Window and Traces Window), this uses up a lot of toner and is
difficult to read. So go to the Main Consed Window, pulldown the
'Options' menu and release on 'General Preferences'. Scroll down to
"Make light background in Aligned Reads Window..." and click on "Do it
now". Dismiss any Aligned Reads Windows or Traces Windows and then
bring them back up. You will notice the light background. A few
other things (traces colors and thickness) are also customized for
making color prints.

------------------------------------------------------------------------

INSTALLING CONSED

You MUST have the following phred, phrap, phd2fasta, and crossmatch in
order to use this version of Consed:

000925.c or later for phred
0.990319 or later for phrap and crossmatch
0.990622.d or later for phd2fasta (supplied with this version of consed)
000802 or later for addReads2Consed.perl (supplied with this version
of consed)
000726 or later for phredPhrap (supplied with this version of consed)
990823 or later for transferConsensusTags.perl (supplied with this
version of consed)
000727 or later for tagRepeats.perl (supplied with this version of consed)
001205 or later for determineReadTypes.perl (or your own custom
modified version)

For phred, contact bge@u.washington.edu (Brent Ewing)

For phrap and crossmatch, contact phg@u.washington.edu (Phil Green)

In order to run the gauntlet of phred/phd2fasta/crossmatch/phrap,
there is a perl script phredPhrap supplied with Consed (above). YOU
MUST USE THIS PERL SCRIPT. If you try to run each of these programs
directly, you are on your own and you will probably spend a lot of
time needlessly.

91) Follow the first few steps of USING CONSED GRAPHICALLY of the
Quick Tour (above). If you have problems, it may be due to your X
emulator. See 'MONITORS FOR CONSED' below.

92) I suggest you put Consed, phred, crossmatch, phrap, the perl
scripts, and other executables into /usr/local/genome/bin. So create
/usr/local/genome/bin and /usr/local/genome/lib

If you can't actually use /usr/local/genome, then you could make
/usr/local/genome be a link to the real location--that will work just
as well.

If you want to have another location xxx, then put:

setenv CONSED_HOME xxx

into the .cshrc (or equivalent) of all Consed users

and create $CONSED_HOME/bin and $CONSED_HOME/lib and put all of these
programs into $CONSED_HOME/bin

93) Make sure that /usr/local/genome/bin (or $CONSED_HOME/bin) is in
every Consed users' PATH.

94) Put the Consed executable in /usr/local/genome/bin (or $CONSED_HOME/bin)

95) Check this by logging on as a user and typing:

consed -V

You should see 'Version 11.0'. If you see something else, you have
some debugging to do.

96) Build phd2fasta:
Go to the misc/phd2fasta directory and type 'make'
Move the phd2fasta executable to /usr/local/genome/bin (or $CONSED_HOME/bin)

97) Build mktrace:
Go to the misc/mktrace directory and type 'make'
Move the mktrace executable to /usr/local/genome/bin (or $CONSED_HOME/bin)

98) Move all perl scripts from the scripts directory to
/usr/local/genome/bin (or $CONSED_HOME/bin)
Make sure all are executable (chmod a+x *)

DELETE ANY PREVIOUS VERSIONS OF THESE SCRIPTS OR YOU WILL BE SORRY!
(Bugs have been fixed.)

99) Get perl 5. You can check where to get perl via the perl web
site:

http://www.perl.com/perl/info/software.html

(If you don't know about perl, try it--it will save you a
huge amount of time over developing the same utilities in C, awk, or
csh or sh.) Regardless where you put perl, put a link to it in
/usr/local/bin so that all of the scripts with
#!/usr/local/bin/perl
will work and you won't have to edit all of them everytime a new
Consed release comes out.

100) Create a subdirectory /usr/local/genome/lib/screenLibs. (If you
are using a location other than /usr/local/genome for the root of all
Phred/Phrap/Consed programs, create $CONSED_HOME/lib/screenLibs). From
the misc subdirectory, copy primerCloneScreen.seq and
primerSubcloneScreen.seq to the directory
/usr/local/genome/lib/screenLibs (or $CONSED_HOME/lib/screenLibs).

Take a look at these files. They are dummy files indicating the fasta
format of the sequences that should be put in them. You should put
into primerCloneScreen.seq the vector sequence of the cloning vectors
you are using (BAC or cosmid) and into primerSubcloneScreen.seq the
sequencing vectors you are using (plasmid, M13, etc). Don't be too
generous in putting lots of vectors into the files! The larger they
are, the slower primer picking will be. Our files are only this big:

-rw-r--r-- 1 root root 29938 Nov 7 1997 primerCloneScreen.seq
-rw-r--r-- 1 root root 7381 Aug 13 1997 primerSubcloneScreen.seq

and primer picking is quite fast enough.

Now that you have set this up, you should try the PRIMER PICKING
sections in the Quick Tour (above) to make sure this works.

101) You should also create a file

/usr/local/genome/lib/screenLibs/vector.seq

(or $CONSED_HOME/lib/screenLibs/vector.seq if you are not using
/usr/local/genome for the root of the Phred/Phrap/Consed files.)

This contains all the vector sequences (in FASTA format) that you want
to mask out before phrapping. In general, it is the combination of
primerCloneScreen.seq and primerSubcloneScreen.seq

102) You should also create a file
/usr/local/genome/lib/screenLibs/repeats.fasta

(or $CONSED_HOME/lib/screenLibs/repeats.fasta if you are not using
/usr/local/genome for the root of the Phred/Phrap/Consed files.)

In this file, put any sequences (in FASTA format) that you want to
have automatically tagged. These typically are ALU sequences. If you
don't want to tag anything, then comment out (put '#' as the first
character of the line) the following lines in phredPhrap:

Change:
!system( "$tagRepeats $szAceFileToBeProduced" )
|| die "some problem running $tagRepeats";

to:
#!system( "$tagRepeats $szAceFileToBeProduced" )
# || die "some problem running $tagRepeats";

103) determineReadTypes.perl

Phrap, Consed's primer picking, and Consed/Autofinish all need the
following information for each read:
is it a univeral primer forward, a universal primer reverse,
or a walking read?
what is its template name?

Generally this information can be determined from the read name, using
*your* naming convention. Modify the perl script
determineReadTypes.perl to put this information at the end of the phd
file using WR info items.

If you don't want to do any perl programming, you have the option of
using the St Louis naming convention as is. But what is the St Louis
naming convention? Most of it (but not all) is explaned in the phrap
documentation. In addition, you must never use an underscore in the
name if the read is a universal primer forward or universal primer
reverse read. If the read is a walk, then you must have an underscore
(_) follow the template name and then have a number (the oligo
number).

Examples of reads in the St Louis naming convention:

read eeq03a01.g1.phd.1 is univ rev template: eeq03a01 library: eeq03
read eeq03a02.b1.phd.1 is univ fwd template: eeq03a02 library: eeq03
read eeq03a02.g1.phd.1 is univ rev template: eeq03a02 library: eeq03
read eeq03a03.b1.phd.1 is univ fwd template: eeq03a03 library: eeq03
read eej45h07_2.i1.phd.1 is walk template: eej45h07 library: eej45
read eej46c12_1.i1.phd.1 is walk template: eej46c12 library: eej46

Once you have correctly customized determineReadTypes.perl, then
uncomment the line in phredPhrap which calls determineReadTypes.perl

Consed allows you to check that you have correctly modified
determineReadTypes.perl: On the Main Consed Window, point to 'Info',
hold down the left mouse button, and release on 'Show Info for Each
Read'. Check that the information presented is correct. If, for
example, Consed thinks that there are templates that have 9 or more
reads, it is likely that you have not correctly customized
determineReadTypes.perl

If you think you have made a mistake in customizing
determineReadTypes.perl, it is best to delete the PHD files and run
phredPhrap again since the otherwise incorrect WR items will be left
in the PHD files.

See the script determineReadTypes.perl for more information about how
to customize it.

104) Fake Reads

In the past, any read that ended with a .a2 or .c3 (where 2 and 3
could be any numbers), was considered a fake read. Now you can make
Autofinish not assume this using the .consedrc resource:

consed.fakeReadsSpecifiedByFilenameExtension: false

Instead, you must have determineReadTypes.perl put "fake" into the
"type:" field of a template WR item. See determineReadTypes.perl for
more information.

TEST RUNNING PHREDPHRAP

105) See the section RUNNING PHRED and PHRAP in the Quick Tour (above)

TESTING ADDING NEW READS

106) It will make your life easier if phred, phrap, and crossmatch are
all where Consed expects them: in /usr/local/genome/bin

107) Decide where to put phred's parameter file and edit both
addReads2Consed.perl and phredPhrap to reflect this location. I
generally prefer to put it in /usr/local/genome/lib to keep all of the
Phred/Phrap/Consed files in one place. Alternatively, you could put
it in /usr/local/etc/PhredPar/phredpar.dat which is the historical
location of this file.

108) Next you should test the ADDING NEW READS step in the Quick Tour
(above). This step requires that everything be set up correctly and
in the correct location. Hopefully the error messages are clear
enough to help you if you have set up anything incorrectly.

USING YOUR OWN DATA

109) Create the following directory structure, which can be anywhere
on any disk:

Directory structure:
top level directory (generally named after the BAC or cosmid)
subdirectory 'chromat_dir'--chromatograms go in here
subdirectory 'phd_dir'--phd files will automatically be put here
subdirectory 'edit_dir'--ace files will automatically be put here

If you already have your chromatograms somewhere else, you can make
chromat_dir be a link to wherever you have them.

The various phrap and crossmatch files will be put into edit_dir by
the phredPhrap script.

110) cd to the edit_dir directory, and type:

phredPhrap

If you are successful, the script will tell you so and you can bring
up Consed on the ace file:

111) Type:

consed

You should see a file with the extension .ace.1
Double click on it.

You should see a list of contigs.

Double click on the one you want to see.

Follow the first few steps of the Quick Tour under USING CONSED
GRAPHICALLY (above). You should at least go as far as viewing traces.

112) Appending expid to the phd files

If you are using Autofinish, and would like Autofinish to tell you how
well your reads are succeeding, then the phd files must be appended
with the experiment id's. In the 3 Autofinish summary files
(*.univReverse, *.univForwards, and *.customPrimers), you will see
information like this:

univ rev,,,->,-329,-249,71,Contig1,3,djs228_1034

or this:

tgaagaaatggctgactcc,56,1,->,3258,3338,3658,Contig1,4,djs228_2813,5,djs228_168,6,djs228_1248

The '3' just before the djs228_1034 is an experiment id. There is
also an expid '4' just before djs228_2813, an expid '5' before
djs228_168, and an expid '6' just before djs228_1248.

Autofinish doesn't know what you will end up calling these reads it is
telling you to make. Autofinish only knows those reads by the numbers
3, 4, 5, and 6. So when you make the reads, Autofinish needs to be
informed that this is 'experiment 3' or whatever. You do this by
appending in the phd file the following structure:

WR{
expid addExpid 990811:140818
5
}

where WR stands for 'whole read item',
expid for 'expid'
addExpid is the name of the program that you will write that
will append this information
990811:140818 is the date and time in format YYMMDD:HHMISS
5 is the expid

This program must be run *after* phred runs to create the phd files.
Thus your program must have some method of determining what the expid
of each read is. What the University of Washington Genome Center does
is to have the finishers put the expid as part of the filename. This
makes it easy for a program to look at the phd file and figure out
what the expid is and then write the WR item into that phd file.

Alternatively, you could keep a database and, after the phd file is
created, look into the database to see what the expid is.

When you have successfully added expid's to the phd files, the next
time you run Autofinish on this project, it will have in the
'EVALUATE' section of the Autofinish output file, lots of interesting
information about how well the reads succeeded.

--------------------------------------------------------------------------
NOTE TO SGI USERS

In /usr/lib, there must be a file: libCsup.so

If you don't have this file, you must get it from SGI. To get it, if
you are on Irix 6.2 through 6.4, request:

SG0001637 'C++ Exception handling patch for 7.00 (and above) compilers
on irix 6.2' (it's on the 'Development Options 7.1' CD).

If you are on Irix 5.3, install patch 1600

To make things easier for you, I've included my libCsup.so
This might save you having to get the patches above.

consed_sgi64 is for 64 bit computers. If you have a 64 bit computer,
use it instead of consed_sgi since it will allow you to use very large
datasets (over 100,000 reads).

----------------------------------------------------------------------------

FOR PROGRAMMERS AND FELLOW TRAVELLERS ONLY

113) CONSED CUSTOMIZATION

Click on the 'Info' menu on the Main Consed Window and release on menu
item 'Show Consed Resources'. This shows you what is available to be
changed by putting in your ~/.consedrc file.

Click on the 'Info' menu on the Main Consed Window and release on menu
item 'Show Default X Resources'. This shows you what is available to
be changed by putting in your ~/.Xdefaults file.

Changes in ~/.consedrc only affect one user. If you want to make a
change to affect all Consed users on the system, put a file in some
central location (e.g., /usr/local/genome/lib/.consedrc ) and then
have every user set the environment variable CONSED_PARAMETERS to
that location:

setenv CONSED_PARAMETERS /usr/local/genome/lib/.consedrc

Anything the user puts in ~/.consedrc will override whatever is in the
CONSED_PARAMETERS file.

You can also have different parameters for different projects. Put a
.consedrc file in the edit_dir of a particular project. When you are
working on that project, whatever is in that .consedrc will override
whatever is in your ~/.consedrc file or the CONSED_PARAMETERS file.

CUSTOMIZING NAVIGATE BY SINGLE STRANDED REGIONS AND NAVIGATE BY SINGLE
SUBCLONE REGIONS

You can set the parameters:

consed.searchFunctionsUseUnalignedEndsOfReads: false
consed.searchFunctionsUseLowQualityEndsOfReads: true

If you set consed.searchFunctionsUseUnalignedEndsOfReads to be false,
then the unaligned ends of a read are not considered to cover the
consensus.

If you set consed.searchFunctionsUseLowQualityEndsOfReads to false,
then the low quality ends of a read are not considered to cover the
consensus.

For example, if the settings are:

consed.searchFunctionsUseUnalignedEndsOfReads: false
consed.searchFunctionsUseLowQualityEndsOfReads: false

then a base in a read is only considered to cover the consensus if it
is both in the aligned portion of the read and the high quality
portion of the read.

Although most Consed parameters now go into .consedrc, there are still
a very few that need to stay in .Xdefaults. Here is the rule: if the
parameter starts with

consed.

such as

consed.gunzipFullPath: /bin/uncompress

then it goes into .consedrc

If the parameter starts with

consed*

such as

consed*contigwin.background: Black

then it goes in .Xdefaults

If you are upgrading from Consed version 8.0 or older:

Consed in version 8.0 and older used .Xdefaults for Consed
parameters--no longer. Now Consed uses ~/.consedrc for most of
the same parameters. Thus you should remove Consed parameters
from .Xdefaults and put them in .consedrc in your home directory.

Before, when you made a typo with one of the Consed parameters, it
was just silently ignored. Now Consed makes a big fuss. So you
need to be prepared to find out all of the parameters that have
not been working all this time.

114) COMPRESSING CHROMATOGRAMS

If you are interested in compressing your chromatogram files, go into
chromat_dir and gzip one of the chromatogram files. Make sure that
gunzip is in /usr/local/bin (You can change this location via the
Consed resource

consed.gunzipFullPath: /usr/local/bin/gunzip

--see CONSED CUSTOMIZATION (above), but it will be easiest for
you and your users if you just put gunzip (or a link to it) in
/usr/local/bin and not have to bother with Consed resources.)

Restart Consed and bring up the corresponding trace. You will notice
no appreciable delay.

115) READING CHROMATOGRAMS OUT OF AN EXTERNAL DATABASE

Normally, chromatograms are kept in ../chromat_dir. If you want to
keep them somewhere else (such as in an external database), you can do
that. When the chromatogram is needed (when the user asks to view a
trace), Consed will call an external program, passing it the name of
the read required, and then look for the chromatogram in /tmp (by
default). It will read the chromatogram and then delete it. Use the
resources:

consed.alwaysRunProgramToGetChromats: true
consed.programToRunToGetChromats: /usr/local/bin/programToGetChromat

In this case, "programToGetChromat" is the name of the program that
gets the chromatogram and puts it into /tmp.

116) CONSED -ACE

Try bringing up Consed like this:

consed -ace (name of ace file)

This can be useful if you are going to have Consed brought up from
some other program.

117) NO PHD FILES

Try bring up Consed like this:

consed -nophd

This mode does not allow editing and does not show quality
information. It allows you to view an assembly when you don't have
phd files or chromatograms but you only have the ace file. You will
not be able to see the quality information, since that information is
kept in the phd files. I do not recommend nor support this option!

118) CREATING CUSTOM TAG TYPES

The following Consed resources are available for creating custom tag
types:

consed.tagColorCustomTag1:
consed.tagColorCustomTag2:
consed.tagColorCustomTag3:
consed.tagColorCustomTag4:
consed.tagColorCustomTag5:
consed.tagColorCustomTag6:
consed.tagColorCustomTag7:
consed.tagColorCustomTag8:
consed.tagColorCustomTag9:
consed.tagColorCustomTag10:
consed.tagColorCustomTag11:
consed.tagColorCustomTag12:
consed.tagColorCustomTag13:
consed.tagColorCustomTag14:
consed.tagColorCustomTag15:
consed.customTag1:
consed.customTag2:
consed.customTag3:
consed.customTag4:
consed.customTag5:
consed.customTag6:
consed.customTag7:
consed.customTag8:
consed.customTag9:
consed.customTag10:
consed.customTag11:
consed.customTag12:
consed.customTag13:
consed.customTag14:
consed.customTag15:
consed.tagColorCustomConsensusTag1:
consed.tagColorCustomConsensusTag2:
consed.tagColorCustomConsensusTag3:
consed.tagColorCustomConsensusTag4:
consed.tagColorCustomConsensusTag5:
consed.tagColorCustomConsensusTag6:
consed.tagColorCustomConsensusTag7:
consed.tagColorCustomConsensusTag8:
consed.tagColorCustomConsensusTag9:
consed.tagColorCustomConsensusTag10:
consed.tagColorCustomConsensusTag11:
consed.tagColorCustomConsensusTag12:
consed.tagColorCustomConsensusTag13:
consed.tagColorCustomConsensusTag14:
consed.tagColorCustomConsensusTag15:
consed.customConsensusTag1:
consed.customConsensusTag2:
consed.customConsensusTag3:
consed.customConsensusTag4:
consed.customConsensusTag5:
consed.customConsensusTag6:
consed.customConsensusTag7:
consed.customConsensusTag8:
consed.customConsensusTag9:
consed.customConsensusTag10:
consed.customConsensusTag11:
consed.customConsensusTag12:
consed.customConsensusTag13:
consed.customConsensusTag14:
consed.customConsensusTag15:

When you create a custom tag type, you specify its name and the color
you want it displayed in.

For example:

consed.tagColorCustomTag1: SlateBlue2
consed.tagColorCustomTag2: SlateBlue2
consed.tagColorCustomTag3: SlateBlue2
consed.tagColorCustomTag4: brown
consed.tagColorCustomTag5: MediumPurple
consed.tagColorCustomTag6: purple
consed.customTag1: polymorphismInsertion
consed.customTag2: polymorphismDeletion
consed.customTag3: polymorphismSubstitution
consed.customTag4: qualityCoreComment
consed.customTag5: coordinatorApproval
consed.customTag6: coordinatorComment

(All of these tag types are read tag types. Consensus tag types are
specified separately--see the Consed resource names (above).)

Once you have done this, the user of Consed can add tags of these
types in the method described in TAGS of the Quick Tour (above).

119) ADDING TAGS FROM OTHER PROGRAMS

You can also write external programs that add tags to the ace file
and/or the phd files. Both RT (read) and CT (consensus) tags can be
appended to the end of the ace file. BEGIN_TAG tags can be appended
to the end of the phd files. Do not rewrite the ace file or the phd
file--there is no need to do so and it will cause problems.

120) CONTROL OF CONSED FROM SOME OTHER PROGRAM

Consed can be controlled by some other program. For example, you
might have a program that displays mapping data and you would like the
user to be able to click on a location and have Consed come up showing
the bases in that region. This feature allows a programmer to do
this.

The external program can start up Consed as follows:

consed -socket (local port number) -ace (ace filename)

For example,

consed -socket 5432 -ace standard.fasta.screen.ace.1

After Consed completes coming up (including you clicking whether you
want to apply edits), you will see the message in the xterm:

success bind to local port number: 5432

And then you will see a file created by Consed in the default
directory called consedSocketLocalPortNumber

This gives the port number of the Berkeley socket that Consed has
opened and is listening on. Thus your program can read this file and
create a connection to the Berkeley socket created by Consed.

Once the connection is established, your program can send commands to
Consed at that socket indicating to Consed which contig to display and
what consensus position to scroll to. Currently, the only acceptable
commands are:

Scroll (contigname) (consensus position)<return>
PopupTraces (read name) (unpadded read position in the direction of sequencing)<return>

'Unpadded read position in the direction of sequencing' is the
position from the right end, if the read is a bottom strand read.

Just send such a command to the Berkeley socket, and Consed will
respond appropriately. (Currently, Consed doesn't like it if another
process establishes a connection and then terminates without first
terminating the connection.)

121) AUTOMATIC ORDERING OF OLIGOS

I heard of a finisher who manually ordered 72 oligos. She had to
cut/paste the bases of each oligo. That is not only painful, but also
error prone. I've supplied you a script that you can use to
automatically determine which oligos have been newly requested since
the last order, aggregate them into a single order, and email the
request off.

The script is ace2Oligos.perl. It takes as parameters the name of an
ace file and the name of the oligo file. The oligo file is a list of
oligos that have been ordered for that particular project, and looks
like this:

name=G1980A181.1
sequence=ctgcatggctaggga
template=seq from subclone
date=980427 temp=52

name=G1980A181.2
sequence=tcttactttctgactttcattt
template=seq from clone
date=980427 temp=50

ace2Oligos.perl finds all oligo tags in the ace file and makes sure
that all of them are in this oligo file.

To automatically order oligos each night, there is an additional
script you will have to write. I suggest that you run your script
each night under cron and that it do the following:

for each project, it will look for the most recent ace file. It will
run ace2Oligos.perl on that ace file and direct the oligo file to be
in the parent directory of edit_dir, phd_dir, and chromat_dir for that
project. Thus there will be one oligos file for each project. Your
script will run ace2Oligos.perl once for each project.

Then your script would, for each project, look in the oligos file for
new oligos, and aggregate the unordered oligos into a central file,
which it would email to the oligo company. If it finds any new oligos
in an oligo file, it draws a line at the bottom:

-------------------------------

which indicates that all oligos have been ordered. When this script
looks at this file the next night, it uses this line to determine
whether any additional oligos have been requested since the previous
order. (The idea of this line came from St Louis.) Thus the oligos
file tells you which oligos have been ordered and which have not yet
been ordered.

122) CUSTOM NAVIGATION

In the Main Window, there is also a Navigate menu. Pull it down and
release on the Custom Navigation menu item. A box will pop up saying
'Select custom navigation file:'
There will be a file:
custom_navigation.nav
Double click on it.

You will see the now-familiar custom navigation box. Click 'Next'
repeatedly until you get to the end of the list.

Consed doesn't write such a file--it just reads it. This feature
allows you the ability to write your own programs that select
locations that you want your finishers to examine. Your program
writes a file, the user reads that file into Consed in this manner,
and you can go to each of the locations.

123) DEFINING KEYS TO CALL EXTERNAL PROGRAMS AND APPLY TAGS AND
INTEGRATING CONSED WITH EXTERNAL DATABASES

You now can define keys to call external programs when the key is
pressed in the trace window. As an example, I have control-N and
control-O ("oh"--not zero) call "/bin/echo" by default. Try these and
see. Watch in the xterm where you started Consed for output like
this:

argument_for_first_key djs74_2231.s1 79 Contig1 1809
argument_for_second_key djs74_2231.s1 79 Contig1 1809

The djs74_2231.s1 is the read the user was viewing, Contig1 is the
contig, 79 is the unpadded read position in the direction of
sequencing, and 1809 is the unpadded consensus position.

You will also see that control-O will automatically add a tag.

Several groups that are doing polymorphism detection have expressed
interest in this feature because it enables them to have Consed
directly write into an external database (e.g., Oracle or Sybase) by
calling a program that then writes to the database.

The resources in .consedrc that allow you to customize the calling of
external programs are:

consed.userDefinedKeys: 14 15
! make a space-separated list of the decimal ASCII values of the keys
! 14 means control-N, 15 means control-O

consed.programsForUserDefinedKeys: /bin/echo /bin/echo
! a space-separated list of the full pathnames of the commands to run

consed.argumentsToPassToUserDefinedPrograms: argument_for_first_key argument_for_se
cond_key
! a space-separated list of the arguments to pass to each user-defined programs

consed.tagsToApplyWithUserDefinedKeys: none polymorphismConfirmed
! a space-separate list of the tag types to apply when the user
! presses a user-defined key. If a key is to have no associated tag,
! then enter "none" for that key.

124) USING FILES CREATED ON WINDOWS OR WINDOWS NT.

Don't. (E.g., phd files generated by a Beckman CEQ-2000.) These
files initially had <CR><LF> at end of line instead of <LF>. CONSED
chokes every time it tries to read something from these phd files.
If you must use these files, you must first convert them to UNIX
format, which means stripping out the CR's and just having \n (decimal
125) separate lines.

--------------------------------------------------------------------------

MONITORS AND MICE FOR CONSED

If your monitor is part of a Unix computer (a Sun, an HP, a DEC, an
SGI, or a Linux box) or is an Xterminal, then you will have absolutely
no problems.

You must have 3 button mouse or 3 button emulation. 3 Button
emulation is tricky since Consed uses all 3 buttons of the mouse and
it also uses Control-Middle-Mouse-button, Shift-Middle-Mouse-Button
and Control-Right-Mouse-Button. So if you are going to try to just
use a 2 button mouse (or, God-forbid, a 1 button mouse), you should
make sure that you can emulate each of those. Often, if you push the
left and right mouse buttons at the same time, your X server will
interpret that to be the middle mouse button. But you must consult
your X emulator or X server to know what it will do--that is out of
Consed's control.

If your monitor is a PC running Windows or NT, then you must have an X
emulator installed and running. X emulators include: Exceed, XWin32,
Reflection X, and OpenNT. Any of these will work if configured
correctly (and the 'correctly' is the key). I encourage you to use
single window mode and then use a Unix window manager such as CDE,
fvwm, or mwm.

If your monitor is a MAC, then you must also have an X emulator, such
as Exodus or MACX installed and running. You *must* use this emulator
in single window mode, and then use a Unix window manager such as CDE,
fvwm, or mwm. (If you don't use single window mode, Consed might
crash in some circumstances.)

--------------------------------------------------------------------------

HUGE ASSEMBLIES

If you have an assembly of over around 100,000 reads, you might need
to have a 64 bit computer. Consed's Alpha version is 64 bit and Consed
has a 64-bit SGI version. Linux is currently not available in 64
bit. 64-bit Solaris and HP versions can probably be made if there is
enough interest.

--------------------------------------------------------------------------

AUTOFINISH AND PRIMER-PICKING PARAMETERS

Some of the parameters below are used by Autofinish, some by Consed's
primer picker, and some by both.

You should use the default values of these parameters unless you have
a particular reason for changing them. The defaults have been chosen
very carefully based on theory and experimentation and are the ones
being used at the major genome centers.

You can set these via the .consedrc file.

In addition, for a particular Consed session, you interactively change
many of these in the following manner: On the main window, point to
'Options', hold down the left mouse button and release on 'Primer
Picking Preferences.' You can modify the resource of interest and
then click on 'Apply and Dismiss'. The new value of the resource will
be in affect only until you restart Consed.

For the most current list, in the Consed Main Window, point to 'Info',
hold down the left button, and release on 'Show Current Consed
Resources'.

! If you want to modify any of these parameters, just cut/paste
! the relevant line into your ~/.consedrc file
! (or into the edit_dir/.consedrc file)
! In the following, I have annotated the parameters with the following
! symbols:
!
! (YES) freely customize to your own site
! (OK) don't change unless you have a specific need and know what you
! are doing
! (NO) don't change this!
!
!
!
! resources in the (YES) category:
!
consed.autoFinishMinNumberOfErrorsFixedByAnExp: 0.020
! if an experiment solves fewer errors than this, it isn't worth doing
! so won't be chosen. This parameter controls when Autofinish stops
! choosing experiments.
! (YES)
consed.autoFinishRedundancy: 2.000
! This number should be between 1.0 and 2.0 If you want more reads
! for each area, increase the number towards 2.0 If you want fewer
! reads per area, decrease it towards 1.0. This only affects
! universal primer reads--not custom primer reads.
!
! (YES)
consed.autoFinishAverageInsertSize: 1500
! If a template has a forward but no reverse, when deciding whether to
! allow this template for a particular primer or reverse, we need to
! make an assumption of where is the end of the template. If we have
! do not have enough forward/reverse pairs to determine the mean, then
! this parameter is used.
! (YES)
consed.primersMaxInsertSizeOfASubclone: 3000
! check +/- this distance from the primer for false-annealing
! and check at most this distance for templates for a primer.
! Thus if you have more than one library, make this the max of
! all libraries.
! (YES)
consed.primersMaxMeltingTemp: 60
! (YES)
consed.primersMaxMeltingTempForPCR: 60
! (YES)
consed.primersPickTemplatesForPrimers: true
! when picking primers for subclone templates, pick templates also.
! If there is no suitable template for a primer, do not pick the
! primer. If you like to pick your own templates, you might want to
! turn this off for a little improvement in speed.
! This has no effect on Autofinish--just on interactive primer picking
! in Consed.
! (YES)
consed.primersSubcloneFullPathnameOfFileOfSequencesForScreening: /usr/local/genome/lib/screenLibs/primerSubcloneScreen.seq
! vector sequence file if choosing subclone (e.g., M13, plasmid)
! templates
! (YES)
consed.primersCloneFullPathnameOfFileOfSequencesForScreening: /usr/local/genome/lib/screenLibs/primerCloneScreen.seq
! vector sequence file if choosing clone (e.g., cosmid, BAC) template
! (YES)
consed.primersMinMeltingTemp: 55
! (YES)
consed.primersMinMeltingTempForPCR: 55
! (YES)
consed.autoFinishMaxAcceptableErrorsPerMegabase: 0
! target error rate. This parameter used to be the one that stopped
! Autofinish from calling more reads. However, consider a BAC that is
! nearly perfect except for one region with 3 quality 10 bases in a
! row. In this case the global errors per megabase is very
! low--perhaps lower than 1 error per megabase. Despite this, most
! labs would like to do one more read to fix this problem. Thus we
! set this parameter to zero (to disable it) so Autofinish will use
! the parameter consed.autoFinishMinNumberOfErrorsFixedByAnExp to stop
! calling more reads--it is a local error rate.
! (OK)
consed.autoFinishIfNotEnoughFwdRevPairsUseThisPerCentOfInsertSize: 90
! If a template has a forward but no reverse, when deciding whether to
! allow this template for a particular primer, we need to make an assumption
! of where is the end of the template. If the template comes from a library
! with insert size 1500, it would be reasonable to assume that the end of
! template will be 1500 bases from the forward read. But if this template
! has an insert that is shorter than average, the walk may walk into vector.
! To be conservative, we may want to assume that the insert is somewhat
! shorter than average. By default, we assume that it is 90as large as
! the average. This parameter gives that percentage. This parameter
! is used both by Consed and Autofinish.
! (OK)
consed.primersNumberOfBasesToBackUpToStartLooking: 50
! e.g., if this is 50 and you want a read at position 1000, primers
! will be searched before base 950 but not in the region 950 to 1000
! This has no effect on Autofinish--just on interactively picking primers.
! (OK)
consed.primersMakePCRPrimersThisManyBasesBackFromEndOfHighQualitySegment: 100
! When a PCR product is made, you want it to overlap by this many bases
! the high quality part of the existing consensus. Thus choose PCR
! primers this many bases back (or more)
! (OK)
consed.primersOKToChoosePrimersInSingleSubcloneRegion: true
! (OK)
consed.primersOKToChoosePrimersWhereHighQualityDiscrepancies: false
! (OK)
consed.primersOKToChoosePrimersWhereUnalignedHighQualityRegion: false
! (OK)
consed.autoFinishCallReversesToFlankGaps: true
! if there is a forward-reverse pair flanking a gap, print it out
! if there is not, suggest reverses to flank the gap
! (OK)
consed.autoFinishAllowWholeCloneReads: false
! ok to call reads whose template for sequencing reaction is the
! entire clone (BAC or cosmid)
! (OK)
consed.autoFinishAllowCustomPrimerSubcloneReads: true
! ok to call reads with custom primers and subclone template
! (OK)
consed.autoFinishAllowDeNovoUniversalPrimerSubcloneReads: true
! Allows calling reverse when there is just a forward.
! Allows calling a forward when there is just a reverse.
! (OK)
consed.autoFinishAllowMinilibraries: false
! Allows calling minilibraries (shatter libraries or transposon
! libraries) of subclone templates for closing gaps
! (OK)
consed.autoFinishAllowPCR: true
! Allows calling PCR for closing gaps, but only as a last resort
! (OK)
consed.autoFinishAllowResequencingAUniversalPrimerAutofinishRead: false
! if Autofinish suggests a de novo universal primer read,
! do not allow Autofinish to suggest a resequence of this read
! (OK)
consed.autoFinishAlwaysCloseGapsUsingMinilibraries: false
! "Minilibraries" includes transposing a subclone template or
! making a shatter library from a subclone template
! (OK)
consed.autoFinishMaximumFinishingReadLength: 2000
! Change this only if your finishing reads are typically shorter
! than your shotgun reads. Otherwise, leave it unrealistically long,
! and Autofinish will set its model read based on your existing
! shotgun reads.
! (OK)
consed.autoFinishSuggestMinilibraryIfGapThisManyBasesOrLarger: 800
! (OK)
consed.autoFinishSuggestSpecialChemistryForRunsAndStops: true
! Suggest special chemistry such as dGTP for reads that cross
! mononucleotide or dinucleotide repeats that cause reads to fail or
! stops (structure) that cause reads to fail and thus dye terminator
! reads won't work.
! (OK)
consed.autoFinishSuggestThisManyMinilibrariesPerGap: 2
! (OK)
consed.primersWindowSizeInLooking: 450
! e.g., if this is 300, with example above, primers will be searched
! from base 650 to 950. This has no effect on Autofinish--it is just
! used for interactive primer picking in Consed.
! (OK)
consed.primersAssumeTemplatesAreDoubleStrandedUnlessSpecified: false
! you can put the template type in the phd file in a WR template item
! consed will have a list of these and know which are single and
! double stranded
! (OK)
consed.autoFinishAllowResequencingReads: true
! (OK)
consed.autoFinishAllowResequencingReadsToExtendContigs: false
! if false, a resequencing read is not called to extend a contig--only
! custom primer reads and de novo universal primer reads are called
! for this purpose.
! (OK)
consed.autoFinishCallHowManyReversesToFlankGaps: 2
! If less than this many fwd/rev pairs flank a gap, Autofinish will
! suggest additional reverses until there are this many. If there are
! this many fwd/rev pairs flanking a gap, Autofinish will print out
! the contig ends that flank the gap.
! (OK)
consed.autoFinishCloseGaps: true
! this allows you to turn off choosing reads to close gaps
! (OK)
consed.autoFinishContinueEvenThoughReadInfoDoesNotMakeSense: false
! this allows you to override the checks that autofinish makes on the
! read info, such as checking there are not more than 5 or so reads
! from the same subclone template
! (OK)
consed.autoFinishCostOfResequencingUniversalPrimerSubcloneReaction: 20.000
! compares universal primer subclone reaction, custom primer subclone
! reaction, and custom primer clone reaction to decide which to favor
! (OK)
consed.autoFinishCostOfCustomPrimerSubcloneReaction: 60.000
! see above
! (OK)
consed.autoFinishCostOfCustomPrimerCloneReaction: 80.000
! see above
! (OK)
consed.autoFinishCostOfDeNovoUniversalPrimerSubcloneReaction: 60.000
! cost of reverse where there is only a forward or cost of forward
! when there is only a reverse
! (OK)
consed.autoFinishCostOfMinilibrary: 500.000
! cost of making a minilibrary (transposon library or shatter library)
! from a subclone template
! (OK)
consed.autoFinishCoverSingleSubcloneRegions: true
! this allows you to turn off choosing reads to cover single subclone regions
! (OK)
consed.autoFinishCoverLowConsensusQualityRegions: true
! this allows you to turn off choosing reads to cover low consensus
! quality regions
! (OK)
consed.autoFinishDebugUniversalPrimerReadsFile: gordon_debug.txt
! for debugging Autofinish
! put a file with this name in the same directory as the ace file
! format:
! fcalld09 fwd
! fgj74f01 rev
! (template name) (fwd or rev)
! (OK)
consed.autoFinishDoNotAllowSubcloneCustomPrimerReadsCloserThanThisManyBases: 200
! see consed.autoFinishDoNotAllowSubcloneCustomPrimerReadsCloseTogether
! (OK)
consed.autoFinishDoNotAllowWholeCloneCustomPrimerReadsCloserThanThisManyBases: 300
! see consed.autoFinishDoNotAllowWholeCloneCustomPrimerReadsCloseTogether
! (OK)
consed.autoFinishDoNotFinishWhereTheseTagsAre: doNotFinish
! list of tag types separated by spaces. E.g.,
! repeat
! tells autofinish that you are not interested in finishing in this region
! (OK)
consed.autoFinishDumpTemplates: false
! for debugging, this allows you to dump all information about the
! templates--insert locations
! (OK)
consed.autoFinishExcludeContigIfOnlyThisManyReadsOrLess: 10
! exclude contigs that are probably E. coli contamination
! (OK)
consed.autoFinishExcludeContigIfDepthOfCoverageOutOfLine: true
! (OK)
consed.autoFinishExcludeContigIfDepthOfCoverageThisMuchMoreThanLargestContig: 6.000
! exclude contig if its depth of coverage is much greater than other
! contigs (this indicates contamination)
! (OK)
consed.autoFinishExcludeContigIfTooShort: true
! exclude contig if it has too few bases in the consensus
! (OK)
consed.autoFinishExcludeContigIfThisManyBasesOrLess: 1000
! consed.autoFinishExcludeContigIfTooShort must be set to true for
! this to have any effect
! (OK)
consed.autoFinishHowManyTemplatesYouIntendToUseForCustomPrimerSubcloneReactions: 3
! this tells autofinish which templates you are planning on using
! which is necessary to figure out which regions will still be single
! subclone regions
! (OK)
consed.primersMinNumberOfTemplatesForPrimers: 1
! if there are fewer templates than this, the primer is rejected
consed.autoFinishMinBaseOverlapBetweenAReadAndHighQualitySegmentOfConsensus: 70
! when extending the consensus, a read that is too far from the
! consensus will not be assembled by phrap with this contig and thus
! will not be useful for extending the consensus. This gives the
! minimum overlap of a read with the high quality segment of the
! consensus. As reads are picked, then additional reads may be picked
! further out.
! (OK)
consed.autoFinishNumberOfVectorBasesAtBeginningOfAUniveralPrimerRead: 40
! used to figure out where the beginning of a reverse will be. Not
! important to be accurate because the insert size is so uncertain
! (OK)
consed.autoFinishCDNANotGenomic: false
! If this is set to true, the whole clone is assumed to be cDNA and,
! rather than the normal method of detecting the end of the clone,
! Autofinish detects the end of the cDNA as follows:
! the user is expected to add whole read items of type 'template',
! with 'type: univ fwd' for the 5' end and 'type: univ rev' for the 3'
! end of the cDNA.
! (OK)
consed.autoFinishConfidenceThatReadWillCoverSingleSubcloneRegion: 90
! Autofinish computes the per cent of existing reads are aligned at
! each base position. Typically, this number starts at around 0at
! base position 1, rises to close to 100at around base position 300,
! and then drops again to 0at base position 800 or so. This number
! specifies how high the number must be for Autofinish to consider an
! Autofinish read to cover a single subclone region.
! (OK)
consed.autoFinishPrintForwardOrReverseStrandWhenPrintingSubcloneTemplatesForCustomPrimerReads: true
! If this is true, then custom primer reads are printed out like this:
! tccagaaaactaattcaaaataatg,56,standard.2,->,2413,2413,3681,Contig1,9,djs74_690 (fwd),10,djs74_1803 (fwd),11,djs74_1861 (fwd)
! If this is false, then custom primer reads are printed out like this:
! tccagaaaactaattcaaaataatg,56,standard.2,->,2413,2413,3681,Contig1,9,djs74_690,10,djs74_1803,11,djs74_1861
! The difference is the (fwd) or (rev) that indicates which strand of
! the subclone template is to be used. This is particularly important if
! you use M13 and thus must make the reverse strand.
! (OK)
consed.autoFinishPrintMinilibrariesSummaryFile: false
! If this is true, Autofinish will print a file with name
! xxx.minilibraries just as it prints one as xxx.univReverses and
! xxx.univForwards
! (OK)
consed.autoFinishNearGapsSuggestEachMissingReadOfReadPairs: true
! This is set to true to increase the chance of closing a gap. For
! every subclone template that has just one universal primer read
! (either just a forward or just a reverse) that might protrude off
! the end of the contig, Autofinish suggests the universal primer read
! off the opposite end of the subclone template.
! If this parameter is set false, then
! Autofinish may still choose some of these reads, but it won't
! necessarily choose them all.
! (OK)
consed.primersMinimumLengthOfAPrimer: 15
! (OK)
consed.primersMaximumLengthOfAPrimer: 25
! (OK)
consed.primersMinimumLengthOfAPrimerForPCR: 25
! (OK)
consed.primersMaximumLengthOfAPrimerForPCR: 30
! (OK)
consed.primersMaxMeltingTempDifferenceForPCR: 3.000
! how large can the difference of melting temperatures be between
! two primers of a PCR primer pair
! (OK)
consed.primersMaxPCRPrimerPairsToDisplay: 100000
! there is a limit here, because there could possibly be millions
! (OK)
consed.primersCheckJustSomePCRPrimerPairsRatherThanAll: true
! If there are 1000 1st primers, and 1000 2nd primers, that gives
! a million pairs for Consed to check, which takes a long time. So
! instead, just check some of the pairs
! (OK)
consed.primersNumberOfTemplatesToDisplayInFront: 2
! this shows the number of templates to show in the interactive primer
! picking window
! (OK)
consed.primersMaxLengthOfMononucleotideRepeat: 4
! (OK)
consed.primersBadLibrariesFile: badLibraries.txt
! file of libraries, one per line
! If any template is from any one of these libraries, then
! consed/autofinish will not use this template for walking or
! suggesting any universal primer reads
! (OK)
consed.primersLibrariesInfoFile: librariesInfo.txt
! file of libraries, with one entry for each library of the following
! format:
! LIB{
! name: library1
! insertSize: 1500
! stranded: single
! }
! (OK)
consed.primersBadTemplatesFile: badTemplates.txt
! file of templates that you've tried, don't work, and you don't want to try
! again
! (OK)
consed.primersChooseTemplatesByPositionInsteadOfQuality: true
! Templates for subclone custom primer walks can be chosen either on
! the basis of the quality of the template (as determined by the quality
! of existing reads from that template) or by the location of the end of
! the template. If this resource is false, templates will be chosen
! based solely on quality. If this resource is true, then templates
! with forward/reverse pairs will be picked first, followed by templates
! that have the beginning of the insert closest to the primer.
! (OK)
consed.primersWhenChoosingATemplateMinPotentialReadLength: 350
! when choosing templates for a custom primer, only choose a template
! if the read can be chosen at least this long
! (OK)
consed.primersWindowSizeInLookingForPCR: 2000
! will look this many bases back from the pointer when looking for a PCR
! primer. This has no effect on Autofinish--is is just used for
! interactive PCR primer picking in Consed.
! (OK)

----------------------------------------------------------------------------

NEW ACE FILE FORMAT

There is a new ace file format (since early 1998). If you still
haven't changed to the new ace file format, you must do so now since
it contains information that is not contained in the old ace file
format. This additional information (e.g., the alignment and quality
clipping values) are essential for some of the Consed functions (e.g.,
navigate by single stranded, navigate by single subclone, Autofinish)
to work correctly.

Another reason to switch to the new ace format is that you will get
faster Consed startup performance. The new ace file format is also
much smaller (about 60% as big as the old).

The new phrap (Aug 1998 and better) writes the new ace format (using
the -new_ace switch). Since Consed now uses the additional
information found only in the new ace format, if you are editing an
assembly, you should first re-phrap to take advantage of this
additional information.

Consed can read either old or new ace format.
Consed can also write either new or old ace format. It write the new
ace format by default--see 'Options'/'General Preferences'. Also see
the Consed resource:

consed.writeThisAceFormat: 2

(where 2 means 'new' and 1 means 'old')

If you have scripts that read the ace file, you will need to modify
those scripts for the new ace format. Here is the format:

Ace File Format

Refer to the accompanying sample_ace_file.txt (below)

AS <number of contigs> <total number of reads in ace file>

CO <contig name> <# of bases> <# of reads in contig> <# of base segments in contig> <U or C>

The U or C indicates whether the contig has been complemented from the
way phrap originally created it. Thus this is always U for an ace
file created by phrap.

This starts the list of base qualities for the unpadded consensus
bases. The contig is the one from the previous CO, hence no name is
needed here.

AF <read name> <C or U> <padded start consensus position>

This line replaces the 'AssembledFrom*' line in the previous ace file
format. C or U means complemented or uncomplemented. The <read name>
is the true read name (no .comp on it as with the previous ace file
format.)

BS <padded start consensus position> <padded end consensus position> <read name>

This replaces the 'BaseSegment*' line from the previous ace file format.

RD <read name> <# of padded bases> <# of whole read info items> <# of read tags>

QA <qual clipping start> <qual clipping end> <align clipping start> <align clipping end>

This is new information not found in the previous ace file. If the
entire read is low quality, then <qual clipping start> and <qual
clipping end> will both be -1. These positions are offsets from the
left end of the read (left, as shown in Consed). Hence for bottom
strand reads, the offsets are from the end of the read. The offsets
are 1-based. That is, if the left-most base is in the aligned,
high-quality region, <qual clipping start> = 1 and <align clipping
start> = 1 (not zero).

DS CHROMAT_FILE: <name of chromat file> PHD_FILE: <name of phd file> TIME: <date/time of the phd file> CHEM: <prim, term, unknown, etc> DYE: <usually ET, big, etc> TEMPLATE: <template name> DIRECTION: <fwd or rev>

There can be additional information on this line.
This replaces the DESCRIPTION line from the old ace file.

The following is for transient read tags (those generated by
crossmatch and phrap). They are not fully implemented, and the format
may eventually change. The read is implied by the location of the
whole read info item within the ace file. They are found after the DS
line for a read.

RT{
<read name> <tag type> <what program created tag> <padded read pos start> <padded read pos end> <date when tag was created in form YYMMDD:HHMISS>
}

for example:

RT{
djs14_680.s1 matchElsewhereLowQual phrap 904 933 990823:114356
}

There are consensus tags now in the ace file. All consensus tags have
the following format:

CT{
<contig name> <tag type> <what program created tag> <padded cons pos start> <padded cons pos end> <date when tag was created in form YYMMDD> <NoTrans>
(possibly additional information)
}

The NoTrans is optional--it indicates that, when you reassemble, this
tag should not be transferred to the new assembly. This is true with
tags that should be recreated each time because they have to do with
the assembly (e.g., repeat tags).

e.g.,

CT{
Contig206 repeat tagRepeats.perl 118732 119060 990823:115033 NoTrans
AluY
}

In the case of most consensus tag types, there is only 1 line for the
consensus tag. In the case of comment tags and oligo tags, there are
additional lines of information. The comment tag includes the comment
on the additional lines. The oligo tag has the following information:
<oligo name> <oligo bases from 5' to 3'> <melting temp> <C or U
indicating whether the oligo is top strand or bottom strand relative
to the orientation of the contig as created by phrap>

WA{
<tag type> <what program created tag> <date tag was created in form YYMMDD:HHMISS>
1 or more lines of data
}

This line is a 'whole assembly' tag. It is used for information
referring to the assembly as a whole. Currently, phrap puts its
version and phrap command line options in a WA tag.

You can append CT, WA, and RT tags to the end of the ace file in any
order you like.

Sample Ace File:

AS 1 8

CO Contig1 1475 8 156 U
agccccgggccgtggggttccttgagcactcccaaagttccaacccagga
tgtccccgacgcttaaaccttccaagtctgaaacgggaaatttgatttgc
gggctaggataaacgccggggagaaaggcagaactgccttttacccccca
aggatatcccttgggaagggcccctttgcactcagctgctccctaattat
ggcgatcctccctctatctttgtccccctgtctttcaggatccctctcAA
CAACAgaccaCTCccattaaaGAAATCtccttctgatctgcgggatcACA
TAAAACAGTGCCattcAAaAcgtcccttcCcccAATGTCtaagtgTggtg
gagcCcttcctgcCCggctctgtgcacccacggtgcctgcatgaccccgg
atGCAGTGTGCACCAGctCCCATCATTCAAgagCATGACTGTGTTGCCAA
CCAGCcacCAGGCACTGGGGAGGGAGCtgaGGGAGCAcaaAAGGGATGAG
CCACCCTCTGTcCcagAAGTGGAGGGCATGGGGCTTGGCTGGGCTTAGAG
CTAACATACACAGGATGCTGAAAAAGAACAACACAAggtGTGTGGAGCAA
AGGAAAGGGAAATCAGCTTGAAGCTGATGTTAGTGTGCTTGGGCTGAGTA
CAGCCATGCTCTCAGTTGAGGCACGGTTGGCTCCCCATGGGCAAGATCCC
TCCTGGCCCATCTCTCCTCTTATTCTCTATCCCTTCCCCAGGTCCCTGCC
TTAGAGGTTTCACCAGAGCACAGCTCCTGCCTGTGGCCAAAACAGTATTT
GGCCACTCACCGACCCAGTGTCAGC*ATCCAGATGGGTTCCACATCTCAC
AACCCT*GAGCAGCAGAGAAGGGTTTGAAAGGCCAGGGGAG*AATGAAGA
CGAAGGAGG*TGTTGGCAACAACACAGA*G*AGTCAGCAGCCAGAACGCC
AGGTATCCACACACATAAGACATTCTAAATTTTTACTCAACAGAAATTGT
CTATGTCTGTGTCTGGGCACCATGGCAACACCTTATCTCTACAAAAATTA
GCGGAATGTAGTGGTGCCTGTGTGTAGTCCCAGCTATTCAAGAGGCTGAA
GTGGGAGGATTGCTTGAGCCATGGAAGTCAAGGCTGTAGTGAGCCATGAT
TGTGTCAATGCACTCCAGACAGAGCAAGACCCTGCTCCCACCACACACCT
CaaacgaaAAAAAAaaagggcaaagatatgaactgaaatggaatatag*a
gcagcaaaaggaacagaaaattgtctatgcctggttctctagtcatgtgc
agaacagacagtatcccggccctattgagttcttggggcagttaggcttg
tgcacccttgcttctatgccacagttagggcattcgggattcccatcctt
ttccccggggttgctttttgtttgcgattaccttttcggaacaatggggg
gaaattattttccaagttgggtttg

BQ
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 23 22
23 26 24 25 25 17 17 13 14 19 21 22 22 17 17 11 8 7 10 13 18 23 28 28 31 31 32 18 18 10 10 10 12 15 8 6 6 8 8 10 15 9 11 12 15 14 15 20 20 28
30 31 24 24 22 24 25 28 23 27 24 27 18 15 15 16 21 23 18 20 13 8 7 7 12 10 9 10 10 21 12 14 14 28 27 32 24 23 20 19 15 17 15 17 19 20 13 13 13 14
14 10 10 10 23 10 10 10 10 10 11 11 18 25 24 10 10 10 10 10 14 10 11 11 11 13 12 12 10 12 10 10 10 10 10 10 14 10 12 10 10 10 10 10 10 10 14 13 15 15
17 19 24 32 37 37 37 37 32 30 30 30 28 23 23 25 15 15 20 27 32 23 22 22 27 32 34 34 21 21 12 12 12 24 32 41 45 45 37 45 45 45 45 45 37 37 37 41 41 37
37 37 41 32 32 14 14 19 32 28 37 37 41 41 45 45 37 37 37 30 30 32 32 37 37 32 28 16 16 17 32 32 37 45 37 25 25 9 9 9 25 25 37 37 37 37 37 45 40 37
37 37 45 45 37 37 37 37 38 25 25 12 25 10 10 15 32 47 52 62 62 55 43 43 34 43 43 58 58 78 77 72 72 70 70 70 74 77 69 68 55 55 55 57 61 65 70 73 68 61
64 58 56 56 64 65 67 70 70 75 79 70 70 70 70 70 70 67 71 71 71 84 63 63 62 62 62 59 59 61 61 64 64 49 42 32 10 6 18 32 35 46 47 48 47 47 47 55 55 55
55 49 46 47 47 55 55 55 54 47 47 47 48 48 54 54 54 48 48 55 47 47 47 55 49 48 48 48 55 47 48 48 47 47 47 46 48 48 48 50 44 43 44 44 49 49 73 75 82 78
74 66 66 58 54 60 68 68 61 63 47 57 45 74 85 78 70 65 62 61 61 55 73 65 59 61 75 77 80 86 81 81 83 85 85 85 90 84 78 78 73 75 78 77 86 75 76 83 79 84
87 78 72 75 72 72 76 79 82 88 90 89 89 89 89 89 90 90 90 85 85 79 83 83 90 90 90 90 90 90 90 90 90 90 90 90 90 89 89 89 90 90 90 90 90 90 90 90 90 90
90 90 90 81 66 66 62 62 62 73 89 90 90 86 86 86 86 88 88 90 90 90 90 90 90 90 88 71 68 61 61 66 66 70 65 64 70 70 76 90 90 90 90 90 90 85 90 90 90 87
87 79 79 79 79 89 74 65 71 72 79 73 73 70 75 79 76 81 81 83 80 87 89 90 82 82 90 88 88 88 88 89 86 77 77 80 79 79 79 90 90 90 90 79 79 61 58 53 76 63
57 65 76 76 76 80 89 89 89 90 90 90 90 88 88 88 88 88 88 90 90 90 90 90 90 90 90 90 90 88 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 83 79 58 43
45 68 70 61 75 76 73 68 84 88 90 90 90 90 90 89 72 54 62 62 53 55 55 80 83 80 80 83 85 83 87 83 83 83 85 85 86 86 84 81 83 82 77 78 76 76 77 77 80 88
88 87 90 90 90 90 85 84 82 71 75 62 62 37 68 75 77 74 70 71 70 72 72 80 80 80 84 83 82 66 70 55 55 55 37 55 55 55 55 55 55 55 55 54 55 55 55 48 47 47
47 47 47 47 47 47 47 47 55 50 50 50 47 47 47 47 44 44 55 48 51 51 54 54 54 54 54 55 54 54 55 55 55 55 55 55 55 55 55 55 55 55 55 51 51 51 54 51 61 61
61 61 61 61 44 42 34 34 37 37 37 44 47 47 47 61 61 61 61 61 61 61 47 49 48 47 55 54 55 55 55 55 55 44 44 44 44 46 43 43 44 44 44 51 44 47 44 34 44 44
44 44 39 39 43 42 50 42 42 38 37 38 41 50 52 55 47 47 39 44 44 46 41 42 40 43 40 41 42 38 37 42 55 50 44 44 46 48 55 55 55 37 34 34 33 42 47 42 42 42
42 55 46 46 46 48 47 48 46 43 41 39 42 39 44 44 44 48 48 38 36 36 38 38 38 44 44 44 44 44 44 42 42 36 41 40 36 36 30 33 32 29 28 28 23 12 16 10 8 8
13 14 23 20 21 28 28 31 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

AF K26-217c U 498
AF K26-526t U 510
AF K26-961c U 577
AF K26-394c U 797
AF K26-291s U 828
AF K26-822c U 883
AF K26-572c C 1
AF K26-766c C 408
BS 1 515 K26-572c
BS 516 516 K26-217c
BS 517 521 K26-572c
BS 522 529 K26-217c
BS 530 538 K26-572c
BS 539 569 K26-217c
BS 570 571 K26-526t
BS 572 573 K26-217c
BS 574 579 K26-526t
BS 580 584 K26-217c
BS 585 591 K26-526t
BS 592 592 K26-217c
BS 593 601 K26-526t
BS 602 604 K26-217c
BS 605 606 K26-526t
BS 607 607 K26-217c
BS 608 621 K26-526t
BS 622 628 K26-217c
BS 629 629 K26-526t
BS 630 630 K26-217c
BS 631 633 K26-526t
BS 634 634 K26-217c
BS 635 635 K26-526t
BS 636 639 K26-217c
BS 640 646 K26-526t
BS 647 648 K26-217c
BS 649 649 K26-526t
BS 650 650 K26-217c
BS 651 654 K26-766c
BS 655 655 K26-961c
BS 656 656 K26-217c
BS 657 669 K26-961c
BS 670 675 K26-217c
BS 676 676 K26-961c
BS 677 688 K26-217c
BS 689 693 K26-526t
BS 694 696 K26-217c
BS 697 698 K26-526t
BS 699 700 K26-961c
BS 701 706 K26-217c
BS 707 707 K26-961c
BS 708 708 K26-217c
BS 709 709 K26-961c
BS 710 710 K26-526t
BS 711 775 K26-961c
BS 776 776 K26-766c
BS 777 777 K26-961c
BS 778 834 K26-766c
BS 835 837 K26-961c
BS 838 840 K26-394c
BS 841 882 K26-766c
BS 883 884 K26-394c
BS 885 898 K26-766c
BS 899 899 K26-961c
BS 900 900 K26-766c
BS 901 901 K26-961c
BS 902 934 K26-766c
BS 935 935 K26-394c
BS 936 936 K26-766c
BS 937 937 K26-394c
BS 938 940 K26-766c
BS 941 944 K26-394c
BS 945 945 K26-291s
BS 946 948 K26-822c
BS 949 949 K26-766c
BS 950 951 K26-822c
BS 952 954 K26-766c
BS 955 955 K26-822c
BS 956 957 K26-394c
BS 958 962 K26-822c
BS 963 963 K26-394c
BS 964 970 K26-822c
BS 971 971 K26-394c
BS 972 972 K26-822c
BS 973 973 K26-394c
BS 974 976 K26-822c
BS 977 979 K26-394c
BS 980 986 K26-291s
BS 987 987 K26-394c
BS 988 1004 K26-822c
BS 1005 1009 K26-394c
BS 1010 1012 K26-291s
BS 1013 1014 K26-394c
BS 1015 1021 K26-822c
BS 1022 1022 K26-394c
BS 1023 1026 K26-822c
BS 1027 1028 K26-291s
BS 1029 1036 K26-822c
BS 1037 1052 K26-291s
BS 1053 1053 K26-822c
BS 1054 1060 K26-291s
BS 1061 1061 K26-822c
BS 1062 1062 K26-291s
BS 1063 1065 K26-394c
BS 1066 1068 K26-822c
BS 1069 1079 K26-291s
BS 1080 1081 K26-822c
BS 1082 1082 K26-291s
BS 1083 1084 K26-822c
BS 1085 1089 K26-291s
BS 1090 1094 K26-822c
BS 1095 1096 K26-394c
BS 1097 1099 K26-822c
BS 1100 1100 K26-291s
BS 1101 1104 K26-822c
BS 1105 1105 K26-394c
BS 1106 1110 K26-822c
BS 1111 1115 K26-291s
BS 1116 1122 K26-822c
BS 1123 1124 K26-291s
BS 1125 1135 K26-822c
BS 1136 1136 K26-394c
BS 1137 1139 K26-822c
BS 1140 1140 K26-291s
BS 1141 1150 K26-822c
BS 1151 1155 K26-291s
BS 1156 1161 K26-822c
BS 1162 1164 K26-291s
BS 1165 1167 K26-822c
BS 1168 1173 K26-291s
BS 1174 1175 K26-822c
BS 1176 1189 K26-291s
BS 1190 1196 K26-822c
BS 1197 1199 K26-291s
BS 1200 1221 K26-822c
BS 1222 1225 K26-291s
BS 1226 1227 K26-822c
BS 1228 1228 K26-394c
BS 1229 1231 K26-291s
BS 1232 1233 K26-822c
BS 1234 1235 K26-291s
BS 1236 1236 K26-394c
BS 1237 1239 K26-291s
BS 1240 1242 K26-822c
BS 1243 1244 K26-291s
BS 1245 1247 K26-394c
BS 1248 1255 K26-822c
BS 1256 1256 K26-291s
BS 1257 1257 K26-394c
BS 1258 1258 K26-291s
BS 1259 1259 K26-822c
BS 1260 1260 K26-394c
BS 1261 1265 K26-291s
BS 1266 1266 K26-822c
BS 1267 1268 K26-394c
BS 1269 1269 K26-822c
BS 1270 1275 K26-291s
BS 1276 1280 K26-822c
BS 1281 1281 K26-394c
BS 1282 1290 K26-822c
BS 1291 1292 K26-291s
BS 1293 1294 K26-822c
BS 1295 1297 K26-291s
BS 1298 1301 K26-822c
BS 1302 1302 K26-291s
BS 1303 1475 K26-822c

RD K26-217c 563 0 0
tcccCgtgagatcatcctgaAGTGGAGGGCATGGGGCTTGGCTGGGCTTA
GAGCTAACATACACAGGATGCTGAAAAAGAACAACACAAgntGTGTGGAG
CAAAGGAAAGGGAAATCAGCTTGAAGCTGATGTTAGTGTGCTTGGGCTGA
GTACAGCCATGctntCAGTTGAGGCACGGTTGGCTCCCCATGGGCAAGAT
CCCTCCTGGCCCATCTCTCCTCTTATTCTCTATCCCTTCCCCAGGTCCCT
GCCTTAGAGGTTTCACCAGAGCACAGCTCCTGcctgtggccaAAACAGTA
TTTGGCCACTCACcGAcccagTGTCAGC*atccaGatggGtTccacatct
cacaaccct*gggcagcagagaaggggtttaaaggccagggggg*tatta
agccgaaggagg*ttttggaaacaccaaggg*g*ggtcagaccccaacgc
cagtttccccaaaaaggggcattcaaatttttttctcagagattttcttt
ccttttttgggccccgggaaccttttttaaaaaatgggggattgggcccc
cttggcccccctc

QA 19 349 19 424
DS CHROMAT_FILE: K26-217c PHD_FILE: K26-217c.phd.1 TIME: Thu Sep 12 15:42:38 1996
RD K26-526t 687 0 0
ccgtcctgagtggAGggcatggggcttggctggGCTTAGAGCTAACATAC
ACAGGATGCTGAAAAAGAACAACACAAggtGTGTGGAGCAAAGGAAAGGG
AAATCAGCTTGAAGCTGATGTTAGTGTGCTTGGGCTGAGTACagcnatgc
tntgaGTTGAggaacgGTTGGCTCCCCATGGGCAAGATCCCTCCTGGCCC
ATCTCTCCTCTTATTCTCTATCCCTTCCCCAGGTCCCTGCCTTAGAGGTT
TCACcAgAGCACAgCTCctgcctgtggccaAAACAGTATTTGGccACTCA
CCGAcCCAGTGTcagt*atccAGATGGGttccACATCtcacagcccT*Ga
gcAgcagngaaGGGTttgaaagggcAgggggggaatgaaGacggaggagg
gtgttggcaaccacacaga*ggagtcaggaggcaggacggcaggtatccA
Cacacattaggcattttaaatttttacttaacaggaattgtctatggctg
ggtttgggaac*atgggaacacctattcttt*caaaa*ttggggggat*t
agtggtgc*tgt*tatagtcccgttattaaGggttaagtggggtttcttt
gccaggaggtaaggtttggggccctatttttaattacttggaaggaagcc
ttttcccagataaggaaaaaggaggtTTtttgtttta

QA 12 353 9 572
DS CHROMAT_FILE: K26-526t PHD_FILE: K26-526t.phd.1 TIME: Thu Sep 12 15:42:33 1996
RD K26-961c 517 0 0
aatattaccggcgcggggttCcgTCGGAAAGGGAAATCAGCTTGAAGCTG
ATGTTAGTGTGCTTGgGCTGAGTacaGCCATGCTCTCAGTTGAGGCACGG
TTGGCTCCCCATGGGCAAGATCCCTCCTGGCCCATCTCTCCTCTTATTCT
CTATCCCTTCCCCAGGTCCCTGCCTTAGAGGTTTCACCAGAGCACAGCTC
CTGccTGTGGCCAAAACAGTATTTGGccactgaccGACCCagtGTCAGC*
ATCCAGATGGGTTCCACATCTCacaaccCT*GAGCAGCAGAGAAGGGTTT
GAaagGcCAGGGGAG*AATGAAGACgaaggaGG*TGTTgGcaacaacaca
gA*G*AGTCAGCAGccAgaacgccaggtatccacACACATaaggCATtct
aaatttttaCtcaACaggaattgtctATgtctgtgTCtgggcaccagggc
a*cacctTATCTCTAcaaaaat*agcgggatttagtggtgcttgtgtg**
g*cccagctattcaggg

QA 20 415 26 514
DS CHROMAT_FILE: K26-961c PHD_FILE: K26-961c.phd.1 TIME: Thu Sep 12 15:42:37 1996
RD K26-394c 628 0 0
ctgcgtatcgtcacc*accCAGTGTCagctatcCAGATGGGTTCCACATC
TcacaacCCT*GAGCAGCAGAGAAGGGTTTGAAAGGCCAGGGGAG*AATG
AAGACga*gGAGG*tgTTGGCAACAacacagA*G*AGTCAGCAGCCAGAA
CGCCAGGTATCCACACACATAAGACATTCTAAATTTTTACTCAACAGAAA
TTGTCTATGTCTGTGTCTGGgcaCCATGGCAACACCTTATCTCTACAAAA
ATTAGCGGAATGTAGTGGTGCCTGtgtGTAGTCCCAGCTATTCaaGAGGC
TGAAGTGGGAGGATTGCTTGagccaTggaagtcaagGCTGTAGTGagCCa
TGattgtgtCaATGCACtcnagAcagagcaaGACCctgctcccaccacac
aacttaanaggaaaaaaaaaaaggaaaagaaatgaaatgaaatgggatat
ag*aa*aggaaaagga*cagaaa*ttgtctatgcctggt*ctctagtaat
gtcagtcagccagtttccagccttttggtcttgggcattctgctgtcaca
atctcttggaacgttgggcagggaatcccatttttcccccgtttTttttt
gtggcaattaccttttggaaccctgggt

QA 18 368 11 502
DS CHROMAT_FILE: K26-394c PHD_FILE: K26-394c.phd.1 TIME: Thu Sep 12 15:42:32 1996
RD K26-291s 556 0 0
gaggatcgcttTCCacatctcaCAaccctcgagCAgCagagAAgggTTTG
AAAGGCCAGGGGAG*AATGAAGACGa*ggAGG*TGTTGGCAACAacacag
a*G*AGTCAGCAGCCAGAACGCCAggtaTCCAcacacataAgccatTCTA
AATTTTTACTCAAcagAAATTGTCTAtgTCTGTGTCTGggcacCATGGCA
ACACCTTATCTCTACAAAAATTAGCGGAATGTAGTggtGCCTGTGTGTAG
TCCCAGCTATTCAAgaggctGAAGTgcgaggatTGCTTgagCCATGGAAG
TcaaggctgtAGTGAgccatgatTGTGTCAATGCACTCCAGACAGAGCAA
GACCCTGCTCCCAccaCACAcctcaaaaggtattgattaaaGGAaAagaa
atgaaAtgaaatgagataaaggaaaaggaaaaagaacaggatattgTCtA
Tgcctgat*ctctagt*atgtgcagacagaagtttccagccactgagttc
ttgccccagctaactttttacaaatccccctggggaaggtttggcccagg
cagatg

QA 11 373 11 476
DS CHROMAT_FILE: K26-291s PHD_FILE: K26-291s.phd.1 TIME: Thu Sep 12 15:42:31 1996
RD K26-822c 593 0 0
ggggatccg*tcatgagacga*ggAGG*TGTTGGCAACa*ca*agaag*A
GTCAGCAGCCAGAACGCCAGGTATCCACACACATAAGACATTCTAAATTT
TTACTCAACAGAAATTGTCTATGTCTGtgtCTGGGCACCATGGCAACACC
TTATCTCTACAAAAATTAGCGGAATGTAGTggTGCCTGtgtGTAGTCCCA
GCTATTCAAGAGGCTGAAGTGGGAGGATTGCTTGAGCCATGGAAGTCAAG
GCTGTAGTGAGCCATGATTGtgtCAATGCACTCCAGAcAgAGCaAgacCC
tgCTCccACCACACacctCaaacgaaAAAAAAaaagggcaaagatatgaa
ctgaaatggaatatag*agcagcaaaaggaacagaaaattgtcTATGcct
ggttctctagtcatgtgcagaacagacagtatcccggccctattgagttc
ttggggcagttaggcttgtgcacccttgcttctatgccacagttagggca
ttcgggattcccatccttttccccggggttgctttttgtttgcgattacc
ttttcggaacaatggggggaaattattttccaagttgggtttg

QA 25 333 16 593
DS CHROMAT_FILE: K26-822c PHD_FILE: K26-822c.phd.1 TIME: Thu Sep 12 15:42:36 1996
RD K26-572c 594 0 0
agccccgggccgtggggttccttgagcactcccaaagttccaacccagga
tgtccccgacgcttaaaCcttccaagtctgaaacgggaaAtttgatttgc
gggctaggataaacgccggggagaaaggcagaactgccttttaccCCcca
aggatatcccttgggaagggcccctttgcactcagctgctccctaattat
ggcgatcctccctctatctttgtccccctgtctttcaggatccctctcAA
CAACAgaccaCTCccattaaaGAAATCtccttctgatctgcgggatcACA
TAAAACAGTGCCattcAAaAcgtcccttcCcccAATGTCtaagtgTggtg
gagcCcttcctgcCCggctctgtgcacccacggtgcctgcatgaccccgg
atGCAGTGTGCACCAGctCCCATCATTCAAgagCATGACTGTGTTGCCAA
CCAGCcacCAGGCACTGGGGAGGGAGCtgaGGGAGCAcaaAAGGGATGAG
CCACCCTCTGTcCcagAAGTGGAgcgcATGGGGCTTGGCTgggcTTAGAG
CtaacaTACACAGGATGCTGAAaaagaaCAACACaatagtaaca

QA 249 584 1 586
DS CHROMAT_FILE: K26-572c PHD_FILE: K26-572c.phd.1 TIME: Thu Sep 12 15:42:34 1996
RD K26-766c 603 0 0
gaataattggaatcacggcaaaaatttggggacaaatattatttccaaaa
ttcccccagcaatcacacaggccctcaagcccatcaactcggtcattcac
cgattttcctaaatcaagggtattagcttg*ctgggcttacacctaacat
acacagcatgctcaatgagaAcaatacgagctgtgtggagcacaggaagg
ggaAAtcagcctgaagctgctgttagtgtgcttgg*ctgAGTACAGCcaT
GCTctCAGTTgaggcAcggTTGGCTCCCCATGGgCAAGATCCCTCCTggC
CCATCTCTCCTCTTaTTCTCTATCCCTTCCCCAGGTCCCTGCCTTAGagg
tttCACCAGAGCACAGCTCCTGCCTGTGGCCAAAACAGTATTTGGCCACT
CACCGACCCAGTGTCAGC*ATCCAGATGGGTTCCACATCTCACAACCCT*
GAGCAGCAGAGAAGGGTTTGAAAGGCCAGGGGAG*AATGAAGACGAAGGA
GG*TGTTGGCAACAACACAGA*G*AGTCAGCAGCCAGAACGCCAGGTATC
CACACACATAagaCATtctaAATTTTTACTCAAacgatcCccggaaccac
acg

QA 240 584 126 583
DS CHROMAT_FILE: K26-766c PHD_FILE: K26-766c.phd.1 TIME: Thu Sep 12 15:42:35 1996

WA{
phrap_params phrap 990621:161947
/usr/local/genome/bin/phrap standard.fasta.screen -new_ace -view
phrap version 0.990319
}

CT{
Contig1 repeat consed 976 986 971218:180623
}

CT{
Contig1 comment consed 996 1007 971218:180623
This is line 1 of a comment
There may be any number of lines
}

CT{
Contig1 oligo consed 963 987 971218:180623
standard.1 acataagacattctaaatttttact 50 U
seq from clone
}

----------------------------------------------------------------------------

WHAT THE COLORS MEAN

See the beginning of the Quick Tour (above). But here is a very partial list
of the colors:

Greyscale of background indicates quality
Grey base with black background--clipped off part of read (either due
to low quality or due to alignment)
Red base--discrepant with consensus
Black base--agrees with consensus
Colored area covering half of a base--tag (see Quick Tour)
Purple tag--more than 1 tag covering a base