Abstract
We present a method for estimating the most general reversible substitution matrix corresponding to a given collection of aligned DNA sequences. This matrix can then be used to calculate evolutionary distances between pairs of sequences in the collection. Our algorithms are designed for fast execution times, even on large data sets. In a test case on a primate pseudogene, the matrix we arrived at resembles one obtained using maximum likelihood, and the resulting distance measure is shown to have better linearity than obtained in a less general model.
The paper is submitted (BiBTeX citation). Contact the authors if you want to be on our mailing list.
This is the C implementation of the method, |
|
Instructions on how to use |
|
The psi-eta-globin pseudo genes and their alignments that were used. |
|
Graphs and diagrams from the article as well as some related images. |
|
Related links. |