TREE-PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. It implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch. TREE-PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths for user specified trees. Branch lengths can also be calculated under the clock-assumption. In addition, TREE-PUZZLE offers likelihood mapping, a method to investigate the support of a hypothesized internal branch without computing an overall tree and to visualize the phylogenetic content of a sequence alignment. TREE-PUZZLE also conducts a number of statistical tests on the data set (chi-square test for homogeneity of base composition, likelihood ratio to test the clock hypothesis, Kishino-Hasegawa test). The models of substitution provided by TREE-PUZZLE are TN, HKY, F84, SH for nucleotides, Dayhoff, JTT, mtREV24, BLOSUM 62, VT, WAG for amino acids, and F81 for two-state data. Rate heterogeneity is modeled by a discrete Gamma distribution and by allowing invariable sites. The corresponding parameters can be inferred from the data set.
Tree-Puzzle Documentation
Tree-Puzzle website
Tree-Puzzle on Biowulf has been built with MPI for parallel runs. To submit a job on Biowulf, create a command file similar to the following:
-------------------Sample command file for Tree-Puzzle----------------------- #!/bin/csh #PBS -N Ppuzzle #PBS -m be #PBS -k oe set path = (/usr/local/mpich/bin $path) cd /data/username/tree/ mpirun -machinefile $PBS_NODEFILE -np $np /usr/local/bin/ppuzzle << EOF primates.b y EOF -----------------------------------------------------------------------------
Submit this job using the qsub command, e.g:
qsub -v np=4 -l nodes=2 command-filewhere 'command-file' is the file you created above.
Tree-Puzzle has many options. A summary is below:
GENERAL OPTIONS b Type of analysis? Tree reconstruction k Tree search procedure? Quartet puzzling v Approximate quartet likelihood? No u List unresolved quartets? No n Number of puzzling steps? 1000 j List puzzling step trees? No o Display as outgroup? Gibbon z Compute clocklike branch lengths? No e Parameter estimates? Approximate (faster) x Parameter estimation uses? Neighbor-joining tree SUBSTITUTION PROCESS d Type of sequence input data? Nucleotides m Model of substitution? HKY (Hasegawa et al. 1985) t Transition/transversion parameter? Estimate from data set f Nucleotide frequencies? Estimate from data set RATE HETEROGENEITY w Model of rate heterogeneity? Uniform rateDetails about all options are available in the Tree-Puzzle documentation. Options are specified in the command file by simply entering the interactive menu options and values as needed. For example, to change the number of puzzling steps in your run to 8000, the command file would look like
-------------------------------------------------------- #!/bin/csh #PBS -N Ppuzzle #PBS -m be #PBS -k oe set path = (/usr/local/mpich/bin $path) cd /data/username/tree/ mpirun -machinefile $PBS_NODEFILE -np $np /usr/local/bin/ppuzzle << EOF primates.b n 8000 y EOF ----------------------------------------------------