Analyze your data
Home
Help
Citations
Job Queue
Stats
HyPhy package
Available topics.
Tutorial.
An example based tutorial on using Datamonkey. An excellent 'Getting Started' resource!
Data files.
Preparing data files for Datamonkey.org: general guidelines and common mistakes to avoid.
Site-by-site selection.
There are three available methods to test selection at a single codon site:
SLAC
(up to 150 sequences)
The fastest and most conservative method. Use it for large datasets (50 sequences or more) and to obtain substitution maps at each site - a useful feature for visualizing the evolutionary process.
FEL/IFEL
(up to 100 sequences)
The best overall method, in terms of tradeoff between statistical performance and computational expense. Use it for intermediate to large datasets (50 sequences or more) and if you wish to obtain good site-by-site substitution rate estimates. Use IFEL to test for sitewise selection on internal branches of the tree.
REL
(up to 75 sequences)
REL is an extension of familiar codon-based selection analyses pioneered by Nielsen and Yang and implemented in PAML. Importantly, REL allows synonymoys rate variation. It is often the only method that can infer selection from small (5-15 sequence) or low divergence alignments, but also the method that makes the most assumptions and susceptible to high rates of false positives in extreme cases.
TOGGLE
(up 100 sequences and 50 sites)
TOGGLE analysis evaluates selection associated with host-immune response. This model was developed to identify sites which toggle between a wild-type and escaped amino acid state. Typically, these sites have lower levels of amino acid diversity and are not detected by standard diversifying selection tests of selection. However, the analysis is computationally expensive since site-wise tests of escape from wild-type amino acid resiudes are evaluated for each of the 20 potential wildtypes. Note that alignments with more than 50 sites can be uploaded, however only 50 sites will be tested for toggling. Indeed, we recommend that alignments of more than 50 sites are used for the estimation of branch lengths (the first phase of TOGGLE).
DEPS
(up to 75 sequences)
Directional Evolution of Protein Sequences uses amino acid sequences to identify directional evoltion towards residues at sites. Useful for the detection of selective sweeps.
Overall signature of selection.
PARRIS
(up to 40 sequences)
An extension of standard likelihood ratio tests to deal with recombinant data. Useful for answering the question: is there evidence of positive selection anywhere in my alignment?
Evolutionary Fingerprinting.
ESD
(up to 100 sequences)
This method fits a versatile general discrete bivariate model of site to site variation in selection. The evolutionary fingerprint comprises a description of the number of selective classes, the dN/dS rates for each class and the assignment of sites to classes.
Lineage specific selection.
GA Branch
(up to 25 sequences)
A genetic algorithm based data mining procedure which automatically partitions all branches in the tree into several selective regimes (and infers the most appropriate regimes), and performs multi-model inference for increased robustness.
Evolutionary interations between sites.
Spidermonkey
(up to 150 sequences)
Use a Bayesian Graphical Model (BGM) applied to reconstructed evolutionary histories of individual sites to find evidence of co-evolution between sites in an alignment.
Codon Model Selection.
CMS
(up to 100 sequences)
Use a Genetic Algorithm to identify the best model of codon evolution which allows for multiple non-synonymous substitution rates.
Recombination detection.
SBP/GARD
(up to 100/400 sequences)
Determine whether recombination has acted on your alignment, and identify recombination breakpoints using a Genetic Algorithm.
HIV-1 subtype assignment.
SCUEAL
(up to 500 sequences)
Assign HIV-1 subtypes based on HIV-1
pol
alignments.
Ancestral Sequence Reconstruction.
ASR
(up to 400 sequences)
Reconstruct ancestral sequences using three methods.
Other topics.
Post your questions on our user assistance message boards if none of the above topics match your query.
Wayne Delport, Art Poon, Simon D.W. Frost and Sergei L. Kosakovsky Pond 2004-2010