Probalign uses partition function posterior probability estimates to compute
maximum expected accuracy multiple sequence alignments. It performs statistically
significantly better than the leading alignment programs Probcons v1.1, MAFFT
v5.851, and MUSCLE v3.6 on BAliBASE
3.0, HOMSTRAD, and OXBENCH benchmarks. Probalign improvements are largest
on datasets containing N/C terminal extensions and on datasets with long
and heterogeneous length sequences. On heteregeneous length datasets
containing repeats Probalign alignment accuracy is 10% and 15% than the
other three methods when standard deviation of length is at least 300 and
400.
Citation: U. Roshan and D. R. Livesay, Probalign: Multiple sequence alignment using partition function posterior probabilities, Bioinformatics, In Press 2006, (PDF)
Data used in
the paper:
Related: Probalign study for RNA-genome alignment here
Last updated December 2nd, 2006