Online Lectures on Bioinformatics
|
Alignment statisticsStatistical Significance of Local Smith-Waterman-AlignmentsAccording to theorem of Arratia and Waterman [AW90], there are only two possibilities for the local SW-alignment-score to grow with the increase of sequence length (for a given gap-cost-function): there's a region of linear and of logarithmic growth and between the two regions there's a sharp phase-transition.
The goal is to obtain a value for the statistical significance of a local
SW-alignment by modelling the scores by a Poisson distribution
analogously to the HSPs.
Introducing non-overlapping local suboptimal alignments, the logic
of HSP-statistics is applied to local alignments.
To do that in the same way presumes,
that the score of the local SW-alignment grows logarithmically with
the length of the sequences (as this implies strong gap penalties).
Recalling again Arratia an Waterman, there's a connection between the
regions of linear and logarithmic growth of the local SW-alignment-scores
and the global alignment-scores, which is summarized in the following table:
The context above is used to determine the logarithmic region depending on the gap-cost-function by global alignemts with simulated data. ![]() A more detailed presentation of the subject as well as estimations for ![]() exercises Comments are very welcome. luz@molgen.mpg.de |