I'm trying to understand the basic steps of FASTA algorithm in searching similar sequences of a query sequence in a database. These are the steps of the algorithm:
- Identify common k-words between I and J
- Score diagonals with k-word matches, identify 10 best diagonals
- Rescore initial regions with a substitution score matrix
- Join initial regions using gaps, penalise for gaps
- Perform dynamic programming to find final alignments
I'm confused with the 3rd and 4th steps in using PAM250 score matrix, and how to "join using gaps".
Can somebody explain these two steps for me "as specifically as possible". Thanks