Being able to find the optimal alignment of two biological sequences of nucleotides is an important challenge in bioinformatics. Since there are (m+n)!/(m!)² ways of alligning two sequences of lengths m and n, smart ways of comparing DNA sequences had to be developed. This program implements a sequence alignment algorithm, as well as mutlithreading, based on the algorithm of Smith and Waterman and Rogue's improvements, available in the literature. [1] provides more information on the input files used.
This program was written with Hamza Nougba and Hossein Jaidi at the Ecole Polytechnique de Bruxelles, 2016.
Sources:
[1] Farrar, Michael S. 2010. “NCBI BLAST Database Format” 4: 1–9. [2] Gotoh, Osamu. 1982. “An Improved Algorithm for Matching Biological Sequences.” Journal of Molecular Biology 162 (3): 705–8. https://doi.org/10.1016/0022-2836(82)90398-9. [3] Rognes, Torbjørn. 2011. “Faster Smith-Waterman Database Searches with Inter-Sequence SIMD Parallelisation.” BMC Bioinformatics 12 (1). BioMed Central Ltd: 221. https://doi.org/10.1186/1471-2105-12-221. [4] Smith, T.F., and M.S. Waterman. 1981. “Identification of Common Molecular Subsequences.” Journal of Molecular Biology 147 (1): 195–97. https://doi.org/10.1016/0022-2836(81)90087-5.
[French version below]
How to start the program?
1)Compilation: $ make
2)Execution:
PS: make sure that the following files are located in the folder where the program is executed: 1. the file of the query sequence 2. the 3 .pin .psq et .phr files 3. the file with the BLOSUM matrix
French:
Comment lancer le programme?
1)Compilation: $ make
2)Exécution:
PS: vérifier que les fichiers suivants se trouvent bien dans le repertoire où le programme est exécuté: 1. le fichier pour la query sequence 2. les 3 fichiers .pin .psq et .phr 3. le fichier de la matrice blosum