This is a Repo consists of two modules:
- A data parser that streams the UniProt protein sequence.
- A Sequence Alignment tool implementing the Needleman-Wunsch algorithm.
See SeqAlign.py
.
Pairwise sequence global alignment to find out their optimal alignment score, optimal alignment, and the corresponding sequence identity.
- Adjustable substitution score matrix (default BLOSUM62)
See CompareProteinSeq.py
.
Fetch and parse the UniProt protein sequence (FASTA format) with this url format: https://rest.uniprot.org/uniprotkb/${seq_id}.fasta
.
Experiement the sequence global alignment by calling the SeqAlign
given a compare set containing the protein sequence name (id).
To get the comparing results, execute $python __main__.py
.
- Implement pairwise sequence local alignment using Smith–Waterman algorithm.