I'm gathering in this repository useful functions for genomics data analysis. I have been following the Johns Hopkins University specialisation in Data science for genomics and the courses are being a source of inspiration for this project.
For now, here are the functionalities I have been added:
- (Import_fasta and Parse FASTA with Biopython) FASTA File reading and pre-processing for the sequences (Using Biopython or not - 2 scripts).
- (Biopython) BLAST alignment script and processing of results.
- The main, which is a program which accepts singles sequences of FASTA files and will return information regarding the input, including ORF finder and more frequent repeats in your file.