
A collection of various biopython scripts.

Primary LanguagePython



Creates the fasta file kmers.fasta.


Finds the first open reading frame (ORF) from the non-local fasta file /scratch/Drosophila/dmel-all-chromosome-r6.17.fasta. Translates the OFT to protein and prints out the protein.


Finds the ORF from a DNA sequence.


Creates a template to print out an organized message.


Takes a non-local fasta file and prints out the record id.


Uses the template from format.py and prints out an organized message about a non-local fatsa file.


Transcribes a line of random DNA and prints out the RNA.


Prints out a protein sequence that has been translated from transcribed RNA.


This BioPython script creates a SeqRecord object with the following parameters: seq: "aaaatgggggggggggccccgtt" id: "#12345" description: "example 1" alphabet: "generic_dna" Then the object is written to a sequence file in GenBank format and called BioPyhon_seq.gb.


This BioPython script creates a list with the following Seq objects: A sequence retrieved from GenBank by gi (id) for 515056 A sequence retrieved from GenBank by accession (id) for J01673.1 Then it prints out the sequences from the list as well as the type, location, and strand of each feature.


This BioPython script reads a multi-sequence FASTA file and outputs a FASTA file whose contents are the reverse complements of the sequences from the original FASTA file.This script takes two arguments from the command line: the name of the original multi-sequence FASTA file, and the desired name of the new multi-sequence FASTA file. Tested on yeast.fasta. Writes the output yeast_reverse.fasta.


This BioPython script revises the sliding_window_fasta.py script from SlidingWindow repo to open with SeqIO. This file modifies the script sliding_window.py and uses the same functions but, instead of taking a k-mer size and a string, it takes a k_mer size and a fasta file. It also will run separately for each header, printing the header name on one line, and then each k-mer from that fasta entry followed by a tab, followed by the GC content of that k-mer, rounded to two decimal places. Tested with the dengue.fasta file.

Getting Started



Victoria R. Liebsch-Aljawahiri

Date Created

11 October 2021