Last Updated: 05/09/2016
cDNA_Cupcake is a miscellaneous collection of Python and R scripts used for analyzing sequencing data. Most of the scripts only require Biopython. For scripts that require additional libraries, it will be specified in documentation.
- Python >= 2.7
- Biopython
Since most of the scripts are independent (do not depend on each other), you can either clone the whole directory, or, if you are only interested in a specific script, just download that specific script to your local drive.
You can clone the GitHub repository, then add the GitHub repo path to your $PATH
variable. The scripts are organized into different sub-directories (ex: sequence/
, rarefaction/
etc) so you will have to add them individually.
git clone https://github.com/Magdoll/cDNA_Cupcake.git
export PATH=$PATH:<path_to_Cupcake>/sequence/
export PATH=$PATH:<path_to_Cupcake>/rarefaction/
For any issues or bugs, please report to Issues.
Please see wiki for the latest maintained list of scripts.
A brief list of currently listed scripts are:
get_seq_stats.py
: Summarize length distribution of a FASTA/FASTQ file.rev_comp.py
: Reverse complement a sequence from command line.fa2fq.py
andfq2fa.py
: Convert between FASTA and FASTQ format.sort_fasta_by_len.py
: sort fasta file by length (increasing or decreasing).get_seqs_from_list.py
: extract list of sequences given a fasta file and a list of IDs.
simulate.py
: Simulate error in sequences.