/evolverInfileGeneration

Scripts and Makefile to generate infile set for the Evolver suite of genome evolution simulation tools.

Primary LanguagePythonOtherNOASSERTION

evolverInfileGeneration

Dent Earl

2009-2012

Introduction

This repo is intended to assist in the creation of an infile set for use with the EVOLVER suite of genome evolution tools written by Robert C. Edgar. George Asimenos, Serafim Batzoglou and Arend Sidow. http://www.drive5.com/evolver/

Dependencies

Installation

  1. Download the project.
  2. cd into the project directory.
  3. Type make.
  4. Edit your PYTHONPATH environmental variable to contain the parent directory of the evolverInfileGeneration/ directory.

Operation

  1. Copy the bin/infileMakefile into a new directory where you would like to have all the infiles created.

  2. Edit the chromosomes on line 58 from what you see below to whatever you want them to be:

    chrs:= 20 21 22

  3. To test type make -f infileMakefile testSet=YES MODEL=path/to/evolver/model/model.txt

  4. To run type make -f infileMakefile MODEL=path/to/evolver/model/model.txt

Pro tip: The Makefile has been written so that you can use the parallel option in make, -j, for a speedup, provided you have a extra processors to spare.

Extras

  • singleRegionGenerator.sh - A shrunken down version of infileMakefile. For when you don't want a full chromosome, but just a small piece of one. Used in the Cactus publication Cactus: Algorithms for genome multiple sequence alignment 2011. Paten, Earl, Nguyen, Deikans, Zerbino and Haussler. Genome Research. http://genome.cshlp.org/content/early/2011/06/09/gr.123356.111.abstract .
  • splitEvolverInfiles.py - Used to cut a paired FASTA and GFF into smaller FASTAs and GFFs with correct new coordinates for the GFF files.
  • src/testSplitEvolverInfiles.py - unittest for splitEvolverInfiles.py, invoke with python src/testSplitEvolverInfiles.py --verbose
  • subsetRemapGP.py - Takes a .gp (genpred) file and arguments to define a subsetted region and returns just the subsetted region with all elements' coordinates transformed to the subset, in .gp format.