SEQXTANT README
seqxtant
organizes related genomic sequences in order to study orthologs and
paralogs in greater detail.
Overview
- Install
blast-legacy
via bioconda - Get
grimoire
from KorfLab GitHub - Get
seqxtant
from KorfLab GitHub - Create a directory where you want to store the seqxtant database files
- Point
SEQXTANT
to the database directory or use the--env
option - Download genomic fasta, masked, and gff3 files for your favorite genomes
- Possible custom post-processing required here
- Use
seqxtant create
to make a new database - Use
seqxtant add
to add fasta and gff files to database - Use
seqxtant status
to get information about the database - Use
seqstant validate
to check overall genome stats - Use
seqxtant cluster
to cluster related sequences...
Database
The "database" is a file called seqxtant.json
that lives inside a directory
that is either specified by an environment variable or passed in with the
--env
option. Probably should be a sqlite database...
Fasta and GFF3
Dev Notes
Some clades used for testing
- worms
- C. angaria
- C. brenneri
- C. briggsae
- C. elegans
- C. japonica
- C. remanei
- C. tropicalis
- plants
- flies