Bash script that takes reference sequences (e.g., exon's, UCE loci) and maps assembled contigs to the reference sequences.
Requires:
Inputs:
- Directory of fasta contigs (or other fasta). Could be modified to work on read files (to do?)
- Fasta reference file to map reads to
This runs on directories of contigs, but may also work on single contig files (untested).
To run:
chmod +x generate-consensus.sh
generate-consensus.sh ./path/to/contigs-dir ./path/to/output-dir ./path/to/reference_fasta_file.fasta num_processors
Multi-faceted python script. Originally intended to convert fastq output from the above bash script to fasta files sorted by locus.
Was expanded to: generate phyluce inputs (post matching/single fasta file generation), convert to nexus, concat nexus files and prepare a phylip file.
Run generate-alignments.py -h
for additional information on options.
Requires:
Inputs:
- Directory containing fastq output from above script.
- Fasta reference file used in above script.
Need to look more into pruning / cleaning the mapped 'alignments.' Right now, the 'alignment' is generated by position during the mapping, and not using a traditional multiple alignment program. Creates problems with low coverage contigs / large references (i.e., many "N's" are present in the datasets I've tried. GBLOCKS may help this, but remains currently untested.
Aspects of python code from Phyluce