/phylogenomics_tools

Scripts that can come in handy when analysing phylogenomic data

Primary LanguagePythonMIT LicenseMIT

phylogenomics_tools

Scripts that can come in handy when analysing phylogenomic data

test_for_base_composition_bias.py

This script performs chi-squared tests to test whether there is base composition bias across all codon positions, and at each codon position. There are three optional features:

  • RY-recoding - if composition bias is found at a given codon position(s), RY-recodng is performed at that position(s). The recoded alignment is then tested again for base composition bias.
  • RY-recoding and deletion - removal of codon positions which still show bias after recoding.
  • Generate partition files - output a partition file for the input alignment where each sequence is partitioned by codon position. This assumes the input alignment has all three codon positions present. It will also output a partition file for the alignment generated by removing given codon position(s) if this behaviour is enabled and, deletions are necessary.

The default behaviour is just to test for base composition bias, without performing recoding or deleting positions.

The default output is output_prefix.base_composition_results.tsv. If RY recoding is enabled, then the recoded alignment is also output output_prefix.recoded. If deletion is enabled, then the modified alignment is also output output_prefix.recoded.positions.deleted.

The last line in output_prefix.base_composition_results.tsv gives a handy summary of what adjustments were performed, if recoding and deletion are enabled. This is useful for deciding which modified alignment file and partition file should be chosen for downstream analyses.

test_for_base_composition_bias.py -f alignment.aln -p output_prefix # just test for bias
test_for_base_composition_bias.py -f alignment.aln -p output_prefix -r True #  test for bias and perform RY-recoding is bias is found
test_for_base_composition_bias.py -f alignment.aln -p output_prefix -r True -d True #  test for bias and perform RY-recoding is bias is found, if positions still show bias, remove them from the alignment.
test_for_base_composition_bias.py -f alignment.aln -p output_prefix -r True -d True -P True #  as above but also generate a partition file for the input alignment and, if generated, the alignment with codon position(s) removed.

Full usage:

  -h, --help            show this help message and exit
  -f FILE, --file FILE  Path to the input file
  -p PREFIX, --prefix PREFIX
                        Prefix for the output files
  -r RECODE, --recode RECODE
                        Enable recoding of codon positions which show composition bias to RY
  -d DELETE, --delete DELETE
                        Enable removal of codon positions which show composition bias after RY recoding
  -P PARTITION, --partition PARTITION
                        Generate partition files for the input alignment and, if generated, for the alignment which has one or more codon positions removed.

test_for_base_composition_bias.py

Filter an alignment file to only retain a specific codon position or position(s).

Full usage:

usage: filter_alignments_by_codon_position.py [-h] -f FILE -p PREFIX -P POSITIONS

options:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  Path to the input file
  -p PREFIX, --prefix PREFIX
                        Prefix for the output files
  -P POSITIONS, --positions POSITIONS
                        Codon positions to retain separated by commas.