/BEASTifier

Generate a heap of BEAST xml files for simulation experiments

Primary LanguageC++

BEASTifier

Generate a heap of BEAST-formatted xml files for simulation experiments. Takes in Nexus-formatted alignments and (optionally) newick-formatted ultrametric trees. For some Nexus file:

someFilePrefix.NEX

if there exists a tree file:

someFilePrefix.phy

then it will be used to initialize the BEAST analysis. Failing this, a starting tree will be generated by BEAST itself. Ultrametricity of a user-provided initializing tree is not checked, but is required by BEAST (at least for the simulation experiments currently in mind). File suffixes should not matter (e.g. tree files can be *.phy or *.tre), but formatting of the files is draconianly enforced, and unix line returns is assumed. Loops over the following:

  1. alignment files
  2. substitution models
  3. clock flavours
  4. tree priors

Compile

In a terminal prompt in the src directory, type:

make

Usage

Type:

./BEASTifier -h

for help (presented below).

BEASTifier utilizes a configuration file for all analysis parameters. Call as:

./BEASTifier -config config_filename

where 'config_filename' contains all analysis settings. Parameters are listed one per line, in any order. The character '#' is used for comments.

Arguments:

-alist: filename
   - name of text file listing alignment filenames.
   - one alignment filename per line.
   - Required; all other arguments are optional.
-mods: list substitution model(s) to analyze data
   - supported models: JC, K80, HKY, TrNef, TrN, K3P, K3Puf, TIMef, TIM, TVMef, TVM, SYM, GTR.
   - if more than one model, separate by spaces.
   - models themselves must contain no spaces and at most one '+'.
      - e.g. 'GTR+IG' = good; 'GTR+I+G' no es bueno.
   - default: -mods JC HKY GTR JC+G HKY+G GTR+G
-clock: list of flavour(s) of clock model to implement.
   - supported models: 'strict' or 'ucln' or 'uced' or 'randlocal'
   - default = -clock ucln
-mcmc: the number of mcmc generations to run analysis.
   - default: -mcmc 20000000
-tsamp: the interval (in generations) for sampling trees.
   - default: -tsamp 5000
-psamp: the interval (in generations) for sampling parameter values.
   - default: -psamp 1000
-ssamp: the interval (in generations) for printing results to standard output.
   - default: -ssamp 500
-logphy: turn on logging of phylograms (in addition to chronograms).
   -  default: don't log.
-tprior: list of tree prior(s)
   - supported: 'bd', 'yule', 'concoal', 'expcoal', 'logcoal'.
   - default = -tprior bd
-fixtree: turn off topology manipulation operators (i.e. fix to input topology).
   - default = estimate topology.
-rprior: specify prior for root age.
   - supported: 'unif'' or 'norm'.
   - if 'unif', expecting '-rprior unif min_value max_value'.
   - if 'norm', expecting '-rprior norm mean_value stdev_value'.
-overwrite: overwrite existing files.
   - default = don't overwrite; warn instead.

Consult 'config.example' as a, well, example.