Simulations for the Topic modeling paper: [the directory old contains an earlier coalescent-simulator, in a major revision we replaced that with the current one using the sofware DAWG, see below] Simulation of sequences from a tree with gaps using the software DAWG (1). 1. Download DAWG from https://github.com/CartwrightLab/dawg.git 2. Install dawg (follow the gihub instructions) 3. There are two sets of runX: run7 and run14, for recreating our table you will need to create the following directory: 7tip-1000 7tip-1000-moreindel 14tip-1000 14tip-1000-moreindel 4. For each directory: #a. cd 7tip-1000; cp ../*.py .;cp ../run .; cp ../7tiptree.tre; cp ../run7 .; cp ../dna-with-gaps.dawg . . run7 cd .. #b. cd 7tip-1000-moreindel; cp ../*.py .;cp ../run .; cp ../7tiptree.tre; cp ../run7 .; cp ../dna-with-more-gaps.dawg dna-with-gaps.dawg . run7 #c. cd 7tip-1000; cp ../*.py .;cp ../run .; cp ../14tiptree.tre; cp ../run14 .; cp ../dna-with-gaps.dawg . . run14 cd .. #d. cd 7tip-1000-moreindel; cp ../*.py .;cp ../run .; cp ../14tiptree.tre; cp ../run14 .; cp ../dna-with-more-gaps.dawg dna-with-gaps.dawg . run14 cd .. 5. Collect the information for the table #a. cd 7tip-1000; . ./gettable > rawresults-7tip-1000 ... #do this for every directory the rawresults now contain the values that we show in the table #the raw results are ordered bu loci 1,2,5,10,20,...1000 # the table contain the highlighted parts for all 1..1000 loci. # example # +++ +++++++++++ # RF=2/10,RF<=2,4,6,8:0.5 0.9 1.0 1.0, mean(wRF)=1.130189611 [1] Reed A Cartwright, DNA assembly with gaps (Dawg): simulating sequence evolution, Bioinformatics 21, Suppl 3 (2005), iii31–8.
pbeerli/simulations_for_topiccontml
This repository describes the tools needed to construct the simulated data used for the package TopicContml by Marzieh (Tara) Khodaei
PythonMIT