tamuri/treesub

bracket error on pamlout output

thierryjanssens opened this issue · 6 comments

Hi,

I am running treesub separately on pre-calculated RAxML tree with the command:

java -cp dist/treesub.jar treesub.ancestral.ParseRST /path/to/paml/results

This works generally well, but now I get an output in pamlout with only nan as brach lengths and the error

Exception in thread "main" java.io.IOException: Parse exception:pal.tree.TreeParseException: Missing closing bracket
at pal.tree.TreeTool.readTree(Unknown Source)
at treesub.ancestral.ParseRST.getTrees(Unknown Source)
at treesub.ancestral.ParseRST.run(Unknown Source)
at treesub.ancestral.ParseRST.main(Unknown Source)

what is going on?

kind regards,

Thierry

Please attach the offending results file to the issue

BASEML (in paml version 4.9h, March 2018) alignment.paml.phylip HKY85 dGamma (ncatG=5) (3 genes: diff. rate & pi & kappa)
Frequencies..

Gene 1 (len 1009)
T C A G
Seq_1 0.18571 0.18961 0.32727 0.29740
Seq_2 0.20948 0.25810 0.25810 0.27431
Seq_3 0.12979 0.16931 0.31422 0.38668
Seq_4 0.23744 0.19977 0.27511 0.28767
Seq_5 0.18848 0.20768 0.30612 0.29772
Seq_6 0.19848 0.19523 0.29826 0.30803
Seq_7 0.19695 0.20457 0.29489 0.30359
Seq_8 0.19459 0.21405 0.29946 0.29189
Seq_9 0.19177 0.22319 0.30011 0.28494
Seq_10 0.18904 0.22342 0.28786 0.29968
Seq_11 0.18830 0.20532 0.29043 0.31596
Seq_12 0.19643 0.19853 0.30042 0.30462
Seq_13 0.18590 0.21154 0.29594 0.30662
Seq_14 0.19144 0.20642 0.29412 0.30802
Seq_15 0.18297 0.22082 0.28391 0.31230
Seq_16 0.18710 0.21882 0.28118 0.31290
Seq_17 0.18484 0.21080 0.29491 0.30945
Seq_18 0.18592 0.21744 0.29517 0.30147
Seq_19 0.17208 0.22759 0.20197 0.39836

Mean 0.1893 0.2106 0.2894 0.3106

Homogeneity statistic: X2 = 0.15986 G = 0.16108

Gene 2 (len 1009)
T C A G
Seq_1 0.24773 0.25162 0.34112 0.15953
Seq_2 0.24190 0.30672 0.22569 0.22569
Seq_3 0.15614 0.25494 0.44596 0.14297
Seq_4 0.25934 0.20952 0.33407 0.19707
Seq_5 0.25361 0.23317 0.36659 0.14663
Seq_6 0.26464 0.24187 0.33297 0.16052
Seq_7 0.26333 0.24810 0.33079 0.15778
Seq_8 0.25946 0.25730 0.32432 0.15892
Seq_9 0.26219 0.24919 0.33044 0.15818
Seq_10 0.26638 0.25671 0.31472 0.16219
Seq_11 0.26170 0.24362 0.33936 0.15532
Seq_12 0.25840 0.25210 0.33929 0.15021
Seq_13 0.25962 0.25214 0.33547 0.15278
Seq_14 0.25989 0.25027 0.33797 0.15187
Seq_15 0.25868 0.24606 0.33754 0.15773
Seq_16 0.25793 0.24630 0.33932 0.15645
Seq_17 0.25545 0.25130 0.34787 0.14538
Seq_18 0.25840 0.24055 0.35504 0.14601
Seq_19 0.23612 0.24039 0.32151 0.20197

Mean 0.2516 0.2490 0.3368 0.1625

Homogeneity statistic: X2 = 0.19598 G = 0.19914

Gene 3 (len 1009)
T C A G
Seq_1 0.29961 0.29313 0.22179 0.18547
Seq_2 0.16086 0.38776 0.19328 0.25810
Seq_3 0.33627 0.16047 0.36231 0.14094
Seq_4 0.24689 0.37144 0.15970 0.22198
Seq_5 0.26946 0.33653 0.19760 0.19641
Seq_6 0.29284 0.28308 0.21475 0.20933
Seq_7 0.29053 0.28183 0.19804 0.22960
Seq_8 0.24108 0.36541 0.16216 0.23135
Seq_9 0.23619 0.38245 0.15601 0.22535
Seq_10 0.17508 0.42535 0.15360 0.24597
Seq_11 0.27660 0.33830 0.17340 0.21170
Seq_12 0.28151 0.31303 0.20063 0.20483
Seq_13 0.16774 0.48932 0.10150 0.24145
Seq_14 0.18824 0.46845 0.10802 0.23529
Seq_15 0.15352 0.49106 0.11146 0.24395
Seq_16 0.13953 0.50106 0.11099 0.24841
Seq_17 0.23884 0.35722 0.18795 0.21599
Seq_18 0.25840 0.34979 0.17647 0.21534
Seq_19 0.32151 0.21051 0.20624 0.26174

Mean 0.2408 0.3582 0.1787 0.2223

Homogeneity statistic: X2 = 1.10481 G = 1.11683

Average 0.22726 0.27783 0.26669 0.22822
(Ambiguity characters are used to calculate freqs.)

constant sites: 0 (0.00%)

Distances:HKY85 (kappa) (alpha set at 0.50)
This matrix is not used in later m.l. analysis.

(Pairwise deletion.)
Seq_1
Seq_2 9.0000( 0.0000)
Seq_3 0.8909( 1.9740) 9.0000( 1.9740)
Seq_4 0.1701( 1.3508) 9.0000( 1.3508) 9.0000( 1.3508)
Seq_5 0.6366( 2.4930) 9.0000( 2.4930) 2.4284( 0.0801) 0.1192(34.9444)
Seq_6 0.6972( 3.1836) 0.4143( 3.8778) 1.7256( 0.4773) 0.4980( 5.6864) 0.6060( 3.2628)
Seq_7 0.7672( 3.5484) 0.3340( 5.4589) 2.0981( 0.8620) 0.3381( 3.1694) 0.6855( 3.7118) 0.2553( 5.2684)
Seq_8 0.7692( 3.7433) 0.1614( 1.5316) 2.3224( 2.3305) 0.3465( 4.4144) 0.6394( 3.4590) 0.5236( 3.5654) 0.5396( 3.4825)
Seq_9 0.7169( 3.3966) 0.1131( 2.7318) 2.3103( 1.6403) 0.4162( 5.0056) 0.6290( 3.0239) 0.4904( 3.3308) 0.5147( 3.1278) 0.2801( 3.6063)
Seq_10 0.7022( 3.1695) 0.1730( 3.0683) 2.3256( 0.6594) 0.3385( 4.2882) 0.5701( 3.1725) 0.4822( 3.5444) 0.4982( 3.4212) 0.4118( 3.5259) 0.4444( 3.3767)
Seq_11 0.3302( 3.4900) 0.1382( 4.0000) 0.4839( 2.1585) 0.2944( 3.4493) 0.6185( 2.8418) 0.6041( 3.0806) 0.6650( 3.3195) 0.5783( 3.2859) 0.5875( 3.5972) 0.5423( 2.6223)
Seq_12 0.3861( 3.1541) 0.3645( 9.0163) 0.6842( 3.9492) 0.2599( 1.6398) 0.6256( 2.3740) 0.5792( 2.9993) 0.6341( 2.9524) 0.5938( 3.0307) 0.5950( 3.0129) 0.5417( 2.8028) 0.2350( 4.2186)
Seq_13 0.4651( 3.1308) 0.1719( 3.0904) 1.6197( 2.1375) 0.2592( 3.3780) 0.5246( 2.1698) 0.6091( 3.2297) 0.7199( 3.7844) 0.5520( 3.4339) 0.5325( 3.1645) 0.4458( 2.1435) 0.3852( 3.2855) 0.4282( 3.4306)
Seq_14 0.4842( 3.1844) 0.1149( 1.3255) 1.5887( 2.1515) 0.2516( 2.6683) 0.5640( 2.2265) 0.6289( 3.2087) 0.7104( 3.6915) 0.5580( 3.2434) 0.5706( 3.3304) 0.4842( 2.4875) 0.4185( 3.6162) 0.4367( 3.3716) 0.0348( 5.3919)
Seq_15 0.4846( 2.9104) 0.1318( 1.9257) 1.5415( 2.4999) 0.3704( 3.7086) 0.5485( 2.3206) 0.5673( 2.9236) 0.6326( 3.2116) 0.5500( 3.4728) 0.5000( 3.4399) 0.4446( 2.1425) 0.4073( 3.0011) 0.4406( 2.7145) 0.2749( 2.2401) 0.2956( 2.3848)
Seq_16 0.4764( 2.9572) 0.0954( 0.7122) 1.6397( 2.6963) 0.3136( 2.4125) 0.5385( 2.4049) 0.5841( 3.0415) 0.6280( 3.2763) 0.5568( 3.7127) 0.5098( 3.4453) 0.4294( 2.3164) 0.3984( 2.8713) 0.4387( 2.7801) 0.2658( 2.1696) 0.2913( 2.4406) 0.0928( 2.3938)
Seq_17 0.6867( 2.8901) 0.0000( 2.8901) 2.5035( 0.9481) 0.2227(14.2633) 0.2753( 3.3202) 0.5638( 2.3875) 0.5769( 2.5724) 0.5873( 3.2548) 0.5441( 2.7664) 0.5191( 2.5769) 0.5716( 2.9073) 0.6286( 2.6612) 0.4696( 2.2311) 0.5022( 2.3641) 0.4711( 2.4013) 0.4880( 2.4649)
Seq_18 0.6146( 2.4026) 0.0180(999.0000) 2.7834( 0.1894) 0.1192(34.9444) 0.0381( 2.6270) 0.5597( 3.1741) 0.5825( 3.2575) 0.5442( 3.2656) 0.5468( 2.7926) 0.5246( 2.8833) 0.5522( 2.9159) 0.5723( 2.4955) 0.4566( 2.0815) 0.4863( 2.1905) 0.4876( 2.6479) 0.4807( 2.5810) 0.2281( 3.2741)
Seq_19 4.0930( 0.9862) -nan(999.0000) 3.4303( 0.0759) 1.0306( 1.6204) 4.9337( 4.2871) 4.1651( 2.9269) 4.7424( 3.1038) 5.0482( 2.1284) 5.8607( 2.8961) 4.3352( 1.8575) 4.1306( 2.0533) 4.2513( 2.0837) 4.8588( 1.6584) 5.2519( 2.2112) 4.6746( 2.2741) 4.3035( 1.5374) 4.1357( 2.2244) 4.4999( 3.5548)

TREE # 1: (1, (((13, 14), ((16, 15), (((2, 17), (19, (4, (18, 5)))), ((9, 8), ((7, 6), 10))))), (12, (3, 11)))); MP score: -1.00
lnL(ntime: 36 np: 42): -2090977.522948 +0.000000
20..1 20..21 21..22 22..23 23..13 23..14 22..24 24..25 25..16 25..15 24..26 26..27 27..28 28..2 28..17 27..29 29..19 29..30 30..4 30..31 31..18 31..5 26..32 32..33 33..9 33..8 32..34 34..35 35..7 35..6 34..10 21..36 36..12 36..37 37..3 37..11
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 1.000000 1.000000 5.659855 5.171312 5.992299 0.496696

tree length = nan (1st gene)

(Seq_1, (((Seq_13, Seq_14), ((Seq_16, Seq_15), (((Seq_2, Seq_17), (Seq_19, (Seq_4, (Seq_18, Seq_5)))), ((Seq_9, Seq_8), ((Seq_7, Seq_6), Seq_10))))), (Seq_12, (Seq_3, Seq_11))));

(Seq_1: nan, (((Seq_13: nan, Seq_14: nan): nan, ((Seq_16: nan, Seq_15: nan): nan, (((Seq_2: nan, Seq_17: nan): nan, (Seq_19: nan, (Seq_4: nan, (Seq_18: nan, Seq_5: nan): nan): nan): nan): nan, ((Seq_9: nan, Seq_8: nan): nan, ((Seq_7: nan, Seq_6: nan): nan, Seq_10: nan): nan): nan): nan): nan): nan, (Seq_12: nan, (Seq_3: nan, Seq_11: nan): nan): nan): nan);

Detailed output identifying parameters

rates for 3 genes: 1 1.00000 1.00000

Parameters (kappa) in the rate matrix (HKY85) (Yang 1994 J Mol Evol 39:105-111):

Gene #1: 5.65986
Gene #2: 5.17131
Gene #3: 5.99230

alpha (gamma, K=5) = 0.49670
rate: 0.02076 0.15373 0.46446 1.10519 3.25584
freq: 0.20000 0.20000 0.20000 0.20000 0.20000

Tree with branch lengths for codon models:

(Seq_1: nan, (((Seq_13: nan, Seq_14: nan): nan, ((Seq_16: nan, Seq_15: nan): nan, (((Seq_2: nan, Seq_17: nan): nan, (Seq_19: nan, (Seq_4: nan, (Seq_18: nan, Seq_5: nan): nan): nan): nan): nan, ((Seq_9: nan, Seq_8: nan): nan, ((Seq_7: nan, Seq_6: nan): nan, Seq_10: nan): nan): nan): nan): nan): nan, (Seq_12: nan, (Seq_3: nan, Seq_11: nan): nan): nan): nan);

Strange - looks like PAML is not estimating the branch lengths properly.

  • Can you analyse the data using the GUI (rather than separately)? Does it still error?
  • Does RAxML give sensible branch lengths?

I can take a close look if you can share the baseml.ctl file, alignment and tree that you analyse using PAML.

There also might be something odd with your alignment because there are no constant sites. Check your alignment.

Happy you found solution!