The assessment of novel phylogenetic models and inference methods is routinely being conducted via experiments on simulated as well as empirical data. When generating synthetic data it is often unclear how to set simulation parameters for the models and generate tree shapes that appropriately reflect empirical model parameter distributions and tree shapes. As a solution, we present and make available a new database called 'RAxML Grove' currently comprising more than 100,000 inferred trees and respective model parameter estimates from fully anonymized empirical datasets that were analyzed using RAxML and RAxML-NG on two respective web-servers (https://raxml-ng.vital-it.ch and https://www.phylo.org/index.php).
Example scripts using this database can be found here.
This RAxMLGrove is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/
D. Höhler, W. Pfeiffer, V. Ioannidis, H. Stockinger, A. Stamatakis (2022) RAxML Grove: an empirical phylogenetic tree database Bioinformatics, 38(6):1741–1742. https://doi.org/10.1093/bioinformatics/btab863