This folder contains all the analyses scripts used in Sethuraman et al., 2021 https://doi.org/10.1101/2021.06.30.450623
dcocc_analyses.sh - Shell script with detailed analyses for all annotations, gene predictions, phylogeny, and ancestral state reconstruction
Folder: hymenoptera_oto - contains all the FASTA files for running phylogenetic analyses; DCOCC.fa is >25 Mb and can be accessed in the Galaxy Project history link:
https://usegalaxy.org/u/rykamae/h/dcoccinellaegenome
Folder: mcmctree - contains all the ancestral state reconstruction configuration files, and the R code to run phytools
Results, genome files:
augustus_genes.fasta.gz - gzipped FASTA file of all protein coding genes predicted by AUGUSTUS (ab initio)
dupliphytree.tre - .TRE file generated by ML analyses of DupliPhy
results_hym_desc.xlsx - XLSX file containing results of gene family evolution, expansions, reductions from CAFE5
UNI246_report.html - Scaffolding report provided by Dovetail Genomics
QUASTLG_Dcocc_report.pdf - Contains the QUAST-LG assembly quality report generated from running QUAST-LG v.5.0.2 on the final assembly file
Dcocc_gemoma_final_v1.gff - GFF annotation track from homology-mediated gene prediction