/Foragers_vs_Nurses

Data and scripts from the manuscript: Gene expression and epigenetics reveal species-specific mechanisms acting upon common molecular pathways in the evolution of task division in bees

Primary LanguageR

Data and scripts repository from the manuscript:

[Scientifi Reports] (2021) Gene expression and epigenetics reveal species-specific mechanisms acting upon common molecular pathways in the evolution of task division in bees
Natalia de Souza Araujo, and Maria Cristina Arias

[Pre-print] Multiple lineages, same molecular basis: task specialization is commonly regulated across all eusocial bee groups bioRxiv 2020.04.01.020461; doi: https://doi.org/10.1101/2020.04.01.020461

Repository content

Data files

  • Bt_fornur_dez16_lncCod_renamed_one.fasta.bz2: superTranscripts assembled based on RNASeq data from foragers and nurses of B. terrestris. Assembly method described in the manuscript.

  • Jt_fornur_dez16_lncCod_renamed.fasta.gz: superTranscripts assembled based on RNASeq data from foragers and nurses of T. angustula. Assembly method described in the manuscript.

  • Bt_fornur.mstat.data: Estimation of DNA methylation in the supertranscriptome of B. terrestris nurse. Analysis method described in the manuscript.

  • Jt_fornur.mstat.data: Estimation of DNA methylation in the supertranscriptome of T. angustula nurse. Analysis method described in the manuscript.

Statistics

  • common.stats.R: R function used to test overlap significance between DET in the two species. Used parameters in the manuscript are included.

  • expected_GCmethylation.R: R script used to test wether the mean amount of CG methylation observed is greater than expected based on the proportion of CG sites in the transcriptome.

  • methylation_mean_dev.R: R script used to test whether the mean amount of C methylation observed in a transcripts subset is significantly greater than the general mean.

  • cor_meth-expression.R: R script to estimate the Spearman' coeficient between gene expression and mC.

  • GO_enrichment.R: R script to estimate enriched GO terms among the DET using TopGO.

Figures

  • mCbarplot.R: R script to create the DNA methylation barplot.

  • mC_waffle.R: R script to create the DNA methylation waffle plot.

  • GOplot_fig.R: R script to create the GOplot graph for third level terms from subgraphs of enriched terms.

  • euler_fig.R: R script to create the Euler diagram of genes in common.

  • donutPlot.R: R script to create the donut plots of conserved/ taxaomically restricted genes.

  • REViGO_cytoscape_Bt-Jt-Am.xgmml: Input code for Cytoscapse generated by REVIGO to draw the similarity network among the enriched GO terms among the DET of all species.

  • all_enriched_subgraphs_Am.pdf: Subgraph induced by all the enriched GO terms in A. mellifera head.

Others

  • Annocript2TopGo.py: Python script used to format Annocript output to TopGO input.

  • filter_orthogroups.py: Filter orthogroups ( Orthogroups.tsv ) containing a set of genes of interest.

  • correct_rooted.tree: Manually rooted tree used on OrthoFinder.

Orthogroups

Folder containing the IDs of the orthogroups in each taxonical class.

License

This work is distributed under the GPLv3 license. Reuse of code derived from this repository is permitted under two conditions:

Proper attribution (i.e., citation of the associated publication; see CITATION.cff and above).
Publication of reused scripts on an open-access platform, such as Github.