Our raw and processed data is available at: https://figshare.com/s/4f071118f2fde3f00a26 This is code used for paper: https://www.biorxiv.org/content/10.1101/2022.07.29.501997v1 #### These steps are optional, the data is already present in the FigShare directory #### First, run get_somatic_variants.sh to obtain vcfs for each sample containing its unique snvs and indels. Then, run get_txt_files.sh to obtain txt files of these vcfs. To filter the discrete set of snvs further, run filter_snvs.R. This creates new vcf files. To filter the discrete set of indels further, run filter_indels.R. This creates new vcf files. Run get_Zou_Kucab_indels.R to get vcfs for Zou and Kucab indels. Run get_Zou_Kucab.R to get snv counts for Zou and Kucab. Run sigprofiler_snvmatrix.sh to generate matrices for SigProfilerExtractor. To extract SNV and indel signatures with SigProfiler, run sigprofiler_cpu.sh. To decompose these signatures, run sigprofiler_decompose.sh. #### #### #### #### #### #### #### #### #### #### #### #### #### #### #### #### #### #### ##### To analyse samples and generate plots, run SNV_ID_sample_plots.R. To analyse the SNV signatures and generate plots, run SNVsig_plots.R. For indels, run IDsig_plots.R. To perform randomizations of mutation burdens, SNV and indel signature activities, run randomizations.R and then randomizations_plots.R for plots. To analyse structural variant calls from Manta run manta_analysis.R. To analyse clustered SNVs, run analysis_clusters.R. For regional enrichment, visit https://github.com/tdelhomme/RegionalEnrichment-nf/. Beds and input mutation files for SNVs and SVs are provided in the FigShare directory.