laurahspencer/DuMOAR

Align mbdbs data to publicly available mito genome

Closed this issue · 10 comments

To estimate conversion efficiency

@sr320 I presume I can simply use the file Mmag_mito.fa you added to our repo for this task?

sr320 commented

Yes

Aligned the MBDBS data to the mitochondrial genome using the script bismark_mitochondria.sh. Bismark reports for each sample and the summary report and a MultiQC report are all located in results/bismark/mito. Mitochondria % CpG methylation isn't zero, but ranges from 2.8% - 27.5% and is on average 13.9%. Mitochondrial CpG methylation percentage correlates closely with %CpG methylation across the whole genome, see figure ("included" in the figure legend means "included in analysis"):

image

Thanks for doing this, Laura! Wow, that is def. not "zero methylation". What's really crazy for these is how high the CHG and CHH methylation is, I think CHG <2% for the nuclear genome and >20% for mito. ..more ques than answers :)

sr320 commented

I think we need a table of CG CHG CHH methylation context for all samples, for both full genome and mitochondria to confirm sample integrity.

Prep Sample # Collection Sample # pCO2 Rep tank Stage DNA batch % Alignment (genome) % CpG meth (genome) % CHG meth (genome) % CHH meth (genome) % alignment (mito) % CpG meth (mito) % CHG meth (mito) % CHH meth (mito) included in analysis?
1 CH01-06 Low LC 3 J7 1a 31.1 73.4 1.3 7 0.11 20.6 26.5 47.8 yes
2 CH01-14 Low LC 3 J6 3a 26.5 67.7 1.7 12.8 0.13 20.6 25.7 46.8 yes
3 CH01-22 Low LB 1 J6 2b 30 66.2 1.2 8.2 0.12 14.6 19.9 45.3 yes
4 CH01-38 Low LB 1 J6 1b 30.4 68.1 1.6 12.2 0.13 24.8 31.2 48.3 yes
7 CH03-33 Low LB 1 J6 3b 30.8 72.5 1.1 5.6 0.09 11.6 16.8 44.7 yes
12 CH07-06 Low LC 3 J7 2a 30.7 64.1 1.3 9 0.13 13 19.7 45.4 yes
10 CH05-21 High HB 2 J7 3a 31.1 64.1 1.3 12.5 0.12 12.2 18.7 47.1 yes
11 CH05-24 High HC 4 J6 1a 31.5 71.6 1.6 13.9 0.13 27.5 31.4 50.3 yes
15 CH09-02 High HC 4 J6 3b 29.7 68.4 1.8 15.6 0.16 23.1 26.4 47.6 yes
17 CH09-28 High HA 6 J6 4a 31.7 67.6 1.7 7.8 0.13 14.8 22.9 46.3 yes
18 CH10-01 High HB 2 J6 4a 35.3 70.8 1.4 7.7 0.16 19.2 26 47.6 yes
20 CH10-11 High HA 6 J6 3b 31.9 67.7 1.5 12.3 0.16 21.9 26.6 46.1 yes
5 CH03-04 Low LB 1 J7 1b 23.5 11.4 2.5 35.4 0.16 5 9.1 40.8 no
6 CH03-15 Low LB 1 J6 2a 19.1 19.4 1.5 18.4 0.13 4.4 8.7 37.5 no
13 CH07-11 Low LC 3 J6 3a 18.1 7.8 1.6 12.7 0.12 3.5 7 33.1 no
14 CH07-24 Low LB 1 J7 2a 18.1 7 1.6 13.7 0.15 5.1 9.7 37.4 no
8 CH05-01 High HB 2 J7 1a 28.2 51.8 2.1 26.6 0.14 16.2 25 48.2 no
9 CH05-06 High HA 6 J7 3a 16 7 1.4 16 0.12 2.8 5.5 30.9 no
16 CH09-13 High HA 6 J7 2b 12 45.8 1.9 19.7 0.05 10.4 13.6 41 no
19 CH10-08 High HA 6 J6 2b 27.9 31.2 1.8 26.8 0.14 6.7 13 43.9 no

Also uploaded the table to the repo here: results/alignment-meth-summary.csv

Simpler table sorted by genome CpG % methylation rate:

Sample pCO2 Genome Alignment % Genome % CpG Genome % CHG Genome % CHH Mito Alignment % Mito CpG % Mito CHG % Mito CHH % included?
1 L 31.1 73.4 1.3 7 0.11 20.6 26.5 47.8 yes
7 L 30.8 72.5 1.1 5.6 0.09 11.6 16.8 44.7 yes
11 H 31.5 71.6 1.6 13.9 0.13 27.5 31.4 50.3 yes
18 H 35.3 70.8 1.4 7.7 0.16 19.2 26 47.6 yes
15 H 29.7 68.4 1.8 15.6 0.16 23.1 26.4 47.6 yes
4 L 30.4 68.1 1.6 12.2 0.13 24.8 31.2 48.3 yes
20 H 31.9 67.7 1.5 12.3 0.16 21.9 26.6 46.1 yes
2 L 26.5 67.7 1.7 12.8 0.13 20.6 25.7 46.8 yes
17 H 31.7 67.6 1.7 7.8 0.13 14.8 22.9 46.3 yes
3 L 30 66.2 1.2 8.2 0.12 14.6 19.9 45.3 yes
10 H 31.1 64.1 1.3 12.5 0.12 12.2 18.7 47.1 yes
12 L 30.7 64.1 1.3 9 0.13 13 19.7 45.4 yes
8 H 28.2 51.8 2.1 26.6 0.14 16.2 25 48.2 no
16 H 12 45.8 1.9 19.7 0.05 10.4 13.6 41 no
19 H 27.9 31.2 1.8 26.8 0.14 6.7 13 43.9 no
6 L 19.1 19.4 1.5 18.4 0.13 4.4 8.7 37.5 no
5 L 23.5 11.4 2.5 35.4 0.16 5 9.1 40.8 no
13 L 18.1 7.8 1.6 12.7 0.12 3.5 7 33.1 no
14 L 18.1 7 1.6 13.7 0.15 5.1 9.7 37.4 no
9 H 16 7 1.4 16 0.12 2.8 5.5 30.9 no

Mitochondrial alignment is not helpful for conversion efficiency estimation, but is helpful to look at consistency across samples.