Finding DMR
MagdalenaWinklhofer opened this issue · 8 comments
Hi,
I want to use defiant to identify DMR in my WGBS data (non-model organism). I used the following command but can not identify any DMR. Please let me know if and how I could relax the conditions more. Am I doing something wrong?
Installation was done with the bioconda package (https://anaconda.org/bioconda/defiant). The input file are the bismark2coverage.report.txt files. Those files only contain positions of CpG sites. I checked in the run_info_comparison.txt file, and 100% of my nucleotide pairs passed the coverage condition. What I noticed is that my chromosome names don't start with "chr," but I saw that you fixed that in the issue "Chromosome names need to start with "chr". Additionally, I compared the "file_type5.txt" from the issue "Clarify input format (for Bismark output)" with my file and I have the same column. I only noticed that the numbers are much lower in columns 4 and 5. My numbers are between 0 and 5, whereas yours are up to 30. You only included the + strand in your file, whereas I have both the + and - strands.
Could that be why it does not find any DMR in my dataset?
Command used:
defiant -c 10 -p 0.05,0.10 -L Normoxia,Anoxia -l comparison -i N1-D14-1.CpG_report.txt,N2-D15-1.CpG_report.txt,N3-D16-2.CpG_report.txt,N7-D36-1.CpG_report.txt A1-D17-1.CpG_report.txt,A2-D18-1.CpG_report.txt,A4-D26-2.CpG_report.txt,A7-D37-1.CpG_report.txt
Please let me know if you need additional information and thank you for your help!
Hi Magdalena,
sure, I recommend using the Defiant version from GitHub, I think that it's more updated.
Different sequencing sets will have different levels of coverage, smaller/greater percent differences, the defaults that I chose for my rat study may not be appropriate for your data.
You can also try:
defiant -c 5,10,1 -CpG 1,5,1 -p 0.05,0.10 -L Normoxia,Anoxia -l comparison -i N1-D14-1.CpG_report.txt,N2-D15-1.CpG_report.txt,N3-D16-2.CpG_report.txt,N7-D36-1.CpG_report.txt A1-D17-1.CpG_report.txt,A2-D18-1.CpG_report.txt,A4-D26-2.CpG_report.txt,A7-D37-1.CpG_report.txt
this will decrease the minimum coverage and # of CpG.
Perhaps single CpG changes are interesting for your project?
Please let me know how that works out!
Hi,
I git cloned the repository and installed defiant. When I start my slurm script with the code you suggested, I get the error that "-CpG" is not a recognized option.
Am I correctly understanding that if you set the -CpG option to 1 I can detect single CpG changes?
Error:
____ ___________________ _ ________
/ __ / / / / | / | / / __/
/ / / / _/ / / / // /| | / |/ / / /
/ // / // / _/ // ___ |/ /| / / /
//__// /// |// |/ /_/
Differential methylation: Easy, Fast, Identification and ANnoTation
by David E. Condon, University of Pennsylvania, 2015-2020
The minimum coverage used will be 5
Coverage cutoffs will be incremented from 5 to 10 in steps of 1
"-CpG" isn't a recognized option.
I started the program without the -CpG option and all the output file look as followed:
Chromosome Start End #mCpN #Diff.CpN Mean_Difference Normoxia Anoxia
No_DMR_was_found_with_current_parameters
Hi Magdalena,
sorry about that, I mean "-CpN".
and yes, with the "-CpN" option, you'll be able to see single CpG/CpN changes.
Let me know how it works out!
Hi,
it works now, but I still can't find any DMRs. I only get DMRs when I relax the conditions a lot (p-value = 0.15, CpN=1). Could it be that I don't have any DMR that contains equal to or more than 4 CpG changes? Maybe it is just my data, but I am still in denial that this could be the reason. Would you happen to have any other suggestions?
Hi Magdalena,
try the "-S" option, which allows skips of low coverage, for example, "-S 1" and increment upwards.
Does that give you any DMRs?
Hi Magdalena,
try the "-S" option, which allows skips of low coverage, for example, "-S 1" and increment upwards.
Does that give you any DMRs?
Hi,
I don't know if you meant it like this, but the following command does not give me any DMRs:
./defiant -c 10 -CpN 1,5,1 -p 0.05,0.15,0.05 -S 1 -L Normoxia,Anoxia -l comparison -i N1-D14-1.CpG_report.txt,N2-D15-1.CpG_report.txt,N3-D16-2.CpG_report.txt,N7-D36-1.CpG_report.txt A1-D17-1.CpG_report.txt,A2-D18-1.CpG_report.txt,A4-D26-2.CpG_report.txt,A7-D37-1.CpG_report.txt
Hi Magdalena,
try increasing the accepted number of skips. Also, can you show/make a plot of your coverage distribution? I'd be curious to see, I think that you likely have a reduced representation or something like that