Running wg-blimp in one control and one experiment data, without replication.
Closed this issue · 3 comments
Thank you for developing a WGBS data analysis tool.
I want to use wg-blimp to analyze a WGBS data set, one control and one experiment.
All steps go smoothly. However, in
Rscript --vanilla /media/wooje/epi-T/Peggy_wg_blimp/.snakemake/scripts/tmpz6x06vto.bsseq.R
Activating conda environment: /home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/conda/8c03d5578c6dd7b4f0accc99ba7b7c00
I received the following message
Error in rule bsseq:
jobid: 4
output: /media/wooje/epi-T/Peggy_wg_blimp/results/dmr/bsseq/bsseq.Rdata, /media/wooje/epi-T/Peggy_wg_blimp/results/dmr/bsseq/dmrs.csv, /media/wooje/epi-T/Peggy_wg_blimp/results/dmr/bsseq/top100.pdf
log: /media/wooje/epi-T/Peggy_wg_blimp/results/logs/bsseq.log (check log file(s) for error message)
conda-env: /home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/conda/8c03d5578c6dd7b4f0accc99ba7b7c00
RuleException:
CalledProcessError in line 410 of /home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/Snakefile:
Command 'source /home/wooje/anaconda3/envs/wg-blimp/bin/activate '/home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/conda/8c03d5578c6dd7b4f0accc99ba7b7c00'; Rscript --vanilla /media/wooje/epi-T/Peggy_wg_blimp/.snakemake/scripts/tmpz6x06vto.bsseq.R' returned non-zero exit status 1.
File "/home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/concurrent/futures/thread.py", line 52, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/wooje/epi-T/Peggy_wg_blimp/.snakemake/log/2021-08-24T182610.612489.snakemake.log
in log file, I found that
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
[1] "Filtering out 0 rows containing NA"
Error in BSmooth.tstat(smoothedData[!invalidRows], group1 = group1Samples, :
length(group1) + length(group2) >= 3 is not TRUE
Calls: callDmrs -> BSmooth.tstat -> stopifnot
Execution halted
Would you give me some advice on how to run the pipeline with one control and one experiment without replicate data?
Any comments will help us proceed with the analysis.
Thank you!!
Thanks for reaching out!
The issue here is that you are attempting to call DMRs using only two samples. Because wg-blimp
by default used bsseq
for DMR calling internally, you are seeing this error, because bsseq
will refuse to analyse data with only two samples. It should be noted that using only two samples might result in an underpowered analysis, but it is technically possible to still run the analysis.
You can simply remove bsseq
from the list of DMR calling tools by changing the parameter dmr_tools
in your configuration file. With the changed configuration file you can then use wg-blimp run-snakemake-from-config
to re-run the analysis without bsseq
.
I hope that helps, let me know if there are any further issues!
Thank you very much for the quick reply.
It would be a great help in analyzing my data!!
I'll close this issue due to inactivity. Feel free to re-open if necessary :)