MarWoes/wg-blimp

Running wg-blimp in one control and one experiment data, without replication.

Closed this issue · 3 comments

Thank you for developing a WGBS data analysis tool.

I want to use wg-blimp to analyze a WGBS data set, one control and one experiment.

All steps go smoothly. However, in

Rscript --vanilla /media/wooje/epi-T/Peggy_wg_blimp/.snakemake/scripts/tmpz6x06vto.bsseq.R
Activating conda environment: /home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/conda/8c03d5578c6dd7b4f0accc99ba7b7c00

I received the following message

Error in rule bsseq:
    jobid: 4
    output: /media/wooje/epi-T/Peggy_wg_blimp/results/dmr/bsseq/bsseq.Rdata, /media/wooje/epi-T/Peggy_wg_blimp/results/dmr/bsseq/dmrs.csv, /media/wooje/epi-T/Peggy_wg_blimp/results/dmr/bsseq/top100.pdf
    log: /media/wooje/epi-T/Peggy_wg_blimp/results/logs/bsseq.log (check log file(s) for error message)
    conda-env: /home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/conda/8c03d5578c6dd7b4f0accc99ba7b7c00
RuleException:
CalledProcessError in line 410 of /home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/Snakefile:
Command 'source /home/wooje/anaconda3/envs/wg-blimp/bin/activate '/home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/site-packages/snakemake_wrapper/conda/8c03d5578c6dd7b4f0accc99ba7b7c00'; Rscript --vanilla /media/wooje/epi-T/Peggy_wg_blimp/.snakemake/scripts/tmpz6x06vto.bsseq.R' returned non-zero exit status 1.
  File "/home/wooje/anaconda3/envs/wg-blimp/lib/python3.9/concurrent/futures/thread.py", line 52, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/wooje/epi-T/Peggy_wg_blimp/.snakemake/log/2021-08-24T182610.612489.snakemake.log

in log file, I found that

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Attaching package: ‘Biobase’

The following object is masked from ‘package:MatrixGenerics’:

    rowMedians

The following objects are masked from ‘package:matrixStats’:

    anyMissing, rowMedians

[1] "Filtering out 0 rows containing NA"
Error in BSmooth.tstat(smoothedData[!invalidRows], group1 = group1Samples,  : 
  length(group1) + length(group2) >= 3 is not TRUE
Calls: callDmrs -> BSmooth.tstat -> stopifnot
Execution halted

Would you give me some advice on how to run the pipeline with one control and one experiment without replicate data?

Any comments will help us proceed with the analysis.

Thank you!!

Thanks for reaching out!

The issue here is that you are attempting to call DMRs using only two samples. Because wg-blimp by default used bsseq for DMR calling internally, you are seeing this error, because bsseq will refuse to analyse data with only two samples. It should be noted that using only two samples might result in an underpowered analysis, but it is technically possible to still run the analysis.

You can simply remove bsseq from the list of DMR calling tools by changing the parameter dmr_tools in your configuration file. With the changed configuration file you can then use wg-blimp run-snakemake-from-config to re-run the analysis without bsseq.

I hope that helps, let me know if there are any further issues!

Thank you very much for the quick reply.

It would be a great help in analyzing my data!!

I'll close this issue due to inactivity. Feel free to re-open if necessary :)