Issue merging many large bsseq objects with biscuiteer::unionize()
Opened this issue · 5 comments
Working on my first analysis w/ BISCUIT/biscuiteer, but I’ve encountered some issues handling the data. I have 20 gzip/tabix’d VCFs (15-20Gb each) with accompanying bed.gz files. Biscuiteer seems to be working just fine with small/toy datasets. However, I’ve been having issues merging all these samples into a single bsseq
object. I think part of the issue is simply due to the large sample number and the amount of data for each sample. I have attempted to solve this issue with two approaches that have failed thus far:
biscuiteer::readBiscuit()
for each sample individually and then usebiscuiteer::unionize()
to get a single object.- Merge vcf.gz and bed.gz files on the command line and then import together using
biscuiteer::readBiscuit()
Do you have any advice for a better/ideal approach in this situation?
thanks in advance!
Tim,
I suppose this isn't so much of an issue as a question, hence, my lack of sessionInfo()
and error message. I think the package is working as intended. I was just hoping to understand best-practice when it comes to improving performance/speed.
I'll move forward with your suggestion of jointly calling variants with BISCUIT
into a single VCF. Feel free to close this issue unless you'd like further info from my experience.
thanks much!
Dean
just to clarify, i don't have any errors. I'm not used to handling objects of this magnitude in R, so i was just looking for direction regarding an optimal approach :)