GMOD/vcf-js

Error: Too much data. chunkSizeLimit

Closed this issue · 2 comments

Hi,

Does it possible to use vcf-js on huge vcf file (more than 1000 samples) ?
When I try to get only one line :

await tbiIndexed.getLines('25', 223, 224, line =>
    variants.push(tbiVCFParser.parseLine(line)),
  )

I get the following error :

(node:22444) UnhandledPromiseRejectionWarning: Error: Too much data. Chunk size 3,876,100 bytes exceeds chunkSizeLimit of 2,000,000.
    at TabixIndexedFile._callee2
[...]

Thanks for your help
Regards

Hi there, thanks for your interest in vcf-js! The error you're seeing is actually from tbi-js, not vcf-js, but luckily should be pretty easy to fix. When creating the TabixIndexedFile, you can give it a chunkSizeLimit to use instead of the default 2,000,000. Raising the chunkSizeLimit should be fine, the default is pretty conservative, it's just there to keep it from using too much memory unintentionally.

So it would look something like:

const tbiIndexed = new TabixIndexedFile({
  path: 'path/to/my/file.gz',
  chunkSizeLimit: 5000000 // or whatever you need it to be
})

Let me know if this helps.

Perfect, thanks for your answer !