Compression of intermediate bam file
matthdsm opened this issue · 3 comments
matthdsm commented
Hi,
Is there an option to compress the intermediate bam file?
I'm aligning 80GB's worth of fastq files and the intermediate bam's grown to about 500GB, which is becoming quite a load on our FS.
Current command is
snap-aligner paired ./snapaligner sample_R1.fastp.fastq.gz sample_R2.fastp.fastq.gz -o sample.bam -t 18 -so -b- -sm 10 -I -hc-
Thanks
M
bolosky commented
By "intermediate" BAM file do you mean the sort temporary file that gets created during alignment but before the final output? The one that's got a name like sample.bam.tmp?
No, there's no option to compress it. Compressing is really slow and this is the first time that I, at least, have heard that it's been a big problem to find enough space for it so I'd never considered doing it.
If you mean the final output, that is always compressed. If you're getting a 500GB BAM from 80GB of gzipped FASTQ then something's wrong and we should follow up on it.
…--Bill
From: Matthias De Smet ***@***.***>
Sent: Thursday, December 8, 2022 4:00 AM
To: amplab/snap ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [amplab/snap] Compression of intermediate bam file (Issue #163)
Hi,
Is there an option to compress the intermediate bam file?
I'm aligning 80GB's worth of fastq files and the intermediate bam's grown to about 500GB, which is becoming quite a load on our FS.
Current command is
snap-aligner paired ./snapaligner sample_R1.fastp.fastq.gz sample_R2.fastp.fastq.gz -o sample.bam -t 18 -so -b- -sm 10 -I -hc-
Thanks
M
-
Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F163&data=05%7C01%7Cbolosky%40microsoft.com%7C3aaad6cb6a4c4253061608dad913c57e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638060976186244139%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rkVtZt%2B1sVrsRh1cC3xSh9P0LxvsYxHR4eCcy1gAnU4%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHPTWIYBEMOZTWKZHMSC43WMHEVBANCNFSM6AAAAAASYBSNVI&data=05%7C01%7Cbolosky%40microsoft.com%7C3aaad6cb6a4c4253061608dad913c57e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638060976186244139%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JsF2k9TKTS77WPexqmMdZh21I%2FtMFMnbQDnh%2Bsw6Bz4%3D&reserved=0>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>
matthdsm commented
Hi!
The intermediate is the bam.tmp
file, the final bam is only about 50GB, so thats acceptable.
Anyway, I just wanted to know if it was possible. I prefer speed over size anyways.
Thanks for the reply!
M
ghuls commented
I think SAMtools compresses temporary BAM files, but only with compression level 1 (to have some compression, but not to much CPU overhead).