Output data from specific region
Closed this issue · 11 comments
Hi,
How do I output data only from a specific region (chrom:start-end)?
Here is my base command:
wiggletools write_bg - compress gte 0.2 test.bw | less
Thanks
Is this the way to do it?
wiggletools seek chr1 1 1000000 write_bg - compress gte 0.2 test.bw
What if I want all positions on chr1 without having to specify end position?
This does not work:
wiggletools seek chr1 write_bg - compress gte 0.2 test.bw
And is there a way to provide a file with a list of regions? If so, what format is this file (BED, 1-based, etc)?
Hello @gevro ,
Sorry, pulling out a whole chromosome was not an implemented feature.
You may however want to look at the overlaps function, which you provide with regions and is used to filter another iterator.
Hope this helps,
Daniel
Thanks. I don't see the documentation for it - how do I use it, and what format is the regions file input? 0-based or 1-based? BED format or UCSC format?
Also, wouldn't trim also work instead of overlaps, if I give a BED file with coordinates of the whole chromosome?
Yes, trim would work too, just with different rules with respect to boundaries.
Does trim intelligently seek specific coordinates based on the BED file?
It seems like it is not. For example, if I give trim a BED file with chr1 coordinates, it quickly displays the output. But if the trim BED file begins from chr2, it takes a long time, indicating it is sequentially going through the bigwig file until it reaches chr2.
Isn't there a way to seek directly to the desired coordinates?
It looks like overlaps doesn't do it either, only seek does.
But the problem is that seek doesn't take a BED file.
Is there any way to get rapid seek but with a BED file input? Or can this be added?
Hello @gevro ,
that's an interesting idea but alas I don't think it will be implemented shortly, as Wiggletools is under maintenance only.
Best regards,
Daniel
Thanks. My solution is to do seek iteratively on each region and concatenate the outputs to one file with '>>' each time.