Ensembl/WiggleTools

Output data from specific region

Closed this issue · 11 comments

gevro commented

Hi,
How do I output data only from a specific region (chrom:start-end)?

Here is my base command:

wiggletools write_bg - compress gte 0.2 test.bw | less

Thanks

gevro commented

Is this the way to do it?
wiggletools seek chr1 1 1000000 write_bg - compress gte 0.2 test.bw

Hello @gevro , absolutely, that's correct!

gevro commented

What if I want all positions on chr1 without having to specify end position?
This does not work:
wiggletools seek chr1 write_bg - compress gte 0.2 test.bw

And is there a way to provide a file with a list of regions? If so, what format is this file (BED, 1-based, etc)?

Hello @gevro ,

Sorry, pulling out a whole chromosome was not an implemented feature.

You may however want to look at the overlaps function, which you provide with regions and is used to filter another iterator.

Hope this helps,

Daniel

gevro commented

Thanks. I don't see the documentation for it - how do I use it, and what format is the regions file input? 0-based or 1-based? BED format or UCSC format?

gevro commented

Also, wouldn't trim also work instead of overlaps, if I give a BED file with coordinates of the whole chromosome?

Yes, trim would work too, just with different rules with respect to boundaries.

gevro commented

Does trim intelligently seek specific coordinates based on the BED file?

It seems like it is not. For example, if I give trim a BED file with chr1 coordinates, it quickly displays the output. But if the trim BED file begins from chr2, it takes a long time, indicating it is sequentially going through the bigwig file until it reaches chr2.

Isn't there a way to seek directly to the desired coordinates?

gevro commented

It looks like overlaps doesn't do it either, only seek does.

But the problem is that seek doesn't take a BED file.

Is there any way to get rapid seek but with a BED file input? Or can this be added?

Hello @gevro ,

that's an interesting idea but alas I don't think it will be implemented shortly, as Wiggletools is under maintenance only.

Best regards,

Daniel

gevro commented

Thanks. My solution is to do seek iteratively on each region and concatenate the outputs to one file with '>>' each time.