Bug in 'bin'
gevro opened this issue · 10 comments
Hi,
I found a bug in bin. I have a bigwig whose max values are 1.0:
$wiggletools write_bg - fillIn genome.bed blah.bw | head -n 1000000 | cut -f 4 | sort -nr | head
1.000000
1.000000
1.000000
1.000000
However, when I do bin 20 scale 0.05, I am getting values > 1 (either with or without write_bg), which should be impossible:
$wiggletools bin 20 scale 0.05 fillIn genome.bed blah.bw
chr1 109840 109860 0.982538
chr1 109860 110020 1.000000
chr1 110020 110040 1.321969
chr1 110040 110060 0.959469
chr1 110060 110080 0.961057
$wiggletools write_bg - bin 20 scale 0.05 fillIn genome.bed
chr1 109980 110000 1.000000
chr1 110000 110020 1.000000
chr1 110020 110040 1.321969
chr1 110040 110060 0.959469
chr1 110060 110080 0.961057
It is happening due to bin, because without scale, it is still happening (max value should be 20):
$ wiggletools bin 20 fillIn genome.bed blah.bw
chr1 109820 109840 19.169865
chr1 109840 109860 19.650755
chr1 109860 110020 20.000000
chr1 110020 110040 26.439389
chr1 110040 110060 19.189385
chr1 110060 110080 19.221145
This is the original area of that issue without binning:
$wiggletools fillIn genome.bed blah.bw
chr1 109845 109847 0.954545
chr1 109847 109848 0.950000
chr1 109848 110027 1.000000
chr1 110027 109848 0.000000
chr1 109848 110027 1.000000
chr1 110027 110032 0.954545
chr1 110032 110041 0.958333
I'm not sure the source of the bug, but I suspect that binning is not weighting properly the proportion of the bin that is overlapped by the bigwig values.
Hello @gevro ,
Actually, there appears to be a bug in fillIn
, judging by the repeated sequences (chr1:109848-110027), which would allow for double counting. I'll have a look.
Cheers,
Daniel
Ah yes, good catch!
Hello @gevro ,
I'm struggling to reproduce the bug, could you please share the genome.bed and blah.bw files with me? Or at least the relevant region on chr1.
Cheers,
Daniel
Actually, the bug might be upstream. See below:
$wiggletools blah.bw
fixedStep chrom=chr1 start=109810 step=1
0.925000
0.966666
0.964285
0.964285
0.964285
chr1 109814 109821 0.961538
chr1 109821 109845 0.958333
chr1 109845 109847 0.954545
fixedStep chrom=chr1 start=109848 step=1
0.950000
chr1 109848 110027 1.000000
chr1 109848 110027 1.000000
chr1 110027 110032 0.954545
chr1 110032 110041 0.958333
chr1 110041 110046 0.954545
This is showing up twice:
chr1 109848 110027 1.000000
chr1 109848 110027 1.000000
blah.bw was made from blah.wig using kent tools. Here is the area from blah.wig:
$cat blah.wig
fixedStep chrom=chr1 start=109810 step=1
0.925000
0.966666
0.964285
0.964285
0.964285
chr1 109814 109821 0.961538
chr1 109821 109845 0.958333
chr1 109845 109847 0.954545
fixedStep chrom=chr1 start=109848 step=1
0.950000
chr1 109848 110027 1.000000
chr1 110027 110032 0.954545
chr1 110032 110041 0.958333
chr1 110041 110046 0.954545
Origin of the above blah.wig file is from:
$wiggletools scale 0.5 sum blah1.bw blah2.bw > blah.wig
Without scale 0.5 it shows:
$wiggletools sum blah1.bw blah2.bw
fixedStep chrom=chr1 start=109810 step=1
1.850000
1.933333
1.928571
1.928571
1.928571
chr1 109814 109821 1.923077
chr1 109821 109845 1.916667
chr1 109845 109847 1.909091
fixedStep chrom=chr1 start=109848 step=1
1.900000
chr1 109848 110027 2.000000
chr1 110027 110032 1.909091
chr1 110032 110041 1.916667
chr1 110041 110046 1.909091
Here are the regions from the two original bw files before sum:
$blah1.bw
fixedStep chrom=chr1 start=109803 step=1
0.909091
chr1 109803 109810 0.916667
chr1 109810 110140 1.000000
chr1 110140 110146 0.875000
chr1 110146 110171 0.857143
chr1 110171 110175 0.833333
chr1 110175 110179 0.800000
chr1 110179 110197 0.833333
chr1 110197 110208 0.800000
chr1 110208 110226 0.750000
chr1 110226 110228 0.500000
$blah2.bw
fixedStep chrom=chr1 start=109794 step=1
0.866667
0.857143
0.857143
0.857143
0.857143
0.923077
0.923077
0.923077
0.916667
chr1 109802 109809 0.928571
chr1 109809 109811 0.933333
chr1 109811 109814 0.928571
chr1 109814 109821 0.923077
chr1 109821 109845 0.916667
chr1 109845 109847 0.909091
fixedStep chrom=chr1 start=109848 step=1
0.900000
chr1 109848 110027 1.000000
chr1 110027 110032 0.909091
chr1 110032 110041 0.916667
chr1 110041 110046 0.909091
blah1.bw and blah2.bw were made from bedgraph using kent tools:
$blah1.bedgraph
chr1 109845 109846 1
chr1 109846 109847 1
chr1 109847 109848 1
chr1 109848 109849 1
chr1 109849 109850 1
chr1 109850 109851 1
chr1 109851 109852 1
chr1 109852 109853 1
chr1 109853 109854 1
$blah2.bedgraph
chr1 109845 109846 0.909091
chr1 109846 109847 0.909091
chr1 109847 109848 0.9
chr1 109848 109849 1
chr1 109849 109850 1
chr1 109850 109851 1
chr1 109851 109852 1
chr1 109852 109853 1
Does this help find the bug? Is the issue with 'sum'?
Sure! Thanks, I will e-mail you now.
Hello @gevro ,
As mentioned by email, I think I found and corrected the bug, please pull in my latest commits on master.
Let me know how it goes.
Thanks for spotting this,
Daniel
Installation via brew is not working:
- First I uninstalled: brew uninstall wiggletools.
2)Then install from --HEAD
$ brew install wiggletools --HEAD
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
pkger
==> Updated Formulae
Updated 104 formulae.
==> New Casks
operator silicon
==> Updated Casks
Updated 49 casks.
==> Installing wiggletools from brewsci/bio
==> Cloning https://github.com/Ensembl/WiggleTools.git
Updating /Users/evrong01/Library/Caches/Homebrew/wiggletools--git
From https://github.com/Ensembl/WiggleTools
5b4f2a4..94e11c4 master -> origin/master
==> Checking out branch master
Already on 'master'
Your branch is behind 'origin/master' by 9 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
HEAD is now at 94e11c4 Implemented compression keyword
==> make
Last 15 lines from /Users/evrong01/Library/Logs/Homebrew/wiggletools/01.make:
apply = newWiggleIterator(data, &LooseBufferedWiggleIteratorPop, &BufferedWiggleIteratorSeek, data->default_value);
~~~~~~~~~~~~~~~~~ ^
./wiggleIterator.h:37:1: note: 'newWiggleIterator' declared here
WiggleIterator * newWiggleIterator(void * data, void (*pop)(WiggleIterator *), void (*seek)(WiggleIterator *, const char *, int, int), double default_value, bool overlapping);
^
apply.c:169:77: error: too few arguments to function call, expected 5, have 4
return newWiggleIterator(data, &FillInUnaryPop, NULL, source->default_value);
~~~~~~~~~~~~~~~~~ ^
./wiggleIterator.h:37:1: note: 'newWiggleIterator' declared here
WiggleIterator * newWiggleIterator(void * data, void (*pop)(WiggleIterator *), void (*seek)(WiggleIterator *, const char *, int, int), double default_value, bool overlapping);
^
3 errors generated.
make[1]: *** [apply.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Wiggletools] Error 2
If reporting this issue please do so at (not Homebrew/brew or Homebrew/core):
https://github.com/brewsci/homebrew-bio/issues
Please create pull requests instead of asking for help on Homebrew's GitHub,
Twitter or any other official channels.
D'oh! I had forgotten to commit the changes in that file!
Now fixed, please try again.
Works! I am truly appreciative for all the quick fixes and improvements!