zyndagj/BSMAPz

methdiff.py error

Closed this issue · 15 comments

Hello,
I research Whole-Genome Bisulfite Sequencing in Arabidopsis thaliana.

I have a question about methdiff.py.
When I carried out methdiff.py, program was spewed following error.

python methdiff.py -o diff.txt -b 1 -d GCF_000001735.3_TAIR10_genomic.fna WT.txt SI.txt
[methdiff] @TUE Jul 18 17:37:52 2017 reading reference file: ../test_1/GCF_000001735.3_TAIR10_genomic_1.fna ...
[methdiff] @TUE Jul 18 17:37:54 2017 processing NC_000932.1 ...
[methdiff] @TUE Jul 18 17:37:54 2017 processing NC_001284.2 ...
Traceback (most recent call last):
File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 133, in
cmp_chrom(cr)
File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 112, in cmp_chrom
pval = get_pval(m[0], d[0], m[1], d[1])
File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 86, in get_pval
l0, u0 = conf_intv(m0, d0, z0)
File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 81, in conf_intv
span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5
ValueError: negative number cannot be raised to a fractional power

I think that this error has been attributed to python version from following site (I use Python 2.7.5).
https://stackoverflow.com/questions/17747124/valueerror-negative-number-cannot-be-raised-to-a-fractional-power
However, I don’t know how to rewrite the methdiff.py.

Please let me know corrective strategy.

Hello, I was not the original author of this software, but I will try to help.

I am fairly certain that this is not a python problem but an issue in the code itself. It looks like a negative number is being raised to the power 1/2, or (more commonly) a square root. This would yield a complex number, and is probably not the intention.

I will try to figure out why this would occur in the next week and get back to you.

Thanks,
Greg Zynda

Thank you very much for your advice.
I have carried out without error, if I changed from 0.5 to 1/2.

Thanks,
Yoshihiro

@Yoshihiro-handa, please not, that this is likely incorrect. In Python dividing two integers, will result in a truncated integer. In your case:

$ python
Python 2.7.13 (default, May  3 2017, 12:33:43) 
[GCC 5.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 1/2
0
>>> 3/4
0
>>> 1.0/2
0.5

So, taking something to the power of zero always results in 1. You need to fix the program error. For example, hire a programmer to improve bsmap. Your workaround could give you incorrect results.

I have a feeling the error occurred because you used a bin size of 1 (-b 1), but I have been unable to reproduce.

Please send me your input files if you would like an official fix because your floating point truncation is not correct.

Hi,
I'm also using methdiff.py after methratio.py and I have the same error than Yoshihiro, did you finally solve the problem ?

Thanks,

Bernadette

I'll try to take a look at it.

Similar to what Yoshihiro shared, can you show me how you invoked the program and the exact error you received?

It would also be helpful if I had access to your inputs, or knew how they were generated.

Thanks,
Greg

Thank you for the information. I'll pull some S. lycopersicum data from NCBI and try to reproduce.

I reproduced your error

(team-py2) wireless-10-146-4-21:methdiff_error gzynda$ python methdiff.py -o out.txt -d S_lycopersicum_chromosomes.2.50.fa out_ratio_WTB.txt out_ratio_WTC.txt 
[methdiff] @Tue Sep  3 13:05:52 2019 	reading reference file: S_lycopersicum_chromosomes.2.50.fa ...
[methdiff] @Tue Sep  3 13:06:06 2019 	processing SL2.50ch00 ...
[methdiff] @Tue Sep  3 13:06:47 2019 	processing SL2.50ch01 ...
[methdiff] @Tue Sep  3 13:09:47 2019 	processing SL2.50ch02 ...
[methdiff] @Tue Sep  3 13:11:24 2019 	processing SL2.50ch03 ...
[methdiff] @Tue Sep  3 13:13:22 2019 	processing SL2.50ch04 ...
[methdiff] @Tue Sep  3 13:15:27 2019 	processing SL2.50ch05 ...
[methdiff] @Tue Sep  3 13:17:37 2019 	processing SL2.50ch06 ...
[methdiff] @Tue Sep  3 13:19:08 2019 	processing SL2.50ch07 ...
[methdiff] @Tue Sep  3 13:21:16 2019 	processing SL2.50ch08 ...
[methdiff] @Tue Sep  3 13:23:23 2019 	processing SL2.50ch09 ...
[methdiff] @Tue Sep  3 13:25:36 2019 	processing SL2.50ch10 ...
[methdiff] @Tue Sep  3 13:27:39 2019 	processing SL2.50ch11 ...
[methdiff] @Tue Sep  3 13:29:19 2019 	processing SL2.50ch12 ...
Traceback (most recent call last):
  File "methdiff.py", line 133, in <module>
    cmp_chrom(cr)
  File "methdiff.py", line 112, in cmp_chrom
    pval = get_pval(m[0], d[0], m[1], d[1])
  File "methdiff.py", line 86, in get_pval
    l0, u0 = conf_intv(m0, d0, z0)
  File "methdiff.py", line 81, in conf_intv
    span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5
ValueError: negative number cannot be raised to a fractional power

and I believe it is because some floating point calculations are being floored to integers during the wilson score calculation.

span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5

I'll modify that code and report back.

That change did not fix it, so I'm going to figure out what specific data values are causing the error

That change did not fix it, so I'm going to figure out what specific data values are causing the error

Dear zyngaji,

Have you fixed this issue? if so, could send me a copy of the script? (xiaenhua@gmail.com)

Enhua

So methdiff.py calculates the methylation frequency from columns

Column Reason Description
6 Effective depth eff_CT = CT * (G / GA)
7 Methylation counts C

The effective depth can be smaller than C, so methratio.py calculates the final methylation frequency with

frequency = min(C, eff_CT) / eff_CT

but methdiff.py has been calculating it as

frequency = C / eff_CT

which was allowing the frequency to go above 1 when eff_CT was reduced below the raw CT values. This then was causing the following calculation

span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5

to take the square root of a negative number from the (1-p) part, leading to

ValueError: negative number cannot be raised to a fractional power

I am updating the methdiff.py code to use the same calculating that methratio.py does and will update you when a new release is pushed.

The latest release that I just pushed out fixes this issue. Please let me know if you encounter anything else!

https://github.com/zyndagj/BSMAPz/releases/tag/1.1.1

https://anaconda.org/zyndagj/bsmapz