No calls in window
JesseBNL opened this issue · 7 comments
Dear,
The tool worked great for me, but for one sample it is not working. This is what I get:
`methplotlib -m meth_calls2_LILBR1.tsv meth_freqs2_LILRB1.tsv -n Calls Frequencies -w "19:5440600-54424828" -o lilrb1.html
Reading meth_calls2_LILBR1.tsv would be faster with bgzip and tabix.
Please index with 'tabix -S1 -s1 -b3 -e4'.
/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py:1649: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
result = method(y)
Problem parsing nanopolish file meth_calls2_LILBR1.tsv!
Could it be that there are no calls in your selected window?
Detailed error:
Error processing meth_calls2_LILBR1.tsv!
Detailed error:
Traceback (most recent call last):
File "/home/algemeen/anaconda3/bin/methplotlib", line 10, in
sys.exit(main())
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/methplotlib.py", line 17, in main
meth_data = get_data(args.methylation, args.names, window, args.smooth)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 30, in get_data
return [read_meth(f, n, window, smoothen) for f, n in zip(methylation_files, names)]
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 30, in
return [read_meth(f, n, window, smoothen) for f, n in zip(methylation_files, names)]
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 47, in read_meth
return parse_nanopolish(filename, file_type, name, window, smoothen=smoothen)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 101, in parse_nanopolish
gr.pos = np.floor(gr.drop().df[["Start", "End"]].mean(axis=1))
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2934, in getitem
raise_missing=True)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer
return self._get_listlike_indexer(obj, axis, **kwargs)[1]
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Index(['Start', 'End'], dtype='object')] are in the [columns]"`
It says there are no calls in the selected window, but when I check my called tsk file, there are calls in the window: see attachment for a sample.
What could go wrong here?
Dag Jesse,
I wanted to replicate your error, but when opening your file things look a little weird (see below), is everything looking alright on your end? Could you perhaps email the full file?
As methplotlib also told you things would be an enormous lot faster with tabix indexing :-)
Cheers,
Wouter
Hi Wouter,
Thank you for the fast reply. I have added the nanopolish output files for you.
meth_calls2_LILBR1.tsv.gz
meth_freqs2_LILRB1.tsv.gz
Where can I find a tool to convert to tabix? This data set is not too big, so Methplotlib was fast anyways.
You would have to sort and compress your tsv files with bgzip (e.g. cat <(head -n1 meth_calls2_LILBR1.tsv) <(tail -n +2 meth_calls2_LILBR1.tsv | sort -k2,2 -k3,3) | bgzip > meth_freqs2_LILRB1.tsv.gz
), and then just run tabix -S1 -s1 -b3 -e4
for calls or tabix -S1 -s1 -b2 -e3
for frequencies.
If you don't have tabix already you can install it with conda (https://anaconda.org/bioconda/tabix).
But the cause of the error you saw is this: you specified a very long interval: "19:5440600-54424828" and I think you want 19:54406000-54424828.
This leads to the error because methplotlib will start chopping up your interval in chunks if it's too long to plot in one go. The first interval from your very large window would then be 19:5440600-6440279, which is indeed empty, and thus throws the error.
I think it would make more sense if methplotlib then just skips ahead to the next sub-interval, but that's a bit hard in how I wrote the code right now. I'll think about it :)
Thank you for the response. I will try with the tabix files.
The interval issue didnt solve the problem:
methplotlib -m meth_calls2_LILBR1.tsv meth_freqs2_LILRB1.tsv -n Calls Frequencies -w "19:54406000-54424828" -o lilrb1_new.html
Reading meth_calls2_LILBR1.tsv would be faster with bgzip and tabix.
Please index with 'tabix -S1 -s1 -b3 -e4'.
/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py:1649: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
result = method(y)
Problem parsing nanopolish file meth_calls2_LILBR1.tsv!
Could it be that there are no calls in your selected window?
Detailed error:
Error processing meth_calls2_LILBR1.tsv!
Detailed error:
Traceback (most recent call last):
File "/home/algemeen/anaconda3/bin/methplotlib", line 10, in
sys.exit(main())
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/methplotlib.py", line 17, in main
meth_data = get_data(args.methylation, args.names, window, args.smooth)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 30, in get_data
return [read_meth(f, n, window, smoothen) for f, n in zip(methylation_files, names)]
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 30, in
return [read_meth(f, n, window, smoothen) for f, n in zip(methylation_files, names)]
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 47, in read_meth
return parse_nanopolish(filename, file_type, name, window, smoothen=smoothen)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 101, in parse_nanopolish
gr.pos = np.floor(gr.drop().df[["Start", "End"]].mean(axis=1))
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2934, in getitem
raise_missing=True)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer
return self._get_listlike_indexer(obj, axis, **kwargs)[1]
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "/home/algemeen/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Index(['Start', 'End'], dtype='object')] are in the [columns]"
Aha, I identified another issue that I hadn't encountered in my testing. It appears that using "19" as a chromosome (where I commonly have chr19) caused some unforeseen problems. These should now be fixed in methplotlib 0.18.1.
Aha, I identified another issue that I hadn't encountered in my testing. It appears that using "19" as a chromosome (where I commonly have chr19) caused some unforeseen problems. These should now be fixed in methplotlib 0.18.1.
This fixed it indeed, thank you!
Good to hear! Please let me know if you encounter further issues.