calls file not recognised
Coracollar opened this issue · 11 comments
Hi, I've been using methplotlib for representing my methylation frequencies. However I am having problems for representing the calls. The file is in the nanopolish call format and does not seem to have any problems when I check it (I tried with more than one calls file)
$ methplotlib -m /data/cephfs/punim1048/allbarcoded/CORT/CORT13/methylation_chromosomes/chr10_methylation_calls.tsv -n CORT13 -w ch10:63273264-63314576 -g /data/cephfs/punim1048/GRCm38_genome/annotation/mm10.ensGene.gtf --simplify
Input file /data/cephfs/punim1048/allbarcoded/CORT/CORT13/methylation_chromosomes/chr10_methylation_calls.tsv not recognized!
Detailed error:
Traceback (most recent call last):
File "/home/coracollar/anaconda3/envs/methplot/bin/methplotlib", line 10, in
sys.exit(main())
File "/home/coracollar/anaconda3/envs/methplot/lib/python3.7/site-packages/methplotlib/methplotlib.py", line 18, in main
meth_data = get_data(args.methylation, args.names, window, args.smooth)
File "/home/coracollar/anaconda3/envs/methplot/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 165, in get_data
return [read_meth(f, n, window, smoothen) for f, n in zip(methylation_files, names)]
File "/home/coracollar/anaconda3/envs/methplot/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 165, in
return [read_meth(f, n, window, smoothen) for f, n in zip(methylation_files, names)]
File "/home/coracollar/anaconda3/envs/methplot/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 32, in read_meth
return parse_nanopolish(filename, file_type, name, window, smoothen=smoothen)
File "/home/coracollar/anaconda3/envs/methplot/lib/python3.7/site-packages/methplotlib/import_methylation.py", line 51, in parse_nanopolish
gr.pos = np.floor(gr.drop().df[["Start", "End"]].mean(axis=1))
File "/home/coracollar/.local/lib/python3.7/site-packages/pandas/core/frame.py", line 2806, in getitem
indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
File "/home/coracollar/.local/lib/python3.7/site-packages/pandas/core/indexing.py", line 1552, in _get_listlike_indexer
keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
File "/home/coracollar/.local/lib/python3.7/site-packages/pandas/core/indexing.py", line 1639, in _validate_read_indexer
raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['Start', 'End'], dtype='object')] are in the [columns]"
Do you think it would be possible to share that file with me to debug?
There is a column of index numbers in the file which shouldn't be there. You can fix it with:
zcat chr10_methylation_calls.tsv.gz | cut -f2- | gzip > fixed_calls.tsv.gz
The output of that command worked for me with methplotlib.
Did you do any manual manipulations to that file or is it straight from nanopolish? If so, which version did you use?
-w ch10:6,000-7,000
is that a typo here on GitHub or did you use chr10?
But did you do -w ch10 or -w chr10? Which command exactly caused the problem?
Thanks for following up! That sounds likely to be the issue, but methplotlib should give you a better error message there. I'll take care of it.
Thanks for your patience, methplotlib should now give you a better error message (in version 0.14.1).
Please let me know if you encounter further issues.