zyndagj/BSMAPz

methratio.py 'S not a valid CIGAR character'

Opened this issue · 5 comments

When I run this script on my bam file I got ValueError and script ends with the comment:
'S not a valid CIGAR character'. The script only operates on M|I|D characters and crashes on any other however S|H|X are also allowed by sam file format.
https://drive5.com/usearch/manual/cigar.html
I did some prints inside and my seq and cigar looks like below e.g.
seq: ATCCCAACAACACTCCAACCTCAACATAAACCAACCCCAACATAAACCAACCCCAACATAAACCTACCTCAACAT
cigar: 43M32S

You are correct, my methratio.py parser should be handling those characters.

https://github.com/zyndagj/BSMAPz/blob/master/methratio.py#L431

I'll work on modifying that script to correctly support those additional cases after the new year.

I can implement the fix however I am not so sure how to treat them. So far for my own purpose I ignore them but do not think it is a best solution. What do you think about these unmatched readings? Sorry for mess with reopening.

The original version ignored them and assumed they only existed at the end of a read. I don't currently have time to implement a fix, but am happy to accept a pull request if you want to give it a try.

Hi, sorry to bother with this issue, but I'm finding the same problem. I do not unfortunately have the knowledge to implement the fix. Is there any plan to update the script in the near future? I'm aware this is probably not the best moment at all given the covid-19 crisis, but just asking. Apologies and thank you very much for the work to update BSMAP and help.

Hi, any updates on this issue? I'm running into the same problem. Thanks for keeping BSMAP alive!