biocommons/hgvs

Support for circular genomic identifiers

mbiokyle29 opened this issue · 4 comments

I am currently getting a parser error when using the o. genomic DNA identifier (from https://varnomen.hgvs.org/bg-material/refseq/). The error can be reproduced with the following code:

from hgvs.easy import parse

# parse a variant from a plasmid using the circular genome
parse('plasmid:o.10A>G')
---------------------------------------------------------------------------
ParseError                                Traceback (most recent call last)
~/Dev/project/.venv/lib/python3.7/site-packages/hgvs/parser.py in rule_fxn(s)
    124                 try:
--> 125                     return self._grammar(s).__getattr__(rule_name)()
    126                 except ometa.runtime.ParseError as exc:

~/Dev/project/.venv/lib/python3.7/site-packages/parsley.py in invokeRule(*args, **kwargs)
     97                                      [["message", "expected EOF"]], err.trail)
---> 98             raise err
     99         return invokeRule

~/Dev/project/.venv/lib/python3.7/site-packages/parsley.py in invokeRule(*args, **kwargs)
     84             try:
---> 85                 ret, err = self._grammar.apply(name, *args)
     86             except ParseError as e:

~/Dev/project/.venv/lib/python3.7/site-packages/ometa/runtime.py in apply(self, ruleName, *args)
    461         if r is not None:
--> 462             val, err = self._apply(r, ruleName, args)
    463             return val, err

~/Dev/project/.venv/lib/python3.7/site-packages/ometa/runtime.py in _apply(self, rule, ruleName, args)
    494                 memoRec = self.input.setMemo(ruleName,
--> 495                                          [rule(), self.input])
    496             except ParseError as e:

/pymeta_generated_code/pymeta_grammar__Grammar.py in rule_hgvs_variant(self)
     37                 return (_G_apply_12, self.currentError)
---> 38             _G_or_13, lastError = self._or([_G_or_1, _G_or_3, _G_or_5, _G_or_7, _G_or_9, _G_or_11])
     39             self.considerError(lastError, 'hgvs_variant')

~/Dev/project/.venv/lib/python3.7/site-packages/ometa/runtime.py in _or(self, fns)
    603                 self.input = m
--> 604         raise joinErrors(errors)
    605

ParseError:
plasmid:o.10A>G
        ^
Parse error at line 1, column 8: expected one of 'c', 'g', 'm', 'n', 'p', or 'r'. trail: [p_variant hgvs_variant]


During handling of the above exception, another exception occurred:

HGVSParseError                            Traceback (most recent call last)
~/Dev/project/.venv/lib/python3.7/site-packages/hgvs/shell.py in <module>
----> 1 parse('plasmid:o.10A>G')

~/Dev/project/.venv/lib/python3.7/site-packages/hgvs/parser.py in parse(self, v)
    105
    106         """
--> 107         return self.parse_hgvs_variant(v)
    108
    109     def _expose_rule_functions(self, expose_all_rules=False):

~/Dev/project/.venv/lib/python3.7/site-packages/hgvs/parser.py in rule_fxn(s)
    126                 except ometa.runtime.ParseError as exc:
    127                     raise HGVSParseError(
--> 128                         "{s}: char {exc.position}: {reason}".format(s=s, exc=exc, reason=exc.formatReason())
    129                     )
    130

HGVSParseError: plasmid:o.10A>G: char 8: expected one of 'c', 'g', 'm', 'n', 'p', or 'r'

Seems like o. is not supported by the package. This is not a huge deal, but thought it was worth opening an issue. I'd be interested in providing a PR assuming I could get a little bit of guidance on the changes required. Seems like at a bare minimum I would need too:

I am guessing there are other things that I am not aware of. Either way thanks for all the hard work on this package. It has been great to work with, the shell and .easy module are both great!