biolink/ontobio

Unknown GAF qualifier/relation breaks parser

dustine32 opened this issue · 1 comments

The GafParser.to_association() function is failing while attempting to parse this line:

UniProtKB       P0AFD6  nuoI    Contributes_to  GO:0003954      PMID:3122832    IDA             F                       gene    taxon:83333     20080722        EcoliWiki

Stack trace:

  File "/Users/ebertdu/go/go-site/pipeline/env/lib/python3.6/site-packages/ontobio/io/assocparser.py", line 521, in association_generator
    parsed_result = self.parse_line(line)
  File "/Users/ebertdu/go/go-site/pipeline/env/lib/python3.6/site-packages/ontobio/io/gafparser.py", line 181, in parse_line
    parsed = to_association(list(vals), report=self.report, qualifier_parser=self.qualifier_parser(), bio_entities=self.bio_entities)
  File "/Users/ebertdu/go/go-site/pipeline/env/lib/python3.6/site-packages/ontobio/io/gafparser.py", line 397, in to_association
    qualifiers = [association.Curie.from_str(curie_util.contract_uri(relations.lookup_label(q), strict=False)[0]) for q in qualifiers]
  File "/Users/ebertdu/go/go-site/pipeline/env/lib/python3.6/site-packages/ontobio/io/gafparser.py", line 397, in <listcomp>
    qualifiers = [association.Curie.from_str(curie_util.contract_uri(relations.lookup_label(q), strict=False)[0]) for q in qualifiers]
  File "/Users/ebertdu/go/go-site/pipeline/env/lib/python3.6/site-packages/prefixcommons/curie_util.py", line 113, in contract_uri
    if (uri.startswith(v)):
AttributeError: 'NoneType' object has no attribute 'startswith'

Failing line:

qualifiers = [association.Curie.from_str(curie_util.contract_uri(relations.lookup_label(q), strict=False)[0]) for q in qualifiers]

For the Contributes_to case above, the next few lines are already setup to catch and report it, but the code dies before it can.

parsed_qualifiers = qualifier_parser.validate(gaf_line[3])
if not parsed_qualifiers.valid:
report.error(source_line, Report.INVALID_QUALIFIER, parsed_qualifiers.original, parsed_qualifiers.message, taxon=gaf_line[TAXON_INDEX], rule=1)
return assocparser.ParseResult(source_line, [], True, report=report)

I believe simply moving this code above the list comprehension line should fix this for us.

kltm commented

@dustine32 As we no longer have this testable upstream and you have tests and the code is live, I'm just going to call this closed for now--please reopen if I'm mistaken.