nomad-coe/nomad

JOSS Review: `match_parser` does not work if `magic` is not installed

Closed this issue · 3 comments

In parsers.py, the _compressions variable remains undefined if python-magic (and some other imports) are not imported. However, _compressions is used on line 80 regardless and you get the following error until you install python-magic:

Cell In[1], line 5
      2 from nomad.client import parse, normalize_all
      4 # match and run the parser
----> 5 archive = parse('vasprun.xml')
      6 # run all normalizers
      7 normalize_all(archive)

File ~/anaconda3/envs/nomad/lib/python3.9/site-packages/nomad/client/processing.py:49, in parse(mainfile_path, parser_name, backend_factory, strict, logger)
     47 else:
     48     mainfile_path = os.path.abspath(mainfile_path)
---> 49     parser = parsers.match_parser(mainfile_path, strict=strict)
     50     if isinstance(parser, parsing.MatchingParser):
     51         parser_name = parser.name

File ~/anaconda3/envs/nomad/lib/python3.9/site-packages/nomad/parsing/parsers.py:71, in
 match_parser(mainfile_path, strict)
     68     return None
     70 with open(mainfile_path, 'rb') as f:
---> 71     compression, open_compressed = _compressions.get(f.read(3), (None, open))
     73 with open_compressed(mainfile_path, 'rb') as cf:  # type: ignore
     74     buffer = cf.read(config.parser_matching_size)

NameError: name '_compressions' is not defined
TLCFEM commented

magic is needed to determine file types. It should be installed with parsing as in

'python-magic==0.4.24',

Since you are running on a win machine, is libmagic available such that pythoin-magic is successfully installed? Just FYI, win is not officially supported/maintained, there could be more errors here and there.

This was actually on Ubuntu. I'll dig into what happened here and report back, but either way I think the logic is still broken regardless because _compressions remains undefined if the try/except block goes to except.

I started using the main branch instead of an older version and this issue no longer remains.