XENONnT/utilix

JSON decode error

Closed this issue · 4 comments

While running 200 reprocessing jobs, some are getting this error:

Traceback:

File "/home/angevaare/software/straxen/straxen/__init__.py", line 6, in <module>
    from utilix import uconfig
  File "/home/angevaare/software/utilix/utilix/__init__.py", line 7, in <module>
    db = DB()
  File "/home/angevaare/software/utilix/utilix/rundb.py", line 208, in __init__
    token = Token(token_path)
  File "/home/angevaare/software/utilix/utilix/rundb.py", line 122, in __init__
    json_in = json.load(f)
  File "/home/angevaare/software/Miniconda3/envs/strax/lib/python3.6/json/__init__.py", line 299, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/home/angevaare/software/Miniconda3/envs/strax/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/home/angevaare/software/Miniconda3/envs/strax/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/angevaare/software/Miniconda3/envs/strax/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Software:

	Python 3.6.9 at /home/angevaare/software/Miniconda3/envs/strax/bin/python
	Strax 0.12.4 at /home/angevaare/software/strax/strax
	Straxen 0.12.2 at /home/angevaare/software/straxen/straxen

Thanks for reporting @jorana. My guess an issue with too many jobs trying to read the same file at the same time? I've never had this issue. I will try debugging sometime this weekend but I can't promise to have a solution today. Does resubmitting the jobs work for temporary fix? Or maybe adding sleep statements to the jobs?

Sure, it were 14/200 jobs, just did a resubmit to check if it helps. No rush, can certainty wait till monday but should at some point look into it.

I'm hitting this error again on OSG so just want to remind myself. We should add a sleep+retry for this type of error.

This is long solved 👍