evidens/json2csv

"ValueError: end is out of bounds" and "ValueError: Unterminated string..." when run in subprocess.call, but NOT subprocess.Popen

Michael-B-G opened this issue · 1 comments

Hi,

So, I have a large Python program I'm writing that calls both gen_outline.py and json2csv.py as initial steps. Following this, I call Rufus Pollock's "csv2sqlite.py" (https://github.com/rgrp/csv2sqlite) to get the data in SQLITE format. In order to invoke these Python module from within my larger program I use the "subprocess" Python library. Up until now I was using Popen, but soon found I was running into race conditions (rufuspollock/csv2sqlite#20), After doing some research, I decided to either stick to subprocess.call(), or at least invoke Popen.wait(). While these solutions appear to solve the race condition, they unfortunately present another error, albeit a non-fatal one.

Basically, my Python program calls gen_outline.py and json2csv.py each four times, each one corresponding to a separate JSON file. The resulting stack trace is as follows, which again only manifests when I invoke Popen.wait() (or use the call() command instead), and did not occur before when I simply spawned a new process (but risked a race condition). I am not sure what the problem is, but I would very much appreciate help. Thanks in advance!

Traceback (most recent call last):
File "gen_outline.py", line 86, in
main()
File "gen_outline.py", line 75, in main
outline = make_outline(args.json_file, args.each_line, args.collection)
File "gen_outline.py", line 50, in make_outline
key_map = gather_key_map(iterator)
File "gen_outline.py", line 31, in gather_key_map
for d in iterator:
File "gen_outline.py", line 25, in coll_iter
data = json.load(f)
File "C:\Python278\lib\json__init__.py", line 290, in load
*kw)
File "C:\Python278\lib\json__init
_.py", line 338, in loads
return default_decoder.decode(s)
File "C:\Python278\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=w(s, 0).end())
File "C:\Python278\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: end is out of bounds
Traceback (most recent call last):
File "gen_outline.py", line 86, in
main()
File "gen_outline.py", line 75, in main
outline = make_outline(args.json_file, args.each_line, args.collection)
File "gen_outline.py", line 50, in make_outline
key_map = gather_key_map(iterator)
File "gen_outline.py", line 31, in gather_key_map
for d in iterator:
File "gen_outline.py", line 25, in coll_iter
data = json.load(f)
File "C:\Python278\lib\json__init
.py", line 290, in load
*kw)
File "C:\Python278\lib\json__init
_.py", line 338, in loads
return default_decoder.decode(s)
File "C:\Python278\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=w(s, 0).end())
File "C:\Python278\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Unterminated string starting at: line 1 column 200693 (char 200692)
Traceback (most recent call last):
File "gen_outline.py", line 86, in
main()
File "gen_outline.py", line 75, in main
outline = make_outline(args.json_file, args.each_line, args.collection)
File "gen_outline.py", line 50, in make_outline
key_map = gather_key_map(iterator)
File "gen_outline.py", line 31, in gather_key_map
for d in iterator:
File "gen_outline.py", line 25, in coll_iter
data = json.load(f)
File "C:\Python278\lib\json__init
.py", line 290, in load
*kw)
File "C:\Python278\lib\json__init
_.py", line 338, in loads
return default_decoder.decode(s)
File "C:\Python278\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=w(s, 0).end())
File "C:\Python278\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Unterminated string starting at: line 1 column 57339 (char 57338)
Traceback (most recent call last):
File "gen_outline.py", line 86, in
main()
File "gen_outline.py", line 75, in main
outline = make_outline(args.json_file, args.each_line, args.collection)
File "gen_outline.py", line 50, in make_outline
key_map = gather_key_map(iterator)
File "gen_outline.py", line 31, in gather_key_map
for d in iterator:
File "gen_outline.py", line 25, in coll_iter
data = json.load(f)
File "C:\Python278\lib\json__init
.py", line 290, in load
*kw)
File "C:\Python278\lib\json__init
_.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python278\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python278\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: end is out of bounds

I think I may have solved this. Not all instances of invoking "popen" within my code were configured properly in terms of their stdout and stderr parameters. I have corrected this, and it appears to work well, though I will update this if anything goes wrong again.

Thanks!