BUG: record_api.line_counts raises TypeError on unexpected keyword arguments to dump()
datapythonista opened this issue · 5 comments
I'm getting the next error when calling record_api.line_counts
. I'm using version 1.1.1
.
/tmp/record_api_results.jsonl
contains the output generated by python -m record_api
, here there is a sample:
$ head -n 5 record_api_results.jsonl
{"location":"/api_stats/scripts/10002306.py:22","function":{"t":"builtin_function_or_method","v":"getattr"},"params":{"args":[{"t":"module","v":"pandas"},"read_csv"]}}
{"location":"/api_stats/scripts/10002306.py:22","function":{"t":"function","v":{"module":"pandas.io.parsers","name":"_make_parser_function.<locals>.parser_f"}},"bound_params":{"pos_or_kw":[["filepath_or_buffer","../input/train.csv"],["index_col",null]]}}
{"location":"/api_stats/scripts/10002306.py:23","function":{"t":"builtin_function_or_method","v":"getattr"},"params":{"args":[{"t":{"module":"pandas.core.frame","name":"DataFrame"}},"shape"]}}
{"location":"/api_stats/scripts/10002306.py:25","function":{"t":"method","v":{"self":{"t":{"module":"pandas.core.frame","name":"DataFrame"}},"name":"head"}},"bound_params":{}}
{"location":"/api_stats/scripts/10002306.py:27","function":{"t":"builtin_function_or_method","v":"getattr"},"params":{"args":[{"t":"module","v":"pandas"},"read_csv"]}}
Call to line_counts
:
export PYTHON_RECORD_API_INPUT=/tmp/record_api_results.jsonl
export PYTHON_RECORD_API_OUTPUT=/tmp/record_api_results_line_counts.jsonl
python -m record_api.line_counts
Result:
Counting lines...
reading /tmp/record_api_results.jsonl: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 478985/478985 [00:02<00:00, 217632.24it/s]
writing: 0%| | 0/13169 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/mgarcia/miniconda3/envs/pydata/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/mgarcia/miniconda3/envs/pydata/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/mgarcia/miniconda3/envs/pydata/lib/python3.7/site-packages/record_api/line_counts.py", line 42, in <module>
__main__()
File "/home/mgarcia/miniconda3/envs/pydata/lib/python3.7/site-packages/record_api/line_counts.py", line 38, in __main__
write(row_)
File "/home/mgarcia/miniconda3/envs/pydata/lib/python3.7/site-packages/record_api/jsonl.py", line 45, in write_line
buffer.write(orjson.dumps(o, **kwargs))
TypeError: dumps() got an unexpected keyword argument
Do you have the same orjson
version? I could pin the later one if that's the issue:
$ pip freeze | grep orjson
orjson==3.0.1
I had orjson==3.2
, but I downgraded to 3.0.1
and still the same problem.
There is something very weird I don't understand:
Python 3.7.6 (default, Jan 8 2020, 19:59:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import orjson
>>> orjson.dumps({})
b'{}'
>>> orjson.dumps({}, *[])
b'{}'
>>> orjson.dumps({}, *[], **{})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: dumps() got an unexpected keyword argument
For now I'm going to remove the kwargs
from the call, since it seems that in this case it's not used anyway.
That looks really weird... We do use it one place to change the default
kwarg passed in to change how some things I serialized I believe...
That seems like some sort of Python bug? Or an issue with your python install? I don't understand how f()
and f(**{})
would be different.
I upgraded to the latest Python version, and it's working. It's weird that Python has this bug, but who knows... Closing, it works now.