[BUG] spacy_ann create_index failed
tungwini opened this issue · 1 comments
Describe the bug
I tried to run Tutorial from here, but
spacy_ann create_index en_core_web_md kb_dir/ models_dir/ failed with the following error. File "/opt/conda/lib/python3.7/site-packages/spacy_ann/candidate_generator.py", line 336, in <lambda> p.with_suffix(".json"), self.short_aliases File "/opt/conda/lib/python3.7/site-packages/srsly/_json_api.py", line 74, in write_json json_data = json_dumps(data, indent=indent) File "/opt/conda/lib/python3.7/site-packages/srsly/_json_api.py", line 26, in json_dumps result = ujson.dumps(data, indent=indent, escape_forward_slashes=False) TypeError: {'NLP', 'OS', 'ML'} is not JSON serializable
Tracing the code, it passes a list toe json_dumps()
To Reproduce
Follow steps in Tutorial from here,
- pip install spacy-ann-linker
- spacy_ann example_data ./kb_dir
- spacy download en_core_web_md
- spacy_ann create_index en_core_web_md kb_dir/ models_dir/ failed with the following error.
It outputs:
================================= Load Model =================================
⠙ Loading model en_core_web_md
============================ Apply EntityEncoder ============================
⠙ Applying EntityEncoder to descriptions
============================== Create ANN Index ==============================
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
Traceback (most recent call last):
File "/opt/conda/bin/spacy_ann", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/spacy_ann/cli/init.py", line 24, in main
typer.run(commands[command])
File "/opt/conda/lib/python3.7/site-packages/typer/main.py", line 855, in run
app()
File "/opt/conda/lib/python3.7/site-packages/typer/main.py", line 214, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.7/site-packages/spacy_ann/cli/create_index.py", line 113, in create_index
nlp.to_disk(output_dir)
File "/opt/conda/lib/python3.7/site-packages/spacy/language.py", line 927, in to_disk
util.to_disk(path, serializers, exclude)
File "/opt/conda/lib/python3.7/site-packages/spacy/util.py", line 681, in to_disk
writer(path / key)
File "/opt/conda/lib/python3.7/site-packages/spacy/language.py", line 925, in
serializers[name] = lambda p, proc=proc: proc.to_disk(p, exclude=["vocab"])
File "/opt/conda/lib/python3.7/site-packages/spacy_ann/ann_linker.py", line 199, in to_disk
self.cg.to_disk(path)
File "/opt/conda/lib/python3.7/site-packages/spacy_ann/candidate_generator.py", line 347, in to_disk
to_disk(path, serializers, {})
File "/opt/conda/lib/python3.7/site-packages/spacy/util.py", line 681, in to_disk
writer(path / key)
File "/opt/conda/lib/python3.7/site-packages/spacy_ann/candidate_generator.py", line 336, in
p.with_suffix(".json"), self.short_aliases
File "/opt/conda/lib/python3.7/site-packages/srsly/_json_api.py", line 74, in write_json
json_data = json_dumps(data, indent=indent)
File "/opt/conda/lib/python3.7/site-packages/srsly/_json_api.py", line 26, in json_dumps
result = ujson.dumps(data, indent=indent, escape_forward_slashes=False)
TypeError: {'NLP', 'OS', 'ML'} is not JSON serializable
- But I expected it to output:
...
Fitting ann index took xxx seconds
Environment
-
OS: [e.g. Linux / Windows / macOS]
Amazon Linux AMI 2018.03 -
Python version
Python 3.7.10
Additional context
spacy-ann-linker==0.3.3
spacy==2.3.7
nmslib==2.0.5
scikit-learn==0.21.3
srsly==1.0.5
This seems to be resolvable by running pip install srsly==2.0.0
as per this comment in Issue #6