Pickling error when multiprocessing
Closed this issue · 2 comments
When I tried to use fugashi for multiprocessing, I got the following error.
File "stringsource", line 2, in fugashi.fugashi.GenericTagger.reduce_cython
self.c_tagger cannot be converted to a Python object for pickling
The Tagger
object is a C++ managed object and thus can't be pickled - pickling only works by default for pure Python objects. It's possible for fugashi to add pickling support but because it's pretty easy to just recreate the Tagger I haven't done that before. If someone wants to submit a PR I would be open to it if it's not too complex.
Note that this has nothing to do with multiprocessing, and happens even in vanilla use cases.
If you want a work around for your specific use case, I recommend saving the Tagger args you're using (if any) and creating a tagger for each process.
The
Tagger
object is a C++ managed object and thus can't be pickled - pickling only works by default for pure Python objects. It's possible for fugashi to add pickling support but because it's pretty easy to just recreate the Tagger I haven't done that before. If someone wants to submit a PR I would be open to it if it's not too complex.Note that this has nothing to do with multiprocessing, and happens even in vanilla use cases.
If you want a work around for your specific use case, I recommend saving the Tagger args you're using (if any) and creating a tagger for each process.
Thank you for your explanation and clarification. I've already used that workaround. It's OK to close this issue.