webis-de/small-text

active_learner.save('active_leaner.pkl'), can't pickle _abc_data objects

Closed this issue · 8 comments

Hi,

I've trained an active_learner object, now trying to save it to file.

According to the doc: https://small-text.readthedocs.io/en/latest/patterns/serialization.html
active_learner.save('active_leaner.pkl') should work but I get the following error:

TypeError                                 Traceback (most recent call last)
<ipython-input-79-3c088eb07e76> in <module>()
      1 
----> 2 active_learner.save(f"{DIR}/results/active_leaner.pkl")

22 frames
/usr/lib/python3.7/pickle.py in save(self, obj, save_persistent_id)
    522             reduce = getattr(obj, "__reduce_ex__", None)
    523             if reduce is not None:
--> 524                 rv = reduce(self.proto)
    525             else:
    526                 reduce = getattr(obj, "__reduce__", None)

TypeError: can't pickle _abc_data objects

I can extract the transformer model and save that instead using active_learner.classifier.model.save_pretrained(f"{directory}") but not using active_learner.save()

Hi,

thanks for reporting this issue. Unfortunately, I have not been able to reproduce the problem so far.

Which Python version are you using (which minor version 3.7.x exactly)? And could you check which version of the dill library is used? (Obtainable via pip freeze or python3 -c "import dill; print(dill.__version__)".)

This could be related to uqfoundation/dill#332.

Hi,

I resolved the issue with updating dill library.

Thanks!

Thanks for the feedback! Could you tell me which dill version had been installed before? If so, I could adapt the requirements to prevent such problems in the future. I tried multiple dill versions on my own but never got the observed TypeError.

Edit: Maybe not only dill, but also the versions of torch and transformers would be helpful to reproduce this issue.

It was version 0.3.3, now using 0.3.4.

Still no success in getting this exact TypeError, but this example here also fails for me:
https://stackoverflow.com/questions/60583118/serialize-subclass-of-abstract-class-in-python-3-7-1

Also, I could not figure out why neither my test cases, nor my active learning script raised any error.

There is a chance that this might be fixed in the dill library soon.
uqfoundation/dill#427

I will leave this open for now, as this will require fixing or updating the dependency soon.

Update:
The example linked above still fails with dill==0.3.5.1.

Probably this is the relevant issue:
uqfoundation/dill#450

Seems to work with dill==0.3.7 (published just 3 days ago; 0.3.6 did not work either). Now, this can be fixed after all that time 😎.

This should be resolved now. If anyone encounters this issue again, feel free to reopen.