uqfoundation/dill

pydantic>=2.5 classes can't be serialized

isidentical opened this issue · 4 comments

import dill
from pydantic import BaseModel, Field


dill.settings["recurse"] = True


class Input(BaseModel):
    prompt: str = Field(
        ..., title="Prompt", description="The prompt to use for the completion."
    )
    num_inference_steps: int = Field(
        default=25,
        ge=20,
        le=100,
        title="Number of Inference Steps",
        description="The number of inference steps to take for each prompt.",
    )


Input2 = dill.loads(dill.dumps(Input))
print(Input2(prompt="test", num_inference_steps=25))

same example works with cloudpickle

import cloudpickle
from pydantic import BaseModel, Field


class Input(BaseModel):
    prompt: str = Field(
        ..., title="Prompt", description="The prompt to use for the completion."
    )
    num_inference_steps: int = Field(
        default=25,
        ge=20,
        le=100,
        title="Number of Inference Steps",
        description="The number of inference steps to take for each prompt.",
    )


Input2 = cloudpickle.loads(cloudpickle.dumps(Input))
print(Input2(prompt="test", num_inference_steps=25))

I also encountered this issue. Not sure if the issue is on dill or pydantic side (even if pydantic.BaseModels can be serialized/deserialized with pickle and cloudpickle).

This the minimum code required to reproduce the error with dill==0.3.8, pydantic==2.7.0 and pydantic_core==2.18.1:

import dill
from pydantic import BaseModel


class MyModel(BaseModel):
    pass


dill.loads(dill.dumps(MyModel()))

and the error:

/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py:414: PicklingWarning: Cannot locate reference to <class '__main__.MyModel'>.
  StockPickler.save(self, obj, save_persistent_id)
/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py:414: PicklingWarning: Cannot pickle <class '__main__.MyModel'>: __main__.MyModel has recursive self-references that trigger a RecursionError.
  StockPickler.save(self, obj, save_persistent_id)
Traceback (most recent call last):
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/test_step_decorator.py", line 13, in <module>
    dill.loads(dill.dumps(MyModel()))
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 303, in loads
    return load(file, ignore, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 289, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 444, in load
    obj = StockUnpickler.load(self)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 593, in _create_type
    return typeobj(*args)
           ^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 93, in __new__
    private_attributes = inspect_namespace(
                         ^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 406, in inspect_namespace
    raise PydanticUserError(
pydantic.errors.PydanticUserError: A non-annotated attribute was detected: `model_fields = {}`. All model fields require a type annotation; if `model_fields` is not meant to be a field, you may be able to resolve this error by annotating it as a `ClassVar` or updating `model_config['ignored_types']`.

For further information visit https://errors.pydantic.dev/2.7/u/model-field-missing-annotation

This only occurs if the pydantic.BaseModel have been declared in __main__. If it's declared in another module, then everything works.

dill doesn't explicitly support pickling of pydantic classes, but I can help figure out if there's a patch to be applied in dill (due to something in the standard library), or in pydantic, or elsewhere.

If earlier versions of dill, pydantic, etc had serialized a BaseModel instance, then one easy thing to do is to walk back over commits and see which commit corresponds to the change in behavior. Also, dill provides a serialization traceback, that traces the recursive pickling process... so it's helpful to debug a failure to serialize with dill.detect.trace(True). I can help decipher what the trace is telling you.

Thanks for the info @mmckerns. I'll do some more tests and try to figure out what's happening.

Also facing this issue — we use pydantic for our run configs in ML experiments, and pickling is essential since we do distributed training. Any updates would be super helpful! Thank you