Deep dive on using Django REST Framework ModelSerializer to read, create and update model relations
Postman is used for developing the APIs.
In Part I, we are going to set up a few Django models for us to play with, and then set up a couple of REST APIs using ModelSerializers; nothing special going on here if you are already familiar with how to use DRF generic views.
In the center of our ERD, the VehicleModel
model has the following relations:
- One-to-one relation with
Project
- Many-to-one (foreign key) relation with
Manufacturer
- Many-to-one (foreign key) relation with
VehicleModel
(itself) - Many-to-many relation with
Engine
As well as the following Django reverse relations:
- Many-to-one (foreign key) relation from
Vehicle
- Many-to-many relation from
Engineer
See carmaker.models for details on the model set up.
For this part, I've also set up a couple of views using Django REST Framework generic views with ModelSerializers. Because I want to expand on the APIs later, I'm putting the APIs in Part I in their own module called api_1.
With a standard ModelSerializer
for VehicleModel
such as:
class VehicleModelSerializer(ModelSerializer):
class Meta:
model = VehicleModel
fields = "__all__"
and a standard ListCreateAPIView
:
class VehicleModelListCreateView(ListCreateAPIView):
permission_classes = [AllowAny]
queryset = VehicleModel.objects.all()
serializer_class = VehicleModelSerializer
We can start testing these APIs.
We can now call the listing endpoint using:
curl --location --request GET 'http://localhost:8000/api_1/'
you can also use this Postman example .
and get the following response:
[
{
"id": 1,
"model": "Buick D-35 Roadster",
"year": 1917,
"project": 1,
"maker": 1,
"predecessor": null,
"engine_options": [
1,
2
]
}
]
We can make the following observations:
TLDR DRF
ModelSerializer
's basic read behavior:
- Returns Django model attributes as they are defined on ORM model, including auto fields
- Returns value of appropriate type as defined by the ORM model field
- Includes all relations declared on the ORM model
- The related instances are returned as their primary keys
- Does not include any Django reverse relations
Similarly, we can call the same endpoint to create a new instance:
curl --location --request POST 'http://localhost:8000/api_1/' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "Buick D-35 Roadster 2",
"year": 1919,
"project": null,
"maker": 1,
"predecessor": 1,
"engine_options": [
1,
2
]
}'
you can also use this Postman example .
You can play around with this request data, and try to include some additional related fields, and make the following observations:
TLDR DRF
ModelSerializer
's basic write behavior:
- Relations can be made using existing related instance primary key
- Django reverse relations are ignored
- Unrecognized and read-only attributes are ignored
Part II example code can be found in module api_2.
A common way of including related instance data is through nesting of the serializers:
class VehicleModelSerializer(ModelSerializer):
project = ProjectSerializer()
maker = ManufacturerSerializer()
engine_options = EngineSerializer(many=True)
vehicle_set = VehicleSerializer(many=True) # reverse relation 'vehicle_set'
engineers_responsible = EngineerSerializer(many=True, source="engineer_set") # reverse relation 'engineer_set'
class Meta:
model = VehicleModel
fields = "__all__"
With this serializer, we can hit
the listing endpoint
again. Note that the reverse relations will even work as
long as the reverse attribute name declared on the serializer matches what is on the ORM model, or
initialized with the source
argument pointing to a matching ORM model attribute.
You should see the response data containing a list of objects like the following:
{
"id": 1,
"project": {
"id": 1,
"code_name": "project-d-35-roadster"
},
"maker": {
"id": 1,
"name": "Buick"
},
"engine_options": [
{
"id": 1,
"name": "Model D Inline-4",
"displacement": 2.7
},
{
"id": 2,
"name": "Chevrolet Inline-4",
"displacement": 2.8
}
],
"vehicle_set": [
{
"id": 1,
"VIN": "A123456789",
"model": 1
}
],
"engineers_responsible": [
{
"id": 1,
"name": "Yoshida",
"works_on": [
1
]
}
],
"model": "Buick D-35 Roadster",
"year": 1917,
"predecessor": null
}
TLDR DRF nested serializer read behavior:
- Related instance nested serializer must be initialized with
many=True
if there are more than one instance expected- Django ORM reverse relations also work if matching model attribute name or specified by
source
argument
I think creating and updating related instance is probably not good RESTful design, but sometimes we may be asked to do so because the related model may be very small.
Interestingly, if we try to call the creation endpoint, DRF will complain about not sure what to do with the related model data.
To allow writing to related models, we will need to overwrite the VehicleModelSerializer.to_internal_value()
method
so that the related fields provided by the client is mapped back to related model instances:
class VehicleModelSerializer(ModelSerializer):
project = ProjectSerializer()
maker = ManufacturerSerializer()
engine_options = EngineSerializer(many=True)
vehicle_set = VehicleSerializer(many=True, read_only=True)
engineers_responsible = EngineerSerializer(many=True, source="engineer_set")
def to_internal_value(self, data):
new_data = super().to_internal_value(data)
new_data["maker"] = Manufacturer.objects.get(**new_data["maker"])
engine_options_q = Q()
for engine in new_data["engine_options"]:
engine_options_q |= Q(**engine)
new_data["engine_options"] = Engine.objects.filter(engine_options_q)
engineer_set_q = Q()
for engineer in new_data["engineer_set"]:
engineer_set_q |= Q(**engineer)
new_data["engineer_set"] = Engineer.objects.filter(engineer_set_q)
return new_data
and with this, we can call
the creation API
with the following data to create a new instance of VehicleModel
as
well
as the associated Project
instance, relations to existing Manufacturer
, Engine
and Engineer
instances will be
made:
{
"project": {
"code_name": "project-d-35-roadster-3"
},
"maker": {
"name": "Buick"
},
"engine_options": [
{
"name": "Model D Inline-4",
"displacement": 2.7
},
{
"name": "Chevrolet Inline-4",
"displacement": 2.8
}
],
"vehicle_set": [
{
"VIN": "A123456789",
"model": 1
}
],
"engineers_responsible": [
{
"name": "Yoshida"
}
],
"model": "Buick D-35 Roadster",
"year": 1917,
"predecessor": null
}
Note that in this example, we are only handling the fields where we want the API to retrieve existing related instances,
which are 'maker', 'engine_options', and 'engineers_responsible'. We did not handle any fields where the API may be
required to create new instances of related model such as 'project'. (Project
and VehicleModel
have one-to-one
relation, so we are pretending that one of the API requirements is that when creating a new VehicleModel
, also
create a Project
at the same time).
We could have created a new instance of Project
in .to_internal_value
method, but we elect not to do it here in case
there is validation issue with the data from client later on, and we wouldn't want a Project
instance to be created
before data validation completes. Instead, we will handle related object creation by overwriting the .create()
method:
class VehicleModelSerializer(ModelSerializer):
... # omitted, see above
def create(self, validated_data):
validated_data["project"] = Project.objects.create(**validated_data.pop("project"))
instance = super().create(validated_data)
return instance
Now, you may have noticed that the 'vehicle_set' attribute from the serializer is not being handled, because we have
initialized this field as a read-only field, therefore the data is stripped by the ModelSerializer.to_internal_value()
method. This would be based on the specific API design requirements.
Similar to the handling of creation, we may be asked to handle updating of the related model as well, it can be done in
very similar fashion by overwriting the .update()
method:
class VehicleModelSerializer(ModelSerializer):
... # omitted, see above
def update(self, instance, validated_data):
project_data = validated_data.pop("project", None)
if project_data is not None and instance.project.code_name != project_data["code_name"]:
instance.project.delete()
validated_data["project"] = Project.objects.create(**project_data)
return super().update(instance, validated_data)
Note that we are checking project_data is not None
, this may or may not be the desired behavior based on API design
requirement. You may be asked to delete the related Project
if the client passes null
value or empty string, or not
do anything with it, etc. Some requirements may not be good RESTful API design, but it can happen.
By now you might have noticed that you cannot update a VehicleModel
instance with the same Project
, but you can
update it to new Project
with different 'code name'. This is because Project
model's 'code name' field has unique
constraint, and .to_internal_value()
will validate client data against field level validation which includes ORM
model field arguments.
If for some reason you need to allow client to specify the Project
's 'code name' again in
the body (such as in the case of a 'PUT' request where you are essentially replacing the current instance), you may
consider bypassing some of the data from super().to_internal_value()
:
class VehicleModelSerializer(ModelSerializer):
...
def to_internal_value(self, data):
project_data = data.pop("project", None)
new_data = super().to_internal_value(data)
if project_data:
new_data["project"] = project_data
...
return new_data
Note the project data is popped from data
to circumvent being pushed into super().to_internal_value()
, and then
added back to the new_data
before return.
Another key thing to note is that if you put a debug breakpoint inside of the .to_internal_value()
method, you would
see that the data
passed in has been changed by super().to_internal_value(data)
(aka new_data
in our example).
While data
has key 'engineers_responsible' which is how the client passes in, new_data
change it to key
'engineer_set'. This is one of DRF ModelSerializer.to_internal_value()
method's responsibility: to map raw data to
ORM field data.
Some takeaways:
TLDR DRF nested serializer write behavior:
.to_internal_value()
method is called before.validate()
.to_internval_value()
method validates against field-level constraints set by ORM model fields.to_internal_value()
method maps raw client data to ORM field data, which may change data keys if a field's name is different from itssource
- when nesting a related model serializer as a serializer field, we need to take care of mapping them back to Django ORM model instances, it is a good idea to:
- Retrieve existing relations inside
.to_internal_value()
- Create/update new related instances inside
.create()
/.update()
Part III example code can be found in module api_3.
Often times we are asked to design an endpoint for a particular model that includes some attributes of a related model, and in away that the related data appears to be attributes of this model as far as the client is concerned. This can be implemented in a number of different ways with DRF, but I am only going to focus on achieving two-way data binding - so it works consistently in both read and write operations using a single mechanism.
For this example, let's hoist the related Project
instance's code_name
attribute to our VehicleModel
serializer
so that it would appears as an attribute of the VehicleModel
. We are going to use a custom field:
class ProjectCodeNameField(RelatedField):
queryset = Project.objects.all()
def to_internal_value(self, data):
return data
def to_representation(self, value):
return getattr(value, "code_name")
and now we can use this field on our serializer for VehicleModel
:
class VehicleModelSerializer(ModelSerializer):
project_code_name = ProjectCodeNameField(source="project")
class Meta:
model = VehicleModel
fields = "__all__"
We can hit
the listing endpoint
again, and you should see the response containing list of VehicleModel
objects
such
as this:
{
"id": 1,
"project_code_name": "project-d-35-roadster-x1",
"model": "Buick D-35 Roadster2",
"year": 1917,
"project": 4,
"maker": 1,
"predecessor": null,
"engine_options": [
1,
2
]
}
where the related Project.code_name
has been made available as 'project_code_name'.
Similar to what we did in Part II, we need to handle related model instance creation by overwriting .create()
method:
class VehicleModelSerializer(ModelSerializer):
maker = ManufacturerNameField()
project_code_name = ProjectCodeNameField(source="project")
def to_internal_value(self, data):
new_data = super().to_internal_value(data)
new_data["maker"] = Manufacturer.objects.get(name=new_data["maker"])
return new_data
def create(self, validated_data):
if validated_data.get("project"):
validated_data["project"] = Project.objects.create(code_name=validated_data["project"])
return super().create(validated_data)
class Meta:
model = VehicleModel
fields = "__all__"
I also intentionally added another field maker
here to demonstrate how the it is handled a little bit
different from project_code_name
field. The maker
field is pointing to the related Manufacturer
instance's 'name',
while the project_code_name
is point to the related Project
instance's 'code_name'. Just like what we did earlier in
Part II, we are assuming the API requirement dictates that it only needs to associate the VehicleModel
instance
to be created with an existing Manufacturer
instance, but associated with a new Project
instance (because of
the one-to-one relationship).
For this reason, we are mapping the maker
field to an existing Manufacturer
instance inside .to_internal_value()
,
while creating a new Project
instance inside the .create()
.
Some takeaways regarding accessing related model data directly:
TLDR DRF serializer accessing related model data:
- subclass
restframework.serializers.RelatedField
for accessing an attribute of a related model, this allows for reading and writing using the same mechanism- whenever writing to related model, we need to handle write behavior, it is a good idea to:
- Retrieve existing relations inside
.to_internal_value()
- Create/update new related instances inside
.create()
/.update()