Dataclasses JSON

This library provides a simple API for encoding and decoding dataclasses to and from JSON.

It's recursive (see caveats below), so you can easily work with nested dataclasses. In addition to the supported types in the py to JSON table, this library supports the following:

any arbitrary Collection type is supported. Mapping types are encoded as JSON objects and str types as JSON strings. Any other Collection types are encoded into JSON arrays, but decoded into the original collection types.
datetime objects. datetime objects are encoded to float (JSON number) using timestamp. As specified in the datetime docs, if your datetime object is naive, it will assume your system local timezone when calling .timestamp(). JSON nunbers corresponding to a datetime field in your dataclass are decoded into a datetime-aware object, with tzinfo set to your system local timezone. Thus, if you encode a datetime-naive object, you will decode into a datetime-aware object. This is important, because encoding and decoding won't strictly be inverses.
UUID objects. They are encoded as str (JSON string).

The latest release is compatible with both Python 3.7 and Python 3.6 (with the dataclasses backport).

Quickstart

pip install dataclasses-json

Approach 1: Class decorator

from dataclasses import dataclass
from dataclasses_json import dataclass_json

@dataclass_json
@dataclass
class Person:
    name: str

lidatong = Person('lidatong')

# Encoding to JSON
lidatong.to_json()  # '{"name": "lidatong"}'

# Decoding from JSON
Person.from_json('{"name": "lidatong"}')  # Person(name='lidatong')

Note that the @dataclass_json decorator must be stacked above the @dataclass decorator (order matters!)

Approach 2: Inherit from a mixin

from dataclasses import dataclass
from dataclasses_json import DataClassJsonMixin

@dataclass
class Person(DataClassJsonMixin):
    name: str

lidatong = Person('lidatong')

# A different example from Approach 1 above, but usage is the exact same
assert Person.from_json(lidatong.to_json()) == lidatong

Pick whichever approach suits your taste. The differences in implementation are invisible in usage.

How do I...

Use my dataclass with JSON arrays or objects?

from dataclasses import dataclass
from dataclasses_json import dataclass_json

@dataclass_json
@dataclass
class Person:
    name: str

Encode into a JSON array containing instances of my Data Class

people_json = [Person('lidatong')]
Person.schema().dumps(people_json, many=True)  # '[{"name": "lidatong"}]'

Decode a JSON array containing instances of my Data Class

people_json = '[{"name": "lidatong"}]'
Person.schema().loads(people_json, many=True)  # [Person(name='lidatong')]

Encode as part of a larger JSON object containing my Data Class (e.g. an HTTP request/response)

import json

person_dict = Person.schema().dump(Person('lidatong'))

response_dict = {
    'response': {
        'person': person_dict
    }
}

response_json = json.dumps(response_dict)

In this case, we do two steps. First, we encode the dataclass into a python dictionary rather than a JSON string, using schema() and dump. Scroll down for a section addressing that.

Second, we leverage the built-in json.dumps to serialize our dataclass into a JSON string.

Decode as part of a larger JSON object containing my Data Class (e.g. an HTTP response)

import json

response_dict = json.loads('{"response": {"person": {"name": "lidatong"}}}')

person_dict = response_dict['response']

person = Person.schema().load(person_dict)

In a similar vein to encoding above, we leverage the built-in json module.

First, call json.loads to read the entire JSON object into a dictionary. We then access the key of the value containing the encoded dict of our Person that we want to decode (response_dict['response']).

Second, we load in the dictionary using Person.schema().load.

Encode or decode into Python lists/dictionaries rather than JSON?

This can be by calling .schema() and then using the corresponding encoder/decoder methods, ie. .load(...)/.dump(...).

Encode into a single Python dictionary

person = Person('lidatong')
Person.schema().dump(person)  # {"name": "lidatong"}

Encode into a list of Python dictionaries

people = [Person('lidatong')]
Person.schema().dump(people, many=True)  # [{"name": "lidatong"}]

Decode a dictionary into a single dataclass instance

person_dict = {"name": "lidatong"}
Person.schema().load(person_dict)  # Person(name='lidatong')

Decode a list of dictionaries into a list of dataclass instances

people_dicts = [{"name": "lidatong"}]
Person.schema().load(people_dicts, many=True)  # [Person(name='lidatong')]

Handle missing or optional field values when decoding?

By default, any fields in your dataclass that use default or default_factory will have the values filled with the provided default, if the corresponding field is missing from the JSON you're decoding.

Decode JSON with missing field

@dataclass_json
@dataclass
class Student
    id: int
    name: str = 'student'

Student.from_json({"id": 1})  # Student(id=1, name='student')

Notice from_json filled the field name with the specified default 'student' when it was missing from the JSON.

Sometimes you have fields that are typed as Optional, but you don't necessarily want to assign a default. In that case, you can use the infer_missing kwarg to make from_json infer the missing field value as None.

Decode optional field without default

@dataclass_json
@dataclass
class Tutor:
    id: int
    student: Optional[Student]

Tutor.from_json({"id": 1})  # Tutor(id=1, student=None)

Personally I recommend you leverage dataclass defaults rather than using infer_missing, but if for some reason you need to decouple the behavior of JSON decoding from the field's default value, this will allow you to do so.

Explanation

Briefly, on what's going on under the hood in the above examples: calling .schema() will have this library generate a marshmallow schema for you. It also fills in the corresponding object hook, so that marshmallow will create an instance of your Data Class on load (e.g. Person.schema().load returns a Person) rather than a dict, which it does by default in marshmallow.

Performance note

.schema() is not cached (it generates the schema on every call), so if you have a nested Data Class you may want to save the result to a variable to avoid re-generation of the schema on every usage.

person_schema = Person.schema()
person_schema.dump(people, many=True)

# later in the code...

person_schema.dump(person)

Marshmallow interop

Using the dataclass_json decorator or mixing in DataClassJsonMixin will provide you with an additional method .schema().

.schema() generates a schema exactly equivalent to manually creating a marshmallow schema for your dataclass. You can reference the marshmallow API docs to learn other ways you can use the schema returned by .schema().

You can pass in the exact same arguments to .schema() that you would when constructing a PersonSchema instance, e.g. .schema(many=True), and they will get passed through to the marshmallow schema.

from dataclasses import dataclass
from dataclasses_json import dataclass_json

@dataclass_json
@dataclass
class Person:
    name: str

# You don't need to do this - it's generated for you by `.schema()`!
from marshmallow import Schema, fields

class PersonSchema(Schema):
    name = fields.Str()

A larger example

from dataclasses import dataclass
from dataclasses_json import dataclass_json
from typing import List

@dataclass_json
@dataclass(frozen=True)
class Minion:
    name: str


@dataclass_json
@dataclass(frozen=True)
class Boss:
    minions: List[Minion]

boss = Boss([Minion('evil minion'), Minion('very evil minion')])
boss_json = """
{
    "minions": [
        {
            "name": "evil minion"
        },
        {
            "name": "very evil minion"
        }
    ]
}
""".strip()

assert boss.to_json(indent=4) == boss_json
assert Boss.from_json(boss_json) == boss

Caveats

Data Classes that contain forward references (e.g. recursive dataclasses) are not currently supported.

canassa/dataclasses-json