json2python-models is a Python tool that can generate Python models classes (pydantic, dataclasses, attrs) from JSON dataset.
- Full
typing
module support - Types merging - if some field contains data of different types this will be represent as
Union
type - Fields and models names generation (unicode support included)
- Similar models generalization
- Handling recursive data structures (i.e family tree)
- Detecting string serializable types (i.e. datetime or just stringify numbers)
- Detecting fields containing string constants (
Literal['foo', 'bar']
) - Generation models as list (flat models structure) or tree (nested models)
- Specifying when dictionaries should be processed as
dict
type (by default every dict is considered as some model) - CLI API with a lot of options
from pydantic import BaseModel, Field
from typing import List, Optional
from typing_extensions import Literal
class Tab(BaseModel):
id_: str = Field(..., alias="id")
public: bool
stash_type: Literal["CurrencyStash", "NormalStash", "PremiumStash"] = Field(..., alias="stashType")
items: List['Item']
account_name: Optional[str] = Field(None, alias="accountName")
last_character_name: Optional[str] = Field(None, alias="lastCharacterName")
stash: Optional[str] = None
league: Optional[Literal["Hardcore", "Standard"]] = None
----- Show -----
driver_standings.json
[
{
"season": "2019",
"round": "3",
"DriverStandings": [
{
"position": "1",
"positionText": "1",
"points": "68",
"wins": "2",
"Driver": {
"driverId": "hamilton",
"permanentNumber": "44",
"code": "HAM",
"url": "http://en.wikipedia.org/wiki/Lewis_Hamilton",
"givenName": "Lewis",
"familyName": "Hamilton",
"dateOfBirth": "1985-01-07",
"nationality": "British"
},
"Constructors": [
{
"constructorId": "mercedes",
"url": "http://en.wikipedia.org/wiki/Mercedes-Benz_in_Formula_One",
"name": "Mercedes",
"nationality": "German"
}
]
},
...
]
}
]
json2models -f pydantic -l DriverStandings - driver_standings.json
r"""
generated by json2python-models v0.2.0 at Mon May 4 17:46:30 2020
command: /opt/projects/json2python-models/venv/bin/json2models -f pydantic -s flat -l DriverStandings - driver_standings.json
"""
from pydantic import BaseModel, Field
from typing import List
from typing_extensions import Literal
class DriverStandings(BaseModel):
season: int
round_: int = Field(..., alias="round")
DriverStandings: List['DriverStanding']
class DriverStanding(BaseModel):
position: int
position_text: int = Field(..., alias="positionText")
points: int
wins: int
driver: 'Driver' = Field(..., alias="Driver")
constructors: List['Constructor'] = Field(..., alias="Constructors")
class Driver(BaseModel):
driver_id: str = Field(..., alias="driverId")
permanent_number: int = Field(..., alias="permanentNumber")
code: str
url: str
given_name: str = Field(..., alias="givenName")
family_name: str = Field(..., alias="familyName")
date_of_birth: str = Field(..., alias="dateOfBirth")
nationality: str
class Constructor(BaseModel):
constructor_id: str = Field(..., alias="constructorId")
url: str
name: str
nationality: Literal["Austrian", "German", "American", "British", "Italian", "French"]
----- Show -----
swagger.json
from any online API (I tested file generated by drf-yasg and another one for Spotify API)
It requires a lit bit of tweaking:
- Some fields store routes/models specs as dicts
- There is a lot of optinal fields so we reduce merging threshold
- Disable string literals
json2models -f dataclasses -m Swagger testing_tools/swagger.json \
--dict-keys-fields securityDefinitions paths responses definitions properties \
--merge percent_50 number --max-strings-literals 0
r"""
generated by json2python-models v0.2.0 at Mon May 4 18:08:09 2020
command: /opt/projects/json2python-models/json_to_models/__main__.py -s flat -f dataclasses -m Swagger testing_tools/swagger.json --max-strings-literals 0 --dict-keys-fields securityDefinitions paths responses definitions properties --merge percent_50 number
"""
from dataclasses import dataclass, field
from json_to_models.dynamic_typing import FloatString
from typing import Any, Dict, List, Optional, Union
@dataclass
class Swagger:
swagger: FloatString
info: 'Info'
host: str
schemes: List[str]
base_path: str
consumes: List[str]
produces: List[str]
security_definitions: Dict[str, 'Parameter_SecurityDefinition']
security: List['Security']
paths: Dict[str, 'Path']
definitions: Dict[str, 'Definition_Schema']
@dataclass
class Info:
title: str
description: str
version: str
@dataclass
class Security:
api_key: Optional[List[Any]] = field(default_factory=list)
basic: Optional[List[Any]] = field(default_factory=list)
@dataclass
class Path:
parameters: List['Parameter_SecurityDefinition']
post: Optional['Delete_Get_Patch_Post_Put'] = None
get: Optional['Delete_Get_Patch_Post_Put'] = None
put: Optional['Delete_Get_Patch_Post_Put'] = None
patch: Optional['Delete_Get_Patch_Post_Put'] = None
delete: Optional['Delete_Get_Patch_Post_Put'] = None
@dataclass
class Property:
type_: str
format_: Optional[str] = None
xnullable: Optional[bool] = None
items: Optional['Item_Schema'] = None
@dataclass
class Property_2E:
type_: str
title: Optional[str] = None
read_only: Optional[bool] = None
max_length: Optional[int] = None
min_length: Optional[int] = None
items: Optional['Item'] = None
enum: Optional[List[str]] = field(default_factory=list)
maximum: Optional[int] = None
minimum: Optional[int] = None
format_: Optional[str] = None
@dataclass
class Item:
title: Optional[str] = None
type_: Optional[str] = None
ref: Optional[str] = None
max_length: Optional[int] = None
min_length: Optional[int] = None
@dataclass
class Parameter_SecurityDefinition:
name: Optional[str] = None
in_: Optional[str] = None
required: Optional[bool] = None
schema: Optional['Item_Schema'] = None
description: Optional[str] = None
type_: Optional[str] = None
@dataclass
class Delete_Get_Patch_Post_Put:
operation_id: str
description: str
parameters: List['Parameter_SecurityDefinition']
responses: Dict[str, 'Response']
tags: List[str]
@dataclass
class Item_Schema:
ref: str
@dataclass
class Response:
description: str
schema: Optional[Union['Item_Schema', 'Definition_Schema']] = None
@dataclass
class Definition_Schema:
type_: str
required: Optional[List[str]] = field(default_factory=list)
properties: Optional[Dict[str, Union['Property', 'Property_2E']]] = field(default_factory=dict)
ref: Optional[str] = None
----- Show -----
Github-actions model based on files from starter-workflows
json2models -m Actions "./starter-workflows/ci/*.yml" -s flat -f pydantic -i yaml --dkf env with jobs
r"""
generated by json2python-models v0.2.3 at Tue Jul 13 19:52:43 2021
command: /opt/projects/json2python-models/venv/bin/json2models -m Actions ./starter-workflows/ci/*.yml -s flat -f pydantic -i yaml --dkf env with jobs
"""
from pydantic import BaseModel, Field
from typing import Dict, List, Optional, Union
from typing_extensions import Literal
class Actions(BaseModel):
on: Union['On', List[Literal["push"]]]
jobs: Dict[str, 'Job']
name: Optional[str] = None
env: Optional[Dict[str, Union[int, str]]] = {}
class On(BaseModel):
push: Optional['Push'] = None
pull_request: Optional['PullRequest'] = None
release: Optional['Release'] = None
schedule: Optional[List['Schedule']] = []
workflow_dispatch: Optional[None] = None
class Push(BaseModel):
branches: List[Literal["$default-branch"]]
tags: Optional[List[Literal["v*.*.*"]]] = []
class PullRequest(BaseModel):
branches: List[Literal["$default-branch"]]
class Release(BaseModel):
types: List[Literal["created", "published"]]
class Schedule(BaseModel):
cron: Literal["$cron-daily"]
class Job(BaseModel):
runson: Literal["${{ matrix.os }}", "macOS-latest", "macos-latest", "ubuntu-18.04", "ubuntu-latest", "windows-latest"] = Field(..., alias="runs-on")
steps: List['Step']
name: Optional[str] = None
environment: Optional[Literal["production"]] = None
outputs: Optional['Output'] = None
container: Optional['Container'] = None
needs: Optional[Literal["build"]] = None
permissions: Optional['Permission'] = None
strategy: Optional['Strategy'] = None
defaults: Optional['Default'] = None
env: Optional[Dict[str, str]] = {}
class Step(BaseModel):
uses: Optional[str] = None
name: Optional[str] = None
with_: Optional[Dict[str, Union[bool, float, str]]] = Field({}, alias="with")
run: Optional[str] = None
env: Optional[Dict[str, str]] = {}
workingdirectory: Optional[str] = Field(None, alias="working-directory")
id_: Optional[Literal["build-image", "composer-cache", "deploy-and-expose", "image-build", "login-ecr", "meta", "push-to-registry", "task-def"]] = Field(None, alias="id")
if_: Optional[str] = Field(None, alias="if")
shell: Optional[Literal["Rscript {0}"]] = None
class Output(BaseModel):
route: str = Field(..., alias="ROUTE")
selector: str = Field(..., alias="SELECTOR")
class Container(BaseModel):
image: Literal["crystallang/crystal", "erlang:22.0.7"]
class Permission(BaseModel):
contents: Literal["read"]
packages: Literal["write"]
class Strategy(BaseModel):
matrix: Optional['Matrix'] = None
maxparallel: Optional[int] = Field(None, alias="max-parallel")
failfast: Optional[bool] = Field(None, alias="fail-fast")
class Matrix(BaseModel):
rversion: Optional[List[float]] = Field([], alias="r-version")
pythonversion: Optional[List[float]] = Field([], alias="python-version")
deno: Optional[List[Literal["canary", "v1.x"]]] = []
os: Optional[List[Literal["macOS-latest", "ubuntu-latest", "windows-latest"]]] = []
rubyversion: Optional[List[float]] = Field([], alias="ruby-version")
nodeversion: Optional[List[Literal["12.x", "14.x", "16.x"]]] = Field([], alias="node-version")
configuration: Optional[List[Literal["Debug", "Release"]]] = []
class Default(BaseModel):
run: 'Run'
class Run(BaseModel):
shell: Literal["bash"]
Be ware: this project supports only python3.7 and higher. |
---|
To install it, use pip
:
pip install json2python-models
Or you can build it from source:
git clone https://github.com/bogdandm/json2python-models.git
cd json2python-models
python setup.py install
For regular usage CLI tool is the best option. After you install this package you could use it as json2models <arguments>
or python -m json_to_models <arguments>
. I.e.:
json2models -m Car car_*.json -f attrs > car.py
Arguments:
-
-h
,--help
- Show help message and exit -
-m
,--model
- Model name and its JSON data as path or unix-like path pattern.*
,**
or?
patterns symbols are supported.- Format:
-m <Model name> [<JSON files> ...]
- Example:
-m Car audi.json reno.json
or-m Car audi.json -m Car reno.json
(results will be the same)
- Format:
-
-l
,--list
- Like-m
but given json file should contain list of model data (dataset). If this file contains dict with nested list than you can pass<JSON key>
to lookup. Deep lookups are supported by dot-separated path. If no lookup needed pass-
as<JSON key>
.- Format:
-l <Model name> <JSON key> <JSON file>
- Example:
-l Car - cars.json -l Person fetch_results.items.persons result.json
- Note: Models names under these arguments should be unique.
- Format:
-
-i
,--input-format
- Input file format (parser). Default is JSON parser. Yaml parser requires PyYaml or ruamel.yaml to be installed. Ini parser uses builtin configparser. To implement new one - add new method tocli.FileLoaders
(and create pull request :) )- Format:
-i {json, yaml, ini}
- Example:
-i yaml
- Default:
-i json
- Format:
-
-o
,--output
- Output file- Format:
-o <FILE>
- Example:
-o car_model.py
- Format:
-
-f
,--framework
- Model framework for which python code is generated.base
(default) mean no framework so code will be generated without any decorators and additional meta-data.- Format:
-f {base, pydantic, attrs, dataclasses, custom}
- Example:
-f pydantic
- Default:
-f base
- Format:
-
-s
,--structure
- Models composition style.- Format:
-s {flat, nested}
- Example:
-s nested
- Default:
-s flat
- Format:
-
--datetime
- Enable datetime/date/time strings parsing.- Default: disabled
- Warning: This can lead to 6-7 times slowdown on large datasets. Be sure that you really need this option.
-
--disable-unicode-conversion
,--no-unidecode
- Disable unicode conversion in field labels and class names- Default: enabled
-
--strings-converters
- Enable generation of string types converters (i.e.IsoDatetimeString
orBooleanString
).- Default: disabled
-
--max-strings-literals
- GenerateLiteral['foo', 'bar']
when field have less than NUMBER string constants as values.- Format:
--max-strings-literals <NUMBER>
- Default: 10 (generator classes could override it)
- Example:
--max-strings-literals 5
- only 5 literals will be saved and used to code generation - Note: There could not be more than 15 literals per field (for performance reasons)
- Note:
attrs
code generator do not use Literals and just generatestr
fields instead
- Format:
-
--merge
- Merge policy settings. Possible values are:- Format:
--merge MERGE_POLICY [MERGE_POLICY ...]
- Possible values (MERGE_POLICY):
percent[_<percent>]
- two models had a certain percentage of matched field names. Custom value could be i.e.percent_95
.number[_<number>]
- two models had a certain number of matched field names.exact
- two models should have exact same field names to merge.
- Example:
--merge percent_95 number_20
- merge if 95% of fields are matched or 20 of fields are matched - Default:
--merge percent_70 number_10
- Format:
-
--dict-keys-regex
,--dkr
- List of regular expressions (Python syntax). If all keys of some dict are match one of the pattern then this dict will be marked as dict field but not nested model.- Format:
--dkr RegEx [RegEx ...]
- Example:
--dkr node_\d+ \d+_\d+_\d+
- Note:
^
and$
(string borders) tokens will be added automatically but you have to escape other special characters manually. - Optional
- Format:
-
--dict-keys-fields
,--dkf
- List of model fields names that will be marked as dict fields- Format:
--dkf FIELD_NAME [FIELD_NAME ...]
- Example:
--dkf "dict_data" "mapping"
- Optional
- Format:
-
--code-generator
- Absolute import path toGenericModelCodeGenerator
subclass.- Format:
--code-generator CODE_GENERATOR
- Example:
-f mypackage.mymodule.DjangoModelsGenerator
- Note: Is ignored without
-f custom
but is required with it.
- Format:
-
--code-generator-kwargs
- List of GenericModelCodeGenerator subclass arguments (for__init__
method, see docs of specific subclass). Each argument should be in following format:argument_name=value
or"argument_name=value with space"
. Boolean values should be passed in JS style:true
orfalse
- Format:
--code-generator-kwargs [NAME=VALUE [NAME=VALUE ...]]
- Example:
--code-generator-kwargs kwarg1=true kwarg2=10 "kwarg3=It is string with spaces"
- Optional
- Format:
One of model arguments (-m
or -l
) is required.
-
To run tests you should clone project and run setup.py
script:
git clone https://github.com/bogdandm/json2python-models.git
cd json2python-models
python setup.py test -a '<pytest additional arguments>'
Also I would recommend you to install pytest-sugar
for pretty printing test results
You can find out some examples of usage of this project at testing_tools/real_apis/...
Each file contains functions to download data from some online API (references included at the top of file) and
main
function that generates and prints code. Some examples may print debug data before actual code.
Downloaded data will be saved at testing_tools/real_apis/<name of example>/<dataset>.json
- python-dateutil - Datetime parsing
- inflection - String transformations
- Unidecode - Unicode to ASCII conversion
- Jinja2 - Code templates
- ordered-set is used in models merging algorithm
Test tools:
- pytest - Test framework
- pytest-xdist - Parallel execution of test suites
- pytest-sugar - Test results pretty printing
- requests - Test data download
Feel free to open pull requests with new features or bug fixes. Just follow few rules:
- Always use some code formatter (black or PyCharm built-in)
- Keep code coverage above 95-98%
- All existing tests should be passed (including test examples from
testing_tools/real_apis
) - Use
typing
module - Fix codacy issues from your PR
This project is licensed under the MIT License - see the LICENSE file for details