json-arrays

PyPI version PyPI - Python Version PyPI - Downloads

Maturity badge - level 3 Stage

Code Coverage

CI(check) CI(release) CI(scheduled) CI(test)

Read and write JSON lazy, especially json-arrays.

Handles both the JSON format:

[
  {
    "a": 1
  },
  {
    "a": 2
  }
]

As well as JSON LINES format:

{"a":1}
{"a": 2}

Also supports streaming from gzipped files.

Uses orjson if present, otherwise standard json.

Usage

Installation

# Using standard json
pip install json-arrays

# Using orjson
pip install json-arrays[orjson]

Note

This library prefers files opened in binary mode. Therefore does all dumps-methods return bytes.

All loads methods handles str, bytes and bytesarray arguments.

Examples

Allows you to use json.load and json.dump with both json and json-lines files as well as dumping generators.

import json_arrays

# This command tries to guess format and opens the file
data = json_arrays.load_from_file("data.json") # or data.jsonl

# Write to file, again guessing format
json_arrays.dump_to_file(data, "data.jsonl")
from json_arrays import json_iter, jsonl_iter

# Open and read the file without guessing
data = json_iter.load_from_file("data.json")

# Process file

# Write to file without guessing
jsonl_iter.dump_to_file(data, "data.jsonl")
import json_arrays
def process(data):
    for entry in data:
        # process
        yield entry

def read_process_and_write(filename_in, filename_out):

    json_arrays.dump_to_file(
        process(
            json_arrays.load_from_file(filename_in)
        ),
        filename_out
    )

You can also use json_arrays as a sink, that you can send data to.

import json_arrays

with open("out.json", "bw") as fp:
  # guessing format
  with json_arrays.sink(fp) as sink:
    for data in data_source():
      sink.send(data)

Release Notes

This projects keeps a CHANGELOG.

Development

This project uses pdm. After cloning the repo, just run

make dev
make test

to setup a virtual environment, install dev dependencies and run the unit tests.

Note: If you run the command in a activated virtual environment, that environment is used instead.

Deployment

Push a tag in the format v\d+.\d+.\d+to main-branch, to build & publish package to PyPi.