Overview

No Maintenance Intended
docs Documentation Status
tests
Travis-CI Build Status AppVeyor Build Status Requirements Status
Coverage Status
codacy Codebeat Code Quality Status Code Climate Maintainability Score Scrutinizer Code Quality Status CodeFactor
package

Code generators for immutable structured data, including algebraic data types, and functions to destructure them. Structured Data provides three public modules: structured_data.adt, structured_data.match, and structured_data.data.

The adt module provides base classes and an annotation type for converting a class into algebraic data types.

The match module provides a Pattern class that can be used to build match structures, and a Matchable class that wraps a value, and attempts to apply match structures to it. If the match succeeds, the bindings can be extracted and used. It includes some special support for adt subclasses.

The match architecture allows you tell pull values out of a nested structure:

structure = (match.pat.a, match.pat.b[match.pat.c, match.pat.d], 5)
my_value = (('abc', 'xyz'), ('def', 'ghi'), 5)
matchable = match.Matchable(my_value)
if matchable(structure):
    # The format of the matches is not final.
    print(matchable['a'])  # ('abc', 'xyz')
    print(matchable['b'])  # ('def', 'ghi')
    print(matchable['c'])  # 'def'
    print(matchable['d'])  # 'ghi'

The subscript operator allows binding both the outside and the inside of a structure. Indexing a Matchable is forwarded to a matches attribute, which is None if the last match was not successful, and otherwise contains an instance of a custom mapping type, which allows building the matched values back up into simple structures.

The Sum base class exists to create classes that do not necessarily have a single fixed format, but do have a fixed set of possible formats. This lowers the maintenance burden of writing functions that operate on values of a Sum class, because the full list of cases to handle is directly in the class definition.

Here are implementations of common algebraic data types in other languages:

class Maybe(adt.Sum, typing.Generic[T]):

    Just: adt.Ctor[T]
    Nothing: adt.Ctor


class Either(adt.Sum, typing.Generic[E, R]):

    Left: adt.Ctor[E]
    Right: adt.Ctor[R]

The data module provides classes based on these examples.

  • Free software: MIT license

How Can I Help?

Currently, this project has somewhat high quality metrics, though some of them have been higher. I am highly skeptical of this, because I've repeatedly given in to the temptation to code to the metrics. I can't trust the metrics, and I know the code well enough that I can't trust my own judgment to figure out which bits need to be improved and how. I need someone to review the code and identify problem spots based on what doesn't make sense to them. The issues are open.

Should I Use This?

Until there's a major version out, probably not.

There are several alternatives in the standard library that may be better suited to particular use-cases:

  • The namedtuple factory creates tuple classes with a single structure; the typing.NamedTuple class offers the ability to include type information. The interface is slightly awkward, and the values expose their tuple-nature easily. (NOTE: In Python 3.8, the fast access to namedtuple members means that they bypass user-defined __getitem__ methods, thereby allowing factory consumers to customize indexing without breaking attribute access. It looks like it does still rely on iteration behavior for various convenience methods.)
  • The enum module provides base classes to create finite enumerations. Unlike NamedTuple, the ability to convert values into an underlying type must be opted into in the class definition.
  • The dataclasses module provides a class decorator that converts a class into one with a single structure, similar to a namedtuple, but with more customization: instances are mutable by default, and it's possible to generate implementations of common protocols.
  • The Structured Data adt decorator is inspired by the design of dataclasses. (A previous attempt used metaclasses inspired by the enum module, and was a nightmare.) Unlike enum, it doesn't require all instances to be defined up front; instead each class defines constructors using a sequence of types, which ultimately determines the number of arguments the constructor takes. Unlike namedtuple and dataclasses, it allows instances to have multiple shapes with their own type signatures. Unlike using regular classes, the set of shapes is specified up front.
  • If you want multiple shapes, and don't want to specify them ahead of time, your best bet is probably a normal tree of classes, where the leaf classes are dataclasses.

Installation

pip install structured-data

Documentation

https://python-structured-data.readthedocs.io/

Development

To run the all tests run:

nox