/zookeeper

A small library for managing deep learning models, hyperparameters and datasets

Primary LanguagePythonApache License 2.0Apache-2.0

Zookeeper

GitHub Actions Codecov PyPI - Python Version PyPI PyPI - License Code style: black Join the community on Spectrum

A small library for configuring modular applications.

Installation

pip install zookeeper

Components

The fundamental building blocks of Zookeeper are components. The @component decorator is used to turn classes into components. These component classes can have configurable parameters, which are declared using class-level type annotations (in a similar way to Python dataclasses). These parameters can be Python objects or nested sub-components, and need not be set with a default value.

For example:

from zookeeper import component

@component
class ChildComponent:
    a: int                  # An `int` parameter, with no default set
    b: str = "foo"          # A `str` parameter, which by default will be `foo`

@component
class ParentComponent:
    a: int                  # The same `int` parameter as the child
    child: ChildComponent   # A nested component parameter, of type `ChildComponent`

After instantiation, components can be 'configured' with a configuration dictionary, containing values for a tree of nested parameters. This process automatically injects the correct values into each parameter.

If a child sub-component declares a parameter which already exists in some containing parent, then it will pick up the value that's set on the parent, unless a 'scoped' value is set on the child.

For example:

from zookeeper import configure

p = ParentComponent()

configure(
    p,
    {
        "a": 5,
        "child.a": 4,
    }
)

>>> 'ChildComponent' is the only concrete component class that satisfies the type
>>> of the annotated parameter 'ParentComponent.child'. Using an instance of this
>>> class by default.

print(p)

>>> ParentComponent(
>>>     a = 5,
>>>     child = ChildComponent(
>>>         a = 4,
>>>         b = "foo"
>>>     )
>>> )

Tasks and the CLI

The @task decorator is used to define Zookeeper tasks and can be applied to any class that implements an argument-less run method. Such tasks can be run through the Zookeeper CLI, with parameter values passed in through CLI arguments (configure is implicitly called).

For example:

from zookeeper import cli, task

@task
class UseChildA:
    parent: ParentComponent
    def run(self):
        print(self.parent.child.a)

@task
class UseParentA(UseChildA):
    def run(self):
        print(self.parent.a)

if __name__ == "__main__":
    cli()

Running the above file then gives a nice CLI interface:

python test.py use_child_a
>>> ValueError: No configuration value found for annotated parameter 'UseChildA.parent.a' of type 'int'.

python test.py use_child_a a=5
>>> 5

python test.py use_child_a a=5 child.a=3
>>> 3

python test.py use_parent_a a=5 child.a=3
>>> 5

Using Zookeeper to define Larq or Keras experiments

See examples/larq_experiment.py for an example of how to use Zookeeper to define all the necessary components (dataset, preprocessing, and model) of a Larq experiment: training a BinaryNet on MNIST. This example can be easily adapted to other Larq or Keras models and other datasets.