
A simple, flexible library for configuring python projects using dataclasses.

Primary LanguagePythonMIT LicenseMIT


PyPI version PyTest License: MIT

A modular dataclass configuration system

Haven is system for configuring applications using dataclasses and YAML. It provides full type safety while being modular enough to scale to large projects.

Key Features

  • Builds plain dataclasses, so you can use all standard dataclass features, such as custom methods, __post_init__, etc.
  • Doesn't take over your CLI or impose certain structure on your program.
  • Support for parsing a wide variety of types and type hints, including optionals and unions.
  • Scales to projects with many config variations or sub-components using choice and plugin fields.
  • Easily couple code variations with config choices in a type-safe way using Component.


Basic example

class ModelConfig:
    num_layers: int = 5
    embed_dim: int = 512

class TrainConfig:
    workers: int = 5
    steps: list[int] = field(default_factory=lambda: [50, 100 150])
    model: ModelConfig = field(default_factory=ModelConfig)

# Load from yaml string
cfg = haven.load(TrainConfig, """
steps: [1,2,3]
  num_layers: 16
assert cfg.model.num_layers == 16

# Or load from file
with open("config.yaml") as f:
    cfg = haven.load(TrainConfig, f)

# Update using "dotlist" style overrides (e.g. from CLI args)
cfg = haven.update_from_dotlist(cfg, ["workers=3", "model.num_layers=2"])

# Print yaml

Choice fields

More complex projects often want to support many variations for each application component. This can be accomplished through subclassing and choice fields.

class ModelConfig:
    name: str

# Two types of models
class GPT2Config(ModelConfig):
    num_layers: int

class Llama2Config(ModelConfig):
    embed_dim: int = 512

class TrainConfig:
    workers: int = 5
    steps: list[int] = field(default_factory=lambda: [50, 100 150])

    # Choose config class based on value of `ModelConfig.name`.
    model: ModelConfig = haven.choice(
        [GPT2Config, Llama2Config],

# Load from yaml string
cfg = haven.load(TrainConfig, """
steps: [1,2,3]
  name: GPT2Config
  num_layers: 16
assert isinstance(cfg.model, GPT2Config)

Chocies can also be module + object paths that are imported lazily:

class TrainConfig:
    model: ModelConfig = haven.choice([

The benefit of this style of configuration is that all of the available choices are documented directly in the config definition. This works well when there are a small to medium number of variations. For more flexibility, a plugin system is available:

class TrainConfig:
    model: ModelConfig = haven.plugin(

Each module under the mypackage.models namespace that contains the attribute MODEL_CONFIG will then be an available choice. The choice name is the same as the name of the module.


The problem with choice fields alone is that typically, you want to run different code in your application depending on which variant of the config was selected. haven.Component provides a simple mechanism for linking each variation to a callable.

# Sample model definitions
class ModelBase(nn.Module):

class Llama2(Model):
    def __init__(self, cfg: Llama2Config):

class GPT(Model):
    def __init__(self, cfg: GPTConfig):

class TrainConfig:
    model: haven.Component[ModelConfig, ModelBase] = haven.choice([

cfg = haven.load(TrainConfig, "model: Llama2")

# Instantiate the chosen class, passing the appropriate config as the first arg.
model = cfg.model()
assert isinstance(model, Llama2)

The config dataclass to use for each variation is automatically derived from the type hint on the first argument of the callable.

YAML Includes

Haven supports including other yaml files using the !include directive.


person: !include A.yaml


name: "test"


  name: "test"

To set the base directory used for searching for relative paths, you can use haven.set_include_base_dir.

More examples

See the examples directory in the source code for more complete examples.


Full documentation here.


This project is inspired by and borrows code from Pyrallis, SimpleParsing, and draccus, and Hydra