dagster_poc

This is a Dagster project scaffolded with dagster project scaffold.

It is a minimal proof-of-concept for a typical workflow for me:

  • Workspaces are defined by a parameter
  • Functions load data, perform transformations and write output within the workspace

Note: to materialize assets you need to hold Ctrl and click materialize -> this provides the prompt to scaffold missing config, and enter the parameter.

Getting started

First, install your Dagster repository as a Python package. By using the --editable flag, pip will install your repository in "editable mode" so that as you develop, local code changes will automatically apply.

pip install -e ".[dev]"

Then, start the Dagit web server:

dagit

Open http://localhost:3000 with your browser to see the project.

You can start writing assets in dagster_poc/assets/. The assets are automatically loaded into the Dagster repository as you define them.

Development

Adding new Python dependencies

You can specify new Python dependencies in setup.py.

Unit testing

Tests are in the dagster_poc_tests directory and you can run tests using pytest:

pytest dagster_poc_tests

Schedules and sensors

If you want to enable Dagster Schedules or Sensors for your jobs, start the Dagster Daemon process in the same folder as your workspace.yaml file, but in a different shell or terminal.

The $DAGSTER_HOME environment variable must be set to a directory for the daemon to work. Note: using directories within /tmp may cause issues. See Dagster Instance default local behavior for more details.

dagster-daemon run

Once your Dagster Daemon is running, you can start turning on schedules and sensors for your jobs.

Deploy on Dagster Cloud

The easiest way to deploy your Dagster project is to use Dagster Cloud.

Check out the Dagster Cloud Documentation to learn more.