/sidetrek

Primary LanguageTypeScriptApache License 2.0Apache-2.0

What is Sidetrek?

Sidetrek is the fastest way to build an OSS modern data stack. It's an open-source CLI that helps you create a data project from scratch.

With Sidetrek, you can set up an end-to-end data pipeline locally in minutes.

Sidetrek is built on top of popular open-source tools like Dagster, Meltano, DBT, Minio, Apache Iceberg, Trino, and Superset. We're continuously adding new tools and use cases - if you'd like to see a specific tool added, please let us know by opening an issue on our GitHub repository.

Our roadmap includes not just data engineering tools, but also machine learning and data science tools for ML and AI use cases.

Why Sidetrek?

Data engineering is complex! There are so many tools out there and the list just keeps growing.

It's not only hard to keep up with the latest tools and best practices, but connecting these tools can be tricky. As a data engineer, you have so many responsibilities already. You shouldn't have to spend time figuring out how to set up and connect these tools together.

Sidetrek simplifies the process of creating a data project by providing a curated list of tools that work well together.

Prerequisites

Make sure you have the following installed on your machine:

  • Python version 3.10-3.11
  • Poetry
  • git CLI

Supported Platforms

Currently we only support MacOs and Linux.

Quick Start

If you're new to Sidetrek, the best place to start is our Get Started guide.

Download and Install

Download the latest release of Sidetrek CLI.

curl -fsSL https://sidetrek.com/cli | sh

Once you install it, verify the installation by checking the version:

sidetrek --version

Upgrade Sidetrek

To update Sidetrek to the latest version, you simply need to run the install command again:

curl -fsSL https://sidetrek.com/cli | sh

Uninstall Sidetrek

To remove Sidetrek, run:

rm -rf ~/.sidetrek

Initialize a Project

Initialize a new project by running this command:

sidetrek init

It will ask you to select the Python version, project name, and data stack. Currently, we only have one data stack available made up of the following open-source tools:

  • Dagster for orchestration
  • Meltano for data ingestion
  • DBT for data transformation
  • Minio and Apache Iceberg for data storage
  • Trino for querying your data
  • Superset for data visualization

After pressing Enter, Sidetrek will start scaffolding your project and in a couple of minutes, you should see the success message You're all set - enjoy building your new data project! 🚀.

Start the Project

Change working directory to your project directory by running cd <your_project>.

Once you are in the project folder, run the following command:

sidetrek start

If you're running it for the first time, it will take a while to pull all the images, so please be patient!

Once it's up and running, you can see the Dagster UI here: http://localhost:3000.

Explore the Example Project

If you opted in to include an example project (recommended), you have a fully functional example data pipeline set up now.

You still have to do a few things though if you want to see the example data visualized in Superset (http://localhost:8088).

  1. Run the Meltano ingestion job in Dagster to load the example data into Iceberg tables.
  2. Run the DBT transformations in Dagster.
  3. Add Trino as a database connection in Superset.
  4. Add an example dashboard in Superset.

Once you've completed the above steps, you should be able to see the Superset dashboard with charts!

For more information, please check out the Explore Example Guide.

For a full guide from installation to example data visualization, check out our Get Started guide.

Data Stack

Your data project will include the following open-source tools:

  • Dagster for orchestration
  • Meltano for data ingestion
  • DBT for data transformation
  • Minio and Apache Iceberg for data storage
  • Trino for querying your data
  • Superset for data visualization

We're working on adding more tools and use cases. If you have a suggestion, please feel free to reach out to us.

We'd love to learn more about what your use case!

Connect with Us

Have questions? Talk to us!

If you have any questions, feel free to reach out to us on Slack, Github or email. We're here to help!

We're continuously improving our documentation so if you have a use case that we don't cover, please let us know and we'll do our best to create a good tutorial for it.

We'd love to learn more about your use case and help you get it working. So don't hesitate to reach out!

Project Maturity

Sidetrek is new and we will likely have breaking changes in the future.

If there are any breaking changes, we will make it clear in the release notes.

Right now Sidetrek is mostly a set up tool, so any future changes should not impact your existing project scaffolded by Sidetrek. But as we add more advanced features, this may change. If there are any such changes, we will let you know in the release notes.

Contributions

Contributions are always welcome!

We're also building a team here at Sidetrek and are always on the lookout for great contributors to join us.

Report an Issue

To report an issue, please open a new issue in Issues.

License

Sidetrek is Apache 2.0 licensed.