Sidetrek is the fastest way to build an OSS modern data stack. It's an open-source CLI that helps you create a data project from scratch.
With Sidetrek, you can set up an end-to-end data pipeline locally in minutes.
Sidetrek is built on top of popular open-source tools like Dagster, Meltano, DBT, Minio, Apache Iceberg, Trino, and Superset. We're continuously adding new tools and use cases - if you'd like to see a specific tool added, please let us know by opening an issue on our GitHub repository.
Our roadmap includes not just data engineering tools, but also machine learning and data science tools for ML and AI use cases.
Data engineering is complex! There are so many tools out there and the list just keeps growing.
It's not only hard to keep up with the latest tools and best practices, but connecting these tools can be tricky. As a data engineer, you have so many responsibilities already. You shouldn't have to spend time figuring out how to set up and connect these tools together.
Sidetrek simplifies the process of creating a data project by providing a curated list of tools that work well together.
Make sure you have the following installed on your machine:
- Python version 3.10-3.11
- Poetry
- git CLI
Currently we only support MacOs and Linux.
If you're new to Sidetrek, the best place to start is our Get Started guide.
Download the latest release of Sidetrek CLI.
curl -fsSL https://sidetrek.com/cli | sh
Once you install it, verify the installation by checking the version:
sidetrek --version
To update Sidetrek to the latest version, you simply need to run the install command again:
curl -fsSL https://sidetrek.com/cli | sh
To remove Sidetrek, run:
rm -rf ~/.sidetrek
Initialize a new project by running this command:
sidetrek init
It will ask you to select the Python version, project name, and data stack. Currently, we only have one data stack available made up of the following open-source tools:
- Dagster for orchestration
- Meltano for data ingestion
- DBT for data transformation
- Minio and Apache Iceberg for data storage
- Trino for querying your data
- Superset for data visualization
After pressing Enter, Sidetrek will start scaffolding your project and in a couple of minutes, you should see the success message You're all set - enjoy building your new data project! 🚀
.
Change working directory to your project directory by running cd <your_project>
.
Once you are in the project folder, run the following command:
sidetrek start
If you're running it for the first time, it will take a while to pull all the images, so please be patient!
Once it's up and running, you can see the Dagster UI here: http://localhost:3000.
If you opted in to include an example project (recommended), you have a fully functional example data pipeline set up now.
You still have to do a few things though if you want to see the example data visualized in Superset (http://localhost:8088).
- Run the Meltano ingestion job in Dagster to load the example data into Iceberg tables.
- Run the DBT transformations in Dagster.
- Add Trino as a database connection in Superset.
- Add an example dashboard in Superset.
Once you've completed the above steps, you should be able to see the Superset dashboard with charts!
For more information, please check out the Explore Example Guide.
For a full guide from installation to example data visualization, check out our Get Started guide.
Your data project will include the following open-source tools:
- Dagster for orchestration
- Meltano for data ingestion
- DBT for data transformation
- Minio and Apache Iceberg for data storage
- Trino for querying your data
- Superset for data visualization
We're working on adding more tools and use cases. If you have a suggestion, please feel free to reach out to us.
We'd love to learn more about what your use case!
- 🌟 Star us on GitHub
- 📚 Read our documentation
- 🔭 Follow us on LinkedIn
- 👋 Join us on Slack
- ✏️ Start a GitHub Discussion
- ✉️ Contact us via email
If you have any questions, feel free to reach out to us on Slack, Github or email. We're here to help!
We're continuously improving our documentation so if you have a use case that we don't cover, please let us know and we'll do our best to create a good tutorial for it.
We'd love to learn more about your use case and help you get it working. So don't hesitate to reach out!
Sidetrek is new and we will likely have breaking changes in the future.
If there are any breaking changes, we will make it clear in the release notes.
Right now Sidetrek is mostly a set up tool, so any future changes should not impact your existing project scaffolded by Sidetrek. But as we add more advanced features, this may change. If there are any such changes, we will let you know in the release notes.
Contributions are always welcome!
We're also building a team here at Sidetrek and are always on the lookout for great contributors to join us.
To report an issue, please open a new issue in Issues.
Sidetrek is Apache 2.0 licensed.