lo's Stars
microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
crewAIInc/crewAI
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
exo-explore/exo
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
kestra-io/kestra
:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
Melkeydev/go-blueprint
Go-blueprint allows users to spin up a quick Go project using a popular framework
flyteorg/flyte
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
ibis-project/ibis
the portable Python dataframe library
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
pyinfra-dev/pyinfra
pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
nalgeon/redka
Redis re-implemented with SQLite
datafold/data-diff
Compare tables within or across databases
duckdb/pg_duckdb
DuckDB-powered Postgres for high performance apps & analytics.
rilldata/rill
Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
skyplane-project/skyplane
🔥 Blazing fast bulk data transfers between any cloud 🔥
latitude-dev/latitude
Developer-first embedded analytics
unytics/bigfunctions
Supercharge BigQuery with BigFunctions
slingdata-io/sling-cli
Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
alajmo/mani
:robot: CLI tool to help you manage repositories
elementary-data/dbt-data-reliability
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
databrickslabs/dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Bl3f/yato
The smallest DuckDB SQL orchestrator on Earth.
delta-io/delta-examples
Delta Lake examples
silverton-io/buz
Serverless multi-protocol + multi-destination event collection system.
mattf96s/QuackDB
Open-source in-browser DuckDB SQL editor
borjavb/dbt-iceberg-poc
alfredodeza/rust-etl
Practice ETL with Rust and Polars
treeverse/lakeview
lakeview is a visibility tool for S3 based data lakes
Simplifi-ED/cloudcost
Go cli to check Cloud services (Azure for now) prices on the fly
hellomikelo/databricks-n8n
n8n hosted as Databricks Apps