yogyang's Stars
apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
pola-rs/polars
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
fastapi/sqlmodel
SQL databases in Python, designed for simplicity, compatibility, and robustness.
apache/dolphinscheduler
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
datahub-project/datahub
The Metadata Platform for your Data Stack
ritz078/transform
A polyglot web converter.
yhatt/marp
The site of classic Markdown presentation writer app
joelittlejohn/jsonschema2pojo
Generate Java types from JSON or JSON Schema and annotate those types for data-binding with Jackson, Gson, etc
amundsen-io/amundsen
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
lancedb/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
AutoMQ/automq
AutoMQ is a cloud-first alternative to Kafka by decoupling durability to S3 and EBS. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency.
pyeve/cerberus
Lightweight, extensible data validation library for Python
koxudaxi/datamodel-code-generator
Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
BlazingDB/blazingsql
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
lampepfl/progfun-wiki
MarquezProject/marquez
Collect, aggregate, and visualize a data ecosystem's metadata
teamclairvoyant/airflow-maintenance-dags
A series of DAGs/Workflows to help maintain the operation of Airflow
astronomer/dag-factory
Dynamically generate Apache Airflow DAGs from YAML configuration files
scalapy/scalapy
Use the world of Python from the comfort of Scala!
cwacek/python-jsonschema-objects
Automatic Python binding generation from JSON Schemas
etsy/boundary-layer
Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform
pipeline-tools/gusty
Making DAG construction easier
takidau/streamingbook
Code snippets from the Streaming Systems book (streamingbook.net).
AutoMQ/automq-for-rocketmq
A cloud native implementation for Apache RocketMQ 5.0
MIT-DB-Class/course-info-2018
Course info for 6.814/6.830 Fall 2018
cornerpocket407/MIT-6.830-SimpleDB
FebruaryBreeze/json-schema-to-class
Convert JSON Schema into Python Class
pipeline-tools/gusty-demo
A containerized demo of Airflow using gusty
cchandurkar/json-schema-to-case-class
A library that converts JSON Schema to Scala Case Classes
pipeline-tools/gusty-demo-lite
The smallest containerized gusty demo possible