branky's Stars
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
commaai/openpilot
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 275+ supported cars.
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
fastai/fastai
The fastai deep learning library
cube-js/cube
📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
joke2k/faker
Faker is a Python package that generates fake data for you.
pallets/jinja
A very fast and expressive template engine.
sqlfluff/sqlfluff
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
graphql-kit/graphql-voyager
🛰️ Represent any GraphQL API as an interactive graph
tobymao/sqlglot
Python SQL Parser and Transpiler
pachyderm/pachyderm
Data-Centric Pipelines and Data Versioning
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
pmd/pmd
An extensible multilanguage static code analyzer.
amundsen-io/amundsen
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
camelot-dev/excalibur
A web interface to extract tabular data from PDFs
antvis/AVA
🤖 A framework for automated visual analytics.
ververica/flink-sql-cookbook
The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.
lw-lin/streaming-readings
Streaming System 相关的论文读物
LucaCanali/sparkMeasure
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
mrparkers/terraform-provider-keycloak
Terraform provider for Keycloak
yahoo/streaming-benchmarks
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
knime/knime-core
KNIME Analytics Platform
dbt-labs/dbt-spark
dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
logpai/Log3C
Log-based impactful problem identification using machine learning [FSE'18]
EvanZhouDev/TheDonutProject
Making donut.c in every language.
snowplow/scala-maxmind-iplookups
Scala client for MaxMind Geo-IP
dataApps/chlorine-finder
A Java Library to detect and mask sensitive data
anthonny/kit-keycloak-theme
A simple kit to edit your Keycloak theme
AgeOfLearning/dbt-unit-test
A tiny framework for testing reusable code inside of dbt models