brightwon's Stars
donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
kubernetes/kubernetes
Production-Grade Container Scheduling and Management
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
apache/flink
Apache Flink
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
paperless-ngx/paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
onnx/onnx
Open standard for machine learning interoperability
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
scala/scala
Scala 2 compiler and standard library. Scala 2 bugs at https://github.com/scala/bug; Scala 3 at https://github.com/scala/scala3
scylladb/scylladb
NoSQL data store using the seastar framework, compatible with Apache Cassandra
visenger/awesome-mlops
A curated list of references for MLOps
datastacktv/data-engineer-roadmap
Roadmap to becoming a data engineer in 2021
provectus/kafka-ui
Open-Source Web UI for Apache Kafka Management
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
apache/beam
Apache Beam is a unified programming model for Batch and Streaming data processing.
apache/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
apache/flink-cdc
Flink CDC is a streaming data integration tool
feast-dev/feast
The Open Source Feature Store for Machine Learning
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
kubeflow/spark-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
mercari/ml-system-design-pattern
System design patterns for machine learning
japila-books/apache-spark-internals
The Internals of Apache Spark
alanleedev/KoreaSecurityApps
(Unofficial) Korean translation of Wladimir Palant's series of writing on vulnerabilities and issues around Korean Security apps.
approximatelabs/datadm
DataDM is your private data assistant. Slide into your data's DMs
apache/hbase-connectors
Apache HBase Connectors
AbsaOSS/ABRiS
Avro SerDe for Apache Spark structured APIs.
cherrypy/cheroot
Cheroot is the high-performance, pure-Python HTTP server used by CherryPy. Docs -->