brightwon

brightwon's Stars

donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Language:Python273k 6.5k 32046k
kubernetes/kubernetes
Production-Grade Container Scheduling and Management
Language:Go111k 3.2k 46k39.5k
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Language:Python33.5k 473 18.6k5.7k
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
27.3k 949 243.7k
apache/flink
Apache Flink
Language:Java24k 946 013.3k
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
Language:C++23.6k 205 5.1k1.9k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python20.8k 202 3812.1k
paperless-ngx/paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Language:Python19.9k 109 1.6k1.1k
onnx/onnx
Open standard for machine learning interoperability
Language:Python17.8k 438 2.8k3.7k
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
Language:Java16k 856 6.6k5.4k
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Language:C++14.5k 245 6.6k2.9k
scala/scala
Scala 2 compiler and standard library. Scala 2 bugs at https://github.com/scala/bug; Scala 3 at https://github.com/scala/scala3
Language:Scala14.3k 718 03.1k
scylladb/scylladb
NoSQL data store using the seastar framework, compatible with Apache Cassandra
Language:C++13.5k 337 12.8k1.3k
visenger/awesome-mlops
A curated list of references for MLOps
12.6k 396 151.9k
datastacktv/data-engineer-roadmap
Roadmap to becoming a data engineer in 2021
12.4k 543 631.3k
provectus/kafka-ui
Open-Source Web UI for Apache Kafka Management
Language:Java9.7k 67 1.7k1.2k
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Language:Python8.2k 144 3.8k1.5k
apache/beam
Apache Beam is a unified programming model for Batch and Streaming data processing.
Language:Java7.8k 258 7.1k4.2k
apache/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Language:Java6.4k 312 02.8k
apache/flink-cdc
Flink CDC is a streaming data integration tool
Language:Java5.7k 135 1.7k1.9k
feast-dev/feast
The Open Source Feature Store for Machine Learning
Language:Python5.6k 76 1.4k993
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
Language:Java5.4k 1.2k 3.2k2.4k
kubeflow/spark-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Language:Go2.8k 83 1.2k1.4k
mercari/ml-system-design-pattern
System design patterns for machine learning
2.3k 73 14239
japila-books/apache-spark-internals
The Internals of Apache Spark
1.5k 134 17453
alanleedev/KoreaSecurityApps
(Unofficial) Korean translation of Wladimir Palant's series of writing on vulnerabilities and issues around Korean Security apps.
450 17 2941
approximatelabs/datadm
DataDM is your private data assistant. Slide into your data's DMs
Language:Python379 8 828
apache/hbase-connectors
Apache HBase Connectors
Language:Scala235 51 0177
AbsaOSS/ABRiS
Avro SerDe for Apache Spark structured APIs.
Language:Scala229 17 18475
cherrypy/cheroot
Cheroot is the high-performance, pure-Python HTTP server used by CherryPy. Docs -->
Language:Python185 16 20790