TongWei1105's Stars
donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
supabase/supabase
The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.
nomic-ai/gpt4all
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
dair-ai/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
skylot/jadx
Dex to Java decompiler
QuivrHQ/quivr
Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Efficient retrieval augmented generation framework
ben-manes/caffeine
A high performance caching library for Java
redpanda-data/redpanda
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
easychen/lean-side-bussiness
精益副业:程序员如何优雅地做副业
open-metadata/OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
apache/fury
A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
awesome-spark/awesome-spark
A curated list of awesome Apache Spark packages and resources.
timeplus-io/proton
A streaming SQL engine, a fast and lightweight alternative to ksqlDB and Apache Flink, 🚀 powered by ClickHouse.
apache/carbondata
High performance data store solution
apache/incubator-gluten
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
kwai/blaze
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
apache/gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
NVIDIA/spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
japila-books/spark-sql-internals
The Internals of Spark SQL
not-poma/lazyshell
GPT powered Zsh completion script
linkedin/spark-tfrecord
Read and write Tensorflow TFRecord data from Apache Spark.
melin/superior-sql-parser
基于 antlr4 的多种数据库SQL解析器,获取SQL中元数据,可用于数据平台产品中的多个场景:ddl语句提取元数据、sql 权限校验、表级血缘、sql语法校验等场景。支持spark、flink、gauss、starrocks、Oracle、MYSQL、Postgresql,sqlserver,、db2等
simdjson/simdjson-java
A Java version of simdjson, a high-performance JSON parser utilizing SIMD instructions
adoptium/temurin17-binaries
Temurin 17 binaries
trinodb/trino-gateway
streamnative/pulsar-spark
Spark Connector to read and write with Pulsar
target/data-validator
A tool to validate data, built around Apache Spark.
CoxAutomotiveDataSolutions/spark-distcp
A re-implementation of Hadoop DistCP in Apache Spark
Carleslc/Gantt
Online Gantt Chart for a better planning.