tcodehuber

The PPMC member of @apache Amoro, focusing on big data.

JD.COMChengdu, China

Pinned Repositories

amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
Language:Java0 0 00
amoro-site
Documentation site for project Amoro
Language:SCSS0 0 00
Chat2DB
🔥🔥🔥AI-driven data management platform Over 1 million developers are using Chat2DB
Language:Java0 0 00
ChatTTS_colab
🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。
Language:Python0 0 00
cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台，支持sso登录，多租户，大数据平台对接，notebook在线开发，拖拉拽任务流pipeline编排，多机多卡分布式训练，超参搜索，推理服务VGPU，边缘计算，serverless，标注平台，自动化标注，数据集管理，大模型微调，vllm大模型推理，llmops，私有知识库，AI模型应用商店，支持模型一键开发/推理/微调，支持国产cpu/gpu/npu芯片，支持RDMA，支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
Language:Jupyter Notebook0 0 00
iceberg
Apache Iceberg
Language:Java0 1 00
paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Language:Java0 0 00
seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Language:Java0 1 00
spark
Apache Spark - A unified analytics engine for large-scale data processing
Language:Scala0 2 02
starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Language:Java0 0 00

tcodehuber's Repositories

tcodehuber/amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
Language:Java0 0 00
tcodehuber/amoro-site
Documentation site for project Amoro
Language:SCSS0 0 00
tcodehuber/Chat2DB
🔥🔥🔥AI-driven data management platform Over 1 million developers are using Chat2DB
Language:Java0 0 00
tcodehuber/ChatTTS_colab
🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。
Language:Python0 0 00
tcodehuber/cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台，支持sso登录，多租户，大数据平台对接，notebook在线开发，拖拉拽任务流pipeline编排，多机多卡分布式训练，超参搜索，推理服务VGPU，边缘计算，serverless，标注平台，自动化标注，数据集管理，大模型微调，vllm大模型推理，llmops，私有知识库，AI模型应用商店，支持模型一键开发/推理/微调，支持国产cpu/gpu/npu芯片，支持RDMA，支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
Language:Jupyter Notebook0 0 00
tcodehuber/iceberg
Apache Iceberg
Language:Java0 1 00
tcodehuber/paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Language:Java0 0 00
tcodehuber/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Language:Java0 1 00
tcodehuber/spark
Apache Spark - A unified analytics engine for large-scale data processing
Language:Scala0 2 02
tcodehuber/starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Language:Java0 0 00
tcodehuber/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Language:Java0 0
tcodehuber/dolphinscheduler
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Language:Java1 0
tcodehuber/flink
Apache Flink
Language:Java2 0
tcodehuber/flink-cdc-connectors
CDC Connectors for Apache Flink®
Language:Java2 0
tcodehuber/flink-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
tcodehuber/gravitino
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
Language:Java1 0
tcodehuber/hadoop
Apache Hadoop
Language:Java2 0
tcodehuber/homebrew-thrift
Public Homebrew for Thrift 0.13
Language:Ruby1 0
tcodehuber/incubator-celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Language:Java
tcodehuber/juicefs
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Language:Go0 0
tcodehuber/kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
Language:Scala2 0
tcodehuber/open-r1
Fully open reproduction of DeepSeek-R1
Language:Python0 0
tcodehuber/OpenMetadata
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
Language:TypeScript0 0
tcodehuber/ranger
Mirror of Apache Ranger
Language:Java2 0
tcodehuber/risingwave
Scalable Postgres for stream processing, analytics, and management. KsqlDB and Apache Flink alternative. 🚀 10x more productive. 🚀 10x more cost-efficient.
Language:Rust1 0
tcodehuber/spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Language:Go2 0
tcodehuber/supersonic
SuperSonic is the next-generation BI platform that integrates Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
Language:Java0 0
tcodehuber/unitycatalog
Open, Multi-modal Catalog for Data & AI
Language:Java0 0
tcodehuber/volcano
A Cloud Native Batch System (Project under CNCF)
Language:Go1 0
tcodehuber/WrenAI
WrenAI makes your database RAG-ready. Implement Text-to-SQL more accurately and securely.
Language:TypeScript0 0