Pinned Repositories
amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
amoro-site
Documentation site for project Amoro
Chat2DB
🔥🔥🔥AI-driven data management platform Over 1 million developers are using Chat2DB
ChatTTS_colab
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。
cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,支持sso登录,多租户,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注,数据集管理,大模型微调,vllm大模型推理,llmops,私有知识库,AI模型应用商店,支持模型一键开发/推理/微调,支持国产cpu/gpu/npu芯片,支持RDMA,支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
iceberg
Apache Iceberg
paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
spark
Apache Spark - A unified analytics engine for large-scale data processing
starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
tcodehuber's Repositories
tcodehuber/amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
tcodehuber/amoro-site
Documentation site for project Amoro
tcodehuber/Chat2DB
🔥🔥🔥AI-driven data management platform Over 1 million developers are using Chat2DB
tcodehuber/ChatTTS_colab
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。
tcodehuber/cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,支持sso登录,多租户,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注,数据集管理,大模型微调,vllm大模型推理,llmops,私有知识库,AI模型应用商店,支持模型一键开发/推理/微调,支持国产cpu/gpu/npu芯片,支持RDMA,支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
tcodehuber/iceberg
Apache Iceberg
tcodehuber/paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
tcodehuber/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
tcodehuber/spark
Apache Spark - A unified analytics engine for large-scale data processing
tcodehuber/starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
tcodehuber/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
tcodehuber/dolphinscheduler
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
tcodehuber/flink
Apache Flink
tcodehuber/flink-cdc-connectors
CDC Connectors for Apache Flink®
tcodehuber/flink-kubernetes-operator
Apache Flink Kubernetes Operator
tcodehuber/gravitino
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
tcodehuber/hadoop
Apache Hadoop
tcodehuber/hive
Apache Hive
tcodehuber/homebrew-thrift
Public Homebrew for Thrift 0.13
tcodehuber/incubator-celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
tcodehuber/juicefs
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
tcodehuber/kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
tcodehuber/OpenMetadata
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
tcodehuber/ranger
Mirror of Apache Ranger
tcodehuber/risingwave
Scalable Postgres for stream processing, analytics, and management. KsqlDB and Apache Flink alternative. 🚀 10x more productive. 🚀 10x more cost-efficient.
tcodehuber/spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
tcodehuber/supersonic
SuperSonic is the next-generation BI platform that integrates Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
tcodehuber/unitycatalog
Open, Multi-modal Catalog for Data & AI
tcodehuber/volcano
A Cloud Native Batch System (Project under CNCF)
tcodehuber/WrenAI
WrenAI makes your database RAG-ready. Implement Text-to-SQL more accurately and securely.