Pinned Repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
alluxio
Alluxio, formerly Tachyon, Unify Data at Memory Speed
ApacheQuiz
A multiple choice quiz of Apache Software Foundation policy
Chronicle-Map
Replicate your Key Value Store across your network, with consistency, persistance and performance.
ckman
This is a tool which used to manage and monitor ClickHouse database
ClickHouse
ClickHouse is a free analytics DBMS for big data
datafuse
An elastic and scalable Cloud Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy
emr-bootstrap-alluxio
alluxio - emr bootstrap action scripts
hdinsight-scriptaction-alluxio
kylin
Apache Kylin
shaofengshi's Repositories
shaofengshi/emr-bootstrap-alluxio
alluxio - emr bootstrap action scripts
shaofengshi/hdinsight-scriptaction-alluxio
shaofengshi/kylin
Apache Kylin
shaofengshi/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
shaofengshi/ApacheQuiz
A multiple choice quiz of Apache Software Foundation policy
shaofengshi/Chronicle-Map
Replicate your Key Value Store across your network, with consistency, persistance and performance.
shaofengshi/ckman
This is a tool which used to manage and monitor ClickHouse database
shaofengshi/ClickHouse
ClickHouse is a free analytics DBMS for big data
shaofengshi/datafuse
An elastic and scalable Cloud Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy
shaofengshi/delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
shaofengshi/druid
阿里巴巴数据库事业部出品,为监控而生的数据库连接池。阿里云Data Lake Analytics(https://www.aliyun.com/product/datalakeanalytics )、DRDS、TDDL 连接池powered by Druid
shaofengshi/incubator
Apache Incubator Website
shaofengshi/incubator-hudi
Upserts And Incremental Processing on Big Data
shaofengshi/moonbox
Moonbox is a DVtaaS (Data Virtualization as a Service) Platform
shaofengshi/parquet-format
Apache Parquet
shaofengshi/parquet-mr
Apache Parquet
shaofengshi/presto
Official home of the community managed version of Presto, the distributed SQL query engine for big data, under the auspices of the Presto Software Foundation.
shaofengshi/Quicksql
Simpler, Safer, Faster Unified SQL Analytics Engine for Multi-Datasources
shaofengshi/RemoteShuffleService
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
shaofengshi/SparkCube
SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.
shaofengshi/tsunami-security-scanner-plugins
This project aims to provide a central repository for many useful Tsunami Security Scanner plugins.
shaofengshi/apachecon-acasia
Draft page for acah2021 conference
shaofengshi/Chat2DB
🔥 🔥 🔥 An intelligent and versatile general-purpose SQL client and reporting tool for databases which integrates ChatGPT capabilities.
shaofengshi/designing-data-intensive-applications
Designing Data-Intensive Applications by Martin Kleppmann
shaofengshi/gravitino
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
shaofengshi/gravitino-playground
A playground to experience Gravitino
shaofengshi/gravitino-site
Apache gravitino
shaofengshi/hsd-cipher-sm
国产密码算法SM2,SM3,SM4
shaofengshi/Polycat
Polycat is a cutting-edge cloud-native metastore system, purpose-built to cater to the demands of modern data management in lakehouse deployments. It offers a comprehensive solution for organizations that need to manage metadata from multiple data sources across different clouds, all in one unified platform.
shaofengshi/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.