Pinned Repositories
airflow
Apache Airflow
architect-awesome
后端架构师技术图谱
AthenaX
SQL-based streaming analytics platform at scale
atlas
Apache Atlas
autolabel
Label, clean and enrich text datasets with LLMs.
beam
Apache Beam
canal
阿里巴巴mysql数据库binlog的增量订阅&消费组件 。阿里云DRDS( https://www.aliyun.com/product/drds )、阿里巴巴TDDL 二级索引、小表复制powerd by canal.
flinkStreamSQL
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
hbase
Mirror of Apache HBase
librec
LibRec: A Leading Java Library for Recommender Systems, see
BobbySun's Repositories
BobbySun/autolabel
Label, clean and enrich text datasets with LLMs.
BobbySun/flinkStreamSQL
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
BobbySun/airflow
Apache Airflow
BobbySun/atlas
Apache Atlas
BobbySun/beam
Apache Beam
BobbySun/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
BobbySun/DataLink
DataLink是一个满足各种异构数据源之间的实时增量同步、离线全量同步,分布式、可扩展的数据交换平台。
BobbySun/DataSphereStudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
BobbySun/delta-architecture
Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
BobbySun/dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
BobbySun/FATE
An Industrial Grade Federated Learning Framework
BobbySun/fes.js
Fes.js 是一套优秀的中后台前端解决方案。提供初始项目、开发调试、Mock接口、编译打包的命令行工具。内置布局、权限、数据字典、状态管理、存储、Api等多个模块。以约定、配置化、组件化的设计**,让用户仅仅关心用组件搭建页面内容。基于Vue.js,上手简单。经过多个项目中打磨,趋于稳定。
BobbySun/flink-cdc-connectors
Change Data Capture (CDC) Connectors for Apache Flink
BobbySun/flinkx
基于flink的分布式数据同步工具
BobbySun/free-programming-books-zh_CN
:books: 免费的计算机编程类中文书籍,欢迎投稿
BobbySun/GitDataV
基于Vue框架构建的github数据可视化平台
BobbySun/God-Of-BigData
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
BobbySun/hudi
Upserts, Deletes And Incremental Processing on Big Data.
BobbySun/iceberg
Apache Iceberg
BobbySun/incubator-inlong
Apache InLong
BobbySun/incubator-superset
Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application
BobbySun/Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
BobbySun/NNAnalytics
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
BobbySun/Qualitis
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis
BobbySun/Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
BobbySun/scio
A Scala API for Apache Beam and Google Cloud Dataflow.
BobbySun/Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
BobbySun/shuzeCloud
国内领先的数据中台开发平台
BobbySun/snowplow
Cloud-native web, mobile and event analytics, running on AWS and GCP
BobbySun/wormhole
Wormhole is a SPaaS (Stream Processing as a Service) Platform