shellyhh's Stars
simbafl/DataWarehouse
从数据仓库到用户画像,从数据建设到数据应用
apache/apisix
The Cloud-Native API Gateway
debezium/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
GradleUp/shadow
Gradle plugin to create fat/uber JARs, apply file transforms, and relocate packages for applications and libraries. Gradle version of Maven's Shade plugin.
linkease/ddnsto-openwrt
ddnsto for openwrt
alievk/avatarify-python
Avatars for Zoom, Skype and other video-conferencing apps.
plasma-umass/scalene
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
mingrammer/diagrams
:art: Diagram as Code for prototyping cloud system architectures
jupyter/jupyter
Jupyter metapackage for installation, docs and chat
shijinkui/spark_study
spark源码学习
MoRan1607/BigDataGuide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
stupidloud/nanopi-openwrt
Openwrt for Nanopi R1S R2S R4S R5S 香橙派 R1 Plus 固件编译 纯净版与大杂烩
endymecy/spark-config-and-tuning
spark性能调优总结 spark config and tuning
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
apache/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
MarquezProject/marquez-airflow
Airflow support for Marquez
MarquezProject/marquez
Collect, aggregate, and visualize a data ecosystem's metadata
microsoft/Bringing-Old-Photos-Back-to-Life
Bringing Old Photo Back to Life (CVPR 2020 oral)
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
soimort/you-get
:arrow_double_down: Dumb downloader that scrapes the web
geekxh/hello-algorithm
🌍 针对小白的算法训练 | 包括四部分:①.大厂面经 ②.力扣图解 ③.千本开源电子书 ④.百张技术思维导图(项目花了上百小时,希望可以点 star 支持,🌹感谢~)推荐免费ChatGPT使用网站
Qihoo360/XSQL
Unified SQL Analytics Engine Based on SparkSQL
zhisheng17/flink-learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
hashicorp/vagrant
Vagrant is a tool for building and distributing development environments.
byzer-org/byzer-lang
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
allwefantasy/spark-binlog
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
spark-jobserver/spark-jobserver
REST job server for Apache Spark
apache/doris
Apache Doris is an easy-to-use, high performance and unified analytics database.