Pinned Repositories
Agile_Data_Code_2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Android
GitHub上最火的Android开源项目,所有开源项目都有详细资料和配套视频
architect-awesome
后端架构师技术图谱
architecture.taobao-alibaba
互联网公司架构: 淘宝技术架构,阿里巴巴技术架构
dataop
NB Operation
DeepLearning-500-questions
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
hudi
Upserts, Deletes And Incremental Processing on Big Data.
iceberg
Apache Iceberg
incubator-uniffle
Uniffle is a high performance, general purpose Remote Shuffle Service.
spark
Apache Spark
Run-Lin's Repositories
Run-Lin/dataop
NB Operation
Run-Lin/Firestorm
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers
Run-Lin/incubator-uniffle-website
Apache uniffle
Run-Lin/submarine
Submarine is Cloud Native Machine Learning Platform.
Run-Lin/hudi
Upserts, Deletes And Incremental Processing on Big Data.
Run-Lin/iceberg
Apache Iceberg
Run-Lin/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
Run-Lin/Awesome-ChatGPT
ChatGPT资料汇总学习,持续更新......
Run-Lin/bagpy
Python package for reading, and extracting data from rosbag files and performing any analysis on it.
Run-Lin/clusterInfo
Run-Lin/databend
An elastic and reliable Serverless Data Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy
Run-Lin/docusaurus_run
Run-Lin/ds-cheatsheets
List of Data Science Cheatsheets to rule the world
Run-Lin/feature_engine
Feature engineering package with sklearn like functionality
Run-Lin/hudi-resources
汇总Apache Hudi相关资料
Run-Lin/iceberg-rs
Run-Lin/incubator-datalab
Apache DataLab (incubating)
Run-Lin/incubator-livy
Mirror of Apache livy (Incubating)
Run-Lin/kuberay
A toolkit to run Ray applications on Kubernetes
Run-Lin/kyuubi
Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
Run-Lin/learning_ray
Code for the upcoming book "Learning Ray" with O'Reilly
Run-Lin/lsql
lsql
Run-Lin/mcap
MCAP is a modular, performant, and serialization-agnostic container file format for pub/sub messages, primarily intended for use in robotics applications.
Run-Lin/oceanbase
A distributed, banking suitable, open-source related database featuring high scalability and high compatibility.
Run-Lin/ray-educational-materials
Ray educational materials
Run-Lin/run-lin.github.io
Run-Lin/spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Run-Lin/starrocks
StarRocks is a next-gen sub-second MPP database for full analysis senarios, including multi-dimensional analytics, real-time analytics and ad-hoc query, formerly known as DorisDB.
Run-Lin/the-algorithm
Source code for Twitter's Recommendation Algorithm
Run-Lin/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)