Pinned Repositories
analytics-zoo
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
BigDL
BigDL: Distributed Deep learning Library for Apache Spark
BigDL-Tutorials
Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL
drizzle-spark
Drizzle integration with Apache Spark
dynolog
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
HiBench
HiBench is a big data benchmark suite.
horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
llm-on-ray
OAP
Optimized Analytics Package for Spark Platform
oap-raydp
RayDP: Distributed data processing library on Ray by running popular big data frameworks like Apache Spark on Ray. RayDP seamlessly integrates with other Ray libraries to make it simple to build E2E data analytics and AI pipeline.
carsonwang's Repositories
carsonwang/analytics-zoo
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
carsonwang/BigDL
BigDL: Distributed Deep learning Library for Apache Spark
carsonwang/BigDL-Tutorials
Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL
carsonwang/drizzle-spark
Drizzle integration with Apache Spark
carsonwang/dynolog
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
carsonwang/HiBench
HiBench is a big data benchmark suite.
carsonwang/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
carsonwang/llm-on-ray
carsonwang/OAP
Optimized Analytics Package for Spark Platform
carsonwang/oap-raydp
RayDP: Distributed data processing library on Ray by running popular big data frameworks like Apache Spark on Ray. RayDP seamlessly integrates with other Ray libraries to make it simple to build E2E data analytics and AI pipeline.
carsonwang/PAT
Performance Analysis Tool
carsonwang/spark
Mirror of Apache Spark
carsonwang/spark-adaptive
carsonwang/spark-mpi
MPI-oriented extension of the Spark computational model
carsonwang/sparkMeasure
SparkMeasure is a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
carsonwang/sparrow
Sparrow scheduling platform (U.C. Berkeley).
carsonwang/tachyon
A Reliable Memory Centric Distributed Storage System
carsonwang/tpcds-kit
TPC-DS benchmark kit with some modifications/fixes