Ray makes it effortless to parallelize single machine code — go from a single CPU to multi-core, multi-GPU or multi-node with minimal code changes.
This is a curated list of awesome RAY libraries, projects, and other resources. Contributions are welcome!
This section contains libraries that are well-made and useful, but have not necessarily been battle-tested by a large userbase yet.
- FastChat Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
- LangChain-Ray Examples on how to use LangChain and Ray
- Aviary Ray Aviary - evaluate multiple LLMs easily
- LLM-distributed-finetune Finetuning Large Language Models Efficiently on a Distributed Cluster, Uses Ray AIR to orchestrate the training on multiple AWS GPU instances.
- muzero-general - A commented and documented implementation of MuZero based on the Google DeepMind paper (Schrittwieser et al., Nov 2019) and the associated pseudocode.
- rllib-torch-maddpg - PyTorch implementation of MADDPG (Lowe et al.) in RLLib
- MARLlib - a comprehensive Multi-Agent Reinforcement Learning algorithm library
- Swarm-jax - Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
- Alpa - Auto parallelization for large-scale neural networks using Jax, XLA, and Ray
- Balsa Balsa is a learned SQL query optimizer. It tailor optimizes your SQL queries to find the best execution plans for your hardware and engine.
- RaySQL Distributed SQL Query Engine in Python using Ray
- Quokka Open source SQL engine in Python
- prefect-ray Prefect integrations with Ray
- xgboost_ray Distributed XGBoost on Ray
- Ray-DeepSpeed-Inference Run deepspeed on ray serve
- SkyPilot a framework for easily running machine learning workloads on any cloud through a unified interface
- Exoshuffle-CloudSort the winning entry of the 2022 CloudSort Benchmark in the Indy category.
- Fugue a unified interface for distributed computing that lets users execute Python, pandas, and SQL code on Ray without rewrites.
- Daft is a fast, Pythonic and scalable open-source dataframe library built for Python and Machine Learning workloads.
- Flower(flwr) is a framework for building federated learning systems. Uses Ray for scaling out experiments from desktop, single GPU rack, or multi-node GPU cluster.
- Modin: Scale your pandas workflows by changing one line of code. Uses Ray for transparently scaling out to multiple nodes.
- Volcano is a batch system built on Kubernetes. It provides a suite of mechanisms that are commonly required by many classes of batch & elastic workloads.
- Ray on Azure ML Turning AML compute into Ray cluster
- AutoGluon AutoML for Image, Text, and Tabular Data
- Aws-samples Ray on Amazon SageMaker/EC2/EKS/EMR
- KubeRay A toolkit to run Ray applications on Kubernetes
- ray-educational-materials This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
- Metaflow-Ray An extension for Metaflow that enables seamless integration with Ray
- Deep reinforcement learning at Riot Games by Ben Kasper - reinforcement learning for game development in production
This section contains papers focused on Ray (e.g. RAY-based library whitepapers, research on RAY, etc). Papers implemented in RAY are listed in the Models/Projects section.
- Programming in Ray: Tips for first-time users
- Reddit post
- Load PyTorch Models 340 Times Faster with Ray
- Learning Ray Learning Ray - Flexible Distributed Python for Machine Learning
- RL course Applied Reinforcement Learning with RLlib
- MLops course MLops course
Contributions welcome! Read the contribution guidelines first.