/awesome-ray

Ray - A curated list of resources: https://github.com/ray-project/ray

MIT LicenseMIT

Awesome RAY AwesomeRay Logo

ray_logo

Ray makes it effortless to parallelize single machine code — go from a single CPU to multi-core, multi-GPU or multi-node with minimal code changes.

This is a curated list of awesome RAY libraries, projects, and other resources. Contributions are welcome!

Contents

This section contains libraries that are well-made and useful, but have not necessarily been battle-tested by a large userbase yet.

Ray + LLM

  • FastChat Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
  • LangChain-Ray Examples on how to use LangChain and Ray
  • Aviary Ray Aviary - evaluate multiple LLMs easily
  • LLM-distributed-finetune Finetuning Large Language Models Efficiently on a Distributed Cluster, Uses Ray AIR to orchestrate the training on multiple AWS GPU instances.

Reinforcmenet Learning

  • muzero-general - A commented and documented implementation of MuZero based on the Google DeepMind paper (Schrittwieser et al., Nov 2019) and the associated pseudocode.
  • rllib-torch-maddpg - PyTorch implementation of MADDPG (Lowe et al.) in RLLib
  • MARLlib - a comprehensive Multi-Agent Reinforcement Learning algorithm library

Ray + JAX / TPU

  • Swarm-jax - Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
  • Alpa - Auto parallelization for large-scale neural networks using Jax, XLA, and Ray

Ray + Database

  • Balsa Balsa is a learned SQL query optimizer. It tailor optimizes your SQL queries to find the best execution plans for your hardware and engine.
  • RaySQL Distributed SQL Query Engine in Python using Ray
  • Quokka Open source SQL engine in Python

Ray + X (integration)

Ray-Project

distributed computing

  • Fugue a unified interface for distributed computing that lets users execute Python, pandas, and SQL code on Ray without rewrites.
  • Daft is a fast, Pythonic and scalable open-source dataframe library built for Python and Machine Learning workloads.
  • Flower(flwr) is a framework for building federated learning systems. Uses Ray for scaling out experiments from desktop, single GPU rack, or multi-node GPU cluster.
  • Modin: Scale your pandas workflows by changing one line of code. Uses Ray for transparently scaling out to multiple nodes.
  • Volcano is a batch system built on Kubernetes. It provides a suite of mechanisms that are commonly required by many classes of batch & elastic workloads.

Ray AIR

Misc

  • AutoGluon AutoML for Image, Text, and Tabular Data
  • Aws-samples Ray on Amazon SageMaker/EC2/EKS/EMR
  • KubeRay A toolkit to run Ray applications on Kubernetes
  • ray-educational-materials This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
  • Metaflow-Ray An extension for Metaflow that enables seamless integration with Ray

Papers

This section contains papers focused on Ray (e.g. RAY-based library whitepapers, research on RAY, etc). Papers implemented in RAY are listed in the Models/Projects section.

books

  • Learning Ray Learning Ray - Flexible Distributed Python for Machine Learning

course

cheatsheet

Contributing

Contributions welcome! Read the contribution guidelines first.