Pinned Repositories
Bandit-algorithms
LinRel, SupLinRel, LinUCB, SupLinUCB, UCB-GLM, SupCB-GLM, LinTS, E-greedy, Giro, LinPhe, Phased Elimination, LSE, SpectralEliminator
comPy
Communication system and signal processing library
differentiable-bandit-algorithm
graph-based-bandit
algorithms for graph-based bandit
graph_signal_processing
Graph signal processing
GraphUCB-and-GraphUCB-Local-Algorithms
Algorithms of my AISTATS 2020 paper: Laplacian-regularized graph bandits: Algorithms and theoretical analysis
ray-rllib-hardcore
This repo extends Ray RLlib (2.9.0) with addiitonal algorithms and support required by complex real-world RL scenarios
RL-Algorithms-Implementation
RL-Research-Large-Scale
This repos show how to setup large-scale RL training platform on AWS Cloud to run RL Research
Time-Series-Library-extention
A Library for Advanced Deep Time Series Models.
yang0110's Repositories
yang0110/ml-agents-on-aws
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
yang0110/ray-rllib-hardcore
This repo extends Ray RLlib (2.9.0) with addiitonal algorithms and support required by complex real-world RL scenarios
yang0110/RL-Research-Large-Scale
This repos show how to setup large-scale RL training platform on AWS Cloud to run RL Research
yang0110/Time-Series-Library-extention
A Library for Advanced Deep Time Series Models.
yang0110/AIGames
use AI to play some games.
yang0110/allRank
allRank is a framework for training learning-to-rank neural models based on PyTorch.
yang0110/alphageometry
yang0110/alphazero-general
A fast, generalized, and modified implementation of Deepmind's distinguished AlphaZero in PyTorch.
yang0110/awesome-game-ai
Awesome Game AI materials of Multi-Agent Reinforcement Learning
yang0110/carla
Open-source simulator for autonomous driving research.
yang0110/ddpo
Code for the paper "Training Diffusion Models with Reinforcement Learning"
yang0110/Deep-Reinforcement-Learning-Hands-On-Third-Edition
Deep Reinforcement Learning Hands-On, 3E_Published by Packt
yang0110/Dense-Deep-Reinforcement-Learning
This repo contains the code for paper "Dense reinforcement learning for safety validation of autonomous vehicles"
yang0110/dreamerv3
Mastering Diverse Domains through World Models
yang0110/efficientalphazero
AlphaZero for singleplayer environments implemented efficiently using Ray
yang0110/EfficientZero
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
yang0110/flappy-bird-env
Flappy Bird as a Farama Gymnasium environment.
yang0110/Group-robust-preference-optimization
yang0110/how-to-autorl
Plug-and-play hydra sweepers for the EA-based multifidelity method DEHB and several population-based training variations, all proven to efficiently tune RL hyperparameters.
yang0110/imitation-learning-in-action
imitation-learning-in-action
yang0110/LightZero
LightZero: A lightweight and efficient MCTS/AlphaZero/MuZero algorithm toolkit.
yang0110/Open-Sora
Building your own video generation model like OpenAI's Sora
yang0110/Open-Sora-Plan
This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.
yang0110/peaceful-pie
Control Unity from Python! Use for reinforcement learning.
yang0110/ray-rllib-experiments
advanced usage of Ray RLlib
yang0110/shortcut-models
yang0110/spacetimeformer
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."
yang0110/streaming-drl
Deep reinforcement learning without experience replay, target networks, or batch updates.
yang0110/verl
veRL: Volcano Engine Reinforcement Learning for LLM
yang0110/VideoAgent
Official implementation of "Self-Improving Video Generation as Agent"