waterhorse1
Ph.D. student in University College London, interested in Large Language Model, Meta Learning, Reinforcement Learning and Multi-agent Learning.
University College London
Pinned Repositories
torchopt
TorchOpt is an efficient library for differentiable optimization built upon PyTorch.
apollo_learning
Baidu Apollo Learning
ChessGPT
(NeurIPS 2023) ChessGPT - Bridging Policy Learning and Language Modeling
CMML_pytorch
ha_ma_ppo
LLM_Tree_Search
(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training
MELU_pytorch
An unofficial pytorch implementation of MELU
MRI_RL
NAC
(NeurIPS 2021) Neural Auto-Curricula in Two-Player Zero-Sum Games.
waterhorse1.github.io
waterhorse1's Repositories
waterhorse1/LLM_Tree_Search
(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training
waterhorse1/ChessGPT
(NeurIPS 2023) ChessGPT - Bridging Policy Learning and Language Modeling
waterhorse1/MELU_pytorch
An unofficial pytorch implementation of MELU
waterhorse1/NAC
(NeurIPS 2021) Neural Auto-Curricula in Two-Player Zero-Sum Games.
waterhorse1/CMML_pytorch
waterhorse1/ha_ma_ppo
waterhorse1/waterhorse1.github.io
waterhorse1/apollo_learning
Baidu Apollo Learning
waterhorse1/classification
waterhorse1/decision-transformer
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
waterhorse1/Deep-RL-Keras
Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)
waterhorse1/MRI_RL
waterhorse1/chess_template
waterhorse1/deepdrive
End-to-end simulation for self-driving cars
waterhorse1/DeepLearningFlappyBird
Flappy Bird hack using Deep Reinforcement Learning (Deep Q-learning).
waterhorse1/DRL-implementation
waterhorse1/haddpg
waterhorse1/meta_classification
waterhorse1/Meta_Gradient
waterhorse1/Meta_Regression
waterhorse1/metaworld
An open source robotics benchmark for meta- and multi-task reinforcement learning
waterhorse1/models
Models and examples built with TensorFlow
waterhorse1/MRI_DDPG
waterhorse1/pearl_lstm
waterhorse1/Pearl_relabel
waterhorse1/Promp_test
waterhorse1/Regression
waterhorse1/reinforcement-learning
Minimal and Clean Reinforcement Learning Examples
waterhorse1/torchopt
TorchOpt is a high-performance optimizer library built upon PyTorch for easy implementation of functional optimization and gradient-based meta-learning.