Pinned Repositories
Algorithm_Interview_Notes-Chinese
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
baselines
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
BEER-ICLR2024
The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".
Deep-reinforcement-learning-with-pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Distributional-Soft-Actor-Critic
learning-to-communicate-pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
PEER-CVPR23
Authors' implementation of PEER
PyTorch-GAN
PyTorch implementations of Generative Adversarial Networks.
reinforcement-learning-algorithms
This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)
RL-Adventure
Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL
sweetice's Repositories
sweetice/Deep-reinforcement-learning-with-pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
sweetice/PEER-CVPR23
Authors' implementation of PEER
sweetice/BEER-ICLR2024
The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".
sweetice/ERC-ECML-23
Anonymous code for ICML submission 45
sweetice/sweetice.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
sweetice/sweetice.github.io_old
sweetice/ColossalAI
Making large AI models cheaper, faster and more accessible
sweetice/dalai_llama
The simplest way to run LLaMA on your local machine
sweetice/deep-successor-features-for-transfer
A reusable framework for successor features for transfer in deep reinforcement learning using keras.
sweetice/dice_rl
sweetice/drqv2
DrQ-v2: Improved Data-Augmented Reinforcement Learning
sweetice/ffn_geyang
Public Repo for the paper "Overcoming The Spectral-Bias of Neural Value Approximation"
sweetice/learned-fourier-features
Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"
sweetice/LibMTL
A PyTorch Library for Multi-Task Learning
sweetice/llama
Inference code for LLaMA models
sweetice/LLM4Arxiv
sweetice/MEPE
Official implementation of MEPE
sweetice/mpo
PyTorch Implementation of the Maximum a Posteriori Policy Optimisation
sweetice/neural-approx-ss-lfi
Codes for ICLR 21 paper: Neural Approximate Sufficient Statistics for Implicit Models
sweetice/Online-RLHF
A recipe for online RLHF.
sweetice/pderl
Code for "Proximal Distilled Evolutionary Reinforcement Learning", accepted at AAAI 2020
sweetice/reward-surfaces
sweetice/RWKV-LM
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
sweetice/snrl
sweetice/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
sweetice/sweetice.github.io_abondon
sweetice/TD3_BC
Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL
sweetice/tqc_pytorch_1epo
Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/
sweetice/trl
Train transformer language models with reinforcement learning.
sweetice/voltron-robotics
Voltron: Language-Driven Representation Learning for Robotics