sweetice

PhD @ Ruhr University Bochum

Tuebingen, Germany

Pinned Repositories

Algorithm_Interview_Notes-Chinese
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
Language:Python1 2 00
baselines
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Language:Python1 2 00
BEER-ICLR2024
The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".
Language:Python1 1 00
Deep-reinforcement-learning-with-pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Language:Python4k 36 34858
Distributional-Soft-Actor-Critic
Language:Python1 2 01
learning-to-communicate-pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Language:Python3 2 02
PEER-CVPR23
Authors' implementation of PEER
Language:Python9 1 01
PyTorch-GAN
PyTorch implementations of Generative Adversarial Networks.
Language:Python2 3 01
reinforcement-learning-algorithms
This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)
Language:Python2 3 02
RL-Adventure
Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL
Language:Jupyter Notebook3 3 02

sweetice's Repositories

sweetice/Deep-reinforcement-learning-with-pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Language:Python4k 36 34858
sweetice/PEER-CVPR23
Authors' implementation of PEER
Language:Python9 1 01
sweetice/BEER-ICLR2024
The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".
Language:Python1 1 00
sweetice/ERC-ECML-23
Anonymous code for ICML submission 45
Language:Python1 2 00
sweetice/sweetice.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
Language:HTML1 1 0
sweetice/sweetice.github.io_old
Language:HTML1 2 0
sweetice/ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python0 0
sweetice/dalai_llama
The simplest way to run LLaMA on your local machine
Language:CSS0 0
sweetice/deep-successor-features-for-transfer
A reusable framework for successor features for transfer in deep reinforcement learning using keras.
Language:Python0 0
sweetice/dice_rl
Language:Python1 0
sweetice/drqv2
DrQ-v2: Improved Data-Augmented Reinforcement Learning
Language:Python1 0
sweetice/ffn_geyang
Public Repo for the paper "Overcoming The Spectral-Bias of Neural Value Approximation"
Language:Python0 0
sweetice/learned-fourier-features
Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"
Language:Python0 0
sweetice/LibMTL
A PyTorch Library for Multi-Task Learning
Language:Python0 0
sweetice/llama
Inference code for LLaMA models
Language:Python0 0
sweetice/LLM4Arxiv
Language:Python0 0
sweetice/MEPE
Official implementation of MEPE
Language:Python1 0
sweetice/mpo
PyTorch Implementation of the Maximum a Posteriori Policy Optimisation
Language:Python1 0
sweetice/neural-approx-ss-lfi
Codes for ICLR 21 paper: Neural Approximate Sufficient Statistics for Implicit Models
Language:Jupyter Notebook1 0
sweetice/Online-RLHF
A recipe for online RLHF.
Language:Python0 0
sweetice/pderl
Code for "Proximal Distilled Evolutionary Reinforcement Learning", accepted at AAAI 2020
Language:Python1 0
sweetice/reward-surfaces
Language:Python0 0
sweetice/RWKV-LM
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Language:Python0 0
sweetice/snrl
Language:Python1 0
sweetice/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python0 0
sweetice/sweetice.github.io_abondon
Language:JavaScript1 0
sweetice/TD3_BC
Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL
Language:Python1 0
sweetice/tqc_pytorch_1epo
Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/
Language:Python0 0
sweetice/trl
Train transformer language models with reinforcement learning.
Language:Python0 0
sweetice/voltron-robotics
Voltron: Language-Driven Representation Learning for Robotics
Language:Python0 0