liziniu
Ph.D. student at The Chinese University of Hong Kong, Shenzhen.
The Chinese University of Hong Kong, ShenzhenShenzhen
Pinned Repositories
CVPR-XJTU-2018
2018 Spring Course, Computer Vision and Pattern Recognition, in XJTU
Face-Recognition-TX2
Face Recognition on NVIDIA TX2
GEM
Code for Paper (Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity)
HyperDQN
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
ILwSD
ISWBC
Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)
policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
RL-PPO-Keras
Proximal Policy Optimization(PPO) with Keras Implementation
RLX
RLX is an RL codebase based on TensorFlow. It implements algorithms like SAC, ACER, GAIL and TRPO. It is easy to use.
liziniu's Repositories
liziniu/ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
liziniu/policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
liziniu/RL-PPO-Keras
Proximal Policy Optimization(PPO) with Keras Implementation
liziniu/HyperDQN
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
liziniu/GEM
Code for Paper (Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity)
liziniu/ISWBC
Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)
liziniu/ILwSD
liziniu/RLX
RLX is an RL codebase based on TensorFlow. It implements algorithms like SAC, ACER, GAIL and TRPO. It is easy to use.
liziniu/liziniu.github.io
liziniu/bib-merge
liziniu/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
liziniu/baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
liziniu/baselines
liziniu/cgmm
liziniu/Chinese-LLaMA-Alpaca-2
中文 LLaMA-2 & Alpaca-2 大模型二期项目 + 本地CPU/GPU训练部署 (Chinese LLaMA-2 & Alpaca-2 LLMs)
liziniu/clash-for-linux
Linux 端使用 Clash 作为代理工具
liziniu/CVAE
liziniu/dagger
liziniu/deep-learning-notes
Experiments with Deep Learning
liziniu/go-explore
Code for Go-Explore: a New Approach for Hard-Exploration Problems
liziniu/gym-minigrid
Minimalistic gridworld package for OpenAI Gym
liziniu/iclr-blog-track.github.io
liziniu/Maze
liziniu/Model-Uncertainty-in-Neural-Networks
TensorFlow implementation of Model-Uncertainty-in-Neural-Networks
liziniu/random-network-distillation
Code for the paper "Exploration by Random Network Distillation"
liziniu/sample-efficient-bayesian-rl
Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL
liziniu/stable-baselines
liziniu/SuperMario
liziniu/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
liziniu/webpage-template
Adapted from the widely used project webpage template made by the colorful folks.