liziniu

Ph.D. student at The Chinese University of Hong Kong, Shenzhen.

The Chinese University of Hong Kong, ShenzhenShenzhen

Pinned Repositories

CVPR-XJTU-2018
2018 Spring Course, Computer Vision and Pattern Recognition, in XJTU
Language:MATLAB2 1 03
Face-Recognition-TX2
Face Recognition on NVIDIA TX2
Language:Python10 1 53
GEM
Code for Paper (Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity)
Language:Python10 1 00
HyperDQN
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
Language:Python12 1 01
ILwSD
Language:Python3 1 00
ISWBC
Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)
Language:Python7 1 00
policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
Language:Python24 1 05
ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
Language:Python155 2 313
RL-PPO-Keras
Proximal Policy Optimization(PPO) with Keras Implementation
Language:Python17 1 312
RLX
RLX is an RL codebase based on TensorFlow. It implements algorithms like SAC, ACER, GAIL and TRPO. It is easy to use.
Language:Python3 2 00

liziniu's Repositories

liziniu/ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
Language:Python155 2 313
liziniu/policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
Language:Python24 1 05
liziniu/RL-PPO-Keras
Proximal Policy Optimization(PPO) with Keras Implementation
Language:Python17 1 312
liziniu/HyperDQN
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
Language:Python12 1 01
liziniu/GEM
Code for Paper (Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity)
Language:Python10 1 00
liziniu/ISWBC
Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)
Language:Python7 1 00
liziniu/ILwSD
Language:Python3 1 00
liziniu/RLX
RLX is an RL codebase based on TensorFlow. It implements algorithms like SAC, ACER, GAIL and TRPO. It is easy to use.
Language:Python3 2 00
liziniu/liziniu.github.io
Language:Python2 1 00
liziniu/bib-merge
Language:Python1 2 00
liziniu/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook0 0
liziniu/baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库；24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
Language:Python0 0
liziniu/baselines
Language:Python1 0
liziniu/cgmm
Language:Python2 0
liziniu/Chinese-LLaMA-Alpaca-2
中文 LLaMA-2 & Alpaca-2 大模型二期项目 + 本地CPU/GPU训练部署 (Chinese LLaMA-2 & Alpaca-2 LLMs)
Language:Python0 0
liziniu/clash-for-linux
Linux 端使用 Clash 作为代理工具
Language:Shell0 0
liziniu/CVAE
Language:Python2 0
liziniu/dagger
Language:Python2 0
liziniu/deep-learning-notes
Experiments with Deep Learning
Language:Jupyter Notebook1 0
liziniu/go-explore
Code for Go-Explore: a New Approach for Hard-Exploration Problems
Language:Python2 0
liziniu/gym-minigrid
Minimalistic gridworld package for OpenAI Gym
Language:Python2 0
liziniu/iclr-blog-track.github.io
Language:HTML0 0
liziniu/Maze
Language:Jupyter Notebook1 0
liziniu/Model-Uncertainty-in-Neural-Networks
TensorFlow implementation of Model-Uncertainty-in-Neural-Networks
Language:Jupyter Notebook2 0
liziniu/random-network-distillation
Code for the paper "Exploration by Random Network Distillation"
Language:Python3 0
liziniu/sample-efficient-bayesian-rl
Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL
Language:Jupyter Notebook1 0
liziniu/stable-baselines
Language:Python2 0
liziniu/SuperMario
Language:Jupyter Notebook2 0
liziniu/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
liziniu/webpage-template
Adapted from the widely used project webpage template made by the colorful folks.
Language:HTML0 0