chrisliu298
PhD student @UCSC CSE | Research Intern at @SkyworkAI
University of California, Santa CruzSanta Cruz, California
Pinned Repositories
awesome-llm-unlearning
A resource repository for machine unlearning in large language models
awesome-representation-engineering
A resource repository for representation engineering in large language models
awesome-sparse-autoencoders
A resource repository of sparse autoencoders for large language models
gpt2-arxiv
Fine-tuning GPT-2 to generate research paper abstracts
llm-unlearn-eco
[NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts
min_double_descent
A minimal example of double descent
resnet-tinyimagenet
I trained ResNet models using the Tiny ImageNet dataset.
roberta-imdb
IMDb sentiment analysis with RoBERTa
Skywork-Reward
Rank 1 and 3 reward models on RewardBench
tapt
Data augmentation by generating new samples
chrisliu298's Repositories
chrisliu298/awesome-llm-unlearning
A resource repository for machine unlearning in large language models
chrisliu298/awesome-representation-engineering
A resource repository for representation engineering in large language models
chrisliu298/llm-unlearn-eco
[NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts
chrisliu298/tapt
Data augmentation by generating new samples
chrisliu298/awesome-sparse-autoencoders
A resource repository of sparse autoencoders for large language models
chrisliu298/minimal-lm-finetune
A minimal example of fine-tuning autoregressive language models with multiple GPUs and DeepSpeed
chrisliu298/nanoGCG
A fast + lightweight implementation of the GCG algorithm in PyTorch
chrisliu298/min_double_descent
A minimal example of double descent
chrisliu298/Skywork-Reward
Rank 1 and 3 reward models on RewardBench
chrisliu298/alignment-handbook
Robust recipes to align language models with human and AI preferences
chrisliu298/Awesome-GenAI-Unlearning
chrisliu298/circuit-breakers
Improving Alignment and Robustness with Circuit Breakers
chrisliu298/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
chrisliu298/firefoxCSS
Oneline, minimal, keyboard-centered Firefox CSS theme.
chrisliu298/halu_clf
chrisliu298/hugo-website
Minimalist Hugo template for academic websites
chrisliu298/kickstart.nvim
A launch point for your personal nvim configuration
chrisliu298/muse_bench
chrisliu298/Online-RLHF
A recipe for online RLHF.
chrisliu298/openr
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
chrisliu298/Qwen2.5-Math
A series of math-specific large language models of our Qwen2 series.
chrisliu298/reward-bench
RewardBench: the first evaluation tool for reward models.
chrisliu298/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
chrisliu298/rm-score
chrisliu298/SOUL
Official repo for paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"
chrisliu298/SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
chrisliu298/tofu
Landing Page for TOFU
chrisliu298/Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
chrisliu298/trl
Train transformer language models with reinforcement learning.
chrisliu298/wmdp
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.