Harry-mic

a fresh man in RL

Tsinghua

Pinned Repositories

GODA
Language:Jupyter Notebook0 1 00
Harry-mic.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Language:JavaScript00
la-mbda
LAMBDA is a model-based reinforcement learning agent that uses Bayesian world models for safe policy optimization
Language:Python0 0 00
RL-ViGen
This is the repo for RL-ViGen
Language:Python0 0 00
TREvaL
Reasonable Reward Evaluation of Large Language Models
Language:Python7 1 11
alignment-handbook
Robust recipes to align language models with human and AI preferences
Language:Python4.7k 111 137412
Dromedary
Dromedary: towards helpful, ethical and reliable LLMs.
Language:Python1.1k 24 1287
Value-Augmented-Sampling
Language:Python16 2 32
RE-Control
Language:Python122
Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
Language:Python183 3 3116

Harry-mic's Repositories

Harry-mic/TREvaL
Reasonable Reward Evaluation of Large Language Models
Language:Python7 1 11
Harry-mic/GODA
Language:Jupyter Notebook0 1 00
Harry-mic/Harry-mic.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Language:JavaScript00
Harry-mic/la-mbda
LAMBDA is a model-based reinforcement learning agent that uses Bayesian world models for safe policy optimization
Language:Python0 0 00
Harry-mic/RL-ViGen
This is the repo for RL-ViGen
Language:Python0 0 00