YifeiZhou02's Stars
gimme1dollar/b-moca
Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation)
ServiceNow/BrowserGym
BrowserGym, a gym environment for web task automation in the Chromium browser.
YifeiZhou02/ArCHer
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
snu-mllab/Achievement-Distillation
Official PyTorch implementation of "Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning" (NeurIPS 2023)
OSU-NLP-Group/TravelPlanner
[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
princeton-nlp/WebShop
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
princeton-nlp/intercode
[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898
THUDM/AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
PKU-Alignment/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Jiayi-Pan/GPT-V-on-Web
👀🧠 GPT-4 Vision x 💪⌨️ Vimium = Autonomous Web Agent
microsoft/TextWorld
TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.
ikostrikov/jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
mingkaid/rl-prompt
Accompanying repo for the RLPrompt paper
young-geng/EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Sea-Snell/JAXSeq
Train very large language models in Jax.
Sea-Snell/Implicit-Language-Q-Learning
Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"
fengyuli-dev/distribution-normalization
Test-Time Distribution Normalization For Contrastively Learned Vision-language Models
ikostrikov/rlpd
ikostrikov/pytorch-trpo
PyTorch implementation of Trust Region Policy Optimization
kairproject/kair_algorithms_draft
Reinforcement learning algorithms for robot control tasks
YifeiZhou02/generalized_paraphrase_identification
Research code for "GAPX: Generalized Autoregressive Paraphrase-identification X", NeurIPS 2022
jungokasai/THumB
yudasong/HyQ
Official code repo for paper: Hybrid RL: Using both offline and online data can make RL efficient.
LexiFi/csml
High-level bindings between .Net and OCaml
SimplifyJobs/Summer2025-Internships
Collection of Summer 2025 tech internships!
northwesternfintech/2025QuantInternships
Public quant internship repository, maintained by NUFT but available for everyone.
aviralkumar2907/CQL
Code for conservative Q-learning
YifeiZhou02/Improve-Discourse-Dependency-Parsing-with-Contextualized-Representations
Implementation of the paper 'Improve Discourse Dependency Parsing with Contextualized Representations', Findings of NAACL 2022
liaopeiyuan/artbench
Benchmarking Generative Models with Artworks
dangkhoasdc/awesome-ai-residency
List of AI Residency Programs