Pinned Repositories
a3c_continuous
A continuous action space version of A3C LSTM in pytorch plus A3G design
ac-teach
Code for the CoRL 2019 paper AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
arena-hard-auto
Arena-Hard-Auto: An automatic LLM benchmark.
atari-rl
Atari - Deep Reinforcement Learning algorithms in TensorFlow
CUP
IDAQ_Public
learningtolearn
MetaCURE-Public
robosuite
sc2_teacher
NagisaZj's Repositories
NagisaZj/IDAQ_Public
NagisaZj/CUP
NagisaZj/arena-hard-auto
Arena-Hard-Auto: An automatic LLM benchmark.
NagisaZj/bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
NagisaZj/ContextWM
Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499
NagisaZj/decision-transformer
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
NagisaZj/diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
NagisaZj/diffusion_reward
[arXiv'23] Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"
NagisaZj/dreamerv3
Mastering Diverse Domains through World Models
NagisaZj/DrM
DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.
NagisaZj/DUP
NagisaZj/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
NagisaZj/Graphormer
Graphormer is a general-purpose deep learning backbone for molecular modeling.
NagisaZj/HIQL
HIQL: Offline Goal-Conditioned RL with Latent States as Actions (NeurIPS 2023)
NagisaZj/hypnettorch
Package for working with hypernetworks in PyTorch.
NagisaZj/icl-alignment
Is In-Context Learning Sufficient for Instruction Following in LLMs?
NagisaZj/implicit_q_learning
NagisaZj/LAPO-offlienRL
NagisaZj/lightATAC
NagisaZj/llama3
The official Meta Llama 3 GitHub site
NagisaZj/metaworld-cup
NagisaZj/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2
NagisaZj/mtenv
NagisaZj/octo
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
NagisaZj/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
NagisaZj/OPPO
NagisaZj/opro
official code for "Large Language Models as Optimizers"
NagisaZj/simple-evals
NagisaZj/universal_manipulation_interface
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
NagisaZj/viper_rl
Using advances in generative modeling to learn reward functions from unlabeled videos.