NagisaZj

Pinned Repositories

a3c_continuous
A continuous action space version of A3C LSTM in pytorch plus A3G design
Language:Python0 1 00
ac-teach
Code for the CoRL 2019 paper AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
Language:Python0 1 00
arena-hard-auto
Arena-Hard-Auto: An automatic LLM benchmark.
Language:Jupyter Notebook0 0 00
atari-rl
Atari - Deep Reinforcement Learning algorithms in TensorFlow
Language:Python0 1 00
CUP
Language:Python5 1 01
IDAQ_Public
Language:Python6 1 20
learningtolearn
Language:Python1 1 00
MetaCURE-Public
Language:Python14 2 24
robosuite
Language:Python1 2 00
sc2_teacher
Language:Python1 1 00

NagisaZj's Repositories

NagisaZj/IDAQ_Public
Language:Python6 1 20
NagisaZj/CUP
Language:Python5 1 01
NagisaZj/arena-hard-auto
Arena-Hard-Auto: An automatic LLM benchmark.
Language:Jupyter Notebook0 0 00
NagisaZj/bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
NagisaZj/ContextWM
Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499
Language:Python0 0
NagisaZj/decision-transformer
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
NagisaZj/diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
Language:Python0 0
NagisaZj/diffusion_reward
[arXiv'23] Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"
Language:Python0 0
NagisaZj/dreamerv3
Mastering Diverse Domains through World Models
Language:Python0 0
NagisaZj/DrM
DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.
Language:Python0 0
NagisaZj/DUP
Language:Python0 0
NagisaZj/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python0 0
NagisaZj/Graphormer
Graphormer is a general-purpose deep learning backbone for molecular modeling.
Language:Python0 0
NagisaZj/HIQL
HIQL: Offline Goal-Conditioned RL with Latent States as Actions (NeurIPS 2023)
Language:Python0 0
NagisaZj/hypnettorch
Package for working with hypernetworks in PyTorch.
Language:Python0 0
NagisaZj/icl-alignment
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Language:Python0 0
NagisaZj/implicit_q_learning
Language:Python0 0
NagisaZj/LAPO-offlienRL
Language:Python0 0
NagisaZj/lightATAC
Language:Python0 0
NagisaZj/llama3
The official Meta Llama 3 GitHub site
Language:Python0 0
NagisaZj/metaworld-cup
Language:Python1 0
NagisaZj/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2
Language:Python0 0
NagisaZj/mtenv
Language:Python1 0
NagisaZj/octo
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
Language:Python0 0
NagisaZj/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Language:Python0 0
NagisaZj/OPPO
Language:Python0 0
NagisaZj/opro
official code for "Large Language Models as Optimizers"
Language:Python0 0
NagisaZj/simple-evals
Language:Python0 0
NagisaZj/universal_manipulation_interface
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
Language:Python0 0
NagisaZj/viper_rl
Using advances in generative modeling to learn reward functions from unlabeled videos.
Language:Jupyter Notebook0 0