Pinned Repositories
chain-of-hindsight
Chain-of-Hindsight, A Scalable RLHF Method
hybrid-discriminative-generative
Hybrid Discriminative-Generative Training via Contrastive Learning
instructrl
Instruction Following Agents with Multimodal Transforemrs
language-quantized-autoencoders
Language Quantized AutoEncoders
mini_apt
ringattention
Transformers with Arbitrarily Large Context
taming-maml
Taming MAML: efficient unbiased meta-reinforcement learning
tux
Tools and Utils for Experiments (TUX)
LWM
Large World Model With 1M Context
open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
forhaoliu's Repositories
forhaoliu/ringattention
Transformers with Arbitrarily Large Context
forhaoliu/chain-of-hindsight
Chain-of-Hindsight, A Scalable RLHF Method
forhaoliu/language-quantized-autoencoders
Language Quantized AutoEncoders
forhaoliu/hybrid-discriminative-generative
Hybrid Discriminative-Generative Training via Contrastive Learning
forhaoliu/instructrl
Instruction Following Agents with Multimodal Transforemrs
forhaoliu/taming-maml
Taming MAML: efficient unbiased meta-reinforcement learning
forhaoliu/tux
Tools and Utils for Experiments (TUX)
forhaoliu/mini_apt
forhaoliu/jax_sac