twidddj's Stars
lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
heejkoo/Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
deepmind/alphatensor
uber-research/go-explore
Code for Go-Explore: a New Approach for Hard-Exploration Problems
utilForever/rl-paper-study
Reinforcement Learning paper review study
AmenRa/retriv
A Python Search Engine for Humans 🥸
ezelikman/STaR
Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)
facebookresearch/NPM
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
utilForever/baba-is-auto
Baba Is You simulator using C++ with some reinforcement learning
kakaobrain/brain_agent
Brain Agent for Large-Scale and Multi-Task Agent Learning
pokaxpoka/B_Pref
kakaobrain/stg
Official implementation of Selective Token Generation (COLING'22)