twidddj

@kakaobrainSeoul, Korea

twidddj's Stars

lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Language:Python7.7k 143 47668
heejkoo/Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
Language:HTML7.6k 244 26646
deepmind/alphatensor
Language:Python2.5k 53 9212
uber-research/go-explore
Code for Go-Explore: a New Approach for Hard-Exploration Problems
Language:Python558 19 1399
utilForever/rl-paper-study
Reinforcement Learning paper review study
218 22 342
AmenRa/retriv
A Python Search Engine for Humans 🥸
Language:Python183 8 3722
ezelikman/STaR
Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)
Language:Python158 3 119
facebookresearch/NPM
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
Language:Python156 7 414
utilForever/baba-is-auto
Baba Is You simulator using C++ with some reinforcement learning
Language:Python152 8 2819
kakaobrain/brain_agent
Brain Agent for Large-Scale and Multi-Task Agent Learning
Language:Python80 3 212
pokaxpoka/B_Pref
Language:Jupyter Notebook44 2 210
kakaobrain/stg
Official implementation of Selective Token Generation (COLING'22)
Language:Jupyter Notebook8 4 00