koulanurag

Applied Scientist 2 at Amazon | LLM for Code | Deep Reinforcement Learning

AmazonNew York, New York

koulanurag's Stars

openai/openai-cookbook
Examples and guides for using the OpenAI API
Language:MDX60.8k 894 4879.7k
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Language:Jupyter Notebook36.3k 394 1094.6k
astral-sh/ruff
An extremely fast Python linter and code formatter, written in Rust.
Language:Rust34.1k 84 5.8k1.1k
Aider-AI/aider
aider is AI pair programming in your terminal
Language:Python23.8k 157 2.3k2.2k
ml-explore/mlx
MLX: An array framework for Apple silicon
Language:C++18k 148 5891k
KindXiaoming/pykan
Kolmogorov Arnold Networks
Language:Jupyter Notebook15.3k 112 4171.4k
huggingface/trl
Train transformer language models with reinforcement learning.
Language:Python10.4k 77 1.3k1.3k
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Language:Python9.3k 85 38875
kyutai-labs/moshi
Language:Python7.1k 80 92551
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python5.7k 60 106521
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Language:Python5.3k 39 41514
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
3.6k 62 4218
cohere-ai/cohere-toolkit
Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.
Language:TypeScript2.9k 42 55371
pytorch/torchtitan
A PyTorch native library for large model training
Language:Python2.8k 44 198228
facebookresearch/Pearl
A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
Language:Jupyter Notebook2.7k 35 59170
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
Language:Python2.3k 19 84189
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
Language:Python1.6k 19 48138
JonasGeiping/cramming
Cramming the training of a (BERT-type) language model into limited compute.
Language:Python1.3k 23 3499
google-deepmind/rlax
Language:Python1.3k 34 2688
openai/lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
Language:Python1.3k 23 15164
sarthakrastogi/quality-prompts
Language:Python716 8 447
alexmolas/microsearch
Language:Python397 3 233
facebookresearch/searchformer
Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".
Language:Jupyter Notebook343 5 118
VIRL-Platform/VIRL
(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life
Language:Python319 12 513
haoliuhl/chain-of-hindsight
Simple next-token-prediction for RLHF
Language:Python219 4 1717
gautierdag/bpeasy
Fast bare-bones BPE for modern tokenizer training
Language:Python142 2 33
microsoft/LLF-Bench
A benchmark for evaluating learning agents based on just language feedback
Language:Python61 6 613
vwxyzjn/gym-microrts-paper
The source code for the gym-microrts paper.
Language:Python42 4 63
understanding-search/maze-dataset
maze datasets for investigating OOD behavior of ML systems
Language:Jupyter Notebook18 4 224
RAIVNLab/SuperposedDecoding
Code repository for the paper - "Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass"
Language:Python17 7 04

koulanurag

koulanurag's Stars

openai/openai-cookbook

rasbt/LLMs-from-scratch

astral-sh/ruff

Aider-AI/aider

ml-explore/mlx

KindXiaoming/pykan

huggingface/trl

karpathy/minbpe

kyutai-labs/moshi

pytorch-labs/gpt-fast

google/gemma_pytorch

opendilab/awesome-RLHF

cohere-ai/cohere-toolkit

pytorch/torchtitan

facebookresearch/Pearl

eric-mitchell/direct-preference-optimization

maitrix-org/llm-reasoners

JonasGeiping/cramming

google-deepmind/rlax

openai/lm-human-preferences

sarthakrastogi/quality-prompts

alexmolas/microsearch

facebookresearch/searchformer

VIRL-Platform/VIRL

haoliuhl/chain-of-hindsight

gautierdag/bpeasy

microsoft/LLF-Bench

vwxyzjn/gym-microrts-paper

understanding-search/maze-dataset

RAIVNLab/SuperposedDecoding