koulanurag
Applied Scientist 2 at Amazon | LLM for Code | Deep Reinforcement Learning
AmazonNew York, New York
koulanurag's Stars
openai/openai-cookbook
Examples and guides for using the OpenAI API
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
astral-sh/ruff
An extremely fast Python linter and code formatter, written in Rust.
Aider-AI/aider
aider is AI pair programming in your terminal
ml-explore/mlx
MLX: An array framework for Apple silicon
KindXiaoming/pykan
Kolmogorov Arnold Networks
huggingface/trl
Train transformer language models with reinforcement learning.
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
kyutai-labs/moshi
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
cohere-ai/cohere-toolkit
Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.
pytorch/torchtitan
A PyTorch native library for large model training
facebookresearch/Pearl
A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
JonasGeiping/cramming
Cramming the training of a (BERT-type) language model into limited compute.
google-deepmind/rlax
openai/lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
sarthakrastogi/quality-prompts
alexmolas/microsearch
facebookresearch/searchformer
Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".
VIRL-Platform/VIRL
(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life
haoliuhl/chain-of-hindsight
Simple next-token-prediction for RLHF
gautierdag/bpeasy
Fast bare-bones BPE for modern tokenizer training
microsoft/LLF-Bench
A benchmark for evaluating learning agents based on just language feedback
vwxyzjn/gym-microrts-paper
The source code for the gym-microrts paper.
understanding-search/maze-dataset
maze datasets for investigating OOD behavior of ML systems
RAIVNLab/SuperposedDecoding
Code repository for the paper - "Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass"