shehper's Stars
pytorch/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Doriandarko/claude-engineer
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capabilities of a large language model with practical file system operations and web search functionality.
probml/pyprobml
Python code for "Probabilistic Machine learning" book by Kevin Murphy
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
probml/pml-book
"Probabilistic Machine Learning" - a book series by Kevin Murphy
google-deepmind/open_spiel
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
roboflow/sports
computer vision and sports
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
google-deepmind/bsuite
bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent
opendilab/LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
openai/mlsh
Code for the paper "Meta-Learning Shared Hierarchies"
greentfrapp/lucent
Lucid library adapted for PyTorch
BobMcDear/attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
corl-team/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
THUYimingLi/BackdoorBox
The open-sourced Python toolbox for backdoor attacks and defenses.
pytorch-labs/LeanRL
LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.
EleutherAI/sae
Sparse autoencoders
openai/sparse_autoencoder
project-numina/aimo-progress-prize
camlab-ethz/AI_Science_Engineering
This repository is the official project page of the course AI in the Sciences and Engineering, ETH Zurich.
cmu-l3/llmlean
LLMs + Lean, on your laptop or in the cloud
MichalBortkiewicz/JaxGCRL
Goal-Conditioned Reinforcement Learning with JAX
UT-Austin-RPL/amago
a simple and scalable agent for training adaptive policies with sequence-based RL
hughbzhang/o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
openai/understanding-rl-vision
Code for the paper "Understanding RL Vision"
cassidylaidlaw/effective-horizon
Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"
Jaykef/Triton-nanoGPT
ZiyueWang25/RLHF-Shakespeare
Finetune LLM with RLHF to generate positive tone message from Shakespeare Corpus.
kslav/roc_analysis_gui
kslav/sat2nu
Sat2Nu, a U-Net architecture, for synthesizing nonfat-suppressed breast MRIs from fat-suppressed inputs.