shehper

shehper's Stars

pytorch/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Language:Python22.5k 399 6409.6k
Doriandarko/claude-engineer
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capabilities of a large language model with practical file system operations and web search functionality.
Language:Python9.7k 136 1291k
probml/pyprobml
Python code for "Probabilistic Machine learning" book by Kevin Murphy
Language:Jupyter Notebook6.6k 192 4561.5k
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
5.4k 86 9297
probml/pml-book
"Probabilistic Machine Learning" - a book series by Kevin Murphy
Language:Jupyter Notebook5k 88 656597
google-deepmind/open_spiel
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
Language:C++4.3k 107 561937
roboflow/sports
computer vision and sports
Language:Python2.6k 53 24287
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
1.6k 25 1145
google-deepmind/bsuite
bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent
Language:Python1.5k 60 31182
opendilab/LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
Language:Python1.2k 13 106122
openai/mlsh
Code for the paper "Meta-Learning Shared Hierarchies"
Language:Python614 46 14164
greentfrapp/lucent
Lucid library adapted for PyTorch
Language:Python609 16 3088
BobMcDear/attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Language:Python486 11 622
corl-team/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
Language:Python486 3 1021
THUYimingLi/BackdoorBox
The open-sourced Python toolbox for backdoor attacks and defenses.
Language:Python470 7 1873
pytorch-labs/LeanRL
LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.
Language:Python466 8 917
EleutherAI/sae
Sparse autoencoders
Language:Python353 7 1449
openai/sparse_autoencoder
Language:Python349 11 1435
project-numina/aimo-progress-prize
Language:Jupyter Notebook329 7 1025
camlab-ethz/AI_Science_Engineering
This repository is the official project page of the course AI in the Sciences and Engineering, ETH Zurich.
Language:Jupyter Notebook137 4 225
cmu-l3/llmlean
LLMs + Lean, on your laptop or in the cloud
Language:Lean125 5 116
MichalBortkiewicz/JaxGCRL
Goal-Conditioned Reinforcement Learning with JAX
Language:Jupyter Notebook96 1 013
UT-Austin-RPL/amago
a simple and scalable agent for training adaptive policies with sequence-based RL
Language:Python93 2 104
hughbzhang/o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
Language:Python52 1 01
openai/understanding-rl-vision
Code for the paper "Understanding RL Vision"
Language:Python43 6 021
cassidylaidlaw/effective-horizon
Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"
Language:Python42 3 36
Jaykef/Triton-nanoGPT
Language:Python13 1 00
ZiyueWang25/RLHF-Shakespeare
Finetune LLM with RLHF to generate positive tone message from Shakespeare Corpus.
Language:Python4 1 00
kslav/roc_analysis_gui
Language:Python2 1 00
kslav/sat2nu
Sat2Nu, a U-Net architecture, for synthesizing nonfat-suppressed breast MRIs from fat-suppressed inputs.
Language:Python1 1 00