Pinned Repositories
--headful
Make a web browser multimodal, give it eyes and ears.
chat-your-code
Ask questions about your codebase using GPT
cheatcode
your cheatcode for productivity
distilabel
Distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
dotfiles-v1
germanrag
GermanRAG - a German dataset for finetuning Retrieval Augmented Generation
inference-is-all-you-need
mp-transformer
Learn latent primitives of human movement.
smolR1
reproducing DeepSeek R1 Zero with Qwen2.5-0.5B on two 4090 GPUs
rasdani's Repositories
rasdani/smolR1
reproducing DeepSeek R1 Zero with Qwen2.5-0.5B on two 4090 GPUs
rasdani/atropos
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments
rasdani/build-smolGRPO
rasdani/dotfiles
managed by chezmoi
rasdani/evalchemy
Automatic evals for LLMs
rasdani/genesys
rasdani/gpt-oss
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
rasdani/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
rasdani/j1-micro
j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
rasdani/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
rasdani/modal-examples
Examples of programs built using Modal
rasdani/nano-vllm
Nano vLLM
rasdani/open-instruct
AllenAI's post-training codebase
rasdani/OpenHands
🙌 OpenHands: Code Less, Make More
rasdani/prime
prime is a framework for efficient, globally distributed training of AI models over the internet.
rasdani/prime-cli
The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers
rasdani/prime-rl
rasdani/R2E-Gym
Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
rasdani/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
rasdani/reasoning-gym
procedural reasoning datasets
rasdani/rllm
Democratizing Reinforcement Learning for LLMs
rasdani/simpleRL-reason
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
rasdani/SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
rasdani/SWE-smith
Scaling Data for SWE-agents
rasdani/triton
Development repository for the Triton language and compiler
rasdani/unsloth
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
rasdani/verifiers
Verifiers for LLM Reinforcement Learning
rasdani/verl
verl: Volcano Engine Reinforcement Learning for LLMs
rasdani/verl-internvl
rasdani/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs