r1

There are 51 repositories under r1 topic.

  • zzli2022/Awesome-System2-Reasoning-LLM

    Latest Advances on System-2 Reasoning

    Language:Python1.2k11869
  • turningpoint-ai/VisualThinker-R1-Zero

    Explore the Multimodal “Aha Moment” on 2B Model

    Language:Python608151022
  • jingyi0000/R1-VL

    R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

    Language:Python4230
  • modelscope/awesome-deep-reasoning

    Collect every awesome work about r1!

    Language:Python4166015
  • XiaoYee/Awesome_Efficient_LRM_Reasoning

    😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond

  • deepseek-mcp-server

    DMontgomery40/deepseek-mcp-server

    Model Context Protocol server for DeepSeek's advanced language models

    Language:JavaScript2711217
  • RyanLiu112/compute-optimal-tts

    Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".

    Language:Python27181421
  • SmallDoges/small-doge

    Doge Family of Small Language Models

    Language:Python1733613
  • sun-hailong/TVC

    [ACL 2025] The code repository for "Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning" in PyTorch.

    Language:Python143111
  • CJReinforce/PURE

    Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

    Language:Python136223
  • RyanLiu112/Awesome-Process-Reward-Models

    A comprehensive collection of process reward models.

  • RyanLiu112/GenPRM

    Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".

    Language:Python810
  • HJYao00/Awesome-Reasoning-MLLM

    Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1

  • The-Martyr/Awesome-Multimodal-Reasoning

    Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models

  • LazaUK/AIFoundry-DeepSeek-SDK

    Notebooks to demo the use of Azure AI Python SDK / LangChain with DeepSeek R1 reasoning model in Azure AI Foundry.

    Language:Jupyter Notebook31206
  • glide-the/InterpretationoDreams

    基于langchain设计的智能体任务,包含规划会话场景资源,构建子任务,任务执行器包含(MCTS)

    Language:Jupyter Notebook29102
  • 24-Game-Reasoning

    sylvain-wei/24-Game-Reasoning

    超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of DeepSeek R1-Zero, DeepSeek R1

    Language:Python26122
  • lachlancresswell/AutoR1

    Auto-generate fallback and meter display from existing group info in d&b audiotechnik's R1 and ArrayCalc software.

    Language:Python21772
  • sdiehl/tiny-r1

    Recreating the minimal training methods of DeepSeek-R1 for small langauge models.

    Language:Python21103
  • The-Swarm-Corporation/AgentGym

    A framework making it effortless to convert any llm model into a reasoning agent like o1 or DeepSeek's r1

    Language:Python2010
  • BY571/DistRL-LLM

    Distributed Reinforcement Learning for LLM Fine-Tuning with multi-GPU utilization

    Language:Python19101
  • IoTDevice/phicomm-r1-controler

    斐讯R1音箱控制程序

    Language:Go19102
  • tyler-romero/microR1

    Simple repository for training small reasoning models

    Language:Python121
  • ericsson-iap/python-sample-app

    Python Sample App for SMO Systems like Ericsson Intelligent Automation Platform. We aim to be ORAN aligned. Use this to kickstart your own app!

    Language:Python11510
  • nschlaepfer/ChainForge-R1-SuperCoT

    A multi-stage pipeline that enhances Qwen2.5 language models with DeepSeek Reasoner's chain-of-thought capabilities. Implements the DeepSeek-R1 methodology through cold-start SFT, reasoning-oriented RL, rejection sampling, and optional model distillation.

    Language:Python10203
  • lechmazur/goods

    LLM public goods game

  • OnerootProject/r1

    R1 Protocol

    Language:JavaScript7505
  • Xuchen-Li/OvO-R1

    Exploring the influence of using end-to-end reinforcement learning and various reward functions on the reasoning capabilities of different 1.5B base models.

    Language:Python50
  • ericsson-iap/go-sample-app

    Go Sample App for SMO Systems like Ericsson Intelligent Automation Platform. We aim to be ORAN aligned. Use this to kickstart your own app!

    Language:Go4301
  • PINT-NMR/PINT

    NMR spectroscopy software for line shape fitting and downstream analysis

    Language:HTML4101
  • Trae1ounG/Chinese-Logic-RL

    Exploring R1 on Logic Puzzle in Chinese

    Language:Python3
  • Berstarhunter/deepseek-start

    deepseek-start is a powerful tool designed for deep searching and analysis of large datasets, allowing users to efficiently navigate through complex data structures with ease. With its intuitive interface and advanced algorithms, deepseek-start provides researchers and analysts with the means to uncover valuable insights and patterns hidden within

  • Kuberwastaken/free-deep-research

    My free implementation of @dzhng's implementation of OpenAI's new Deep Research agent. Get (almost) the same capability for free. You can even tweak the behavior of the agent with adjustable breadth and depth. Run it for 5 min or 5 hours, it'll auto adjust :)

    Language:TypeScript2100
  • Kuberwastaken/TREAT-R1

    A DeepSeek R1 version of TREAT: An Open-Source AI Web App to Detect Triggering Content in Movies and Shows

    Language:Python2101
  • NEBYTE/deepseek-rs

    DeepSeek-RS is a personal project implementing DeepSeek's architecture in Rust for learning and experimentation. This is not an official DeepSeek project.

    Language:Rust20
  • SYSTEMS-OPERATOR/SUPER-POLE-POSITION

    HYPERPOLE GYM

    Language:Python1100