Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md

Repository files navigation

awesome-deepseek-r1-resource

This repo contains the resources for DeepSeek R1.

Continue to add more resources.

DeepSeek R1 related papers

Base model: DeepSeek-V3 Technical Report
GRPO algorithm: DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
R1 paper: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Current Reproduction

huggingface/open-r1: Fully open reproduction of DeepSeek-R1
Unakar/Logic-RL: Reproduce R1 Zero on Logic Puzzle
hkust-nlp/simpleRL-reason: This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
agentica-org/DeepScaleR-1.5B-Preview · Hugging Face
Jiayi-Pan/TinyZero: Clean, minimal, accessible reproduction of DeepSeek R1-Zero (github.com)
dhcode-cpp/X-R1: minimal-cost for training 0.5B R1-Zero
- X-R1 aims to build an easy-to-use, low-cost training framework based on reinforcement learning to accelerate the development of Scaling Post-Training
lsdefine/simple_GRPO: A very simple GRPO implement for reproducing r1-like LLM thinking.
- A very simple GRPO implement for reproducing r1-like LLM thinking. This is a simple open source implementation that utilizes the core loss calculation formula referenced from Hugging Face's trl.

Related Talks and articles

Extend to Multimodality

PKU-Alignment/Align-DS-V · Hugging Face
Deep-Agent/R1-V: Witness the aha moment of VLM with less than $3.
om-ai-lab/VLM-R1: Solve Visual Understanding with Reinforced VLMs
- Since the introduction of Deepseek-R1, numerous works have emerged focusing on reproducing and improving upon it. In this project, we propose VLM-R1, a stable and generalizable R1-style Large Vision-Language Model.

Third-party Platforms

Others

DeepSeek-R1-Distill-Qwen-7B vLLM 部署调用
ktransformers/doc/en/DeepseekR1_V3_tutorial.md at main · kvcache-ai/ktransformers
- Support DeepseekR1 and V3 on single (24GB VRAM)/multi gpu and 382G DRAM, up to 3~28x speedup.
Multi-Head Latent Attention(MLA)详细介绍
一文搞懂DeepSeek的技术演进之路：大语言模型、视觉语言理解、多模态统一模型 - 知乎
getAsterisk/deepclaude: A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.
- DeepClaude is a high-performance LLM inference API that combines DeepSeek R1's Chain of Thought (CoT) reasoning capabilities with Anthropic Claude's creative and code generation prowess. It provides a unified interface for leveraging the strengths of both models while maintaining complete control over your API keys and data.

About

This repo contains the resources for DeepSeek R1.

Report repository

Releases

No releases published

Packages

No packages published