DeepSeek-R1-0528-Qwen3-8B is a distilled version of DeepSeek-R1-0528 that achieves state-of-the-art performance among open-source models. For more information, visit the DeepSeek homepage and Hugging Face repository.
You can chat with DeepSeek-R1 on the official website: chat.deepseek.com (enable "DeepThink" mode)
API access is available at: platform.deepseek.com
To run the model locally:
pip install transformers torch
Benchmark | DeepSeek-R1-0528-Qwen3-8B | Qwen3-8B | Improvement |
---|---|---|---|
AIME 2024 | 86.0% | 76.0% | +10.0% |
AIME 2025 | 76.3% | 67.3% | +9.0% |
HMMT Feb 25 | 61.5% | - | - |
GPQA Diamond | 61.1% | 62.0% | -0.9% |
LiveCodeBench | 60.5% | - | - |
file_template = """[file name]: {file_name}
[file content begin]
{file_content}
[file content end]
{question}"""
search_template = '''# The following contents are the search results:
{search_results}
Today is {cur_date}
# The user's message is:
{question}'''
The model is released under MIT License and supports commercial use.
@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs},
author={DeepSeek-AI},
year={2025},
eprint={2501.12948},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
For more details, see the research paper.