Update README.md
#11
by
brvv
- opened
README.md
CHANGED
|
@@ -5,7 +5,7 @@ base_model:
|
|
| 5 |
- deepseek-ai/DeepSeek-V3.2-Exp-Base
|
| 6 |
base_model_relation: finetune
|
| 7 |
---
|
| 8 |
-
# DeepSeek-V3.2: Efficient Reasoning & Agentic AI
|
| 9 |
|
| 10 |
<!-- markdownlint-disable first-line-h1 -->
|
| 11 |
<!-- markdownlint-disable html -->
|
|
@@ -49,7 +49,7 @@ base_model_relation: finetune
|
|
| 49 |
|
| 50 |
## Introduction
|
| 51 |
|
| 52 |
-
We introduce **DeepSeek-V3.2**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
|
| 53 |
|
| 54 |
1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
|
| 55 |
2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
|
|
@@ -64,7 +64,7 @@ We have also released the final submissions for IOI 2025, ICPC World Finals, IMO
|
|
| 64 |
|
| 65 |
## Chat Template
|
| 66 |
|
| 67 |
-
DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a "thinking with tools" capability.
|
| 68 |
|
| 69 |
To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model's text output.
|
| 70 |
|
|
|
|
| 5 |
- deepseek-ai/DeepSeek-V3.2-Exp-Base
|
| 6 |
base_model_relation: finetune
|
| 7 |
---
|
| 8 |
+
# DeepSeek-V3.2-Speciale: Efficient Reasoning & Agentic AI
|
| 9 |
|
| 10 |
<!-- markdownlint-disable first-line-h1 -->
|
| 11 |
<!-- markdownlint-disable html -->
|
|
|
|
| 49 |
|
| 50 |
## Introduction
|
| 51 |
|
| 52 |
+
We introduce **DeepSeek-V3.2-Speciale**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
|
| 53 |
|
| 54 |
1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
|
| 55 |
2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
|
|
|
|
| 64 |
|
| 65 |
## Chat Template
|
| 66 |
|
| 67 |
+
DeepSeek-V3.2-Speciale introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a "thinking with tools" capability.
|
| 68 |
|
| 69 |
To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model's text output.
|
| 70 |
|