deepseek-ai
/

DeepSeek-V3.2-Speciale

@@ -5,7 +5,7 @@ base_model:
   - deepseek-ai/DeepSeek-V3.2-Exp-Base
 base_model_relation: finetune
 ---
-# DeepSeek-V3.2: Efficient Reasoning & Agentic AI
 <!-- markdownlint-disable first-line-h1 -->
 <!-- markdownlint-disable html -->
@@ -49,7 +49,7 @@ base_model_relation: finetune
 ## Introduction
-We introduce **DeepSeek-V3.2**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
 1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
 2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
@@ -64,7 +64,7 @@ We have also released the final submissions for IOI 2025, ICPC World Finals, IMO
 ## Chat Template
-DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a "thinking with tools" capability.
 To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model's text output.

   - deepseek-ai/DeepSeek-V3.2-Exp-Base
 base_model_relation: finetune
 ---
+# DeepSeek-V3.2-Speciale: Efficient Reasoning & Agentic AI
 <!-- markdownlint-disable first-line-h1 -->
 <!-- markdownlint-disable html -->
 ## Introduction
+We introduce **DeepSeek-V3.2-Speciale**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
 1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
 2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
 ## Chat Template
+DeepSeek-V3.2-Speciale introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a "thinking with tools" capability.
 To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model's text output.