File size: 5,777 Bytes
167c064
578a4fe
 
 
 
 
 
 
 
167c064
578a4fe
167c064
578a4fe
167c064
578a4fe
 
 
 
 
 
 
 
 
 
 
167c064
578a4fe
167c064
578a4fe
 
 
 
167c064
578a4fe
167c064
 
 
578a4fe
 
 
 
 
167c064
 
 
578a4fe
167c064
578a4fe
167c064
578a4fe
 
 
 
 
 
 
167c064
578a4fe
 
 
 
 
 
 
167c064
 
578a4fe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167c064
578a4fe
 
167c064
578a4fe
167c064
578a4fe
167c064
578a4fe
167c064
8dc1cab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
license: other
license_name: deepseek
license_link: LICENSE
tags:
- heretic
- uncensored
- decensored
- abliterated
---
# This is a decensored version of [deepseek-ai/deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct), made using [Heretic](https://github.com/p-e-w/heretic) v1.0.1

## Abliteration parameters

| Parameter | Value |
| :-------- | :---: |
| **direction_index** | 25.61 |
| **attn.o_proj.max_weight** | 1.49 |
| **attn.o_proj.max_weight_position** | 37.32 |
| **attn.o_proj.min_weight** | 1.45 |
| **attn.o_proj.min_weight_distance** | 31.10 |
| **mlp.down_proj.max_weight** | 0.81 |
| **mlp.down_proj.max_weight_position** | 56.20 |
| **mlp.down_proj.min_weight** | 0.62 |
| **mlp.down_proj.min_weight_distance** | 5.57 |

## Performance

| Metric | This model | Original model ([deepseek-ai/deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct)) |
| :----- | :--------: | :---------------------------: |
| **KL divergence** | 0.02 | 0 *(by definition)* |
| **Refusals** | 70/100 | 97/100 |

-----



<p align="center">
<img width="1000px" alt="DeepSeek Coder" src="https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/pictures/logo.png?raw=true">
</p>
<p align="center"><a href="https://www.deepseek.com/">[🏠Homepage]</a>  |  <a href="https://coder.deepseek.com/">[🤖 Chat with DeepSeek Coder]</a>  |  <a href="https://discord.gg/Tc7c45Zzu5">[Discord]</a>  |  <a href="https://github.com/guoday/assert/blob/main/QR.png?raw=true">[Wechat(微信)]</a> </p>
<hr>



### 1. Introduction of Deepseek Coder

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support  project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks. 

- **Massive Training Data**: Trained from scratch on 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.
  
- **Highly Flexible & Scalable**: Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.
  
- **Superior Model Performance**: State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.
  
- **Advanced Code Completion Capabilities**: A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.

 
  
### 2. Model Summary
deepseek-coder-33b-instruct is a 33B parameter model initialized from deepseek-coder-33b-base and fine-tuned on 2B tokens of instruction data.
- **Home Page:** [DeepSeek](https://deepseek.com/)
- **Repository:** [deepseek-ai/deepseek-coder](https://github.com/deepseek-ai/deepseek-coder)
- **Chat With DeepSeek Coder:** [DeepSeek-Coder](https://coder.deepseek.com/)


### 3. How to Use
Here give some examples of how to use our model.
#### Chat Model Inference
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
messages=[
    { 'role': 'user', 'content': "write a quick sort algorithm in python."}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
```

### 4. License
This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.

See the [LICENSE-MODEL](https://github.com/deepseek-ai/deepseek-coder/blob/main/LICENSE-MODEL) for more details.

### 5. Contact

If you have any questions, please raise an issue or contact us at [agi_code@deepseek.com](mailto:agi_code@deepseek.com).



---

## Important Disclaimer

**This model has been modified to remove safety guardrails and refusal behaviors.**

### Intended Use
- Research and educational purposes
- Understanding model behavior and limitations
- Creative writing and roleplay with consenting adults
- Red-teaming and safety research

### Not Intended For
- Generating harmful, illegal, or unethical content
- Harassment, abuse, or malicious activities
- Misinformation or deception
- Any use that violates applicable laws

### User Responsibility
By using this model, you acknowledge that:
1. **You are solely responsible** for how you use this model and any content it generates
2. The model creator **accepts no liability** for misuse or harmful outputs
3. You will comply with all applicable laws and ethical guidelines
4. You understand this model may produce inaccurate, biased, or inappropriate content

### Technical Note
This model was created using abliteration techniques that suppress the "refusal direction" in the model's activation space. This does not add new capabilities—it only removes trained refusal behaviors from the base model.

**Use responsibly. You have been warned.**

---