tfrere HF Staff commited on
Commit
95e718b
·
1 Parent(s): 4581f65

update gitignore

Browse files
Files changed (18) hide show
  1. .gitignore +1 -0
  2. app/scripts/notion-importer/.notion-to-md/media/2421384e-bcac-80fb-aa7c-f939fc39269d_media.json +0 -3
  3. app/scripts/notion-importer/.notion-to-md/media/27877f1c-9c9d-804d-9c82-f7b3905578ff_media.json +0 -3
  4. app/scripts/notion-importer/.notion-to-md/media/29177f1c-9c9d-8079-aebf-cfe3ee40f7c5_media.json +0 -3
  5. app/scripts/notion-importer/.notion-to-md/media/29177f1c-9c9d-80d6-91bc-cec1904f628f_media.json +0 -3
  6. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-8006-9cac-f8b9876a3daa_media.json +0 -3
  7. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-8033-84a0-f498edd20d5d_media.json +0 -3
  8. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-8070-84b7-d2c55eec7b31_media.json +0 -3
  9. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-807b-b544-e308e74095eb_media.json +0 -3
  10. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-80b9-ac9e-e2ed81f6f335_media.json +0 -3
  11. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-80df-81c0-fc7920a269f8_media.json +0 -3
  12. app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-80fc-abcc-ed31b76eb37d_media.json +0 -3
  13. app/scripts/notion-importer/.notion-to-md/media/2951384e-bcac-8087-898b-f7fff54fb54b_media.json +0 -3
  14. app/scripts/notion-importer/.notion-to-md/media/2951384e-bcac-809b-9bc4-c0f7647080f3_media.json +0 -3
  15. app/scripts/notion-importer/static/frontmatter.mdx +31 -4
  16. app/src/content/article.mdx +43 -6
  17. app/{scripts/notion-importer/.notion-to-md/media/2421384e-bcac-800c-b22c-df0bb34c69f7_media.json → src/content/assets/image/Capture_decran_2025-10-22_a_09_46_36_2941384e-bcac-80cc-af9d-d2ace91e56e8.png} +2 -2
  18. app/{scripts/notion-importer/.notion-to-md/media/29177f1c-9c9d-80c7-aec6-c6ab90d7912a_media.json → src/content/assets/image/Screenshot_2025-10-24_at_12_26_55_2961384e-bcac-80c9-9266-c37d8d5f4a39.png} +2 -2
.gitignore CHANGED
@@ -20,6 +20,7 @@ node_modules/
20
  *.log
21
  *.env
22
  *.cache
 
23
 
24
  app/scripts/latex-to-mdx/output/
25
  app/scripts/notion-importer/output/**/*
 
20
  *.log
21
  *.env
22
  *.cache
23
+ .notion-to-md
24
 
25
  app/scripts/latex-to-mdx/output/
26
  app/scripts/notion-importer/output/**/*
app/scripts/notion-importer/.notion-to-md/media/2421384e-bcac-80fb-aa7c-f939fc39269d_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:caa89ec1b62b1b78b4592c41d3b9997ed79e276d7169064572f88303a400c4e8
3
- size 61500
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/27877f1c-9c9d-804d-9c82-f7b3905578ff_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:573e517dd54f2ec7a8caf2badeea4b54ca90d3eff45b88a47877f38fc39203f0
3
- size 39793
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/29177f1c-9c9d-8079-aebf-cfe3ee40f7c5_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:da3d0e8b86bab2816814fa8d103e5f10310b2f401fc33ab5e3730d231989c5ff
3
- size 35370
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/29177f1c-9c9d-80d6-91bc-cec1904f628f_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:2898a98abe6fce5c59caeee0c09b440ec4613bfb5eaa3e527b14056a175e76ae
3
- size 2428
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-8006-9cac-f8b9876a3daa_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:e95ed06c4df8d63c7b1aa78d022c10195605c6118ea4b7a035d59fc52f104a81
3
- size 4891
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-8033-84a0-f498edd20d5d_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:2c056bf9e20d2f88d8d96ffecc848f3b8447e036277741dcf49e798d35bc7f08
3
- size 2436
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-8070-84b7-d2c55eec7b31_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:71db30f156ec18a2bc6b1431d7dee2a2a16951a4a5d86887af3854ab024534b4
3
- size 2535
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-807b-b544-e308e74095eb_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:1acc48f0812f0cef1132328d1890b98b20037f806ea73a4d639381473841024f
3
- size 64334
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-80b9-ac9e-e2ed81f6f335_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:b3a20c6fb37ce6e4fed6f3def722ffe7947d65c8390b8244c02d276707a566ce
3
- size 16424
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-80df-81c0-fc7920a269f8_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:f7283e798efc4d1f187fa8629264045d032f7066fd58eb0e87d81048eb3e88d3
3
- size 21352
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2921384e-bcac-80fc-abcc-ed31b76eb37d_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:51c243c47d5731d2589360c467b3122e366297646be8a618e33fbacf85d79c98
3
- size 64206
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2951384e-bcac-8087-898b-f7fff54fb54b_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:68288a0aec237b15610681a864f5c802e22ef10322d8251fa4bfa773ec248621
3
- size 2428
 
 
 
 
app/scripts/notion-importer/.notion-to-md/media/2951384e-bcac-809b-9bc4-c0f7647080f3_media.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:c3842402ba6d83e6ab44fac53432190f95b07ed2ac096031608b1d29a178e858
3
- size 21458
 
 
 
 
app/scripts/notion-importer/static/frontmatter.mdx CHANGED
@@ -6,18 +6,39 @@ authors:
6
  - name: "Loubna Ben Allal"
7
  url: "https://huggingface.co/loubnabnl"
8
  affiliations: [1]
9
- - name: "Leandro von Werra"
10
- url: "https://huggingface.co/lvwerra"
11
- affiliations: [1]
12
  - name: "Lewis Tunstall"
13
  url: "https://huggingface.co/lewtun"
14
  affiliations: [1]
 
 
 
 
 
 
 
 
 
 
 
 
15
  - name: "Clémentine Fourrier"
16
  url: "https://huggingface.co/clefourrier"
17
  affiliations: [1]
18
  - name: "Thibaud Frere"
19
  url: "https://huggingface.co/tfrere"
20
  affiliations: [1]
 
 
 
 
 
 
 
 
 
 
 
 
21
  affiliations:
22
  - name: "Hugging Face"
23
  url: "https://huggingface.co"
@@ -31,4 +52,10 @@ tags:
31
  - template
32
  tableOfContentsAutoCollapse: true
33
  pdfProOnly: true
34
- ---
 
 
 
 
 
 
 
6
  - name: "Loubna Ben Allal"
7
  url: "https://huggingface.co/loubnabnl"
8
  affiliations: [1]
 
 
 
9
  - name: "Lewis Tunstall"
10
  url: "https://huggingface.co/lewtun"
11
  affiliations: [1]
12
+ - name: "Nouamane Tazi"
13
+ url: "https://huggingface.co/nouamanetazi"
14
+ affiliations: [1]
15
+ - name: "Elie Bak"
16
+ url: "https://huggingface.co/eliebak"
17
+ affiliations: [1]
18
+ - name: "Ed Beeching"
19
+ url: "https://huggingface.co/edbeeching"
20
+ affiliations: [1]
21
+ - name: "Carlos Muñoz Ferrandis"
22
+ url: "https://huggingface.co/CarlosMF"
23
+ affiliations: [1]
24
  - name: "Clémentine Fourrier"
25
  url: "https://huggingface.co/clefourrier"
26
  affiliations: [1]
27
  - name: "Thibaud Frere"
28
  url: "https://huggingface.co/tfrere"
29
  affiliations: [1]
30
+ - name: "Anton Lozhkov"
31
+ url: "https://huggingface.co/anton-l"
32
+ affiliations: [1]
33
+ - name: "Colin Raffel"
34
+ url: "https://huggingface.co/craffel"
35
+ affiliations: [1]
36
+ - name: "Leandro von Werra"
37
+ url: "https://huggingface.co/lvwerra"
38
+ affiliations: [1]
39
+ - name: "Thomas Wolf"
40
+ url: "https://huggingface.co/thomwolf"
41
+ affiliations: [1]
42
  affiliations:
43
  - name: "Hugging Face"
44
  url: "https://huggingface.co"
 
52
  - template
53
  tableOfContentsAutoCollapse: true
54
  pdfProOnly: true
55
+ ---
56
+
57
+
58
+
59
+
60
+
61
+
app/src/content/article.mdx CHANGED
@@ -9,14 +9,26 @@ authors:
9
  url: 'https://huggingface.co/loubnabnl'
10
  affiliations:
11
  - 1
12
- - name: Leandro von Werra
13
- url: 'https://huggingface.co/lvwerra'
14
- affiliations:
15
- - 1
16
  - name: Lewis Tunstall
17
  url: 'https://huggingface.co/lewtun'
18
  affiliations:
19
  - 1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  - name: Clémentine Fourrier
21
  url: 'https://huggingface.co/clefourrier'
22
  affiliations:
@@ -25,6 +37,22 @@ authors:
25
  url: 'https://huggingface.co/tfrere'
26
  affiliations:
27
  - 1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  affiliations:
29
  - name: Hugging Face
30
  url: 'https://huggingface.co'
@@ -79,6 +107,7 @@ import Screenshot_2025_09_26_at_22_36_40_27a1384e_bcac_8063_94e0_f1c689e7d9b9 fr
79
  import Screenshot_2025_10_16_at_21_20_31_28e1384e_bcac_8059_83b0_c6e19a90f49c from './assets/image/Screenshot_2025-10-16_at_21_20_31_28e1384e-bcac-8059-83b0-c6e19a90f49c.png';
80
  import Screenshot_2025_10_16_at_21_39_34_28e1384e_bcac_8004_9f13_c53e50cd416e from './assets/image/Screenshot_2025-10-16_at_21_39_34_28e1384e-bcac-8004-9f13-c53e50cd416e.png';
81
  import Screenshot_2025_10_17_at_15_32_35_28f1384e_bcac_8038_aa05_f429bd1bbf0d from './assets/image/Screenshot_2025-10-17_at_15_32_35_28f1384e-bcac-8038-aa05-f429bd1bbf0d.png';
 
82
  import Screenshot_2025_10_17_at_14_07_07_28f1384e_bcac_8000_b77f_d6b0627e5f54 from './assets/image/Screenshot_2025-10-17_at_14_07_07_28f1384e-bcac-8000-b77f-d6b0627e5f54.png';
83
  import Screenshot_2025_10_17_at_15_15_26_28f1384e_bcac_8006_a7de_ec57a503d412 from './assets/image/Screenshot_2025-10-17_at_15_15_26_28f1384e-bcac-8006-a7de-ec57a503d412.png';
84
  import Screenshot_2025_10_17_at_16_00_20_28f1384e_bcac_80f7_ab1c_e448a7e126b4 from './assets/image/Screenshot_2025-10-17_at_16_00_20_28f1384e-bcac-80f7-ab1c-e448a7e126b4.png';
@@ -4866,6 +4895,9 @@ Packing solves this by concatenating multiple sequences together until a desired
4866
 
4867
  To get a sense of how efficient packing is for training, below we compare the runtimes between packing and no-packing over one epoch of our baseline dataset:
4868
 
 
 
 
4869
  <Image src={Screenshot_2025_10_17_at_14_07_07_28f1384e_bcac_8000_b77f_d6b0627e5f54} alt="Image" />
4870
 
4871
 
@@ -4949,7 +4981,7 @@ Although you can keep scaling SFT with more data, at some point you'll observe d
4949
 
4950
  This is where preference optimisation comes in. Instead of just copying demonstrations, we give the model comparative feedback like "response A is better than response B". These preferences provide a more direct training signal for quality and enable to model performance to scale beyond the limits of SFT alone.
4951
 
4952
- Another benefit of preference optimisation is that you typically need far less data than SFT, since the starting point is already a pretty good model that can follow instructions and has knowledge <Sidenote>As we'll see below, there are some algorithms like [ORPO](https://arxiv.org/abs/2403.07691) which can be applied directly to base models.</Sidenote> Let's take a look at how these datasets are created.
4953
 
4954
  #### Creating preference datasets
4955
 
@@ -5017,7 +5049,7 @@ We also ran experiments to determine how dataset size influences results, testin
5017
  <Image src={image_2941384e_bcac_807e_8b6e_fc9020752eb0} alt="Image" />
5018
 
5019
 
5020
- The experiments we ran for the ß parameter ranged from 0.01 to 0.99 to explore values that encourage different degrees of alignment to the reference model. As a reminder, lower values of beta encourage staying close to the reference model while higher values allow the PO model to match the preference data more closely. The model performance for ß=0.1 is the highest for both reasoning modes and improves compared to the metrics from the SFT checkpoint. Using a low beta value hurts model performance and results in a worse model than the SFT checkpoint, while performance remains stable without extended thinking for multiple ß values.
5021
 
5022
  These results suggest that values greater than 0.1 are preferable for PO, and that aligning the model with the preference data is more beneficial than staying close to the reference model. However, we suggest exploring ß values in the range 0.01 and 0.5. Higher values may erase capabilities from the SFT checkpoint that we might not be capturing in the evals shown on the plot.
5023
 
@@ -5267,6 +5299,11 @@ We hope this blog helps you approach your next training project with clarity and
5267
 
5268
  Now go train something. And when your loss spikes mysteriously at 2am, remember: every great model has debugging stories behind it. May the force of open source and open science always be with you!
5269
 
 
 
 
 
 
5270
  ## References
5271
 
5272
 
 
9
  url: 'https://huggingface.co/loubnabnl'
10
  affiliations:
11
  - 1
 
 
 
 
12
  - name: Lewis Tunstall
13
  url: 'https://huggingface.co/lewtun'
14
  affiliations:
15
  - 1
16
+ - name: Nouamane Tazi
17
+ url: 'https://huggingface.co/nouamanetazi'
18
+ affiliations:
19
+ - 1
20
+ - name: Elie Bak
21
+ url: 'https://huggingface.co/eliebak'
22
+ affiliations:
23
+ - 1
24
+ - name: Ed Beeching
25
+ url: 'https://huggingface.co/edbeeching'
26
+ affiliations:
27
+ - 1
28
+ - name: Carlos Muñoz Ferrandis
29
+ url: 'https://huggingface.co/CarlosMF'
30
+ affiliations:
31
+ - 1
32
  - name: Clémentine Fourrier
33
  url: 'https://huggingface.co/clefourrier'
34
  affiliations:
 
37
  url: 'https://huggingface.co/tfrere'
38
  affiliations:
39
  - 1
40
+ - name: Anton Lozhkov
41
+ url: 'https://huggingface.co/anton-l'
42
+ affiliations:
43
+ - 1
44
+ - name: Colin Raffel
45
+ url: 'https://huggingface.co/craffel'
46
+ affiliations:
47
+ - 1
48
+ - name: Leandro von Werra
49
+ url: 'https://huggingface.co/lvwerra'
50
+ affiliations:
51
+ - 1
52
+ - name: Thomas Wolf
53
+ url: 'https://huggingface.co/thomwolf'
54
+ affiliations:
55
+ - 1
56
  affiliations:
57
  - name: Hugging Face
58
  url: 'https://huggingface.co'
 
107
  import Screenshot_2025_10_16_at_21_20_31_28e1384e_bcac_8059_83b0_c6e19a90f49c from './assets/image/Screenshot_2025-10-16_at_21_20_31_28e1384e-bcac-8059-83b0-c6e19a90f49c.png';
108
  import Screenshot_2025_10_16_at_21_39_34_28e1384e_bcac_8004_9f13_c53e50cd416e from './assets/image/Screenshot_2025-10-16_at_21_39_34_28e1384e-bcac-8004-9f13-c53e50cd416e.png';
109
  import Screenshot_2025_10_17_at_15_32_35_28f1384e_bcac_8038_aa05_f429bd1bbf0d from './assets/image/Screenshot_2025-10-17_at_15_32_35_28f1384e-bcac-8038-aa05-f429bd1bbf0d.png';
110
+ import Screenshot_2025_10_24_at_12_26_55_2961384e_bcac_80c9_9266_c37d8d5f4a39 from './assets/image/Screenshot_2025-10-24_at_12_26_55_2961384e-bcac-80c9-9266-c37d8d5f4a39.png';
111
  import Screenshot_2025_10_17_at_14_07_07_28f1384e_bcac_8000_b77f_d6b0627e5f54 from './assets/image/Screenshot_2025-10-17_at_14_07_07_28f1384e-bcac-8000-b77f-d6b0627e5f54.png';
112
  import Screenshot_2025_10_17_at_15_15_26_28f1384e_bcac_8006_a7de_ec57a503d412 from './assets/image/Screenshot_2025-10-17_at_15_15_26_28f1384e-bcac-8006-a7de-ec57a503d412.png';
113
  import Screenshot_2025_10_17_at_16_00_20_28f1384e_bcac_80f7_ab1c_e448a7e126b4 from './assets/image/Screenshot_2025-10-17_at_16_00_20_28f1384e-bcac-80f7-ab1c-e448a7e126b4.png';
 
4895
 
4896
  To get a sense of how efficient packing is for training, below we compare the runtimes between packing and no-packing over one epoch of our baseline dataset:
4897
 
4898
+ <Image src={Screenshot_2025_10_24_at_12_26_55_2961384e_bcac_80c9_9266_c37d8d5f4a39} alt="Image" />
4899
+
4900
+
4901
  <Image src={Screenshot_2025_10_17_at_14_07_07_28f1384e_bcac_8000_b77f_d6b0627e5f54} alt="Image" />
4902
 
4903
 
 
4981
 
4982
  This is where preference optimisation comes in. Instead of just copying demonstrations, we give the model comparative feedback like "response A is better than response B". These preferences provide a more direct training signal for quality and enable to model performance to scale beyond the limits of SFT alone.
4983
 
4984
+ Another benefit of preference optimisation is that you typically need far less data than SFT, since the starting point is already a pretty good model that can follow instructions and has knowledge from previous training stages.<Sidenote>As we'll see below, there are some algorithms like [ORPO](https://arxiv.org/abs/2403.07691) which can be applied directly to base models.</Sidenote> Let's take a look at how these datasets are created.
4985
 
4986
  #### Creating preference datasets
4987
 
 
5049
  <Image src={image_2941384e_bcac_807e_8b6e_fc9020752eb0} alt="Image" />
5050
 
5051
 
5052
+ The experiments we ran for the ß parameter ranged from 0.01 to 0.99 to explore values that encourage different degrees of alignment to the reference model. As a reminder, lower values of beta encourage staying close to the reference model while higher values allow the PO model to match the preference data more closely. The model performance for ß=0.1 is the highest for both reasoning modes and improves compared to the metrics from the SFT checkpoint. Using a low beta value hurts model performance and results in a worse model than the SFT checkpoint, while performance remains stable across multiple ß values without extended thinking.
5053
 
5054
  These results suggest that values greater than 0.1 are preferable for PO, and that aligning the model with the preference data is more beneficial than staying close to the reference model. However, we suggest exploring ß values in the range 0.01 and 0.5. Higher values may erase capabilities from the SFT checkpoint that we might not be capturing in the evals shown on the plot.
5055
 
 
5299
 
5300
  Now go train something. And when your loss spikes mysteriously at 2am, remember: every great model has debugging stories behind it. May the force of open source and open science always be with you!
5301
 
5302
+ #### **Acknowledgments**
5303
+
5304
+
5305
+ We thank [Guilherme](https://huggingface.co/guipenedo) and [Hugo](https://huggingface.co/hlarcher) for their valuable feedback, and [Abubakar](https://huggingface.co/abidlabs) for his help with Trackio features.
5306
+
5307
  ## References
5308
 
5309
 
app/{scripts/notion-importer/.notion-to-md/media/2421384e-bcac-800c-b22c-df0bb34c69f7_media.json → src/content/assets/image/Capture_decran_2025-10-22_a_09_46_36_2941384e-bcac-80cc-af9d-d2ace91e56e8.png} RENAMED
File without changes
app/{scripts/notion-importer/.notion-to-md/media/29177f1c-9c9d-80c7-aec6-c6ab90d7912a_media.json → src/content/assets/image/Screenshot_2025-10-24_at_12_26_55_2961384e-bcac-80c9-9266-c37d8d5f4a39.png} RENAMED
File without changes