RLHFlow
/

Llama-3.2-3B-Instruct-Reinforce-Ada-balance-hard

Model card Files Files and versions

baohao commited on Oct 10

Commit

f99fefd

·

verified ·

1 Parent(s): 88b4752

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -1,3 +1,4 @@
----
-license: llama3.2
----

+---
+license: llama3.2
+---
+Checkpoint from step=400 and trained on the [hard prompt set](https://huggingface.co/datasets/RLHFlow/reinforce_ada_hard_prompt_llama).