Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLHFlow
/
Llama-3.2-3B-Instruct-Reinforce-Ada-balance-hard
like
0
Follow
RLHFlow
144
Safetensors
llama
License:
llama3.2
Model card
Files
Files and versions
xet
Community
1
main
Llama-3.2-3B-Instruct-Reinforce-Ada-balance-hard
/
README.md
baohao
Update README.md
f99fefd
verified
2 months ago
preview
code
|
raw
Copy download link
history
blame
contribute
delete
161 Bytes
metadata
license:
llama3.2
Checkpoint from step=400 and trained on the
hard prompt set
.