RLHFlow
/

Llama-3.2-3B-Instruct-Reinforce-Ada-balance-hard

Model card Files Files and versions

Llama-3.2-3B-Instruct-Reinforce-Ada-balance-hard / README.md

baohao's picture

Update README.md

f99fefd verified 2 months ago

|

history blame contribute delete

161 Bytes

metadata

license: llama3.2

Checkpoint from step=400 and trained on the hard prompt set.