|
|
--- |
|
|
base_model: |
|
|
- stabilityai/stable-diffusion-3.5-medium |
|
|
library_name: diffusers |
|
|
pipeline_tag: text-to-image |
|
|
--- |
|
|
|
|
|
# Model Card |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
This is a reproduced LoRA of SD3.5-Medium, post-trained with DiffusionNFT on multiple reward models, as presented in the paper [Diffusion Negative-aware FineTuning (DiffusionNFT)](https://huggingface.co/papers/2509.16117). |
|
|
|
|
|
### Paper Abstract |
|
|
Online reinforcement learning (RL) has been central to post-training language |
|
|
models, but its extension to diffusion models remains challenging due to |
|
|
intractable likelihoods. Recent works discretize the reverse sampling process |
|
|
to enable GRPO-style training, yet they inherit fundamental drawbacks, |
|
|
including solver restrictions, forward-reverse inconsistency, and complicated |
|
|
integration with classifier-free guidance (CFG). We introduce Diffusion |
|
|
Negative-aware FineTuning (DiffusionNFT), a new online RL paradigm that |
|
|
optimizes diffusion models directly on the forward process via flow matching. |
|
|
DiffusionNFT contrasts positive and negative generations to define an implicit |
|
|
policy improvement direction, naturally incorporating reinforcement signals |
|
|
into the supervised learning objective. This formulation enables training with |
|
|
arbitrary black-box solvers, eliminates the need for likelihood estimation, and |
|
|
requires only clean images rather than sampling trajectories for policy |
|
|
optimization. DiffusionNFT is up to 25times more efficient than FlowGRPO in |
|
|
head-to-head comparisons, while being CFG-free. For instance, DiffusionNFT |
|
|
improves the GenEval score from 0.24 to 0.98 within 1k steps, while FlowGRPO |
|
|
achieves 0.95 with over 5k steps and additional CFG employment. By leveraging |
|
|
multiple reward models, DiffusionNFT significantly boosts the performance of |
|
|
SD3.5-Medium in every benchmark tested. |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
|
|
- **Repository:** https://github.com/NVlabs/DiffusionNFT |
|
|
- **Paper:** https://huggingface.co/papers/2509.16117 |
|
|
- **Project Page:** https://research.nvidia.com/labs/dir/DiffusionNFT |
|
|
|
|
|
## Uses |
|
|
|
|
|
Please refer to the evaluation script in GitHub. |