SAGA: Learning Signal-Aligned Distributions for Improved Text-to-Image Generation

Abstract

State-of-the-art text-to-image models produce visually impressive results but often struggle with precise alignment to text prompts, leading to missing critical elements or unintended blending of distinct concepts. We propose a novel approach that learns a high-success-rate distribution conditioned on a target prompt, ensuring that generated images faithfully reflect the corresponding prompts. Our method explicitly models the signal component during the denoising process, offering fine-grained control that mitigates over-optimization and out-of-distribution artifacts. Moreover, our framework is training-free and seamlessly integrates with both existing diffusion and flow matching architectures. It also supports additional conditioning modalities -- such as bounding boxes -- for enhanced spatial alignment. Extensive experiments demonstrate that our approach outperforms current state-of-the-art methods. The code will be available soon.

Qualitative Results

Generated images with SAGA using SD 3 with the same seeds and prompts.

Generated images across different methods using SD 1.4 with the same seeds and prompts.

Generated images with SAGA using SD 3 with the same seeds and prompts.

Quantitative Results

Comparison of our method against vanilla sampling and state-of-the-art training-free methods on SD 1.4 and SD 3 (scores by number of entities in the prompt: 2, 3, 4).

User study results, reported as selection percentages.

Model performance on the GenEval benchmark. Results are from (Ma et al. 2025). SAGA applied to SD 3 achieves the highest overall performance.

Poster

Coming Soon...

BibTeX

@misc{grimal2025sagalearningsignalaligneddistributions,
      title={SAGA: Learning Signal-Aligned Distributions for Improved Text-to-Image Generation}, 
      author={Paul Grimal and Michaël Soumm and Hervé Le Borgne and Olivier Ferret and Akihiro Sugimoto},
      year={2025},
      eprint={2508.13866},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.13866}, 
}