Light-X: Generative 4D Video Rendering with Camera and Illumination Control Paper • 2512.05115 • Published 4 days ago • 10
RELIC: Interactive Video World Model with Long-Horizon Memory Paper • 2512.04040 • Published 5 days ago • 21
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published 6 days ago • 29
UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers Paper • 2512.04504 • Published 5 days ago • 15
Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression Paper • 2512.05081 • Published 4 days ago • 27
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published 3 days ago • 30
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published 11 days ago • 22
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 11 days ago • 163
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Paper • 2511.23475 • Published 10 days ago • 41
WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing Paper • 2512.00387 • Published 10 days ago • 2
REASONEDIT: Towards Reasoning-Enhanced Image Editing Models Paper • 2511.22625 • Published 11 days ago • 45
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published 26 days ago • 68
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Oct 30 • 77
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published 25 days ago • 60
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 20 days ago • 222