TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published 7 days ago • 57
LaViDa: A Large Diffusion Language Model for Multimodal Understanding Paper • 2505.16839 • Published May 22 • 12
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Paper • 2505.16990 • Published May 22 • 22
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published May 22 • 34
Cobra: Efficient Line Art COlorization with BRoAder References Paper • 2504.12240 • Published Apr 16 • 27
Synthetic Video Enhances Physical Fidelity in Video Synthesis Paper • 2503.20822 • Published Mar 26 • 16