Architecture Decoupling Is Not All You Need For Unified Multimodal Model Paper • 2511.22663 • Published 14 days ago • 28
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published Oct 16 • 65
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published Oct 9 • 125