RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published 11 days ago • 19
Yo'City: Personalized and Boundless 3D Realistic City Scene Generation via Self-Critic Expansion Paper • 2511.18734 • Published 17 days ago • 6
MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts Paper • 2511.20415 • Published 15 days ago • 8
WorldGen: From Text to Traversable and Interactive 3D Worlds Paper • 2511.16825 • Published 20 days ago • 21
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation Paper • 2510.26213 • Published Oct 30 • 9
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets Paper • 2509.21245 • Published Sep 25 • 38
X-Part: high fidelity and structure coherent shape decomposition Paper • 2509.08643 • Published Sep 10 • 26
Can Understanding and Generation Truly Benefit Together -- or Just Coexist? Paper • 2509.09666 • Published Sep 11 • 34
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass Paper • 2508.15769 • Published Aug 21 • 19
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation Paper • 2508.09987 • Published Aug 13 • 25
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation Paper • 2508.00428 • Published Aug 1 • 3
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation Paper • 2505.02836 • Published May 5 • 8
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8 • 182
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published Apr 3 • 57
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published Mar 19 • 20
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published Mar 19 • 21
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published Jan 14 • 61
Imagine360: Immersive 360 Video Generation from Perspective Anchor Paper • 2412.03552 • Published Dec 4, 2024 • 29
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters Paper • 2412.00174 • Published Nov 29, 2024 • 23
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models Paper • 2410.09732 • Published Oct 13, 2024 • 54