Collections
Discover the best community collections!
Collections including paper arxiv:2507.21809
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 301 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 315 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 210
-
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper • 2506.07491 • Published • 50 -
Story2Board: A Training-Free Approach for Expressive Storyboard Generation
Paper • 2508.09983 • Published • 68 -
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Paper • 2503.01710 • Published • 6 -
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135
-
From One to More: Contextual Part Latents for 3D Generation
Paper • 2507.08772 • Published • 25 -
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
Paper • 2507.06165 • Published • 58 -
SeqTex: Generate Mesh Textures in Video Sequence
Paper • 2507.04285 • Published • 9 -
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention
Paper • 2507.17745 • Published • 35
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 33 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 34 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135 -
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
Paper • 2507.06165 • Published • 58 -
DINOv3
Paper • 2508.10104 • Published • 285 -
Qwen-Image Technical Report
Paper • 2508.02324 • Published • 264
-
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 87 -
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 57 -
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development
Paper • 2506.05010 • Published • 79
-
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation
Paper • 2504.21650 • Published • 16 -
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Paper • 2505.02836 • Published • 8 -
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies
Paper • 2506.14315 • Published • 10 -
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135
-
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models
Paper • 2409.19989 • Published • 18 -
3D Scene Generation: A Survey
Paper • 2505.05474 • Published • 21 -
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?
Paper • 2505.22129 • Published • 15 -
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Paper • 2505.18600 • Published • 48
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 301 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 315 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 210
-
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135 -
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
Paper • 2507.06165 • Published • 58 -
DINOv3
Paper • 2508.10104 • Published • 285 -
Qwen-Image Technical Report
Paper • 2508.02324 • Published • 264
-
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper • 2506.07491 • Published • 50 -
Story2Board: A Training-Free Approach for Expressive Storyboard Generation
Paper • 2508.09983 • Published • 68 -
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Paper • 2503.01710 • Published • 6 -
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135
-
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 87 -
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 57 -
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development
Paper • 2506.05010 • Published • 79
-
From One to More: Contextual Part Latents for 3D Generation
Paper • 2507.08772 • Published • 25 -
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
Paper • 2507.06165 • Published • 58 -
SeqTex: Generate Mesh Textures in Video Sequence
Paper • 2507.04285 • Published • 9 -
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention
Paper • 2507.17745 • Published • 35
-
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation
Paper • 2504.21650 • Published • 16 -
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Paper • 2505.02836 • Published • 8 -
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies
Paper • 2506.14315 • Published • 10 -
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Paper • 2507.21809 • Published • 135
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 33 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 34 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14
-
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models
Paper • 2409.19989 • Published • 18 -
3D Scene Generation: A Survey
Paper • 2505.05474 • Published • 21 -
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?
Paper • 2505.22129 • Published • 15 -
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Paper • 2505.18600 • Published • 48