Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.21809

tencent/HunyuanWorld-Mirror

Image-to-3D • Updated Oct 25 • 12.3k • 431
tencent/Hunyuan3D-Part

Updated Oct 17 • 1.64k • 510
tencent/Hunyuan3D-2mini

Image-to-3D • Updated Oct 17 • 4.78k • 103
tencent/HunyuanWorld-Voyager

Image-to-Video • Updated Oct 17 • 155 • 601

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 315
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 210

SpatialLM: Training Large Language Models for Structured Indoor Modeling

Paper • 2506.07491 • Published Jun 9 • 50
Story2Board: A Training-Free Approach for Expressive Storyboard Generation

Paper • 2508.09983 • Published Aug 13 • 68
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Paper • 2503.01710 • Published Mar 3 • 6
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135

From One to More: Contextual Part Latents for 3D Generation

Paper • 2507.08772 • Published Jul 11 • 25
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8 • 58
SeqTex: Generate Mesh Textures in Video Sequence

Paper • 2507.04285 • Published Jul 6 • 9
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

Paper • 2507.17745 • Published Jul 23 • 35

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 33
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 34
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 4 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8 • 58
DINOv3

Paper • 2508.10104 • Published Aug 13 • 285
Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 264

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 87
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 57
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Paper • 2506.05010 • Published Jun 5 • 79

Scene Generation

HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation

Paper • 2504.21650 • Published Apr 30 • 16
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5 • 8
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies

Paper • 2506.14315 • Published Jun 17 • 10
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135

RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models

Paper • 2409.19989 • Published Sep 30, 2024 • 18
3D Scene Generation: A Survey

Paper • 2505.05474 • Published May 8 • 21
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Paper • 2505.22129 • Published May 28 • 15
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Paper • 2505.18600 • Published May 24 • 48

tencent/HunyuanWorld-Mirror

Image-to-3D • Updated Oct 25 • 12.3k • 431
tencent/Hunyuan3D-Part

Updated Oct 17 • 1.64k • 510
tencent/Hunyuan3D-2mini

Image-to-3D • Updated Oct 17 • 4.78k • 103
tencent/HunyuanWorld-Voyager

Image-to-Video • Updated Oct 17 • 155 • 601

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 4 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 315
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 210

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8 • 58
DINOv3

Paper • 2508.10104 • Published Aug 13 • 285
Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 264

SpatialLM: Training Large Language Models for Structured Indoor Modeling

Paper • 2506.07491 • Published Jun 9 • 50
Story2Board: A Training-Free Approach for Expressive Storyboard Generation

Paper • 2508.09983 • Published Aug 13 • 68
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Paper • 2503.01710 • Published Mar 3 • 6
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 87
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 57
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Paper • 2506.05010 • Published Jun 5 • 79

From One to More: Contextual Part Latents for 3D Generation

Paper • 2507.08772 • Published Jul 11 • 25
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8 • 58
SeqTex: Generate Mesh Textures in Video Sequence

Paper • 2507.04285 • Published Jul 6 • 9
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

Paper • 2507.17745 • Published Jul 23 • 35

Scene Generation

HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation

Paper • 2504.21650 • Published Apr 30 • 16
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5 • 8
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies

Paper • 2506.14315 • Published Jun 17 • 10
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 33
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 34
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14

RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models

Paper • 2409.19989 • Published Sep 30, 2024 • 18
3D Scene Generation: A Survey

Paper • 2505.05474 • Published May 8 • 21
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Paper • 2505.22129 • Published May 28 • 15
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Paper • 2505.18600 • Published May 24 • 48

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs