Zhiheng Liu's picture

Zhiheng Liu

Johanan0528

·

Johanan528

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

upvoted a paper about 1 month ago

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

upvoted a paper about 1 month ago

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

View all activity

Organizations

upvoted a paper 7 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published 7 days ago • 57

upvoted 2 papers about 1 month ago

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Paper • 2510.19871 • Published Oct 22 • 29

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29 • 77

upvoted 2 papers 10 months ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published Feb 14 • 55

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

upvoted 4 papers 11 months ago

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published Jan 21 • 48

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Paper • 2501.12375 • Published Jan 21 • 23

MangaNinja: Line Art Colorization with Precise Reference Following

Paper • 2501.08332 • Published Jan 14 • 61

Edicho: Consistent Image Editing in the Wild

Paper • 2412.21079 • Published Dec 30, 2024 • 23

upvoted 2 papers 12 months ago

DepthLab: From Partial to Complete

Paper • 2412.18153 • Published Dec 24, 2024 • 36

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

Paper • 2412.15214 • Published Dec 19, 2024 • 15

upvoted 3 papers about 1 year ago

MagicQuill: An Intelligent Interactive Image Editing System

Paper • 2411.09703 • Published Nov 14, 2024 • 80

Framer: Interactive Frame Interpolation

Paper • 2410.18978 • Published Oct 24, 2024 • 37

PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs

Paper • 2410.05265 • Published Oct 7, 2024 • 33

upvoted 4 papers over 1 year ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63

Zero-shot Image Editing with Reference Imitation

Paper • 2406.07547 • Published Jun 11, 2024 • 33

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17, 2024 • 46

InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

Paper • 2404.11613 • Published Apr 17, 2024 • 11

upvoted 2 papers almost 2 years ago

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 32

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

Paper • 2401.01827 • Published Jan 3, 2024 • 18