Spaces:

dimdimz
/

DimensioDepth

Sleeping

wwieerrz Claude commited on Nov 3

Commit

1e152b4

1 Parent(s): 55d3f0e

COMPLETE FIX: Real AI + All Video Features!

FIX 1 - NameError fixed:
- Use session state to store depth maps
- No more undefined variable errors

FIX 2 - Remove colormap selector:
- BASE model uses Inferno colormap (best for depth)
- Cleaner UI, one less thing to configure

FIX 3 - Add ALL camera effects:
✅ Zoom In/Out - Smooth zoom controls
✅ Pan Left/Right/Up/Down - 4-way panning
✅ Dolly In/Out - Professional cinema shots
✅ Tilt Up/Down - Perspective tilt with transforms
✅ Rotate CW/CCW - Clockwise/counter-clockwise
✅ Ken Burns - Classic zoom + pan effect
✅ Orbit - Smooth orbital rotation with scale

All effects work with:
- Duration: 1-10 seconds
- FPS: 24/30/60
- Resolution: Original/1080p/720p/Square
- Download as MP4

This is the COMPLETE professional video export!

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show

app.py +129 -35

app.py CHANGED Viewed

@@ -52,8 +52,11 @@ def load_model():
 depth_estimator, USE_REAL_AI, MODEL_SIZE = load_model()
-def estimate_depth(image, colormap_style):
     """Estimate depth from an input image using REAL AI or DEMO MODE"""
     try:
         # Convert PIL to numpy if needed
         if isinstance(image, Image.Image):
@@ -68,20 +71,8 @@ def estimate_depth(image, colormap_style):
             depth = generate_smart_depth(image)
             mode_text = "DEMO MODE (Synthetic)"
-        # Convert colormap style to cv2 constant
-        colormap_dict = {
-            "Inferno": cv2.COLORMAP_INFERNO,
-            "Viridis": cv2.COLORMAP_VIRIDIS,
-            "Plasma": cv2.COLORMAP_PLASMA,
-            "Turbo": cv2.COLORMAP_TURBO,
-            "Magma": cv2.COLORMAP_MAGMA,
-            "Hot": cv2.COLORMAP_HOT,
-            "Ocean": cv2.COLORMAP_OCEAN,
-            "Rainbow": cv2.COLORMAP_RAINBOW
-        }
-        # Create colored depth map
-        depth_colored = depth_to_colormap(depth, colormap_dict[colormap_style])
         # Create grayscale depth map
         depth_gray = (depth * 255).astype(np.uint8)
@@ -114,17 +105,10 @@ col1, col2 = st.columns(2)
 with col1:
     st.subheader("Input")
     uploaded_file = st.file_uploader("Upload Your Image", type=['png', 'jpg', 'jpeg'])
-    colormap_style = st.selectbox(
-        "Colormap Style",
-        ["Inferno", "Viridis", "Plasma", "Turbo", "Magma", "Hot", "Ocean", "Rainbow"]
-    )
     process_btn = st.button("🚀 Generate Depth Map", type="primary")
 with col2:
     st.subheader("Output")
-    depth_placeholder = st.empty()
 # Processing
 if uploaded_file is not None and process_btn:
@@ -135,9 +119,14 @@ if uploaded_file is not None and process_btn:
         st.image(image, caption="Original Image", use_column_width=True)
     with st.spinner("Generating depth map..."):
-        depth_colored, depth_gray, mode_text, input_shape, output_shape = estimate_depth(image, colormap_style)
     if depth_colored is not None:
         with col2:
             tab1, tab2 = st.tabs(["Colored", "Grayscale"])
@@ -153,7 +142,6 @@ if uploaded_file is not None and process_btn:
 **Mode**: {mode_text}
 **Input Size**: {input_shape[1]}x{input_shape[0]}
 **Output Size**: {output_shape[1]}x{output_shape[0]}
-**Colormap**: {colormap_style}
 {f'**Powered by**: Depth-Anything V2 {MODEL_SIZE}' if USE_REAL_AI else '**Processing**: Ultra-fast (<50ms) synthetic depth'}
         """)
@@ -161,17 +149,32 @@ if uploaded_file is not None and process_btn:
 st.markdown("---")
 st.subheader("🎬 Video Export")
-if uploaded_file is not None and depth_colored is not None:
-    with st.expander("Export Depth Map as Video"):
         col_vid1, col_vid2 = st.columns(2)
         with col_vid1:
             video_duration = st.slider("Duration (seconds)", 1, 10, 3)
             video_fps = st.selectbox("FPS", [24, 30, 60], index=1)
         with col_vid2:
-            video_resolution = st.selectbox("Resolution", ["Original", "1080p", "720p", "Square 1080p"])
-            video_effect = st.selectbox("Effect", ["Zoom In", "Zoom Out", "Pan Left", "Pan Right", "Rotate"])
         if st.button("🎬 Export Video", type="primary"):
             with st.spinner("Generating video..."):
@@ -179,6 +182,8 @@ if uploaded_file is not None and depth_colored is not None:
                     import cv2
                     import tempfile
                     # Get dimensions
                     if video_resolution == "1080p":
                         width, height = 1920, 1080
@@ -206,7 +211,7 @@ if uploaded_file is not None and depth_colored is not None:
                         # Apply effect
                         if video_effect == "Zoom In":
-                            scale = 1.0 + (progress * 0.5)  # Zoom from 1x to 1.5x
                             center_x, center_y = width // 2, height // 2
                             new_w, new_h = int(width / scale), int(height / scale)
                             x1, y1 = center_x - new_w // 2, center_y - new_h // 2
@@ -215,7 +220,39 @@ if uploaded_file is not None and depth_colored is not None:
                             frame = cv2.resize(cropped, (width, height))
                         elif video_effect == "Zoom Out":
-                            scale = 1.5 - (progress * 0.5)  # Zoom from 1.5x to 1x
                             center_x, center_y = width // 2, height // 2
                             new_w, new_h = int(width / scale), int(height / scale)
                             x1, y1 = center_x - new_w // 2, center_y - new_h // 2
@@ -231,12 +268,59 @@ if uploaded_file is not None and depth_colored is not None:
                             offset = int(width * progress * 0.3)
                             frame = np.roll(depth_resized, offset, axis=1)
-                        elif video_effect == "Rotate":
                             angle = progress * 360
                             center = (width // 2, height // 2)
                             rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
                             frame = cv2.warpAffine(depth_resized, rotation_matrix, (width, height))
                         else:
                             frame = depth_resized.copy()
@@ -254,7 +338,7 @@ if uploaded_file is not None and depth_colored is not None:
                     st.download_button(
                         label="📥 Download Video",
                         data=video_bytes,
-                        file_name=f"depth_video_{video_effect.lower().replace(' ', '_')}.mp4",
                         mime="video/mp4"
                     )
@@ -262,6 +346,8 @@ if uploaded_file is not None and depth_colored is not None:
                     st.error(f"Error generating video: {str(e)}")
                     import traceback
                     traceback.print_exc()
 # Info section
 st.markdown("---")
@@ -269,11 +355,19 @@ st.markdown("""
 ## 💡 About DimensioDepth
 ### Features:
-- ✅ Real AI depth estimation with Depth-Anything V2
-- ✅ Multiple colormap styles for visualization
 - ✅ Fast processing (~800ms on CPU, ~200ms on GPU)
 - ✅ SUPERB quality depth maps
-- ✅ **NEW!** Video export with camera effects
 ### Use Cases:
 - 🎨 **Creative & Artistic**: Depth-enhanced photos, 3D effects

 depth_estimator, USE_REAL_AI, MODEL_SIZE = load_model()
+def estimate_depth(image):
     """Estimate depth from an input image using REAL AI or DEMO MODE"""
+    if image is None:
+        return None, None, "Please upload an image first"
     try:
         # Convert PIL to numpy if needed
         if isinstance(image, Image.Image):
             depth = generate_smart_depth(image)
             mode_text = "DEMO MODE (Synthetic)"
+        # Create colored depth map with Inferno colormap (best for depth)
+        depth_colored = depth_to_colormap(depth, cv2.COLORMAP_INFERNO)
         # Create grayscale depth map
         depth_gray = (depth * 255).astype(np.uint8)
 with col1:
     st.subheader("Input")
     uploaded_file = st.file_uploader("Upload Your Image", type=['png', 'jpg', 'jpeg'])
     process_btn = st.button("🚀 Generate Depth Map", type="primary")
 with col2:
     st.subheader("Output")
 # Processing
 if uploaded_file is not None and process_btn:
         st.image(image, caption="Original Image", use_column_width=True)
     with st.spinner("Generating depth map..."):
+        depth_colored, depth_gray, mode_text, input_shape, output_shape = estimate_depth(image)
     if depth_colored is not None:
+        # Store in session state for video export
+        st.session_state['depth_colored'] = depth_colored
+        st.session_state['depth_gray'] = depth_gray
+        st.session_state['original_image'] = np.array(image)
         with col2:
             tab1, tab2 = st.tabs(["Colored", "Grayscale"])
 **Mode**: {mode_text}
 **Input Size**: {input_shape[1]}x{input_shape[0]}
 **Output Size**: {output_shape[1]}x{output_shape[0]}
 {f'**Powered by**: Depth-Anything V2 {MODEL_SIZE}' if USE_REAL_AI else '**Processing**: Ultra-fast (<50ms) synthetic depth'}
         """)
 st.markdown("---")
 st.subheader("🎬 Video Export")
+if 'depth_colored' in st.session_state:
+    with st.expander("Export Depth Map as Video", expanded=True):
         col_vid1, col_vid2 = st.columns(2)
         with col_vid1:
             video_duration = st.slider("Duration (seconds)", 1, 10, 3)
             video_fps = st.selectbox("FPS", [24, 30, 60], index=1)
+            video_resolution = st.selectbox("Resolution", ["Original", "1080p", "720p", "Square 1080p"])
         with col_vid2:
+            video_effect = st.selectbox("Camera Effect", [
+                "Zoom In",
+                "Zoom Out",
+                "Pan Left",
+                "Pan Right",
+                "Pan Up",
+                "Pan Down",
+                "Rotate CW",
+                "Rotate CCW",
+                "Ken Burns (Zoom + Pan)",
+                "Dolly In",
+                "Dolly Out",
+                "Tilt Up",
+                "Tilt Down",
+                "Orbit"
+            ])
         if st.button("🎬 Export Video", type="primary"):
             with st.spinner("Generating video..."):
                     import cv2
                     import tempfile
+                    depth_colored = st.session_state['depth_colored']
                     # Get dimensions
                     if video_resolution == "1080p":
                         width, height = 1920, 1080
                         # Apply effect
                         if video_effect == "Zoom In":
+                            scale = 1.0 + (progress * 0.5)
                             center_x, center_y = width // 2, height // 2
                             new_w, new_h = int(width / scale), int(height / scale)
                             x1, y1 = center_x - new_w // 2, center_y - new_h // 2
                             frame = cv2.resize(cropped, (width, height))
                         elif video_effect == "Zoom Out":
+                            scale = 1.5 - (progress * 0.5)
+                            center_x, center_y = width // 2, height // 2
+                            new_w, new_h = int(width / scale), int(height / scale)
+                            x1, y1 = center_x - new_w // 2, center_y - new_h // 2
+                            x2, y2 = x1 + new_w, y1 + new_h
+                            cropped = depth_resized[max(0, y1):min(height, y2), max(0, x1):min(width, x2)]
+                            frame = cv2.resize(cropped, (width, height))
+                        elif video_effect == "Ken Burns (Zoom + Pan)":
+                            # Ken Burns: zoom in while panning
+                            scale = 1.0 + (progress * 0.4)
+                            pan_x = int(width * progress * 0.2)
+                            pan_y = int(height * progress * 0.1)
+                            center_x = width // 2 + pan_x
+                            center_y = height // 2 + pan_y
+                            new_w, new_h = int(width / scale), int(height / scale)
+                            x1, y1 = center_x - new_w // 2, center_y - new_h // 2
+                            x2, y2 = x1 + new_w, y1 + new_h
+                            cropped = depth_resized[max(0, y1):min(height, y2), max(0, x1):min(width, x2)]
+                            frame = cv2.resize(cropped, (width, height))
+                        elif video_effect == "Dolly In":
+                            # Dolly in: smooth zoom with slight scale
+                            scale = 1.0 + (progress * 0.3)
+                            center_x, center_y = width // 2, height // 2
+                            new_w, new_h = int(width / scale), int(height / scale)
+                            x1, y1 = center_x - new_w // 2, center_y - new_h // 2
+                            x2, y2 = x1 + new_w, y1 + new_h
+                            cropped = depth_resized[max(0, y1):min(height, y2), max(0, x1):min(width, x2)]
+                            frame = cv2.resize(cropped, (width, height))
+                        elif video_effect == "Dolly Out":
+                            scale = 1.3 - (progress * 0.3)
                             center_x, center_y = width // 2, height // 2
                             new_w, new_h = int(width / scale), int(height / scale)
                             x1, y1 = center_x - new_w // 2, center_y - new_h // 2
                             offset = int(width * progress * 0.3)
                             frame = np.roll(depth_resized, offset, axis=1)
+                        elif video_effect == "Pan Up":
+                            offset = int(height * progress * 0.3)
+                            frame = np.roll(depth_resized, -offset, axis=0)
+                        elif video_effect == "Pan Down":
+                            offset = int(height * progress * 0.3)
+                            frame = np.roll(depth_resized, offset, axis=0)
+                        elif video_effect == "Tilt Up":
+                            # Tilt up: perspective transformation
+                            tilt_factor = progress * 0.3
+                            pts1 = np.float32([[0, 0], [width, 0], [0, height], [width, height]])
+                            pts2 = np.float32([
+                                [0, int(height * tilt_factor)],
+                                [width, int(height * tilt_factor)],
+                                [0, height],
+                                [width, height]
+                            ])
+                            matrix = cv2.getPerspectiveTransform(pts1, pts2)
+                            frame = cv2.warpPerspective(depth_resized, matrix, (width, height))
+                        elif video_effect == "Tilt Down":
+                            tilt_factor = progress * 0.3
+                            pts1 = np.float32([[0, 0], [width, 0], [0, height], [width, height]])
+                            pts2 = np.float32([
+                                [0, 0],
+                                [width, 0],
+                                [0, height - int(height * tilt_factor)],
+                                [width, height - int(height * tilt_factor)]
+                            ])
+                            matrix = cv2.getPerspectiveTransform(pts1, pts2)
+                            frame = cv2.warpPerspective(depth_resized, matrix, (width, height))
+                        elif video_effect == "Rotate CW":
+                            angle = progress * 360
+                            center = (width // 2, height // 2)
+                            rotation_matrix = cv2.getRotationMatrix2D(center, -angle, 1.0)
+                            frame = cv2.warpAffine(depth_resized, rotation_matrix, (width, height))
+                        elif video_effect == "Rotate CCW":
                             angle = progress * 360
                             center = (width // 2, height // 2)
                             rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
                             frame = cv2.warpAffine(depth_resized, rotation_matrix, (width, height))
+                        elif video_effect == "Orbit":
+                            # Orbit: rotate + slight zoom
+                            angle = progress * 360
+                            scale = 1.0 + (np.sin(progress * np.pi) * 0.2)
+                            center = (width // 2, height // 2)
+                            rotation_matrix = cv2.getRotationMatrix2D(center, angle, scale)
+                            frame = cv2.warpAffine(depth_resized, rotation_matrix, (width, height))
                         else:
                             frame = depth_resized.copy()
                     st.download_button(
                         label="📥 Download Video",
                         data=video_bytes,
+                        file_name=f"depth_video_{video_effect.lower().replace(' ', '_').replace('(', '').replace(')', '')}.mp4",
                         mime="video/mp4"
                     )
                     st.error(f"Error generating video: {str(e)}")
                     import traceback
                     traceback.print_exc()
+else:
+    st.info("👆 Upload an image and generate depth map first to enable video export")
 # Info section
 st.markdown("---")
 ## 💡 About DimensioDepth
 ### Features:
+- ✅ Real AI depth estimation with Depth-Anything V2 BASE model
 - ✅ Fast processing (~800ms on CPU, ~200ms on GPU)
 - ✅ SUPERB quality depth maps
+- ✅ **Professional video export** with cinematic camera movements
+### Camera Effects:
+- 📹 **Zoom In/Out** - Smooth zoom controls
+- 🎬 **Pan** - Left, Right, Up, Down panning
+- 🎥 **Dolly** - Professional dolly in/out shots
+- 🎞️ **Tilt** - Up/Down tilt movements
+- 🔄 **Rotate** - Clockwise/Counter-clockwise rotation
+- ⭐ **Ken Burns** - Classic zoom + pan effect
+- 🌀 **Orbit** - Smooth orbital rotation
 ### Use Cases:
 - 🎨 **Creative & Artistic**: Depth-enhanced photos, 3D effects