Spaces:

MCP-1st-Birthday
/

Hivenet_ComputeAgent

Running

App Files Files Community

Hivenet_ComputeAgent / README.md

carraraig

Update README.md

10331cc verified 13 days ago

preview code

raw

history blame contribute delete

9.27 kB

metadata

title: ComputeAgent - Hivenet AI Deployment
emoji: 🚀
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: AI-Powered Deployment using MCP of Compute by Hivenet
tags:
  - mcp-in-action-track-enterprise
  - mcp-in-action-track-consumer
  - mcp-in-action-track-creative

🚀 ComputeAgent - Autonomous AI Deployment via MCP

An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute

🔗 Hackathon Entry: Agents & MCP Hackathon – Winter 2025 (Track 2: Agentic Applications)

🎯 Overview

ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the MCP 1st Birthday Hackathon, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration.

What once required hours of DevOps work now takes seconds.

Simply say: "Deploy meta-llama/Llama-3.1-70B" — and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution.

🔮 Live Demo

Try the chatbot: https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent

📹 Preview

📹 Linkedin Post

Linkedin Post: https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/

💡 The Problem

Deploying AI models at scale remains frustratingly manual and error-prone:

❌ Manual capacity planning - Calculating GPU memory requirements for each model
❌ Complex infrastructure setup - SSH keys, networking, environment dependencies
❌ Inference server configuration - vLLM, TensorRT-LLM parameter tuning
❌ Trial-and-error debugging - Hours spent troubleshooting deployment issues
❌ High barrier to entry - Requires DevOps expertise that many researchers lack

This friction slows innovation and makes large-model deployment inaccessible to many teams.

✨ Our Solution

ComputeAgent introduces autonomous compute orchestration through a multi-agent MCP architecture that thinks, plans, and acts on your behalf:

The Workflow

🤖 Natural Language Interface - Chat with the agent to deploy models
🧠 Intelligent Analysis - Automatically estimates GPU requirements from model architecture
⚡ Automated Provisioning - Spins up HiveCompute instances via MCP
🔧 Smart Configuration - Generates optimized vLLM commands
✅ Human-in-the-Loop - Review and approve each step with modification capabilities
🎯 One-Click Deployment - From request to running endpoint in minutes

Powered entirely by open-source models (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure.

🎮 Key Features

🤖 Conversational Deployment

Deploy any Hugging Face model through natural language:

"Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France"
"I need Mistral-7B with low latency"
"Deploy GPT-OSS-20B for production"

🔧 Tool Approval System

Complete control with human-in-the-loop oversight:

✅ Approve All - Execute all proposed tools
❌ Reject All - Skip tool execution and get alternative responses
🔧 Selective Approval - Choose specific tools (e.g., "1,3,5")
📝 Modify Arguments - Edit parameters before execution
🔄 Re-Reasoning - Provide feedback for agent reconsideration

📊 Automatic Capacity Estimation

Intelligent resource planning:

Calculates GPU memory from model architecture
Recommends optimal GPU types and quantities
Considers tensor parallelism and quantization
Accounts for KV cache and activation memory

🌍 Multi-Location Support

Deploy across global regions:

🇫🇷 France
🇦🇪 UAE
🇺🇸 Texas

🎯 GPU Selection

Support for latest hardware:

NVIDIA RTX 4090 (24GB VRAM)
NVIDIA RTX 5090 (32GB VRAM)
Multi-GPU configurations
Automatic tensor parallelism setup

Custom Capacity Configuration

Override automatic estimates

Tool Modification

Edit tool arguments before execution:

{
  "name": "meta-llama-llama-3-1-8b",
  "location": "uae",
  "config": "1x RTX4090"
}

💬 Interactive Gradio UI

Beautiful, responsive interface:

Real-time chat interaction
Tool approval panels
Capacity configuration editor
Session management

⚡ Real-time Processing

Fast and responsive:

Async API with FastAPI

🚀 Quick Start

Deploy Your First Model

1. Simple Deployment

Deploy meta-llama/Llama-3.1-8B

The agent will:

Analyze the model (8B parameters, ~16GB VRAM needed)
Recommend 1x RTX 4090
Generate vLLM configuration
Provision infrastructure
Provide deployment commands

🎓 Learning Resources

Understanding MCP

LangGraph & Agents

vLLM Deployment

👥 Team

Team Name: Hivenet AI Team

Team Members:

Igor Carrara - @carraraig - AI Scientist
Mamoutou Diarra - @mdiarra - AI Scientist

🎉 Hackathon Context

Created for the MCP 1st Birthday Hackathon, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration.

Why This Matters

ComputeAgent showcases:

✅ MCP's power for building production-grade agents
✅ Human-in-the-loop design for responsible AI
✅ Real-world utility solving actual deployment pain points
✅ Open-source first approach with accessible technology

🤝 Contributing

We welcome contributions! Here's how you can help:

Areas for Contribution

🐛 Bug fixes - Report and fix issues
✨ New features - Add support for more models or GPUs
📚 Documentation - Improve guides and examples
🧪 Testing - Add test coverage
🎨 UI/UX - Enhance the interface

📄 License

Apache 2.0

🌐 About Hivenet & HiveCompute

What is Hivenet?

Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly.

HiveCompute: Distributed GPU Cloud

Compute with Hivenet is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources.

🎯 Key Features

🚀 High-Performance GPUs

Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090
Performance that matches or exceeds traditional data center GPUs
Perfect for AI inference, training, rendering, and scientific computing

💰 Transparent & Affordable Pricing

Per-second billing with up to 58% savings compared to GCP, AWS, and Azure
No hidden egress fees or long-term commitments
Pay only for what you use with prepaid credits

🌍 Global Infrastructure

GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance
Built-in GDPR compliance
Data stays local for faster AI model responses

♻️ Sustainable Computing

Uses unused computing power from devices worldwide instead of power-hungry data centers
Reduces carbon footprint by up to 77% compared to traditional cloud services
Community-driven distributed infrastructure
Utilizes existing, underutilized hardware
Reduces the need for new data center construction

⚡ Instant Setup

Launch GPU instances in seconds
Pre-configured templates for popular frameworks
Jupyter notebooks and SSH access included
Pause/resume instances without losing your setup

🔒 Enterprise-Grade Reliability

Workloads automatically replicate across trusted nodes, keeping downtime near-zero
Hive-Certified providers with 99.9% uptime SLA
Tier-3 data center equivalent quality

Learn more at compute.hivenet.com

💬 Support

Need Help?

📖 Documentation - Check this README and inline code comments
📧 Email - Contact the HiveNet team

Built with ❤️ by the HiveNet Team

Making large-scale AI deployment accessible to everyone.