title: ComputeAgent - Hivenet AI Deployment
emoji: ๐
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: AI-Powered Deployment using MCP of Compute by Hivenet
tags:
- mcp-in-action-track-enterprise
- mcp-in-action-track-consumer
- mcp-in-action-track-creative
๐ ComputeAgent - Autonomous AI Deployment via MCP
An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute
๐ Hackathon Entry: Agents & MCP Hackathon โ Winter 2025 (Track 2: Agentic Applications)
๐ฏ Overview
ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the MCP 1st Birthday Hackathon, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration.
What once required hours of DevOps work now takes seconds.
Simply say: "Deploy meta-llama/Llama-3.1-70B" โ and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution.
๐ฎ Live Demo
Try the chatbot: https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent
๐น Preview
๐น Linkedin Post
Linkedin Post: https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/
๐ก The Problem
Deploying AI models at scale remains frustratingly manual and error-prone:
- โ Manual capacity planning - Calculating GPU memory requirements for each model
- โ Complex infrastructure setup - SSH keys, networking, environment dependencies
- โ Inference server configuration - vLLM, TensorRT-LLM parameter tuning
- โ Trial-and-error debugging - Hours spent troubleshooting deployment issues
- โ High barrier to entry - Requires DevOps expertise that many researchers lack
This friction slows innovation and makes large-model deployment inaccessible to many teams.
โจ Our Solution
ComputeAgent introduces autonomous compute orchestration through a multi-agent MCP architecture that thinks, plans, and acts on your behalf:
The Workflow
- ๐ค Natural Language Interface - Chat with the agent to deploy models
- ๐ง Intelligent Analysis - Automatically estimates GPU requirements from model architecture
- โก Automated Provisioning - Spins up HiveCompute instances via MCP
- ๐ง Smart Configuration - Generates optimized vLLM commands
- โ Human-in-the-Loop - Review and approve each step with modification capabilities
- ๐ฏ One-Click Deployment - From request to running endpoint in minutes
Powered entirely by open-source models (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure.
๐ฎ Key Features
๐ค Conversational Deployment
Deploy any Hugging Face model through natural language:
"Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France"
"I need Mistral-7B with low latency"
"Deploy GPT-OSS-20B for production"
๐ง Tool Approval System
Complete control with human-in-the-loop oversight:
- โ Approve All - Execute all proposed tools
- โ Reject All - Skip tool execution and get alternative responses
- ๐ง Selective Approval - Choose specific tools (e.g., "1,3,5")
- ๐ Modify Arguments - Edit parameters before execution
- ๐ Re-Reasoning - Provide feedback for agent reconsideration
๐ Automatic Capacity Estimation
Intelligent resource planning:
- Calculates GPU memory from model architecture
- Recommends optimal GPU types and quantities
- Considers tensor parallelism and quantization
- Accounts for KV cache and activation memory
๐ Multi-Location Support
Deploy across global regions:
- ๐ซ๐ท France
- ๐ฆ๐ช UAE
- ๐บ๐ธ Texas
๐ฏ GPU Selection
Support for latest hardware:
- NVIDIA RTX 4090 (24GB VRAM)
- NVIDIA RTX 5090 (32GB VRAM)
- Multi-GPU configurations
- Automatic tensor parallelism setup
Custom Capacity Configuration
Override automatic estimates
Tool Modification
Edit tool arguments before execution:
{
"name": "meta-llama-llama-3-1-8b",
"location": "uae",
"config": "1x RTX4090"
}
๐ฌ Interactive Gradio UI
Beautiful, responsive interface:
- Real-time chat interaction
- Tool approval panels
- Capacity configuration editor
- Session management
โก Real-time Processing
Fast and responsive:
- Async API with FastAPI
๐ Quick Start
Deploy Your First Model
1. Simple Deployment
Deploy meta-llama/Llama-3.1-8B
The agent will:
- Analyze the model (8B parameters, ~16GB VRAM needed)
- Recommend 1x RTX 4090
- Generate vLLM configuration
- Provision infrastructure
- Provide deployment commands
๐ Learning Resources
Understanding MCP
LangGraph & Agents
vLLM Deployment
๐ฅ Team
Team Name: Hivenet AI Team
Team Members:
- Igor Carrara - @carraraig - AI Scientist
- Mamoutou Diarra - @mdiarra - AI Scientist
๐ Hackathon Context
Created for the MCP 1st Birthday Hackathon, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration.
Why This Matters
ComputeAgent showcases:
- โ MCP's power for building production-grade agents
- โ Human-in-the-loop design for responsible AI
- โ Real-world utility solving actual deployment pain points
- โ Open-source first approach with accessible technology
๐ค Contributing
We welcome contributions! Here's how you can help:
Areas for Contribution
- ๐ Bug fixes - Report and fix issues
- โจ New features - Add support for more models or GPUs
- ๐ Documentation - Improve guides and examples
- ๐งช Testing - Add test coverage
- ๐จ UI/UX - Enhance the interface
๐ License
Apache 2.0
๐ About Hivenet & HiveCompute
What is Hivenet?
Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly.
HiveCompute: Distributed GPU Cloud
Compute with Hivenet is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources.
๐ฏ Key Features
๐ High-Performance GPUs
- Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090
- Performance that matches or exceeds traditional data center GPUs
- Perfect for AI inference, training, rendering, and scientific computing
๐ฐ Transparent & Affordable Pricing
- Per-second billing with up to 58% savings compared to GCP, AWS, and Azure
- No hidden egress fees or long-term commitments
- Pay only for what you use with prepaid credits
๐ Global Infrastructure
- GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance
- Built-in GDPR compliance
- Data stays local for faster AI model responses
โป๏ธ Sustainable Computing
- Uses unused computing power from devices worldwide instead of power-hungry data centers
- Reduces carbon footprint by up to 77% compared to traditional cloud services
- Community-driven distributed infrastructure
- Utilizes existing, underutilized hardware
- Reduces the need for new data center construction
โก Instant Setup
- Launch GPU instances in seconds
- Pre-configured templates for popular frameworks
- Jupyter notebooks and SSH access included
- Pause/resume instances without losing your setup
๐ Enterprise-Grade Reliability
- Workloads automatically replicate across trusted nodes, keeping downtime near-zero
- Hive-Certified providers with 99.9% uptime SLA
- Tier-3 data center equivalent quality
Learn more at compute.hivenet.com
๐ฌ Support
Need Help?
- ๐ Documentation - Check this README and inline code comments
- ๐ง Email - Contact the HiveNet team
Built with โค๏ธ by the HiveNet Team
Making large-scale AI deployment accessible to everyone.