Spaces:

MCP-1st-Birthday
/

Hivenet_ComputeAgent

Running

File size: 9,265 Bytes

---
title: ComputeAgent - Hivenet AI Deployment
emoji: 🚀
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: AI-Powered Deployment using MCP of Compute by Hivenet
tags:
  - mcp-in-action-track-enterprise
  - mcp-in-action-track-consumer
  - mcp-in-action-track-creative
---

# 🚀 ComputeAgent - Autonomous AI Deployment via MCP

**An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute**

🔗 **Hackathon Entry:** [Agents & MCP Hackathon – Winter 2025 (Track 2: Agentic Applications)](https://huggingface.co/Agents-MCP-Hackathon-Winter25#-track-2-agentic-applications)

---

## 🎯 Overview

ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the **MCP 1st Birthday Hackathon**, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration.

**What once required hours of DevOps work now takes seconds.**

Simply say: *"Deploy meta-llama/Llama-3.1-70B"* — and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution.

---

## 🔮 Live Demo

Try the chatbot: **[https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent](https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent)**

---

## 📹 Preview

<video controls autoplay src="https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent/resolve/main/Demo_Final.mp4"></video>

---

## 📹 Linkedin Post

Linkedin Post: **[https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/](https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/)**

---

## 💡 The Problem

Deploying AI models at scale remains frustratingly manual and error-prone:

- ❌ **Manual capacity planning** - Calculating GPU memory requirements for each model
- ❌ **Complex infrastructure setup** - SSH keys, networking, environment dependencies
- ❌ **Inference server configuration** - vLLM, TensorRT-LLM parameter tuning
- ❌ **Trial-and-error debugging** - Hours spent troubleshooting deployment issues
- ❌ **High barrier to entry** - Requires DevOps expertise that many researchers lack

This friction slows innovation and makes large-model deployment inaccessible to many teams.

---

## ✨ Our Solution

ComputeAgent introduces **autonomous compute orchestration** through a multi-agent MCP architecture that thinks, plans, and acts on your behalf:

### The Workflow

1. **🤖 Natural Language Interface** - Chat with the agent to deploy models
2. **🧠 Intelligent Analysis** - Automatically estimates GPU requirements from model architecture
3. **⚡ Automated Provisioning** - Spins up HiveCompute instances via MCP
4. **🔧 Smart Configuration** - Generates optimized vLLM commands
5. **✅ Human-in-the-Loop** - Review and approve each step with modification capabilities
6. **🎯 One-Click Deployment** - From request to running endpoint in minutes

**Powered entirely by open-source models** (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure.

---

## 🎮 Key Features

### 🤖 Conversational Deployment
Deploy any Hugging Face model through natural language:
```
"Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France"
"I need Mistral-7B with low latency"
"Deploy GPT-OSS-20B for production"
```

### 🔧 Tool Approval System
Complete control with human-in-the-loop oversight:
- **✅ Approve All** - Execute all proposed tools
- **❌ Reject All** - Skip tool execution and get alternative responses
- **🔧 Selective Approval** - Choose specific tools (e.g., "1,3,5")
- **📝 Modify Arguments** - Edit parameters before execution
- **🔄 Re-Reasoning** - Provide feedback for agent reconsideration

### 📊 Automatic Capacity Estimation
Intelligent resource planning:
- Calculates GPU memory from model architecture
- Recommends optimal GPU types and quantities
- Considers tensor parallelism and quantization
- Accounts for KV cache and activation memory

### 🌍 Multi-Location Support
Deploy across global regions:
- 🇫🇷 **France** 
- 🇦🇪 **UAE** 
- 🇺🇸 **Texas** 

### 🎯 GPU Selection
Support for latest hardware:
- NVIDIA RTX 4090 (24GB VRAM)
- NVIDIA RTX 5090 (32GB VRAM)
- Multi-GPU configurations
- Automatic tensor parallelism setup

### Custom Capacity Configuration

Override automatic estimates

### Tool Modification

Edit tool arguments before execution:
```json
{
  "name": "meta-llama-llama-3-1-8b",
  "location": "uae",
  "config": "1x RTX4090"
}
```

### 💬 Interactive Gradio UI
Beautiful, responsive interface:
- Real-time chat interaction
- Tool approval panels
- Capacity configuration editor
- Session management

### ⚡ Real-time Processing
Fast and responsive:
- Async API with FastAPI

---

## 🚀 Quick Start

### Deploy Your First Model

#### 1. **Simple Deployment**
```
Deploy meta-llama/Llama-3.1-8B
```

The agent will:
- Analyze the model (8B parameters, ~16GB VRAM needed)
- Recommend 1x RTX 4090
- Generate vLLM configuration
- Provision infrastructure
- Provide deployment commands

---

## 🎓 Learning Resources

### Understanding MCP
- [Model Context Protocol Specification](https://modelcontextprotocol.io/)
- [MCP Documentation](https://github.com/modelcontextprotocol)

### LangGraph & Agents
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Building Agentic Systems](https://python.langchain.com/docs/modules/agents/)

### vLLM Deployment
- [vLLM Documentation](https://docs.vllm.ai/)
- [Optimizing Inference](https://docs.vllm.ai/en/latest/serving/performance.html)

---

## 👥 Team

**Team Name:** Hivenet AI Team

**Team Members:**
- **Igor Carrara** - [@carraraig](https://huggingface.co/carraraig) - AI Scientist
- **Mamoutou Diarra** - [@mdiarra](https://huggingface.co/mdiarra) - AI Scientist

---

## 🎉 Hackathon Context

Created for the **MCP 1st Birthday Hackathon**, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration.

### Why This Matters

ComputeAgent showcases:
- ✅ **MCP's power** for building production-grade agents
- ✅ **Human-in-the-loop** design for responsible AI
- ✅ **Real-world utility** solving actual deployment pain points
- ✅ **Open-source first** approach with accessible technology

---

## 🤝 Contributing

We welcome contributions! Here's how you can help:

### Areas for Contribution
- 🐛 **Bug fixes** - Report and fix issues
- ✨ **New features** - Add support for more models or GPUs
- 📚 **Documentation** - Improve guides and examples
- 🧪 **Testing** - Add test coverage
- 🎨 **UI/UX** - Enhance the interface

---

## 📄 License

Apache 2.0

---

---

## 🌐 About Hivenet & HiveCompute

### What is Hivenet?

Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly.

### HiveCompute: Distributed GPU Cloud

**Compute with Hivenet** is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources.

#### 🎯 Key Features

**🚀 High-Performance GPUs**
- Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090
- Performance that matches or exceeds traditional data center GPUs
- Perfect for AI inference, training, rendering, and scientific computing

**💰 Transparent & Affordable Pricing**
- Per-second billing with up to 58% savings compared to GCP, AWS, and Azure
- No hidden egress fees or long-term commitments
- Pay only for what you use with prepaid credits

**🌍 Global Infrastructure**
- GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance
- Built-in GDPR compliance
- Data stays local for faster AI model responses

**♻️ Sustainable Computing**
- Uses unused computing power from devices worldwide instead of power-hungry data centers
- Reduces carbon footprint by up to 77% compared to traditional cloud services
- Community-driven distributed infrastructure
- Utilizes existing, underutilized hardware
- Reduces the need for new data center construction

**⚡ Instant Setup**
- Launch GPU instances in seconds
- Pre-configured templates for popular frameworks
- Jupyter notebooks and SSH access included
- Pause/resume instances without losing your setup

**🔒 Enterprise-Grade Reliability**
- Workloads automatically replicate across trusted nodes, keeping downtime near-zero
- Hive-Certified providers with 99.9% uptime SLA
- Tier-3 data center equivalent quality

Learn more at [compute.hivenet.com](https://compute.hivenet.com/)

---

## 💬 Support

### Need Help?

- 📖 **Documentation** - Check this README and inline code comments
- 📧 **Email** - Contact the HiveNet team

---

<div align="center">

**Built with ❤️ by the HiveNet Team**

*Making large-scale AI deployment accessible to everyone.*

</div>