File size: 9,265 Bytes
a8d3d79 0417005 be3c88f a8d3d79 0417005 76655a9 a140a2f a8d3d79 76655a9 0417005 76655a9 8816dfd 76655a9 8816dfd 76655a9 8816dfd 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 4d58585 25530e0 10331cc 25530e0 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 0417005 76655a9 e91d611 76655a9 0417005 76655a9 0417005 76655a9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 |
---
title: ComputeAgent - Hivenet AI Deployment
emoji: ๐
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: AI-Powered Deployment using MCP of Compute by Hivenet
tags:
- mcp-in-action-track-enterprise
- mcp-in-action-track-consumer
- mcp-in-action-track-creative
---
# ๐ ComputeAgent - Autonomous AI Deployment via MCP
**An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute**
๐ **Hackathon Entry:** [Agents & MCP Hackathon โ Winter 2025 (Track 2: Agentic Applications)](https://huggingface.co/Agents-MCP-Hackathon-Winter25#-track-2-agentic-applications)
---
## ๐ฏ Overview
ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the **MCP 1st Birthday Hackathon**, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration.
**What once required hours of DevOps work now takes seconds.**
Simply say: *"Deploy meta-llama/Llama-3.1-70B"* โ and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution.
---
## ๐ฎ Live Demo
Try the chatbot: **[https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent](https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent)**
---
## ๐น Preview
<video controls autoplay src="https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent/resolve/main/Demo_Final.mp4"></video>
---
## ๐น Linkedin Post
Linkedin Post: **[https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/](https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/)**
---
## ๐ก The Problem
Deploying AI models at scale remains frustratingly manual and error-prone:
- โ **Manual capacity planning** - Calculating GPU memory requirements for each model
- โ **Complex infrastructure setup** - SSH keys, networking, environment dependencies
- โ **Inference server configuration** - vLLM, TensorRT-LLM parameter tuning
- โ **Trial-and-error debugging** - Hours spent troubleshooting deployment issues
- โ **High barrier to entry** - Requires DevOps expertise that many researchers lack
This friction slows innovation and makes large-model deployment inaccessible to many teams.
---
## โจ Our Solution
ComputeAgent introduces **autonomous compute orchestration** through a multi-agent MCP architecture that thinks, plans, and acts on your behalf:
### The Workflow
1. **๐ค Natural Language Interface** - Chat with the agent to deploy models
2. **๐ง Intelligent Analysis** - Automatically estimates GPU requirements from model architecture
3. **โก Automated Provisioning** - Spins up HiveCompute instances via MCP
4. **๐ง Smart Configuration** - Generates optimized vLLM commands
5. **โ
Human-in-the-Loop** - Review and approve each step with modification capabilities
6. **๐ฏ One-Click Deployment** - From request to running endpoint in minutes
**Powered entirely by open-source models** (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure.
---
## ๐ฎ Key Features
### ๐ค Conversational Deployment
Deploy any Hugging Face model through natural language:
```
"Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France"
"I need Mistral-7B with low latency"
"Deploy GPT-OSS-20B for production"
```
### ๐ง Tool Approval System
Complete control with human-in-the-loop oversight:
- **โ
Approve All** - Execute all proposed tools
- **โ Reject All** - Skip tool execution and get alternative responses
- **๐ง Selective Approval** - Choose specific tools (e.g., "1,3,5")
- **๐ Modify Arguments** - Edit parameters before execution
- **๐ Re-Reasoning** - Provide feedback for agent reconsideration
### ๐ Automatic Capacity Estimation
Intelligent resource planning:
- Calculates GPU memory from model architecture
- Recommends optimal GPU types and quantities
- Considers tensor parallelism and quantization
- Accounts for KV cache and activation memory
### ๐ Multi-Location Support
Deploy across global regions:
- ๐ซ๐ท **France**
- ๐ฆ๐ช **UAE**
- ๐บ๐ธ **Texas**
### ๐ฏ GPU Selection
Support for latest hardware:
- NVIDIA RTX 4090 (24GB VRAM)
- NVIDIA RTX 5090 (32GB VRAM)
- Multi-GPU configurations
- Automatic tensor parallelism setup
### Custom Capacity Configuration
Override automatic estimates
### Tool Modification
Edit tool arguments before execution:
```json
{
"name": "meta-llama-llama-3-1-8b",
"location": "uae",
"config": "1x RTX4090"
}
```
### ๐ฌ Interactive Gradio UI
Beautiful, responsive interface:
- Real-time chat interaction
- Tool approval panels
- Capacity configuration editor
- Session management
### โก Real-time Processing
Fast and responsive:
- Async API with FastAPI
---
## ๐ Quick Start
### Deploy Your First Model
#### 1. **Simple Deployment**
```
Deploy meta-llama/Llama-3.1-8B
```
The agent will:
- Analyze the model (8B parameters, ~16GB VRAM needed)
- Recommend 1x RTX 4090
- Generate vLLM configuration
- Provision infrastructure
- Provide deployment commands
---
## ๐ Learning Resources
### Understanding MCP
- [Model Context Protocol Specification](https://modelcontextprotocol.io/)
- [MCP Documentation](https://github.com/modelcontextprotocol)
### LangGraph & Agents
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Building Agentic Systems](https://python.langchain.com/docs/modules/agents/)
### vLLM Deployment
- [vLLM Documentation](https://docs.vllm.ai/)
- [Optimizing Inference](https://docs.vllm.ai/en/latest/serving/performance.html)
---
## ๐ฅ Team
**Team Name:** Hivenet AI Team
**Team Members:**
- **Igor Carrara** - [@carraraig](https://huggingface.co/carraraig) - AI Scientist
- **Mamoutou Diarra** - [@mdiarra](https://huggingface.co/mdiarra) - AI Scientist
---
## ๐ Hackathon Context
Created for the **MCP 1st Birthday Hackathon**, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration.
### Why This Matters
ComputeAgent showcases:
- โ
**MCP's power** for building production-grade agents
- โ
**Human-in-the-loop** design for responsible AI
- โ
**Real-world utility** solving actual deployment pain points
- โ
**Open-source first** approach with accessible technology
---
## ๐ค Contributing
We welcome contributions! Here's how you can help:
### Areas for Contribution
- ๐ **Bug fixes** - Report and fix issues
- โจ **New features** - Add support for more models or GPUs
- ๐ **Documentation** - Improve guides and examples
- ๐งช **Testing** - Add test coverage
- ๐จ **UI/UX** - Enhance the interface
---
## ๐ License
Apache 2.0
---
---
## ๐ About Hivenet & HiveCompute
### What is Hivenet?
Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly.
### HiveCompute: Distributed GPU Cloud
**Compute with Hivenet** is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources.
#### ๐ฏ Key Features
**๐ High-Performance GPUs**
- Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090
- Performance that matches or exceeds traditional data center GPUs
- Perfect for AI inference, training, rendering, and scientific computing
**๐ฐ Transparent & Affordable Pricing**
- Per-second billing with up to 58% savings compared to GCP, AWS, and Azure
- No hidden egress fees or long-term commitments
- Pay only for what you use with prepaid credits
**๐ Global Infrastructure**
- GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance
- Built-in GDPR compliance
- Data stays local for faster AI model responses
**โป๏ธ Sustainable Computing**
- Uses unused computing power from devices worldwide instead of power-hungry data centers
- Reduces carbon footprint by up to 77% compared to traditional cloud services
- Community-driven distributed infrastructure
- Utilizes existing, underutilized hardware
- Reduces the need for new data center construction
**โก Instant Setup**
- Launch GPU instances in seconds
- Pre-configured templates for popular frameworks
- Jupyter notebooks and SSH access included
- Pause/resume instances without losing your setup
**๐ Enterprise-Grade Reliability**
- Workloads automatically replicate across trusted nodes, keeping downtime near-zero
- Hive-Certified providers with 99.9% uptime SLA
- Tier-3 data center equivalent quality
Learn more at [compute.hivenet.com](https://compute.hivenet.com/)
---
## ๐ฌ Support
### Need Help?
- ๐ **Documentation** - Check this README and inline code comments
- ๐ง **Email** - Contact the HiveNet team
---
<div align="center">
**Built with โค๏ธ by the HiveNet Team**
*Making large-scale AI deployment accessible to everyone.*
</div> |