File size: 9,265 Bytes
a8d3d79
0417005
 
 
 
be3c88f
 
a8d3d79
 
 
0417005
76655a9
a140a2f
 
a8d3d79
 
76655a9
0417005
76655a9
8816dfd
76655a9
8816dfd
76655a9
8816dfd
76655a9
0417005
76655a9
0417005
76655a9
0417005
76655a9
0417005
76655a9
0417005
4d58585
 
 
 
 
 
 
 
 
 
 
 
25530e0
 
10331cc
25530e0
 
 
76655a9
0417005
76655a9
0417005
76655a9
 
 
 
 
0417005
76655a9
0417005
76655a9
0417005
76655a9
0417005
76655a9
0417005
76655a9
0417005
76655a9
 
 
 
 
 
0417005
76655a9
0417005
76655a9
0417005
76655a9
0417005
76655a9
 
 
 
 
 
 
0417005
76655a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0417005
76655a9
 
 
 
 
 
0417005
76655a9
 
 
0417005
76655a9
 
 
 
 
 
 
0417005
76655a9
0417005
 
76655a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0417005
76655a9
0417005
76655a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0417005
 
 
76655a9
 
e91d611
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76655a9
 
 
0417005
76655a9
 
0417005
 
 
76655a9
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
---
title: ComputeAgent - Hivenet AI Deployment
emoji: ๐Ÿš€
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: AI-Powered Deployment using MCP of Compute by Hivenet
tags:
  - mcp-in-action-track-enterprise
  - mcp-in-action-track-consumer
  - mcp-in-action-track-creative
---

# ๐Ÿš€ ComputeAgent - Autonomous AI Deployment via MCP

**An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute**

๐Ÿ”— **Hackathon Entry:** [Agents & MCP Hackathon โ€“ Winter 2025 (Track 2: Agentic Applications)](https://huggingface.co/Agents-MCP-Hackathon-Winter25#-track-2-agentic-applications)

---

## ๐ŸŽฏ Overview

ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the **MCP 1st Birthday Hackathon**, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration.

**What once required hours of DevOps work now takes seconds.**

Simply say: *"Deploy meta-llama/Llama-3.1-70B"* โ€” and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution.

---

## ๐Ÿ”ฎ Live Demo

Try the chatbot: **[https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent](https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent)**

---

## ๐Ÿ“น Preview

<video controls autoplay src="https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent/resolve/main/Demo_Final.mp4"></video>

---

## ๐Ÿ“น Linkedin Post

Linkedin Post: **[https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/](https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/)**

---

## ๐Ÿ’ก The Problem

Deploying AI models at scale remains frustratingly manual and error-prone:

- โŒ **Manual capacity planning** - Calculating GPU memory requirements for each model
- โŒ **Complex infrastructure setup** - SSH keys, networking, environment dependencies
- โŒ **Inference server configuration** - vLLM, TensorRT-LLM parameter tuning
- โŒ **Trial-and-error debugging** - Hours spent troubleshooting deployment issues
- โŒ **High barrier to entry** - Requires DevOps expertise that many researchers lack

This friction slows innovation and makes large-model deployment inaccessible to many teams.

---

## โœจ Our Solution

ComputeAgent introduces **autonomous compute orchestration** through a multi-agent MCP architecture that thinks, plans, and acts on your behalf:

### The Workflow

1. **๐Ÿค– Natural Language Interface** - Chat with the agent to deploy models
2. **๐Ÿง  Intelligent Analysis** - Automatically estimates GPU requirements from model architecture
3. **โšก Automated Provisioning** - Spins up HiveCompute instances via MCP
4. **๐Ÿ”ง Smart Configuration** - Generates optimized vLLM commands
5. **โœ… Human-in-the-Loop** - Review and approve each step with modification capabilities
6. **๐ŸŽฏ One-Click Deployment** - From request to running endpoint in minutes

**Powered entirely by open-source models** (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure.

---

## ๐ŸŽฎ Key Features

### ๐Ÿค– Conversational Deployment
Deploy any Hugging Face model through natural language:
```
"Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France"
"I need Mistral-7B with low latency"
"Deploy GPT-OSS-20B for production"
```

### ๐Ÿ”ง Tool Approval System
Complete control with human-in-the-loop oversight:
- **โœ… Approve All** - Execute all proposed tools
- **โŒ Reject All** - Skip tool execution and get alternative responses
- **๐Ÿ”ง Selective Approval** - Choose specific tools (e.g., "1,3,5")
- **๐Ÿ“ Modify Arguments** - Edit parameters before execution
- **๐Ÿ”„ Re-Reasoning** - Provide feedback for agent reconsideration

### ๐Ÿ“Š Automatic Capacity Estimation
Intelligent resource planning:
- Calculates GPU memory from model architecture
- Recommends optimal GPU types and quantities
- Considers tensor parallelism and quantization
- Accounts for KV cache and activation memory

### ๐ŸŒ Multi-Location Support
Deploy across global regions:
- ๐Ÿ‡ซ๐Ÿ‡ท **France** 
- ๐Ÿ‡ฆ๐Ÿ‡ช **UAE** 
- ๐Ÿ‡บ๐Ÿ‡ธ **Texas** 

### ๐ŸŽฏ GPU Selection
Support for latest hardware:
- NVIDIA RTX 4090 (24GB VRAM)
- NVIDIA RTX 5090 (32GB VRAM)
- Multi-GPU configurations
- Automatic tensor parallelism setup

### Custom Capacity Configuration

Override automatic estimates

### Tool Modification

Edit tool arguments before execution:
```json
{
  "name": "meta-llama-llama-3-1-8b",
  "location": "uae",
  "config": "1x RTX4090"
}
```

### ๐Ÿ’ฌ Interactive Gradio UI
Beautiful, responsive interface:
- Real-time chat interaction
- Tool approval panels
- Capacity configuration editor
- Session management

### โšก Real-time Processing
Fast and responsive:
- Async API with FastAPI

---

## ๐Ÿš€ Quick Start

### Deploy Your First Model

#### 1. **Simple Deployment**
```
Deploy meta-llama/Llama-3.1-8B
```

The agent will:
- Analyze the model (8B parameters, ~16GB VRAM needed)
- Recommend 1x RTX 4090
- Generate vLLM configuration
- Provision infrastructure
- Provide deployment commands

---

## ๐ŸŽ“ Learning Resources

### Understanding MCP
- [Model Context Protocol Specification](https://modelcontextprotocol.io/)
- [MCP Documentation](https://github.com/modelcontextprotocol)

### LangGraph & Agents
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Building Agentic Systems](https://python.langchain.com/docs/modules/agents/)

### vLLM Deployment
- [vLLM Documentation](https://docs.vllm.ai/)
- [Optimizing Inference](https://docs.vllm.ai/en/latest/serving/performance.html)

---

## ๐Ÿ‘ฅ Team

**Team Name:** Hivenet AI Team

**Team Members:**
- **Igor Carrara** - [@carraraig](https://huggingface.co/carraraig) - AI Scientist
- **Mamoutou Diarra** - [@mdiarra](https://huggingface.co/mdiarra) - AI Scientist

---

## ๐ŸŽ‰ Hackathon Context

Created for the **MCP 1st Birthday Hackathon**, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration.

### Why This Matters

ComputeAgent showcases:
- โœ… **MCP's power** for building production-grade agents
- โœ… **Human-in-the-loop** design for responsible AI
- โœ… **Real-world utility** solving actual deployment pain points
- โœ… **Open-source first** approach with accessible technology

---

## ๐Ÿค Contributing

We welcome contributions! Here's how you can help:

### Areas for Contribution
- ๐Ÿ› **Bug fixes** - Report and fix issues
- โœจ **New features** - Add support for more models or GPUs
- ๐Ÿ“š **Documentation** - Improve guides and examples
- ๐Ÿงช **Testing** - Add test coverage
- ๐ŸŽจ **UI/UX** - Enhance the interface

---

## ๐Ÿ“„ License

Apache 2.0

---

---

## ๐ŸŒ About Hivenet & HiveCompute

### What is Hivenet?

Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly.

### HiveCompute: Distributed GPU Cloud

**Compute with Hivenet** is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources.

#### ๐ŸŽฏ Key Features

**๐Ÿš€ High-Performance GPUs**
- Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090
- Performance that matches or exceeds traditional data center GPUs
- Perfect for AI inference, training, rendering, and scientific computing

**๐Ÿ’ฐ Transparent & Affordable Pricing**
- Per-second billing with up to 58% savings compared to GCP, AWS, and Azure
- No hidden egress fees or long-term commitments
- Pay only for what you use with prepaid credits

**๐ŸŒ Global Infrastructure**
- GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance
- Built-in GDPR compliance
- Data stays local for faster AI model responses

**โ™ป๏ธ Sustainable Computing**
- Uses unused computing power from devices worldwide instead of power-hungry data centers
- Reduces carbon footprint by up to 77% compared to traditional cloud services
- Community-driven distributed infrastructure
- Utilizes existing, underutilized hardware
- Reduces the need for new data center construction

**โšก Instant Setup**
- Launch GPU instances in seconds
- Pre-configured templates for popular frameworks
- Jupyter notebooks and SSH access included
- Pause/resume instances without losing your setup

**๐Ÿ”’ Enterprise-Grade Reliability**
- Workloads automatically replicate across trusted nodes, keeping downtime near-zero
- Hive-Certified providers with 99.9% uptime SLA
- Tier-3 data center equivalent quality

Learn more at [compute.hivenet.com](https://compute.hivenet.com/)

---

## ๐Ÿ’ฌ Support

### Need Help?

- ๐Ÿ“– **Documentation** - Check this README and inline code comments
- ๐Ÿ“ง **Email** - Contact the HiveNet team

---

<div align="center">

**Built with โค๏ธ by the HiveNet Team**

*Making large-scale AI deployment accessible to everyone.*

</div>