carraraig's picture
Update README.md
10331cc verified
metadata
title: ComputeAgent - Hivenet AI Deployment
emoji: ๐Ÿš€
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: AI-Powered Deployment using MCP of Compute by Hivenet
tags:
  - mcp-in-action-track-enterprise
  - mcp-in-action-track-consumer
  - mcp-in-action-track-creative

๐Ÿš€ ComputeAgent - Autonomous AI Deployment via MCP

An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute

๐Ÿ”— Hackathon Entry: Agents & MCP Hackathon โ€“ Winter 2025 (Track 2: Agentic Applications)


๐ŸŽฏ Overview

ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the MCP 1st Birthday Hackathon, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration.

What once required hours of DevOps work now takes seconds.

Simply say: "Deploy meta-llama/Llama-3.1-70B" โ€” and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution.


๐Ÿ”ฎ Live Demo

Try the chatbot: https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent


๐Ÿ“น Preview


๐Ÿ“น Linkedin Post

Linkedin Post: https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/


๐Ÿ’ก The Problem

Deploying AI models at scale remains frustratingly manual and error-prone:

  • โŒ Manual capacity planning - Calculating GPU memory requirements for each model
  • โŒ Complex infrastructure setup - SSH keys, networking, environment dependencies
  • โŒ Inference server configuration - vLLM, TensorRT-LLM parameter tuning
  • โŒ Trial-and-error debugging - Hours spent troubleshooting deployment issues
  • โŒ High barrier to entry - Requires DevOps expertise that many researchers lack

This friction slows innovation and makes large-model deployment inaccessible to many teams.


โœจ Our Solution

ComputeAgent introduces autonomous compute orchestration through a multi-agent MCP architecture that thinks, plans, and acts on your behalf:

The Workflow

  1. ๐Ÿค– Natural Language Interface - Chat with the agent to deploy models
  2. ๐Ÿง  Intelligent Analysis - Automatically estimates GPU requirements from model architecture
  3. โšก Automated Provisioning - Spins up HiveCompute instances via MCP
  4. ๐Ÿ”ง Smart Configuration - Generates optimized vLLM commands
  5. โœ… Human-in-the-Loop - Review and approve each step with modification capabilities
  6. ๐ŸŽฏ One-Click Deployment - From request to running endpoint in minutes

Powered entirely by open-source models (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure.


๐ŸŽฎ Key Features

๐Ÿค– Conversational Deployment

Deploy any Hugging Face model through natural language:

"Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France"
"I need Mistral-7B with low latency"
"Deploy GPT-OSS-20B for production"

๐Ÿ”ง Tool Approval System

Complete control with human-in-the-loop oversight:

  • โœ… Approve All - Execute all proposed tools
  • โŒ Reject All - Skip tool execution and get alternative responses
  • ๐Ÿ”ง Selective Approval - Choose specific tools (e.g., "1,3,5")
  • ๐Ÿ“ Modify Arguments - Edit parameters before execution
  • ๐Ÿ”„ Re-Reasoning - Provide feedback for agent reconsideration

๐Ÿ“Š Automatic Capacity Estimation

Intelligent resource planning:

  • Calculates GPU memory from model architecture
  • Recommends optimal GPU types and quantities
  • Considers tensor parallelism and quantization
  • Accounts for KV cache and activation memory

๐ŸŒ Multi-Location Support

Deploy across global regions:

  • ๐Ÿ‡ซ๐Ÿ‡ท France
  • ๐Ÿ‡ฆ๐Ÿ‡ช UAE
  • ๐Ÿ‡บ๐Ÿ‡ธ Texas

๐ŸŽฏ GPU Selection

Support for latest hardware:

  • NVIDIA RTX 4090 (24GB VRAM)
  • NVIDIA RTX 5090 (32GB VRAM)
  • Multi-GPU configurations
  • Automatic tensor parallelism setup

Custom Capacity Configuration

Override automatic estimates

Tool Modification

Edit tool arguments before execution:

{
  "name": "meta-llama-llama-3-1-8b",
  "location": "uae",
  "config": "1x RTX4090"
}

๐Ÿ’ฌ Interactive Gradio UI

Beautiful, responsive interface:

  • Real-time chat interaction
  • Tool approval panels
  • Capacity configuration editor
  • Session management

โšก Real-time Processing

Fast and responsive:

  • Async API with FastAPI

๐Ÿš€ Quick Start

Deploy Your First Model

1. Simple Deployment

Deploy meta-llama/Llama-3.1-8B

The agent will:

  • Analyze the model (8B parameters, ~16GB VRAM needed)
  • Recommend 1x RTX 4090
  • Generate vLLM configuration
  • Provision infrastructure
  • Provide deployment commands

๐ŸŽ“ Learning Resources

Understanding MCP

LangGraph & Agents

vLLM Deployment


๐Ÿ‘ฅ Team

Team Name: Hivenet AI Team

Team Members:


๐ŸŽ‰ Hackathon Context

Created for the MCP 1st Birthday Hackathon, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration.

Why This Matters

ComputeAgent showcases:

  • โœ… MCP's power for building production-grade agents
  • โœ… Human-in-the-loop design for responsible AI
  • โœ… Real-world utility solving actual deployment pain points
  • โœ… Open-source first approach with accessible technology

๐Ÿค Contributing

We welcome contributions! Here's how you can help:

Areas for Contribution

  • ๐Ÿ› Bug fixes - Report and fix issues
  • โœจ New features - Add support for more models or GPUs
  • ๐Ÿ“š Documentation - Improve guides and examples
  • ๐Ÿงช Testing - Add test coverage
  • ๐ŸŽจ UI/UX - Enhance the interface

๐Ÿ“„ License

Apache 2.0



๐ŸŒ About Hivenet & HiveCompute

What is Hivenet?

Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly.

HiveCompute: Distributed GPU Cloud

Compute with Hivenet is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources.

๐ŸŽฏ Key Features

๐Ÿš€ High-Performance GPUs

  • Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090
  • Performance that matches or exceeds traditional data center GPUs
  • Perfect for AI inference, training, rendering, and scientific computing

๐Ÿ’ฐ Transparent & Affordable Pricing

  • Per-second billing with up to 58% savings compared to GCP, AWS, and Azure
  • No hidden egress fees or long-term commitments
  • Pay only for what you use with prepaid credits

๐ŸŒ Global Infrastructure

  • GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance
  • Built-in GDPR compliance
  • Data stays local for faster AI model responses

โ™ป๏ธ Sustainable Computing

  • Uses unused computing power from devices worldwide instead of power-hungry data centers
  • Reduces carbon footprint by up to 77% compared to traditional cloud services
  • Community-driven distributed infrastructure
  • Utilizes existing, underutilized hardware
  • Reduces the need for new data center construction

โšก Instant Setup

  • Launch GPU instances in seconds
  • Pre-configured templates for popular frameworks
  • Jupyter notebooks and SSH access included
  • Pause/resume instances without losing your setup

๐Ÿ”’ Enterprise-Grade Reliability

  • Workloads automatically replicate across trusted nodes, keeping downtime near-zero
  • Hive-Certified providers with 99.9% uptime SLA
  • Tier-3 data center equivalent quality

Learn more at compute.hivenet.com


๐Ÿ’ฌ Support

Need Help?

  • ๐Ÿ“– Documentation - Check this README and inline code comments
  • ๐Ÿ“ง Email - Contact the HiveNet team

Built with โค๏ธ by the HiveNet Team

Making large-scale AI deployment accessible to everyone.