Spaces:

DarkDriftz
/

SuperMarioGPT

Runtime error

File size: 13,462 Bytes

49b8c43

# MarioGPT System Architecture

## 🏗️ Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                        MarioGPT System                           │
└─────────────────────────────────────────────────────────────────┘

┌──────────────────┐         ┌──────────────────┐
│   User Interfaces │         │   AI Assistants   │
├──────────────────┤         ├──────────────────┤
│ • Web Browser    │         │ • HuggingChat    │
│ • Mobile Browser │         │ • Claude Desktop │
│ • Desktop App    │         │ • Other MCP      │
└────────┬─────────┘         └────────┬─────────┘
         │                            │
         │ HTTP/WebSocket             │ MCP Protocol (stdio)
         │                            │
         ▼                            ▼
┌─────────────────┐         ┌──────────────────┐
│  Gradio App     │         │   MCP Server     │
│  (app.py)       │         │  (mcp_server.py) │
├─────────────────┤         ├──────────────────┤
│ • Web UI        │         │ • Tool Registry  │
│ • FastAPI       │         │ • Async Handler  │
│ • File Serving  │         │ • Validation     │
└────────┬────────┘         └────────┬─────────┘
         │                           │
         │                           │
         └───────────┬───────────────┘
                     │
                     ▼
         ┌───────────────────────┐
         │   MarioGPT Core       │
         │   (supermariogpt/)    │
         ├───────────────────────┤
         │ • MarioLM             │
         │ • Dataset             │
         │ • Prompter            │
         │ • Utils               │
         └───────────┬───────────┘
                     │
         ┌───────────┴───────────┐
         │                       │
         ▼                       ▼
┌─────────────────┐   ┌──────────────────┐
│  GPT-2 Model    │   │  Level Renderer   │
│  (transformers) │   │  (PIL/Pillow)     │
├─────────────────┤   ├──────────────────┤
│ • Text Gen      │   │ • PNG Generation │
│ • Tokenization  │   │ • Tile Mapping   │
│ • Sampling      │   │ • Visualization  │
└────────┬────────┘   └────────┬─────────┘
         │                     │
         │                     │
         ▼                     ▼
┌─────────────────────────────────┐
│        Output Artifacts         │
├─────────────────────────────────┤
│ • Level Text (ASCII)            │
│ • Level Image (PNG)             │
│ • Playable HTML (CheerpJ)       │
└─────────────────────────────────┘
```

## 🔄 Data Flow Diagrams

### Gradio Web Interface Flow

```
User Input
    │
    ├─ Compose Prompt (Radio buttons)
    │  └─ Pipes + Enemies + Blocks + Elevation
    │
    ├─ Type Prompt (Text field)
    │  └─ Custom text description
    │
    └─ Advanced Settings
       ├─ Temperature (0.1-2.0)
       └─ Level Size (100-2799)
    │
    ▼
Validation & Processing
    │
    ├─ Validate temperature range
    ├─ Validate level size range
    └─ Format prompt string
    │
    ▼
MarioLM.sample()
    │
    ├─ Tokenize prompt
    ├─ Generate tokens (GPT-2)
    └─ Decode to level format
    │
    ▼
Post-Processing
    │
    ├─ convert_level_to_png()
    │  └─ Create visual representation
    │
    └─ make_html_file()
       └─ Create playable demo
    │
    ▼
Output to User
    │
    ├─ Display PNG image
    └─ Embed playable iframe
```

### MCP Server Flow

```
HuggingChat Request
    │
    ▼
MCP Protocol (JSON-RPC)
    │
    ├─ tools/list
    │  └─ Return available tools
    │
    └─ tools/call
       ├─ Tool: generate_mario_level
       │  └─ Parameters: prompt, temperature, level_size
       │
       └─ Tool: get_level_suggestions
          └─ No parameters
    │
    ▼
Parameter Validation (Pydantic)
    │
    ├─ Check types
    ├─ Validate ranges
    └─ Apply defaults
    │
    ▼
Lazy Model Initialization
    │
    ├─ Check if model loaded
    └─ Load if needed (first use)
    │
    ▼
Generate Level
    │
    ├─ MarioLM.sample()
    ├─ view_level() → Text
    └─ convert_level_to_png() → Image
    │
    ▼
Encode Response
    │
    ├─ Text as TextContent
    └─ Image as base64 + ImageContent
    │
    ▼
Return to HuggingChat
    │
    ├─ Display text description
    └─ Render image inline
```

## 📦 Component Details

### Core Components

#### 1. MarioLM (Language Model)
```python
class MarioLM:
    - model: GPT-2 Transformer
    - tokenizer: Custom Mario tokenizer
    - device: CUDA or CPU
    
    Methods:
    - sample(prompts, num_steps, temperature)
    - load_pretrained()
    - to(device)
```

#### 2. Gradio Interface
```python
Components:
- Radio buttons (pipes, enemies, blocks, elevation)
- Text input (custom prompts)
- Sliders (temperature, level_size)
- Button (generate)
- Image output (PNG preview)
- HTML output (playable demo)
```

#### 3. MCP Server
```python
Tools:
- generate_mario_level(prompt, temp, size)
  → Returns: TextContent + ImageContent
  
- get_level_suggestions()
  → Returns: TextContent (examples)

Protocol: stdio (JSON-RPC 2.0)
Transport: asyncio streams
```

#### 4. FastAPI Backend
```python
Routes:
- / → Gradio app
- /static → Static file serving
- /gradio_api → Gradio API

Features:
- HTML file generation
- Static file hosting
- CORS handling
```

## 🔧 Technology Stack

### Backend
```
Python 3.8+
├── torch (Deep Learning)
├── transformers (GPT-2)
├── gradio (Web UI)
├── fastapi (API)
├── uvicorn (ASGI Server)
└── mcp (Model Context Protocol)
```

### Model
```
GPT-2 Architecture
├── Input: Text prompt
├── Process: Token generation
├── Output: Mario level tokens
└── Decode: ASCII level format
```

### Frontend
```
Gradio Components
├── HTML5
├── JavaScript
├── CheerpJ (Java → JavaScript)
└── WebSocket (real-time updates)
```

### Protocols
```
HTTP/HTTPS
├── REST API
└── WebSocket

MCP (Model Context Protocol)
├── JSON-RPC 2.0
├── stdio transport
└── Tool-based interface
```

## 🚀 Deployment Architectures

### Architecture 1: HuggingFace Spaces

```
┌─────────────────────────────────┐
│     HuggingFace Spaces          │
├─────────────────────────────────┤
│                                 │
│  ┌──────────────────────────┐  │
│  │   Docker Container       │  │
│  │  ┌────────────────────┐  │  │
│  │  │   Gradio App       │  │  │
│  │  │   (app.py)         │  │  │
│  │  └──────┬─────────────┘  │  │
│  │         │                │  │
│  │  ┌──────▼─────────────┐  │  │
│  │  │   MarioGPT Model   │  │  │
│  │  │   (GPU: T4/A10G)   │  │  │
│  │  └────────────────────┘  │  │
│  └──────────────────────────┘  │
│                                 │
│  Storage: Persistent /static    │
└─────────────────────────────────┘
          │
          ▼
    Internet Users
```

### Architecture 2: MCP Server Integration

```
┌──────────────────────┐
│   HuggingChat UI     │
└──────────┬───────────┘
           │
           │ MCP Protocol
           ▼
┌──────────────────────┐
│   MCP Router         │
│   (HuggingChat)      │
└──────────┬───────────┘
           │
           │ stdio
           ▼
┌──────────────────────┐
│   mcp_server.py      │
│   (Your Machine)     │
└──────────┬───────────┘
           │
           │ Python API
           ▼
┌──────────────────────┐
│   MarioGPT Model     │
│   (Local GPU/CPU)    │
└──────────────────────┘
```

### Architecture 3: Hybrid Setup

```
┌─────────────────────────────────┐
│         Load Balancer            │
└───────────┬─────────────────────┘
            │
     ┌──────┴──────┐
     │             │
     ▼             ▼
┌─────────┐   ┌─────────┐
│ Gradio  │   │  MCP    │
│ Server  │   │ Server  │
│ (Web)   │   │ (API)   │
└────┬────┘   └────┬────┘
     │             │
     └──────┬──────┘
            ▼
    ┌──────────────┐
    │ Shared Model │
    │   Storage    │
    └──────────────┘
```

## 📊 Performance Characteristics

### Latency Breakdown

```
Total Generation Time: 5-10s (GPU) / 30-60s (CPU)

├── Model Loading: 2-3s (first time only)
├── Prompt Processing: <0.1s
├── Token Generation: 3-7s (GPU) / 25-55s (CPU)
├── Post-Processing: 0.5-1s
│   ├── Level rendering: 0.3s
│   └── PNG generation: 0.2s
└── File Writing: <0.1s
```

### Resource Usage

```
GPU Mode (T4):
├── VRAM: 4-6 GB
├── System RAM: 2-4 GB
└── CPU: 1-2 cores (minimal)

CPU Mode:
├── System RAM: 8-12 GB
├── CPU: 4-8 cores (recommended)
└── Generation: ~5-10x slower
```

## 🔐 Security Architecture

### Input Validation

```
User Input
    │
    ├─ Temperature
    │  └─ Clamp: max(0.1, min(2.0, value))
    │
    ├─ Level Size
    │  └─ Clamp: max(100, min(2799, value))
    │
    └─ Prompt Text
       └─ Sanitize: remove special chars
    │
    ▼
Safe Processing
```

### File Handling

```
Generated Files
    │
    ├─ UUID v4 naming (privacy-safe)
    ├─ Restricted directory (./static only)
    ├─ Validated extensions (.html, .png)
    └─ Size limits enforced
```

## 🎯 Integration Points

### 1. HuggingFace Spaces
- Direct deployment
- GPU auto-allocation
- Persistent storage
- Built-in CDN

### 2. HuggingChat (MCP)
- Tool registration
- JSON-RPC protocol
- Async execution
- Rich responses (text + image)

### 3. Custom Applications
- FastAPI endpoints
- Python SDK import
- Docker deployment
- API integration

## 📈 Scalability Considerations

### Horizontal Scaling
```
Load Balancer
    │
    ├─ Instance 1 (GPU)
    ├─ Instance 2 (GPU)
    └─ Instance 3 (CPU fallback)
```

### Vertical Scaling
```
Single Instance
├── More GPU memory → Larger batches
├── Faster GPU → Quicker generation
└── More CPU cores → Better preprocessing
```

### Caching Strategy
```
Cache Layer
├── Generated levels (by prompt hash)
├── Model weights (persistent)
└── Static assets (CDN)
```

---

**Version:** 1.0.0
**Last Updated:** December 6, 2024
**Architecture:** Modular, Scalable, MCP-Compatible