Deployment Guide

This guide covers deploying the Universal AI Agent Platform in production environments, from single-server setups to enterprise-scale deployments.

Quick Deployment Options

1. Docker Deployment (Recommended)

The easiest way to deploy the platform is using Docker containers.

Single Container Setup

# Download the NexusAI platform
wget https://github.com/bits-innovate/nexusai-platform/releases/latest/download/nexusai-platform.tar.gz
tar -xzf nexusai-platform.tar.gz
cd nexusai-platform

# Build the Docker image
docker build -t nexusai-platform .

# Run with environment variables
docker run -d \
  --name nexusai \
  -p 8000:8000 \
  -e OPENAI_API_KEY=your_key \
  -e DEEPGRAM_API_KEY=your_key \
  nexusai-platform

Docker Compose Setup

# docker-compose.yml
version: '3.8'

services:
  universal-ai-platform:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY}
      - CARTESIA_API_KEY=${CARTESIA_API_KEY}
      - LIVEKIT_URL=${LIVEKIT_URL}
      - LIVEKIT_API_KEY=${LIVEKIT_API_KEY}
      - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET}
      - DATABASE_URL=postgresql://user:pass@db:5432/universal_ai
      - REDIS_URL=redis://redis:6379
    depends_on:
      - db
      - redis
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  db:
    image: postgres:15
    environment:
      - POSTGRES_DB=universal_ai
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
    volumes:
      - postgres_data:/var/lib/postgresql/data
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    restart: unless-stopped
    volumes:
      - redis_data:/data

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf
      - ./nginx/ssl:/etc/nginx/ssl
    depends_on:
      - universal-ai-platform
    restart: unless-stopped

volumes:
  postgres_data:
  redis_data:

# Deploy with Docker Compose
docker-compose up -d

2. Kubernetes Deployment

For enterprise-scale deployments, use Kubernetes.

Namespace and ConfigMap

# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: universal-ai
---
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: universal-ai-config
  namespace: universal-ai
data:
  DATABASE_URL: "postgresql://user:pass@postgres:5432/universal_ai"
  REDIS_URL: "redis://redis:6379"
  LOG_LEVEL: "INFO"

Secrets

# k8s/secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: universal-ai-secrets
  namespace: universal-ai
type: Opaque
stringData:
  OPENAI_API_KEY: "your_openai_key"
  DEEPGRAM_API_KEY: "your_deepgram_key"
  CARTESIA_API_KEY: "your_cartesia_key"
  LIVEKIT_API_KEY: "your_livekit_key"
  LIVEKIT_API_SECRET: "your_livekit_secret"

Deployment

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: universal-ai-platform
  namespace: universal-ai
spec:
  replicas: 3
  selector:
    matchLabels:
      app: universal-ai-platform
  template:
    metadata:
      labels:
        app: universal-ai-platform
    spec:
      containers:
      - name: universal-ai
        image: universal-ai-platform:latest
        ports:
        - containerPort: 8000
        env:
        - name: PORT
          value: "8000"
        envFrom:
        - configMapRef:
            name: universal-ai-config
        - secretRef:
            name: universal-ai-secrets
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /ready
            port: 8000
          initialDelaySeconds: 15
          periodSeconds: 15
---
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: universal-ai-service
  namespace: universal-ai
spec:
  selector:
    app: universal-ai-platform
  ports:
  - port: 8000
    targetPort: 8000
  type: ClusterIP
---
# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: universal-ai-ingress
  namespace: universal-ai
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - api.yourdomain.com
    secretName: universal-ai-tls
  rules:
  - host: api.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: universal-ai-service
            port:
              number: 8000

Deploy to Kubernetes

# Apply all configurations
kubectl apply -f k8s/

# Check deployment status
kubectl get pods -n universal-ai
kubectl get services -n universal-ai
kubectl get ingress -n universal-ai

3. Cloud Provider Deployments

AWS ECS with Fargate

{
  "family": "universal-ai-platform",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::account:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "universal-ai",
      "image": "your-account.dkr.ecr.region.amazonaws.com/universal-ai-platform:latest",
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "PORT",
          "value": "8000"
        }
      ],
      "secrets": [
        {
          "name": "OPENAI_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:universal-ai/openai-key"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/universal-ai-platform",
          "awslogs-region": "us-west-2",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": [
          "CMD-SHELL",
          "curl -f http://localhost:8000/health || exit 1"
        ],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}

Google Cloud Run

# cloudrun.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: universal-ai-platform
  annotations:
    run.googleapis.com/ingress: all
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/maxScale: "10"
        run.googleapis.com/cpu-throttling: "false"
    spec:
      containerConcurrency: 100
      containers:
      - image: gcr.io/your-project/universal-ai-platform:latest
        ports:
        - containerPort: 8000
        env:
        - name: PORT
          value: "8000"
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: universal-ai-secrets
              key: openai-api-key
        resources:
          limits:
            cpu: "2"
            memory: "2Gi"
          requests:
            cpu: "1"
            memory: "1Gi"

# Deploy to Cloud Run
gcloud run services replace cloudrun.yaml --region=us-central1

Azure Container Instances

# azure-container.yaml
apiVersion: '2019-12-01'
location: eastus
properties:
  containers:
  - name: universal-ai-platform
    properties:
      image: youracr.azurecr.io/universal-ai-platform:latest
      ports:
      - port: 8000
        protocol: TCP
      environmentVariables:
      - name: PORT
        value: "8000"
      - name: OPENAI_API_KEY
        secureValue: "your_openai_key"
      resources:
        requests:
          cpu: 1
          memoryInGB: 2
  osType: Linux
  ipAddress:
    type: Public
    ports:
    - port: 8000
      protocol: TCP
  restartPolicy: Always
tags:
  environment: production
  service: universal-ai-platform

Production Configuration

Environment Variables

Create a comprehensive .env file for production:

# Production .env file

# Service Configuration
NODE_ENV=production
PORT=8000
HOST=0.0.0.0

# AI Service API Keys
OPENAI_API_KEY=your_openai_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
CARTESIA_API_KEY=your_cartesia_api_key

# LiveKit Configuration
LIVEKIT_URL=wss://your-livekit-server.com
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret

# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/universal_ai
DATABASE_POOL_SIZE=20
DATABASE_MAX_CONNECTIONS=100

# Redis Configuration
REDIS_URL=redis://localhost:6379
REDIS_PASSWORD=your_redis_password
REDIS_DB=0

# Security
JWT_SECRET=your_jwt_secret_key
API_KEY_ENCRYPTION_KEY=your_encryption_key
CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com

# Rate Limiting
RATE_LIMIT_WINDOW_MS=3600000  # 1 hour
RATE_LIMIT_MAX_REQUESTS=1000

# Monitoring
LOG_LEVEL=info
METRICS_ENABLED=true
HEALTH_CHECK_INTERVAL=30000

# Feature Flags
VOICE_ENABLED=true
VISION_ENABLED=true
CUSTOM_ADAPTERS_ENABLED=true

# Billing
STRIPE_SECRET_KEY=your_stripe_secret_key
STRIPE_WEBHOOK_SECRET=your_stripe_webhook_secret

# External Services
SENDGRID_API_KEY=your_sendgrid_key
SLACK_WEBHOOK_URL=your_slack_webhook

Nginx Configuration

# nginx/nginx.conf
upstream universal_ai_backend {
    server universal-ai-platform:8000;
    # Add more servers for load balancing
    # server universal-ai-platform-2:8000;
    # server universal-ai-platform-3:8000;
}

server {
    listen 80;
    server_name api.yourdomain.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name api.yourdomain.com;

    # SSL Configuration
    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;
    ssl_session_timeout 1d;
    ssl_session_cache shared:SSL:50m;
    ssl_stapling on;
    ssl_stapling_verify on;

    # Security Headers
    add_header X-Frame-Options DENY;
    add_header X-Content-Type-Options nosniff;
    add_header X-XSS-Protection "1; mode=block";
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";

    # Rate Limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req zone=api burst=20 nodelay;

    # Gzip Compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # Proxy Configuration
    location / {
        proxy_pass http://universal_ai_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        
        # Buffer settings
        proxy_buffering on;
        proxy_buffer_size 8k;
        proxy_buffers 32 8k;
    }

    # WebSocket support
    location /ws/ {
        proxy_pass http://universal_ai_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # Health check endpoint
    location /health {
        access_log off;
        proxy_pass http://universal_ai_backend/health;
    }

    # Static files (if any)
    location /static/ {
        alias /var/www/static/;
        expires 1y;
        add_header Cache-Control "public, immutable";
    }
}

Database Setup

PostgreSQL Configuration

-- Create database and user
CREATE DATABASE universal_ai;
CREATE USER universal_ai_user WITH PASSWORD 'your_secure_password';
GRANT ALL PRIVILEGES ON DATABASE universal_ai TO universal_ai_user;

-- Connect to the database
\c universal_ai;

-- Create tables
CREATE TABLE IF NOT EXISTS usage_metrics (
    id SERIAL PRIMARY KEY,
    client_id VARCHAR(255) NOT NULL,
    session_id VARCHAR(255),
    metric_type VARCHAR(50) NOT NULL,
    metric_value INTEGER DEFAULT 0,
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    metadata JSONB
);

CREATE TABLE IF NOT EXISTS billing_plans (
    id SERIAL PRIMARY KEY,
    plan_name VARCHAR(100) NOT NULL,
    base_price DECIMAL(10,2) NOT NULL,
    price_per_message DECIMAL(8,4) DEFAULT 0,
    price_per_image DECIMAL(8,4) DEFAULT 0,
    price_per_voice_minute DECIMAL(8,4) DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE IF NOT EXISTS sessions (
    id SERIAL PRIMARY KEY,
    session_id VARCHAR(255) UNIQUE NOT NULL,
    client_id VARCHAR(255),
    agent_config JSONB,
    status VARCHAR(50) DEFAULT 'active',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create indexes
CREATE INDEX idx_usage_metrics_client_id ON usage_metrics(client_id);
CREATE INDEX idx_usage_metrics_timestamp ON usage_metrics(timestamp);
CREATE INDEX idx_sessions_session_id ON sessions(session_id);
CREATE INDEX idx_sessions_client_id ON sessions(client_id);

-- Insert default billing plans
INSERT INTO billing_plans (plan_name, base_price, price_per_message, price_per_image, price_per_voice_minute) VALUES
('starter', 49.00, 0.01, 0.05, 0.10),
('professional', 199.00, 0.008, 0.04, 0.08),
('enterprise', 999.00, 0.005, 0.03, 0.06);

Redis Configuration

# redis/redis.conf
# Basic configuration
bind 127.0.0.1
port 6379
timeout 300
keepalive 60

# Memory management
maxmemory 2gb
maxmemory-policy allkeys-lru

# Persistence
save 900 1
save 300 10
save 60 10000

# Security
requirepass your_redis_password

# Logging
loglevel notice
logfile /var/log/redis/redis-server.log

# Performance
tcp-backlog 511
tcp-keepalive 300

Monitoring and Observability

Prometheus Metrics

# monitoring/metrics.py
from prometheus_client import Counter, Histogram, Gauge, generate_latest

# Define metrics
REQUEST_COUNT = Counter('universal_ai_requests_total', 'Total requests', ['method', 'endpoint', 'status'])
REQUEST_DURATION = Histogram('universal_ai_request_duration_seconds', 'Request duration')
ACTIVE_SESSIONS = Gauge('universal_ai_active_sessions', 'Number of active sessions')
AI_API_CALLS = Counter('universal_ai_api_calls_total', 'AI API calls', ['service', 'status'])

def setup_metrics_endpoint(app):
    """Add metrics endpoint to Flask app"""
    @app.route('/metrics')
    def metrics():
        return generate_latest()

Health Checks

# health/checks.py
import time
import redis
import psycopg2
from flask import jsonify

class HealthChecker:
    def __init__(self, redis_url, db_url):
        self.redis_url = redis_url
        self.db_url = db_url
    
    def check_redis(self):
        """Check Redis connectivity"""
        try:
            r = redis.from_url(self.redis_url)
            r.ping()
            return {'status': 'healthy', 'response_time': 0}
        except Exception as e:
            return {'status': 'unhealthy', 'error': str(e)}
    
    def check_database(self):
        """Check PostgreSQL connectivity"""
        try:
            start_time = time.time()
            conn = psycopg2.connect(self.db_url)
            conn.close()
            response_time = time.time() - start_time
            return {'status': 'healthy', 'response_time': response_time}
        except Exception as e:
            return {'status': 'unhealthy', 'error': str(e)}
    
    def check_ai_services(self):
        """Check AI service connectivity"""
        # Implement checks for OpenAI, Deepgram, etc.
        return {'status': 'healthy'}
    
    def get_health_status(self):
        """Get overall health status"""
        checks = {
            'redis': self.check_redis(),
            'database': self.check_database(),
            'ai_services': self.check_ai_services()
        }
        
        overall_status = 'healthy' if all(
            check['status'] == 'healthy' for check in checks.values()
        ) else 'unhealthy'
        
        return {
            'status': overall_status,
            'timestamp': time.time(),
            'checks': checks
        }

# Add to your Flask app
@app.route('/health')
def health_check():
    health_checker = HealthChecker(redis_url, db_url)
    return jsonify(health_checker.get_health_status())

Logging Configuration

# logging_config.py
import logging
import sys
from pythonjsonlogger import jsonlogger

def setup_logging():
    """Configure structured logging"""
    
    # Create logger
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    
    # Create JSON formatter
    formatter = jsonlogger.JsonFormatter(
        '%(asctime)s %(name)s %(levelname)s %(message)s'
    )
    
    # Console handler
    console_handler = logging.StreamHandler(sys.stdout)
    console_handler.setFormatter(formatter)
    logger.addHandler(console_handler)
    
    # File handler for production
    if os.environ.get('NODE_ENV') == 'production':
        file_handler = logging.FileHandler('/var/log/universal-ai/app.log')
        file_handler.setFormatter(formatter)
        logger.addHandler(file_handler)
    
    return logger

Security Best Practices

1. API Security

# security/auth.py
import jwt
from functools import wraps
from flask import request, jsonify

def require_api_key(f):
    """Decorator to require API key authentication"""
    @wraps(f)
    def decorated_function(*args, **kwargs):
        api_key = request.headers.get('Authorization')
        
        if not api_key:
            return jsonify({'error': 'API key required'}), 401
        
        if not api_key.startswith('Bearer '):
            return jsonify({'error': 'Invalid API key format'}), 401
        
        key = api_key.split('Bearer ')[1]
        if not validate_api_key(key):
            return jsonify({'error': 'Invalid API key'}), 401
        
        return f(*args, **kwargs)
    return decorated_function

def validate_api_key(key):
    """Validate API key against database"""
    # Implement your key validation logic
    return True

2. Input Validation

# security/validation.py
from marshmallow import Schema, fields, ValidationError

class AgentCreateSchema(Schema):
    instructions = fields.Str(required=True, validate=lambda x: len(x) <= 1000)
    capabilities = fields.List(
        fields.Str(validate=lambda x: x in ['text', 'voice', 'vision']),
        required=True
    )
    business_logic_adapter = fields.Str(validate=lambda x: x in ALLOWED_ADAPTERS)
    custom_settings = fields.Dict()

def validate_request(schema_class):
    """Decorator for request validation"""
    def decorator(f):
        @wraps(f)
        def decorated_function(*args, **kwargs):
            schema = schema_class()
            try:
                data = schema.load(request.json)
                request.validated_data = data
                return f(*args, **kwargs)
            except ValidationError as err:
                return jsonify({'error': 'Validation error', 'details': err.messages}), 400
        return decorated_function
    return decorator

3. Rate Limiting

# security/rate_limiting.py
import redis
import time
from flask import request, jsonify

class RateLimiter:
    def __init__(self, redis_client):
        self.redis = redis_client
    
    def is_allowed(self, key, limit, window):
        """Check if request is within rate limit"""
        current_time = time.time()
        pipeline = self.redis.pipeline()
        
        # Remove expired entries
        pipeline.zremrangebyscore(key, 0, current_time - window)
        
        # Count current requests
        pipeline.zcard(key)
        
        # Add current request
        pipeline.zadd(key, {str(current_time): current_time})
        
        # Set expiration
        pipeline.expire(key, int(window))
        
        results = pipeline.execute()
        request_count = results[1]
        
        return request_count < limit

def rate_limit(limit=100, window=3600):
    """Rate limiting decorator"""
    def decorator(f):
        @wraps(f)
        def decorated_function(*args, **kwargs):
            client_ip = request.remote_addr
            api_key = request.headers.get('Authorization', '').split('Bearer ')[-1]
            
            key = f"rate_limit:{api_key or client_ip}"
            
            if not rate_limiter.is_allowed(key, limit, window):
                return jsonify({'error': 'Rate limit exceeded'}), 429
            
            return f(*args, **kwargs)
        return decorated_function
    return decorator

Scaling Strategies

Horizontal Scaling

Load Balancing: Use nginx, HAProxy, or cloud load balancers
Session Affinity: Implement sticky sessions or session storage
Database Scaling: Read replicas, connection pooling
Caching: Redis cluster for distributed caching

Performance Optimization

Connection Pooling: Optimize database connections
Async Processing: Use Celery for background tasks
CDN: Use CloudFlare or AWS CloudFront for static assets
Compression: Enable gzip/brotli compression

Monitoring at Scale

Distributed Tracing: Use Jaeger or Zipkin
Centralized Logging: ELK stack or Fluentd
Metrics Collection: Prometheus + Grafana
Alerting: PagerDuty, OpsGenie integration

Next: Learn about Business Logic Adapters or check out the API Reference for detailed endpoint documentation.

Quick Deployment Options​

1. Docker Deployment (Recommended)​

Single Container Setup​

Docker Compose Setup​

2. Kubernetes Deployment​

Namespace and ConfigMap​

Secrets​

Deployment​

Deploy to Kubernetes​

3. Cloud Provider Deployments​

AWS ECS with Fargate​

Google Cloud Run​

Azure Container Instances​

Production Configuration​

Environment Variables​

Nginx Configuration​

Database Setup​

PostgreSQL Configuration​

Redis Configuration​

Monitoring and Observability​

Prometheus Metrics​

Health Checks​

Logging Configuration​

Security Best Practices​

1. API Security​

2. Input Validation​

3. Rate Limiting​

Scaling Strategies​

Horizontal Scaling​

Performance Optimization​

Monitoring at Scale​