Deployment Guide
This guide covers deploying the Universal AI Agent Platform in production environments, from single-server setups to enterprise-scale deployments.
Quick Deployment Options
1. Docker Deployment (Recommended)
The easiest way to deploy the platform is using Docker containers.
Single Container Setup
# Download the NexusAI platform
wget https://github.com/bits-innovate/nexusai-platform/releases/latest/download/nexusai-platform.tar.gz
tar -xzf nexusai-platform.tar.gz
cd nexusai-platform
# Build the Docker image
docker build -t nexusai-platform .
# Run with environment variables
docker run -d \
--name nexusai \
-p 8000:8000 \
-e OPENAI_API_KEY=your_key \
-e DEEPGRAM_API_KEY=your_key \
nexusai-platform
Docker Compose Setup
# docker-compose.yml
version: '3.8'
services:
universal-ai-platform:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY}
- CARTESIA_API_KEY=${CARTESIA_API_KEY}
- LIVEKIT_URL=${LIVEKIT_URL}
- LIVEKIT_API_KEY=${LIVEKIT_API_KEY}
- LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET}
- DATABASE_URL=postgresql://user:pass@db:5432/universal_ai
- REDIS_URL=redis://redis:6379
depends_on:
- db
- redis
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
db:
image: postgres:15
environment:
- POSTGRES_DB=universal_ai
- POSTGRES_USER=user
- POSTGRES_PASSWORD=pass
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
redis:
image: redis:7-alpine
restart: unless-stopped
volumes:
- redis_data:/data
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
- ./nginx/ssl:/etc/nginx/ssl
depends_on:
- universal-ai-platform
restart: unless-stopped
volumes:
postgres_data:
redis_data:
# Deploy with Docker Compose
docker-compose up -d
2. Kubernetes Deployment
For enterprise-scale deployments, use Kubernetes.
Namespace and ConfigMap
# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: universal-ai
---
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: universal-ai-config
namespace: universal-ai
data:
DATABASE_URL: "postgresql://user:pass@postgres:5432/universal_ai"
REDIS_URL: "redis://redis:6379"
LOG_LEVEL: "INFO"
Secrets
# k8s/secrets.yaml
apiVersion: v1
kind: Secret
metadata:
name: universal-ai-secrets
namespace: universal-ai
type: Opaque
stringData:
OPENAI_API_KEY: "your_openai_key"
DEEPGRAM_API_KEY: "your_deepgram_key"
CARTESIA_API_KEY: "your_cartesia_key"
LIVEKIT_API_KEY: "your_livekit_key"
LIVEKIT_API_SECRET: "your_livekit_secret"
Deployment
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: universal-ai-platform
namespace: universal-ai
spec:
replicas: 3
selector:
matchLabels:
app: universal-ai-platform
template:
metadata:
labels:
app: universal-ai-platform
spec:
containers:
- name: universal-ai
image: universal-ai-platform:latest
ports:
- containerPort: 8000
env:
- name: PORT
value: "8000"
envFrom:
- configMapRef:
name: universal-ai-config
- secretRef:
name: universal-ai-secrets
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 15
periodSeconds: 15
---
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: universal-ai-service
namespace: universal-ai
spec:
selector:
app: universal-ai-platform
ports:
- port: 8000
targetPort: 8000
type: ClusterIP
---
# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: universal-ai-ingress
namespace: universal-ai
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- api.yourdomain.com
secretName: universal-ai-tls
rules:
- host: api.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: universal-ai-service
port:
number: 8000
Deploy to Kubernetes
# Apply all configurations
kubectl apply -f k8s/
# Check deployment status
kubectl get pods -n universal-ai
kubectl get services -n universal-ai
kubectl get ingress -n universal-ai
3. Cloud Provider Deployments
AWS ECS with Fargate
{
"family": "universal-ai-platform",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::account:role/ecsTaskRole",
"containerDefinitions": [
{
"name": "universal-ai",
"image": "your-account.dkr.ecr.region.amazonaws.com/universal-ai-platform:latest",
"portMappings": [
{
"containerPort": 8000,
"protocol": "tcp"
}
],
"environment": [
{
"name": "PORT",
"value": "8000"
}
],
"secrets": [
{
"name": "OPENAI_API_KEY",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:universal-ai/openai-key"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/universal-ai-platform",
"awslogs-region": "us-west-2",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": [
"CMD-SHELL",
"curl -f http://localhost:8000/health || exit 1"
],
"interval": 30,
"timeout": 5,
"retries": 3
}
}
]
}
Google Cloud Run
# cloudrun.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: universal-ai-platform
annotations:
run.googleapis.com/ingress: all
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/maxScale: "10"
run.googleapis.com/cpu-throttling: "false"
spec:
containerConcurrency: 100
containers:
- image: gcr.io/your-project/universal-ai-platform:latest
ports:
- containerPort: 8000
env:
- name: PORT
value: "8000"
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: universal-ai-secrets
key: openai-api-key
resources:
limits:
cpu: "2"
memory: "2Gi"
requests:
cpu: "1"
memory: "1Gi"
# Deploy to Cloud Run
gcloud run services replace cloudrun.yaml --region=us-central1
Azure Container Instances
# azure-container.yaml
apiVersion: '2019-12-01'
location: eastus
properties:
containers:
- name: universal-ai-platform
properties:
image: youracr.azurecr.io/universal-ai-platform:latest
ports:
- port: 8000
protocol: TCP
environmentVariables:
- name: PORT
value: "8000"
- name: OPENAI_API_KEY
secureValue: "your_openai_key"
resources:
requests:
cpu: 1
memoryInGB: 2
osType: Linux
ipAddress:
type: Public
ports:
- port: 8000
protocol: TCP
restartPolicy: Always
tags:
environment: production
service: universal-ai-platform
Production Configuration
Environment Variables
Create a comprehensive .env file for production:
# Production .env file
# Service Configuration
NODE_ENV=production
PORT=8000
HOST=0.0.0.0
# AI Service API Keys
OPENAI_API_KEY=your_openai_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
CARTESIA_API_KEY=your_cartesia_api_key
# LiveKit Configuration
LIVEKIT_URL=wss://your-livekit-server.com
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/universal_ai
DATABASE_POOL_SIZE=20
DATABASE_MAX_CONNECTIONS=100
# Redis Configuration
REDIS_URL=redis://localhost:6379
REDIS_PASSWORD=your_redis_password
REDIS_DB=0
# Security
JWT_SECRET=your_jwt_secret_key
API_KEY_ENCRYPTION_KEY=your_encryption_key
CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com
# Rate Limiting
RATE_LIMIT_WINDOW_MS=3600000 # 1 hour
RATE_LIMIT_MAX_REQUESTS=1000
# Monitoring
LOG_LEVEL=info
METRICS_ENABLED=true
HEALTH_CHECK_INTERVAL=30000
# Feature Flags
VOICE_ENABLED=true
VISION_ENABLED=true
CUSTOM_ADAPTERS_ENABLED=true
# Billing
STRIPE_SECRET_KEY=your_stripe_secret_key
STRIPE_WEBHOOK_SECRET=your_stripe_webhook_secret
# External Services
SENDGRID_API_KEY=your_sendgrid_key
SLACK_WEBHOOK_URL=your_slack_webhook
Nginx Configuration
# nginx/nginx.conf
upstream universal_ai_backend {
server universal-ai-platform:8000;
# Add more servers for load balancing
# server universal-ai-platform-2:8000;
# server universal-ai-platform-3:8000;
}
server {
listen 80;
server_name api.yourdomain.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name api.yourdomain.com;
# SSL Configuration
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_stapling on;
ssl_stapling_verify on;
# Security Headers
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
# Rate Limiting
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req zone=api burst=20 nodelay;
# Gzip Compression
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
# Proxy Configuration
location / {
proxy_pass http://universal_ai_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
# Buffer settings
proxy_buffering on;
proxy_buffer_size 8k;
proxy_buffers 32 8k;
}
# WebSocket support
location /ws/ {
proxy_pass http://universal_ai_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Health check endpoint
location /health {
access_log off;
proxy_pass http://universal_ai_backend/health;
}
# Static files (if any)
location /static/ {
alias /var/www/static/;
expires 1y;
add_header Cache-Control "public, immutable";
}
}
Database Setup
PostgreSQL Configuration
-- Create database and user
CREATE DATABASE universal_ai;
CREATE USER universal_ai_user WITH PASSWORD 'your_secure_password';
GRANT ALL PRIVILEGES ON DATABASE universal_ai TO universal_ai_user;
-- Connect to the database
\c universal_ai;
-- Create tables
CREATE TABLE IF NOT EXISTS usage_metrics (
id SERIAL PRIMARY KEY,
client_id VARCHAR(255) NOT NULL,
session_id VARCHAR(255),
metric_type VARCHAR(50) NOT NULL,
metric_value INTEGER DEFAULT 0,
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
metadata JSONB
);
CREATE TABLE IF NOT EXISTS billing_plans (
id SERIAL PRIMARY KEY,
plan_name VARCHAR(100) NOT NULL,
base_price DECIMAL(10,2) NOT NULL,
price_per_message DECIMAL(8,4) DEFAULT 0,
price_per_image DECIMAL(8,4) DEFAULT 0,
price_per_voice_minute DECIMAL(8,4) DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE IF NOT EXISTS sessions (
id SERIAL PRIMARY KEY,
session_id VARCHAR(255) UNIQUE NOT NULL,
client_id VARCHAR(255),
agent_config JSONB,
status VARCHAR(50) DEFAULT 'active',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create indexes
CREATE INDEX idx_usage_metrics_client_id ON usage_metrics(client_id);
CREATE INDEX idx_usage_metrics_timestamp ON usage_metrics(timestamp);
CREATE INDEX idx_sessions_session_id ON sessions(session_id);
CREATE INDEX idx_sessions_client_id ON sessions(client_id);
-- Insert default billing plans
INSERT INTO billing_plans (plan_name, base_price, price_per_message, price_per_image, price_per_voice_minute) VALUES
('starter', 49.00, 0.01, 0.05, 0.10),
('professional', 199.00, 0.008, 0.04, 0.08),
('enterprise', 999.00, 0.005, 0.03, 0.06);
Redis Configuration
# redis/redis.conf
# Basic configuration
bind 127.0.0.1
port 6379
timeout 300
keepalive 60
# Memory management
maxmemory 2gb
maxmemory-policy allkeys-lru
# Persistence
save 900 1
save 300 10
save 60 10000
# Security
requirepass your_redis_password
# Logging
loglevel notice
logfile /var/log/redis/redis-server.log
# Performance
tcp-backlog 511
tcp-keepalive 300
Monitoring and Observability
Prometheus Metrics
# monitoring/metrics.py
from prometheus_client import Counter, Histogram, Gauge, generate_latest
# Define metrics
REQUEST_COUNT = Counter('universal_ai_requests_total', 'Total requests', ['method', 'endpoint', 'status'])
REQUEST_DURATION = Histogram('universal_ai_request_duration_seconds', 'Request duration')
ACTIVE_SESSIONS = Gauge('universal_ai_active_sessions', 'Number of active sessions')
AI_API_CALLS = Counter('universal_ai_api_calls_total', 'AI API calls', ['service', 'status'])
def setup_metrics_endpoint(app):
"""Add metrics endpoint to Flask app"""
@app.route('/metrics')
def metrics():
return generate_latest()
Health Checks
# health/checks.py
import time
import redis
import psycopg2
from flask import jsonify
class HealthChecker:
def __init__(self, redis_url, db_url):
self.redis_url = redis_url
self.db_url = db_url
def check_redis(self):
"""Check Redis connectivity"""
try:
r = redis.from_url(self.redis_url)
r.ping()
return {'status': 'healthy', 'response_time': 0}
except Exception as e:
return {'status': 'unhealthy', 'error': str(e)}
def check_database(self):
"""Check PostgreSQL connectivity"""
try:
start_time = time.time()
conn = psycopg2.connect(self.db_url)
conn.close()
response_time = time.time() - start_time
return {'status': 'healthy', 'response_time': response_time}
except Exception as e:
return {'status': 'unhealthy', 'error': str(e)}
def check_ai_services(self):
"""Check AI service connectivity"""
# Implement checks for OpenAI, Deepgram, etc.
return {'status': 'healthy'}
def get_health_status(self):
"""Get overall health status"""
checks = {
'redis': self.check_redis(),
'database': self.check_database(),
'ai_services': self.check_ai_services()
}
overall_status = 'healthy' if all(
check['status'] == 'healthy' for check in checks.values()
) else 'unhealthy'
return {
'status': overall_status,
'timestamp': time.time(),
'checks': checks
}
# Add to your Flask app
@app.route('/health')
def health_check():
health_checker = HealthChecker(redis_url, db_url)
return jsonify(health_checker.get_health_status())
Logging Configuration
# logging_config.py
import logging
import sys
from pythonjsonlogger import jsonlogger
def setup_logging():
"""Configure structured logging"""
# Create logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# Create JSON formatter
formatter = jsonlogger.JsonFormatter(
'%(asctime)s %(name)s %(levelname)s %(message)s'
)
# Console handler
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)
# File handler for production
if os.environ.get('NODE_ENV') == 'production':
file_handler = logging.FileHandler('/var/log/universal-ai/app.log')
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
return logger
Security Best Practices
1. API Security
# security/auth.py
import jwt
from functools import wraps
from flask import request, jsonify
def require_api_key(f):
"""Decorator to require API key authentication"""
@wraps(f)
def decorated_function(*args, **kwargs):
api_key = request.headers.get('Authorization')
if not api_key:
return jsonify({'error': 'API key required'}), 401
if not api_key.startswith('Bearer '):
return jsonify({'error': 'Invalid API key format'}), 401
key = api_key.split('Bearer ')[1]
if not validate_api_key(key):
return jsonify({'error': 'Invalid API key'}), 401
return f(*args, **kwargs)
return decorated_function
def validate_api_key(key):
"""Validate API key against database"""
# Implement your key validation logic
return True
2. Input Validation
# security/validation.py
from marshmallow import Schema, fields, ValidationError
class AgentCreateSchema(Schema):
instructions = fields.Str(required=True, validate=lambda x: len(x) <= 1000)
capabilities = fields.List(
fields.Str(validate=lambda x: x in ['text', 'voice', 'vision']),
required=True
)
business_logic_adapter = fields.Str(validate=lambda x: x in ALLOWED_ADAPTERS)
custom_settings = fields.Dict()
def validate_request(schema_class):
"""Decorator for request validation"""
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
schema = schema_class()
try:
data = schema.load(request.json)
request.validated_data = data
return f(*args, **kwargs)
except ValidationError as err:
return jsonify({'error': 'Validation error', 'details': err.messages}), 400
return decorated_function
return decorator
3. Rate Limiting
# security/rate_limiting.py
import redis
import time
from flask import request, jsonify
class RateLimiter:
def __init__(self, redis_client):
self.redis = redis_client
def is_allowed(self, key, limit, window):
"""Check if request is within rate limit"""
current_time = time.time()
pipeline = self.redis.pipeline()
# Remove expired entries
pipeline.zremrangebyscore(key, 0, current_time - window)
# Count current requests
pipeline.zcard(key)
# Add current request
pipeline.zadd(key, {str(current_time): current_time})
# Set expiration
pipeline.expire(key, int(window))
results = pipeline.execute()
request_count = results[1]
return request_count < limit
def rate_limit(limit=100, window=3600):
"""Rate limiting decorator"""
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
client_ip = request.remote_addr
api_key = request.headers.get('Authorization', '').split('Bearer ')[-1]
key = f"rate_limit:{api_key or client_ip}"
if not rate_limiter.is_allowed(key, limit, window):
return jsonify({'error': 'Rate limit exceeded'}), 429
return f(*args, **kwargs)
return decorated_function
return decorator
Scaling Strategies
Horizontal Scaling
- Load Balancing: Use nginx, HAProxy, or cloud load balancers
- Session Affinity: Implement sticky sessions or session storage
- Database Scaling: Read replicas, connection pooling
- Caching: Redis cluster for distributed caching
Performance Optimization
- Connection Pooling: Optimize database connections
- Async Processing: Use Celery for background tasks
- CDN: Use CloudFlare or AWS CloudFront for static assets
- Compression: Enable gzip/brotli compression
Monitoring at Scale
- Distributed Tracing: Use Jaeger or Zipkin
- Centralized Logging: ELK stack or Fluentd
- Metrics Collection: Prometheus + Grafana
- Alerting: PagerDuty, OpsGenie integration
Next: Learn about Business Logic Adapters or check out the API Reference for detailed endpoint documentation.