You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
proxysql/doc/MCP/Architecture.md

461 lines
19 KiB

# MCP Architecture
This document describes the architecture of the MCP (Model Context Protocol) module in ProxySQL, including endpoint design and tool handler implementation.
## Overview
The MCP module implements JSON-RPC 2.0 over HTTPS for LLM (Large Language Model) integration with ProxySQL. It provides multiple endpoints, each designed to serve specific purposes while sharing a single HTTPS server.
### Key Concepts
- **MCP Endpoint**: A distinct HTTPS endpoint (e.g., `/mcp/config`, `/mcp/query`) that implements MCP protocol
- **Tool Handler**: A C++ class that implements specific tools available to LLMs
- **Tool Discovery**: Dynamic discovery via `tools/list` method (MCP protocol standard)
- **Endpoint Authentication**: Per-endpoint Bearer token authentication
- **Connection Pooling**: MySQL connection pooling for efficient database access
## Implemented Architecture
### Component Diagram
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ ProxySQL Process │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ MCP_Threads_Handler │ │
│ │ - Configuration variables (mcp-*) │ │
│ │ - Status variables │ │
│ │ - mcp_server (ProxySQL_MCP_Server) │ │
│ │ - config_tool_handler (NEW) │ │
│ │ - query_tool_handler (NEW) │ │
│ │ - admin_tool_handler (NEW) │ │
│ │ - cache_tool_handler (NEW) │ │
│ │ - observe_tool_handler (NEW) │ │
│ │ - ai_tool_handler (NEW) │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ ProxySQL_MCP_Server │ │
│ │ (Single HTTPS Server) │ │
│ │ │ │
│ │ Port: mcp-port (default 6071) │ │
│ │ SSL: Uses ProxySQL's certificates │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────┬──────────────┼──────────────┬──────────────┬─────────┐ │
│ ▼ ▼ ▼ ▼ ▼ ▼ │
│ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌───┐│
│ │conf│ │obs │ │qry │ │adm │ │cach│ │ai ││
│ │TH │ │TH │ │TH │ │TH │ │TH │ │TH ││
│ └─┬──┘ └─┬──┘ └─┬──┘ └─┬──┘ └─┬──┘ └─┬─┘│
│ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │
│ Tools: Tools: Tools: Tools: Tools: │ │
│ - get_config - list_ - list_ - admin_ - get_ │ │
│ - set_config stats schemas - set_ cache │ │
│ - reload - show_ - list_ - reload - set_ │ │
│ metrics tables - invalidate │ │
│ - query │ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ MySQL Backend │ │
│ │ (Connection Pool) │ │
│ └────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
Where:
- `TH` = Tool Handler
### File Structure
```
include/
├── MCP_Thread.h # MCP_Threads_Handler class definition
├── MCP_Endpoint.h # MCP_JSONRPC_Resource class definition
├── MCP_Tool_Handler.h # Base class for all tool handlers
├── Config_Tool_Handler.h # Configuration endpoint tool handler
├── Query_Tool_Handler.h # Query endpoint tool handler (includes discovery tools)
├── Admin_Tool_Handler.h # Administration endpoint tool handler
├── Cache_Tool_Handler.h # Cache endpoint tool handler
├── Observe_Tool_Handler.h # Observability endpoint tool handler
├── AI_Tool_Handler.h # AI endpoint tool handler
├── Discovery_Schema.h # Discovery catalog implementation
├── Static_Harvester.h # Static database harvester for discovery
└── ProxySQL_MCP_Server.hpp # ProxySQL_MCP_Server class definition
lib/
├── MCP_Thread.cpp # MCP_Threads_Handler implementation
├── MCP_Endpoint.cpp # MCP_JSONRPC_Resource implementation
├── MCP_Tool_Handler.cpp # Base class implementation
├── Config_Tool_Handler.cpp # Configuration endpoint implementation
├── Query_Tool_Handler.cpp # Query endpoint implementation
├── Admin_Tool_Handler.cpp # Administration endpoint implementation
├── Cache_Tool_Handler.cpp # Cache endpoint implementation
├── Observe_Tool_Handler.cpp # Observability endpoint implementation
├── AI_Tool_Handler.cpp # AI endpoint implementation
├── Discovery_Schema.cpp # Discovery catalog implementation
├── Static_Harvester.cpp # Static database harvester implementation
└── ProxySQL_MCP_Server.cpp # HTTPS server implementation
```
### Request Flow (Implemented)
```
1. LLM Client → POST /mcp/{endpoint} → HTTPS Server (port 6071)
2. HTTPS Server → MCP_JSONRPC_Resource::render_POST()
3. MCP_JSONRPC_Resource → handle_jsonrpc_request()
4. Route based on JSON-RPC method:
- initialize/ping → Handled directly
- tools/list → handle_tools_list()
- tools/describe → handle_tools_describe()
- tools/call → handle_tools_call() → Dedicated Tool Handler
5. Dedicated Tool Handler → MySQL Backend (via connection pool)
6. Return JSON-RPC response
```
## Implemented Endpoint Specifications
### Overview
Each MCP endpoint has its own dedicated tool handler with specific tools designed for that endpoint's purpose. This allows for:
- **Specialized tools** - Different tools for different purposes
- **Isolated resources** - Separate connection pools per endpoint
- **Independent authentication** - Per-endpoint credentials
- **Clear separation of concerns** - Each endpoint has a well-defined purpose
### Endpoint Specifications
#### `/mcp/config` - Configuration Endpoint
**Purpose**: Runtime configuration and management of ProxySQL
**Tools**:
- `get_config` - Get current configuration values
- `set_config` - Modify configuration values
- `reload_config` - Reload configuration from disk/memory
- `list_variables` - List all available variables
- `get_status` - Get server status information
**Use Cases**:
- LLM assistants that need to configure ProxySQL
- Automated configuration management
- Dynamic tuning based on workload
**Authentication**: `mcp-config_endpoint_auth` (Bearer token)
---
#### `/mcp/observe` - Observability Endpoint
**Purpose**: Real-time metrics, statistics, and monitoring data
**Tools**:
- `list_stats` - List available statistics
- `get_stats` - Get specific statistics
- `show_connections` - Show active connections
- `show_queries` - Show query statistics
- `get_health` - Get health check information
- `show_metrics` - Show performance metrics
**Use Cases**:
- LLM assistants for monitoring and observability
- Automated alerting and health checks
- Performance analysis
**Authentication**: `mcp-observe_endpoint_auth` (Bearer token)
---
#### `/mcp/query` - Query Endpoint
**Purpose**: Safe database exploration and query execution
**Tools**:
- `list_schemas` - List databases
- `list_tables` - List tables in schema
- `describe_table` - Get table structure
- `get_constraints` - Get foreign keys and constraints
- `sample_rows` - Get sample data
- `run_sql_readonly` - Execute read-only SQL
- `explain_sql` - Explain query execution plan
- `suggest_joins` - Suggest join paths between tables
- `find_reference_candidates` - Find potential foreign key relationships
- `table_profile` - Get table statistics and data distribution
- `column_profile` - Get column statistics and data distribution
- `sample_distinct` - Get distinct values from a column
- `catalog_get` - Get entry from discovery catalog
- `catalog_upsert` - Insert or update entry in discovery catalog
- `catalog_delete` - Delete entry from discovery catalog
- `catalog_search` - Search entries in discovery catalog
- `catalog_list` - List all entries in discovery catalog
- `catalog_clear` - Clear all entries from discovery catalog
- `discovery.run_static` - Run static database discovery (Phase 1)
- `agent.*` - Agent coordination tools for discovery
- `llm.*` - LLM interaction tools for discovery
**Use Cases**:
- LLM assistants for database exploration
- Data analysis and discovery
- Query optimization assistance
- Two-phase discovery (static harvest + LLM analysis)
**Authentication**: `mcp-query_endpoint_auth` (Bearer token)
---
#### `/mcp/admin` - Administration Endpoint
**Purpose**: Administrative operations
**Tools**:
- `admin_list_users` - List MySQL users
- `admin_create_user` - Create MySQL user
- `admin_grant_permissions` - Grant permissions
- `admin_show_processes` - Show running processes
- `admin_kill_query` - Kill a running query
- `admin_flush_cache` - Flush various caches
- `admin_reload` - Reload users/servers
**Use Cases**:
- LLM assistants for administration tasks
- Automated user management
- Emergency operations
**Authentication**: `mcp-admin_endpoint_auth` (Bearer token, most restrictive)
---
#### `/mcp/cache` - Cache Endpoint
**Purpose**: Query cache management
**Tools**:
- `get_cache_stats` - Get cache statistics
- `invalidate_cache` - Invalidate cache entries
- `set_cache_ttl` - Set cache TTL
- `clear_cache` - Clear all cache
- `warm_cache` - Warm up cache with queries
- `get_cache_entries` - List cached queries
**Use Cases**:
- LLM assistants for cache optimization
- Automated cache management
- Performance tuning
**Authentication**: `mcp-cache_endpoint_auth` (Bearer token)
---
#### `/mcp/ai` - AI Endpoint
**Purpose**: AI and LLM features
**Tools**:
- `llm.query` - Query LLM with database context
- `llm.analyze` - Analyze data with LLM
- `llm.generate` - Generate content with LLM
- `anomaly.detect` - Detect anomalies in data
- `anomaly.list` - List detected anomalies
- `recommendation.get` - Get AI recommendations
**Use Cases**:
- LLM-powered data analysis
- Anomaly detection
- AI-driven recommendations
**Authentication**: `mcp-ai_endpoint_auth` (Bearer token)
### Tool Discovery Flow
MCP clients should discover available tools dynamically:
```
1. Client → POST /mcp/config → {"method": "tools/list", ...}
2. Server → {"result": {"tools": [
{"name": "get_config", "description": "..."},
{"name": "set_config", "description": "..."},
...
]}}
3. Client → POST /mcp/query → {"method": "tools/list", ...}
4. Server → {"result": {"tools": [
{"name": "list_schemas", "description": "..."},
{"name": "list_tables", "description": "..."},
...
]}}
```
**Example Discovery**:
```bash
# Discover tools on /mcp/query endpoint
curl -k -X POST https://127.0.0.1:6071/mcp/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{"jsonrpc": "2.0", "method": "tools/list", "id": 1}'
```
### Tool Handler Base Class
All tool handlers will inherit from a common base class:
```cpp
class MCP_Tool_Handler {
public:
virtual ~MCP_Tool_Handler() = default;
// Tool discovery
virtual json get_tool_list() = 0;
virtual json get_tool_description(const std::string& tool_name) = 0;
virtual json execute_tool(const std::string& tool_name, const json& arguments) = 0;
// Lifecycle
virtual int init() = 0;
virtual void close() = 0;
};
```
### Per-Endpoint Authentication
Each endpoint validates its own Bearer token. The implementation is complete and supports:
- **Bearer token** from `Authorization` header
- **Query parameter fallback** (`?token=xxx`) for simple testing
- **No authentication** when token is not configured (backward compatible)
```cpp
bool MCP_JSONRPC_Resource::authenticate_request(const http_request& req) {
// Get the expected auth token for this endpoint
char* expected_token = nullptr;
if (endpoint_name == "config") {
expected_token = handler->variables.mcp_config_endpoint_auth;
} else if (endpoint_name == "observe") {
expected_token = handler->variables.mcp_observe_endpoint_auth;
} else if (endpoint_name == "query") {
expected_token = handler->variables.mcp_query_endpoint_auth;
} else if (endpoint_name == "admin") {
expected_token = handler->variables.mcp_admin_endpoint_auth;
} else if (endpoint_name == "cache") {
expected_token = handler->variables.mcp_cache_endpoint_auth;
}
// If no auth token is configured, allow the request
if (!expected_token || strlen(expected_token) == 0) {
return true; // No authentication required
}
// Try to get Bearer token from Authorization header
std::string auth_header = req.get_header("Authorization");
if (auth_header.empty()) {
// Fallback: try getting from query parameter
const std::map<std::string, std::string, http::arg_comparator>& args = req.get_args();
auto it = args.find("token");
if (it != args.end()) {
auth_header = "Bearer " + it->second;
}
}
if (auth_header.empty()) {
return false; // No authentication provided
}
// Check if it's a Bearer token
const std::string bearer_prefix = "Bearer ";
if (auth_header.length() <= bearer_prefix.length() ||
auth_header.compare(0, bearer_prefix.length(), bearer_prefix) != 0) {
return false; // Invalid format
}
// Extract and validate token
std::string provided_token = auth_header.substr(bearer_prefix.length());
// Trim whitespace
size_t start = provided_token.find_first_not_of(" \t\n\r");
size_t end = provided_token.find_last_not_of(" \t\n\r");
if (start != std::string::npos && end != std::string::npos) {
provided_token = provided_token.substr(start, end - start + 1);
}
return (provided_token == expected_token);
}
```
**Status:****Implemented** (lib/MCP_Endpoint.cpp)
### Connection Pooling Strategy
Each tool handler manages its own connection pool:
```cpp
class Config_Tool_Handler : public MCP_Tool_Handler {
private:
std::vector<MYSQL*> config_connection_pool; // For ProxySQL admin
pthread_mutex_t pool_lock;
};
```
## Implementation Status
### Phase 1: Base Infrastructure ✅ COMPLETED
1. ✅ Create `MCP_Tool_Handler` base class
2. ✅ Create implementations for all 6 tool handlers (config, query, admin, cache, observe, ai)
3. ✅ Update `MCP_Threads_Handler` to manage all handlers
4. ✅ Update `ProxySQL_MCP_Server` to pass handlers to endpoints
### Phase 2: Tool Implementation ✅ COMPLETED
1. ✅ Implement Config_Tool_Handler tools
2. ✅ Implement Query_Tool_Handler tools (includes MySQL tools and discovery tools)
3. ✅ Implement Admin_Tool_Handler tools
4. ✅ Implement Cache_Tool_Handler tools
5. ✅ Implement Observe_Tool_Handler tools
6. ✅ Implement AI_Tool_Handler tools
### Phase 3: Authentication & Testing ✅ MOSTLY COMPLETED
1. ✅ Implement per-endpoint authentication
2. ⚠️ Update test scripts to use dynamic tool discovery
3. ⚠️ Add integration tests for each endpoint
4. ✅ Documentation updates (this document)
## Migration Status ✅ COMPLETED
### Backward Compatibility Maintained
The migration to multiple tool handlers has been completed while maintaining backward compatibility:
1. ✅ The existing `mysql_tool_handler` has been replaced by `query_tool_handler`
2. ✅ Existing tools continue to work on `/mcp/query`
3. ✅ New endpoints have been added incrementally
4. ✅ Deprecation warnings are provided for accessing tools on wrong endpoints
### Migration Steps Completed
```
✅ Step 1: Add new base class and stub handlers (no behavior change)
✅ Step 2: Implement /mcp/config endpoint (new functionality)
✅ Step 3: Move MySQL tools to /mcp/query (existing tools migrate)
✅ Step 4: Implement /mcp/admin (new functionality)
✅ Step 5: Implement /mcp/cache (new functionality)
✅ Step 6: Implement /mcp/observe (new functionality)
✅ Step 7: Enable per-endpoint auth
✅ Step 8: Add /mcp/ai endpoint (new AI functionality)
```
## Related Documentation
- [VARIABLES.md](VARIABLES.md) - Configuration variables reference
- [README.md](README.md) - Module overview and setup
## Version
- **MCP Thread Version:** 0.1.0
- **Architecture Version:** 1.0 (design document)
- **Last Updated:** 2026-01-19