|
|
1 month ago | |
|---|---|---|
| .. | ||
| API.md | 1 month ago | |
| ARCHITECTURE.md | 1 month ago | |
| README.md | 1 month ago | |
| TESTING.md | 1 month ago | |
README.md
LLM Bridge - Generic LLM Access for ProxySQL
Overview
LLM Bridge is a ProxySQL feature that provides generic access to Large Language Models (LLMs) through the MySQL protocol. It allows you to send any prompt to an LLM and receive the response as a MySQL resultset.
Note: This feature was previously called "NL2SQL" (Natural Language to SQL) but has been converted to a generic LLM bridge. Future NL2SQL functionality will be implemented as a Web UI using external agents (Claude Code + MCP server).
Features
- Generic Provider Support: Works with any OpenAI-compatible or Anthropic-compatible endpoint
- Semantic Caching: Vector-based cache for similar prompts using sqlite-vec
- Multi-Provider: Switch between LLM providers seamlessly
- Versatile: Use LLMs for summarization, code generation, translation, analysis, etc.
Supported Endpoints:
- Ollama (via OpenAI-compatible
/v1/chat/completionsendpoint) - OpenAI
- Anthropic
- vLLM
- LM Studio
- Z.ai
- Any other OpenAI-compatible or Anthropic-compatible endpoint
Quick Start
1. Enable LLM Bridge
-- Via admin interface
SET genai-llm_enabled='true';
LOAD GENAI VARIABLES TO RUNTIME;
2. Configure LLM Provider
ProxySQL uses a generic provider configuration that supports any OpenAI-compatible or Anthropic-compatible endpoint.
Using Ollama (default):
Ollama is used via its OpenAI-compatible endpoint:
SET genai-llm_provider='openai';
SET genai-llm_provider_url='http://localhost:11434/v1/chat/completions';
SET genai-llm_provider_model='llama3.2';
SET genai-llm_provider_key=''; -- Empty for local Ollama
LOAD GENAI VARIABLES TO RUNTIME;
Using OpenAI:
SET genai-llm_provider='openai';
SET genai-llm_provider_url='https://api.openai.com/v1/chat/completions';
SET genai-llm_provider_model='gpt-4';
SET genai-llm_provider_key='sk-...'; -- Your OpenAI API key
LOAD GENAI VARIABLES TO RUNTIME;
Using Anthropic:
SET genai-llm_provider='anthropic';
SET genai-llm_provider_url='https://api.anthropic.com/v1/messages';
SET genai-llm_provider_model='claude-3-opus-20240229';
SET genai-llm_provider_key='sk-ant-...'; -- Your Anthropic API key
LOAD GENAI VARIABLES TO RUNTIME;
Using any OpenAI-compatible endpoint:
This works with any OpenAI-compatible API (vLLM, LM Studio, Z.ai, etc.):
SET genai-llm_provider='openai';
SET genai-llm_provider_url='https://your-endpoint.com/v1/chat/completions';
SET genai-llm_provider_model='your-model-name';
SET genai-llm_provider_key='your-api-key'; -- Empty for local endpoints
LOAD GENAI VARIABLES TO RUNTIME;
3. Use the LLM Bridge
Once configured, you can send prompts using the /* LLM: */ prefix:
-- Summarize text
mysql> /* LLM: */ Summarize the customer feedback from last week
-- Explain SQL queries
mysql> /* LLM: */ Explain this query: SELECT COUNT(*) FROM users WHERE active = 1
-- Generate code
mysql> /* LLM: */ Generate a Python function to validate email addresses
-- Translate text
mysql> /* LLM: */ Translate "Hello world" to Spanish
-- Analyze data
mysql> /* LLM: */ Analyze the following sales data and provide insights
Important: LLM queries are executed in the MySQL module (your regular SQL client), not in the ProxySQL Admin interface. The Admin interface is only for configuration.
Response Format
The LLM Bridge returns a resultset with the following columns:
| Column | Description |
|---|---|
text_response |
The LLM's text response |
explanation |
Which model/provider generated the response |
cached |
Whether the response was from cache (true/false) |
provider |
The provider used (openai/anthropic) |
Configuration Variables
| Variable | Default | Description |
|---|---|---|
genai-llm_enabled |
false | Master enable for LLM bridge |
genai-llm_provider |
openai | Provider type (openai/anthropic) |
genai-llm_provider_url |
http://localhost:11434/v1/chat/completions | LLM endpoint URL |
genai-llm_provider_model |
llama3.2 | Model name |
genai-llm_provider_key |
(empty) | API key (optional for local) |
genai-llm_cache_enabled |
true | Enable semantic cache |
genai-llm_cache_similarity_threshold |
85 | Cache similarity threshold (0-100) |
genai-llm_timeout_ms |
30000 | Request timeout in milliseconds |
Request Configuration (Advanced)
When using LLM bridge programmatically, you can configure retry behavior:
| Parameter | Default | Description |
|---|---|---|
max_retries |
3 | Maximum retry attempts for transient failures |
retry_backoff_ms |
1000 | Initial backoff in milliseconds |
retry_multiplier |
2.0 | Backoff multiplier for exponential backoff |
retry_max_backoff_ms |
30000 | Maximum backoff in milliseconds |
allow_cache |
true | Enable semantic cache lookup |
Error Handling
LLM Bridge provides structured error information to help diagnose issues:
| Error Code | Description | HTTP Status |
|---|---|---|
ERR_API_KEY_MISSING |
API key not configured | N/A |
ERR_API_KEY_INVALID |
API key format is invalid | N/A |
ERR_TIMEOUT |
Request timed out | N/A |
ERR_CONNECTION_FAILED |
Network connection failed | 0 |
ERR_RATE_LIMITED |
Rate limited by provider | 429 |
ERR_SERVER_ERROR |
Server error | 500-599 |
ERR_EMPTY_RESPONSE |
Empty response from LLM | N/A |
ERR_INVALID_RESPONSE |
Malformed response from LLM | N/A |
ERR_VALIDATION_FAILED |
Input validation failed | N/A |
ERR_UNKNOWN_PROVIDER |
Invalid provider name | N/A |
ERR_REQUEST_TOO_LARGE |
Request exceeds size limit | 413 |
Result Fields:
error_code: Structured error code (e.g., "ERR_API_KEY_MISSING")error_details: Detailed error context with query, provider, URLhttp_status_code: HTTP status code if applicableprovider_used: Which provider was attempted
Request Correlation
Each LLM request generates a unique request ID for log correlation:
LLM [a1b2c3d4-e5f6-7890-abcd-ef1234567890]: REQUEST url=http://... model=llama3.2
LLM [a1b2c3d4-e5f6-7890-abcd-ef1234567890]: RESPONSE status=200 duration_ms=1234
This allows tracing a single request through all log lines for debugging.
Use Cases
1. Text Summarization
/* LLM: */ Summarize this text: [long text...]
2. Code Generation
/* LLM: */ Write a Python function to check if a number is prime
/* LLM: */ Generate a SQL query to find duplicate users
3. Query Explanation
/* LLM: */ Explain what this query does: SELECT * FROM orders WHERE status = 'pending'
/* LLM: */ Why is this query slow: SELECT * FROM users JOIN orders ON...
4. Data Analysis
/* LLM: */ Analyze this CSV data and identify trends: [data...]
/* LLM: */ What insights can you derive from these sales figures?
5. Translation
/* LLM: */ Translate "Good morning" to French, German, and Spanish
/* LLM: */ Convert this SQL query to PostgreSQL dialect
6. Documentation
/* LLM: */ Write documentation for this function: [code...]
/* LLM: */ Generate API documentation for the users endpoint
7. Code Review
/* LLM: */ Review this code for security issues: [code...]
/* LLM: */ Suggest optimizations for this query
Examples
Basic Usage
-- Get a summary
mysql> /* LLM: */ What is machine learning?
-- Generate code
mysql> /* LLM: */ Write a function to calculate fibonacci numbers in JavaScript
-- Explain concepts
mysql> /* LLM: */ Explain the difference between INNER JOIN and LEFT JOIN
Complex Prompts
-- Multi-step reasoning
mysql> /* LLM: */ Analyze the performance implications of using VARCHAR(255) vs TEXT in MySQL
-- Code with specific requirements
mysql> /* LLM: */ Write a Python script that reads a CSV file, filters rows where amount > 100, and outputs to JSON
-- Technical documentation
mysql> /* LLM: */ Create API documentation for a user registration endpoint with validation rules
Results
LLM Bridge returns a resultset with:
| Column | Type | Description |
|---|---|---|
text_response |
TEXT | LLM's text response |
explanation |
TEXT | Which model was used |
cached |
BOOLEAN | Whether from semantic cache |
error_code |
TEXT | Structured error code (if error) |
error_details |
TEXT | Detailed error context (if error) |
http_status_code |
INT | HTTP status code (if applicable) |
provider |
TEXT | Which provider was used |
Example successful response:
+-------------------------------------------------------------+----------------------+------+----------+
| text_response | explanation | cached | provider |
+-------------------------------------------------------------+----------------------+------+----------+
| Machine learning is a subset of artificial intelligence | Generated by llama3.2 | 0 | openai |
| that enables systems to learn from data... | | | |
+-------------------------------------------------------------+----------------------+------+----------+
Example error response:
+-----------------------------------------------------------------------+
| text_response |
+-----------------------------------------------------------------------+
| -- LLM processing failed |
| |
| error_code: ERR_API_KEY_MISSING |
| error_details: LLM processing failed: |
| Query: What is machine learning? |
| Provider: openai |
| URL: https://api.openai.com/v1/chat/completions |
| Error: API key not configured |
| |
| http_status_code: 0 |
| provider_used: openai |
+-----------------------------------------------------------------------+
Troubleshooting
LLM Bridge returns empty result
-
Check AI module is initialized:
SELECT * FROM runtime_mysql_servers WHERE variable_name LIKE 'ai_%'; -
Verify LLM is accessible:
# For Ollama curl http://localhost:11434/api/tags # For cloud APIs, check your API keys -
Check logs with request ID:
# Find all log lines for a specific request tail -f proxysql.log | grep "LLM \[a1b2c3d4" -
Check error details:
- Review
error_codefor structured error type - Review
error_detailsfor full context including query, provider, URL - Review
http_status_codefor HTTP-level errors (429 = rate limit, 500+ = server error)
- Review
Retry Behavior
LLM Bridge automatically retries on transient failures:
- Rate limiting (HTTP 429): Retries with exponential backoff
- Server errors (500-504): Retries with exponential backoff
- Network errors: Retries with exponential backoff
Default retry behavior:
- Maximum retries: 3
- Initial backoff: 1000ms
- Multiplier: 2.0x
- Maximum backoff: 30000ms
Log output during retry:
LLM [request-id]: ERROR phase=llm error=Empty response status=0
LLM [request-id]: Retryable error (status=0), retrying in 1000ms (attempt 1/4)
LLM [request-id]: Request succeeded after 1 retries
Slow Responses
-
Try a different model:
SET genai-llm_provider_model='llama3.2'; -- Faster than GPT-4 LOAD GENAI VARIABLES TO RUNTIME; -
Use local Ollama for faster responses:
SET genai-llm_provider_url='http://localhost:11434/v1/chat/completions'; LOAD GENAI VARIABLES TO RUNTIME; -
Increase timeout for complex prompts:
SET genai-llm_timeout_ms=60000; LOAD GENAI VARIABLES TO RUNTIME;
Cache Issues
-- Check cache stats
SHOW STATUS LIKE 'llm_%';
-- Cache is automatically managed based on semantic similarity
-- Adjust similarity threshold if needed
SET genai-llm_cache_similarity_threshold=80; -- Lower = more matches
LOAD GENAI VARIABLES TO RUNTIME;
Status Variables
Monitor LLM bridge usage:
SELECT * FROM stats_mysql_global WHERE variable_name LIKE 'llm_%';
Available status variables:
llm_total_requests- Total number of LLM requestsllm_cache_hits- Number of cache hitsllm_cache_misses- Number of cache missesllm_local_model_calls- Calls to local modelsllm_cloud_model_calls- Calls to cloud APIsllm_total_response_time_ms- Total response timellm_cache_total_lookup_time_ms- Total cache lookup timellm_cache_total_store_time_ms- Total cache store time
Performance
| Operation | Typical Latency |
|---|---|
| Local Ollama | ~1-2 seconds |
| Cloud API | ~2-5 seconds |
| Cache hit | < 50ms |
Tips for better performance:
- Use local Ollama for faster responses
- Enable caching for repeated prompts
- Use
genai-llm_timeout_msto limit wait time - Consider pre-warming cache with common prompts
Migration from NL2SQL
If you were using the old /* NL2SQL: */ prefix:
- Update your queries from
/* NL2SQL: */to/* LLM: */ - Update configuration variables from
genai-nl2sql_*togenai-llm_* - Note that the response format has changed:
- Removed:
sql_query,confidencecolumns - Added:
text_response,providercolumns
- Removed:
- The
ai_nl2sql_convertMCP tool is deprecated and will return an error
Old NL2SQL Usage:
/* NL2SQL: */ Show top 10 customers by revenue
-- Returns: sql_query, confidence, explanation, cached
New LLM Bridge Usage:
/* LLM: */ Show top 10 customers by revenue
-- Returns: text_response, explanation, cached, provider
For true NL2SQL functionality (schema-aware SQL generation with iteration), consider using external agents that can:
- Analyze your database schema
- Iterate on query refinement
- Validate generated queries
- Execute and review results
Security
Important Notes
- LLM responses are NOT executed automatically
- Text responses are returned for review
- Always validate generated code before execution
- Keep API keys secure (use environment variables)
Best Practices
- Review generated code: Always check output before running
- Use read-only accounts: Test with limited permissions first
- Keep API keys secure: Don't commit them to version control
- Use caching wisely: Balance speed vs. data freshness
- Monitor usage: Check status variables regularly
API Reference
For complete API documentation, see API.md.
Architecture
For system architecture details, see ARCHITECTURE.md.
Testing
For testing information, see TESTING.md.
License
This feature is part of ProxySQL and follows the same license.