mirror of https://github.com/sysown/proxysql
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
221 lines
5.5 KiB
221 lines
5.5 KiB
# NL2SQL - Natural Language to SQL for ProxySQL
|
|
|
|
## Overview
|
|
|
|
NL2SQL (Natural Language to SQL) is a ProxySQL feature that converts natural language questions into SQL queries using Large Language Models (LLMs).
|
|
|
|
## Features
|
|
|
|
- **Hybrid Deployment**: Local Ollama + Cloud APIs (OpenAI, Anthropic)
|
|
- **Semantic Caching**: Vector-based cache for similar queries using sqlite-vec
|
|
- **Schema Awareness**: Understands your database schema for better conversions
|
|
- **Multi-Provider**: Switch between LLM providers seamlessly
|
|
- **Security**: Generated SQL is returned for review before execution
|
|
|
|
## Quick Start
|
|
|
|
### 1. Enable NL2SQL
|
|
|
|
```sql
|
|
-- Via admin interface
|
|
SET ai_nl2sql_enabled='true';
|
|
LOAD MYSQL VARIABLES TO RUNTIME;
|
|
```
|
|
|
|
### 2. Configure LLM Provider
|
|
|
|
**Using local Ollama (default):**
|
|
|
|
```sql
|
|
SET ai_nl2sql_model_provider='ollama';
|
|
SET ai_nl2sql_ollama_model='llama3.2';
|
|
LOAD MYSQL VARIABLES TO RUNTIME;
|
|
```
|
|
|
|
**Using OpenAI:**
|
|
|
|
```sql
|
|
SET ai_nl2sql_model_provider='openai';
|
|
SET ai_nl2sql_openai_model='gpt-4o-mini';
|
|
SET ai_nl2sql_openai_key='sk-...';
|
|
LOAD MYSQL VARIABLES TO RUNTIME;
|
|
```
|
|
|
|
**Using Anthropic:**
|
|
|
|
```sql
|
|
SET ai_nl2sql_model_provider='anthropic';
|
|
SET ai_nl2sql_anthropic_model='claude-3-haiku';
|
|
SET ai_nl2sql_anthropic_key='sk-ant-...';
|
|
LOAD MYSQL VARIABLES TO RUNTIME;
|
|
```
|
|
|
|
### 3. Use NL2SQL
|
|
|
|
```sql
|
|
-- In your SQL client, prefix your query with "NL2SQL:"
|
|
mysql> SELECT * FROM runtime_mysql_servers WHERE variable_name='ai_nl2sql_enabled';
|
|
|
|
-- Query converted to SQL
|
|
mysql> NL2SQL: Show top 10 customers by revenue;
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `ai_nl2sql_enabled` | true | Enable/disable NL2SQL |
|
|
| `ai_nl2sql_query_prefix` | NL2SQL: | Prefix for NL2SQL queries |
|
|
| `ai_nl2sql_model_provider` | ollama | LLM provider (ollama/openai/anthropic) |
|
|
| `ai_nl2sql_ollama_model` | llama3.2 | Ollama model name |
|
|
| `ai_nl2sql_openai_model` | gpt-4o-mini | OpenAI model name |
|
|
| `ai_nl2sql_anthropic_model` | claude-3-haiku | Anthropic model name |
|
|
| `ai_nl2sql_cache_similarity_threshold` | 85 | Semantic similarity threshold (0-100) |
|
|
| `ai_nl2sql_timeout_ms` | 30000 | LLM request timeout in milliseconds |
|
|
| `ai_nl2sql_prefer_local` | true | Prefer local models when possible |
|
|
|
|
### Model Selection
|
|
|
|
The system automatically selects the best model based on:
|
|
|
|
1. **Latency requirements**: Local Ollama for fast queries (< 500ms)
|
|
2. **API key availability**: Falls back to Ollama if keys missing
|
|
3. **User preference**: Respects `ai_nl2sql_model_provider` setting
|
|
|
|
## Examples
|
|
|
|
### Basic Queries
|
|
|
|
```
|
|
NL2SQL: Show all users
|
|
NL2SQL: Find orders with amount > 100
|
|
NL2SQL: Count customers by country
|
|
```
|
|
|
|
### Complex Queries
|
|
|
|
```
|
|
NL2SQL: Show top 5 customers by total order amount
|
|
NL2SQL: Find customers who placed orders in the last 30 days
|
|
NL2SQL: What is the average order value per month?
|
|
```
|
|
|
|
### Schema-Aware Queries
|
|
|
|
```
|
|
-- Switch to your schema first
|
|
USE my_database;
|
|
NL2SQL: List all products in the Electronics category
|
|
NL2SQL: Find orders that contain specific products
|
|
```
|
|
|
|
### Results
|
|
|
|
NL2SQL returns a resultset with:
|
|
- `sql_query`: Generated SQL
|
|
- `confidence`: 0.0-1.0 score
|
|
- `explanation`: Which model was used
|
|
- `cached`: Whether from semantic cache
|
|
|
|
## Troubleshooting
|
|
|
|
### NL2SQL returns empty result
|
|
|
|
1. Check AI module is initialized:
|
|
```sql
|
|
SELECT * FROM runtime_mysql_servers WHERE variable_name LIKE 'ai_%';
|
|
```
|
|
|
|
2. Verify LLM is accessible:
|
|
```bash
|
|
# For Ollama
|
|
curl http://localhost:11434/api/tags
|
|
|
|
# For cloud APIs, check your API keys
|
|
```
|
|
|
|
3. Check logs:
|
|
```bash
|
|
tail -f proxysql.log | grep NL2SQL
|
|
```
|
|
|
|
### Poor quality SQL
|
|
|
|
1. **Try a different model:**
|
|
```sql
|
|
SET ai_nl2sql_ollama_model='llama3.3';
|
|
```
|
|
|
|
2. **Increase timeout for complex queries:**
|
|
```sql
|
|
SET ai_nl2sql_timeout_ms=60000;
|
|
```
|
|
|
|
3. **Check confidence score:**
|
|
- High confidence (> 0.7): Generally reliable
|
|
- Medium confidence (0.4-0.7): Review before using
|
|
- Low confidence (< 0.4): May need manual correction
|
|
|
|
### Cache Issues
|
|
|
|
```sql
|
|
-- Clear cache (Phase 3 feature)
|
|
-- TODO: Add cache clearing command
|
|
|
|
-- Check cache stats
|
|
SELECT * FROM stats_ai_nl2sql_cache;
|
|
```
|
|
|
|
## Performance
|
|
|
|
| Operation | Typical Latency |
|
|
|-----------|-----------------|
|
|
| Local Ollama | ~1-2 seconds |
|
|
| Cloud API | ~2-5 seconds |
|
|
| Cache hit | < 50ms |
|
|
|
|
**Tips for better performance:**
|
|
- Use local Ollama for faster responses
|
|
- Enable caching for repeated queries
|
|
- Use `ai_nl2sql_timeout_ms` to limit wait time
|
|
- Consider pre-warming cache with common queries
|
|
|
|
## Security
|
|
|
|
### Important Notes
|
|
|
|
- NL2SQL queries are **NOT executed automatically**
|
|
- Generated SQL is returned for **review first**
|
|
- Always validate generated SQL before execution
|
|
- Keep API keys secure (use environment variables)
|
|
|
|
### Best Practices
|
|
|
|
1. **Review generated SQL**: Always check the output before running
|
|
2. **Use read-only accounts**: Test with limited permissions first
|
|
3. **Monitor confidence scores**: Low confidence may indicate errors
|
|
4. **Keep API keys secure**: Don't commit them to version control
|
|
5. **Use caching wisely**: Balance speed vs. data freshness
|
|
|
|
## API Reference
|
|
|
|
For complete API documentation, see [API.md](API.md).
|
|
|
|
## Architecture
|
|
|
|
For system architecture details, see [ARCHITECTURE.md](ARCHITECTURE.md).
|
|
|
|
## Testing
|
|
|
|
For testing information, see [TESTING.md](TESTING.md).
|
|
|
|
## Version History
|
|
|
|
- **0.1.0** (2025-01-16): Initial release with Ollama, OpenAI, Anthropic support
|
|
|
|
## License
|
|
|
|
This feature is part of ProxySQL and follows the same license.
|