diff --git a/doc/NL2SQL/API.md b/doc/NL2SQL/API.md
new file mode 100644
index 000000000..394baec5d
--- /dev/null
+++ b/doc/NL2SQL/API.md
@@ -0,0 +1,438 @@
+# NL2SQL API Reference
+
+## Complete API Documentation
+
+This document provides a comprehensive reference for all NL2SQL APIs, including configuration variables, data structures, and methods.
+
+## Table of Contents
+
+- [Configuration Variables](#configuration-variables)
+- [Data Structures](#data-structures)
+- [NL2SQL_Converter Class](#nl2sql_converter-class)
+- [AI_Features_Manager Class](#ai_features_manager-class)
+- [MySQL Protocol Integration](#mysql-protocol-integration)
+
+## Configuration Variables
+
+All NL2SQL variables use the `ai_nl2sql_` prefix and are accessible via the ProxySQL admin interface.
+
+### Master Switch
+
+#### `ai_nl2sql_enabled`
+
+- **Type**: Boolean
+- **Default**: `true`
+- **Description**: Enable/disable NL2SQL feature
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_enabled='true';
+  LOAD MYSQL VARIABLES TO RUNTIME;
+  ```
+
+### Query Detection
+
+#### `ai_nl2sql_query_prefix`
+
+- **Type**: String
+- **Default**: `NL2SQL:`
+- **Description**: Prefix that identifies NL2SQL queries
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_query_prefix='SQL:';
+  -- Now use: SQL: Show customers
+  ```
+
+### Model Selection
+
+#### `ai_nl2sql_model_provider`
+
+- **Type**: Enum (`ollama`, `openai`, `anthropic`)
+- **Default**: `ollama`
+- **Description**: Preferred LLM provider
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_model_provider='openai';
+  LOAD MYSQL VARIABLES TO RUNTIME;
+  ```
+
+#### `ai_nl2sql_ollama_model`
+
+- **Type**: String
+- **Default**: `llama3.2`
+- **Description**: Ollama model name
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_ollama_model='llama3.3';
+  ```
+
+#### `ai_nl2sql_openai_model`
+
+- **Type**: String
+- **Default**: `gpt-4o-mini`
+- **Description**: OpenAI model name
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_openai_model='gpt-4o';
+  ```
+
+#### `ai_nl2sql_anthropic_model`
+
+- **Type**: String
+- **Default**: `claude-3-haiku`
+- **Description**: Anthropic model name
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_anthropic_model='claude-3-5-sonnet-20241022';
+  ```
+
+### API Keys
+
+#### `ai_nl2sql_openai_key`
+
+- **Type**: String (sensitive)
+- **Default**: NULL
+- **Description**: OpenAI API key
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_openai_key='sk-proj-...';
+  ```
+
+#### `ai_nl2sql_anthropic_key`
+
+- **Type**: String (sensitive)
+- **Default**: NULL
+- **Description**: Anthropic API key
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_anthropic_key='sk-ant-...';
+  ```
+
+### Cache Configuration
+
+#### `ai_nl2sql_cache_similarity_threshold`
+
+- **Type**: Integer (0-100)
+- **Default**: `85`
+- **Description**: Minimum similarity score for cache hit
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_cache_similarity_threshold='90';
+  ```
+
+### Performance
+
+#### `ai_nl2sql_timeout_ms`
+
+- **Type**: Integer
+- **Default**: `30000` (30 seconds)
+- **Description**: Maximum time to wait for LLM response
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_timeout_ms='60000';
+  ```
+
+### Routing
+
+#### `ai_nl2sql_prefer_local`
+
+- **Type**: Boolean
+- **Default**: `true`
+- **Description**: Prefer local Ollama over cloud APIs
+- **Runtime**: Yes
+- **Example**:
+  ```sql
+  SET ai_nl2sql_prefer_local='false';
+  ```
+
+## Data Structures
+
+### NL2SQLRequest
+
+```cpp
+struct NL2SQLRequest {
+    std::string natural_language;           // Natural language query text
+    std::string schema_name;                 // Current database/schema name
+    int max_latency_ms;                      // Max acceptable latency (ms)
+    bool allow_cache;                        // Enable semantic cache lookup
+    std::vector<std::string> context_tables; // Optional table hints for schema
+
+    NL2SQLRequest() : max_latency_ms(0), allow_cache(true) {}
+};
+```
+
+#### Fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `natural_language` | string | "" | The user's query in natural language |
+| `schema_name` | string | "" | Current database/schema name |
+| `max_latency_ms` | int | 0 | Max acceptable latency (0 = no constraint) |
+| `allow_cache` | bool | true | Whether to check semantic cache |
+| `context_tables` | vector<string> | {} | Optional table hints for schema context |
+
+### NL2SQLResult
+
+```cpp
+struct NL2SQLResult {
+    std::string sql_query;                  // Generated SQL query
+    float confidence;                        // Confidence score 0.0-1.0
+    std::string explanation;                 // Which model generated this
+    std::vector<std::string> tables_used;    // Tables referenced in SQL
+    bool cached;                             // True if from semantic cache
+    int64_t cache_id;                        // Cache entry ID for tracking
+
+    NL2SQLResult() : confidence(0.0f), cached(false), cache_id(0) {}
+};
+```
+
+#### Fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `sql_query` | string | "" | Generated SQL query |
+| `confidence` | float | 0.0 | Confidence score (0.0-1.0) |
+| `explanation` | string | "" | Model/provider info |
+| `tables_used` | vector<string> | {} | Tables referenced in SQL |
+| `cached` | bool | false | Whether result came from cache |
+| `cache_id` | int64 | 0 | Cache entry ID |
+
+### ModelProvider Enum
+
+```cpp
+enum class ModelProvider {
+    LOCAL_OLLAMA,      // Local models via Ollama
+    CLOUD_OPENAI,      // OpenAI API
+    CLOUD_ANTHROPIC,   // Anthropic API
+    FALLBACK_ERROR     // No model available
+};
+```
+
+## NL2SQL_Converter Class
+
+### Constructor
+
+```cpp
+NL2SQL_Converter::NL2SQL_Converter();
+```
+
+Initializes with default configuration values.
+
+### Destructor
+
+```cpp
+NL2SQL_Converter::~NL2SQL_Converter();
+```
+
+Frees allocated resources.
+
+### Methods
+
+#### `init()`
+
+```cpp
+int NL2SQL_Converter::init();
+```
+
+Initialize the NL2SQL converter.
+
+**Returns**: `0` on success, non-zero on failure
+
+#### `close()`
+
+```cpp
+void NL2SQL_Converter::close();
+```
+
+Shutdown and cleanup resources.
+
+#### `convert()`
+
+```cpp
+NL2SQLResult NL2SQL_Converter::convert(const NL2SQLRequest& req);
+```
+
+Convert natural language to SQL.
+
+**Parameters**:
+- `req`: NL2SQL request with natural language query and context
+
+**Returns**: NL2SQLResult with generated SQL and metadata
+
+**Example**:
+```cpp
+NL2SQLRequest req;
+req.natural_language = "Show top 10 customers";
+req.allow_cache = true;
+NL2SQLResult result = converter->convert(req);
+if (result.confidence > 0.7f) {
+    execute_sql(result.sql_query);
+}
+```
+
+#### `clear_cache()`
+
+```cpp
+void NL2SQL_Converter::clear_cache();
+```
+
+Clear all cached NL2SQL conversions.
+
+#### `get_cache_stats()`
+
+```cpp
+std::string NL2SQL_Converter::get_cache_stats();
+```
+
+Get cache statistics as JSON.
+
+**Returns**: JSON string with cache metrics
+
+**Example**:
+```json
+{
+  "entries": 150,
+  "hits": 1200,
+  "misses": 300
+}
+```
+
+## AI_Features_Manager Class
+
+### Methods
+
+#### `get_nl2sql()`
+
+```cpp
+NL2SQL_Converter* AI_Features_Manager::get_nl2sql();
+```
+
+Get the NL2SQL converter instance.
+
+**Returns**: Pointer to NL2SQL_Converter or NULL
+
+**Example**:
+```cpp
+NL2SQL_Converter* nl2sql = GloAI->get_nl2sql();
+if (nl2sql) {
+    NL2SQLResult result = nl2sql->convert(req);
+}
+```
+
+#### `get_variable()`
+
+```cpp
+char* AI_Features_Manager::get_variable(const char* name);
+```
+
+Get configuration variable value.
+
+**Parameters**:
+- `name`: Variable name (without `ai_nl2sql_` prefix)
+
+**Returns**: Variable value or NULL
+
+**Example**:
+```cpp
+char* model = GloAI->get_variable("ollama_model");
+```
+
+#### `set_variable()`
+
+```cpp
+bool AI_Features_Manager::set_variable(const char* name, const char* value);
+```
+
+Set configuration variable value.
+
+**Parameters**:
+- `name`: Variable name (without `ai_nl2sql_` prefix)
+- `value`: New value
+
+**Returns**: true on success, false on failure
+
+**Example**:
+```cpp
+GloAI->set_variable("ollama_model", "llama3.3");
+```
+
+## MySQL Protocol Integration
+
+### Query Format
+
+NL2SQL queries use a special prefix:
+
+```sql
+NL2SQL: <natural language query>
+```
+
+### Result Format
+
+Results are returned as a standard MySQL resultset with columns:
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `sql_query` | TEXT | Generated SQL query |
+| `confidence` | FLOAT | Confidence score |
+| `explanation` | TEXT | Model info |
+| `cached` | BOOLEAN | From cache |
+| `cache_id` | BIGINT | Cache entry ID |
+
+### Example Session
+
+```sql
+mysql> USE my_database;
+mysql> NL2SQL: Show top 10 customers by revenue;
++---------------------------------------------+------------+-------------------------+--------+----------+
+| sql_query                                    | confidence | explanation             | cached | cache_id |
++---------------------------------------------+------------+-------------------------+--------+----------+
+| SELECT * FROM customers ORDER BY revenue    |      0.850 | Generated by Ollama    |      0 |        0 |
+| DESC LIMIT 10                               |            | llama3.2                |        |          |
++---------------------------------------------+------------+-------------------------+--------+----------+
+1 row in set (1.23 sec)
+```
+
+## Error Codes
+
+| Code | Description | Action |
+|------|-------------|--------|
+| `ER_NL2SQL_DISABLED` | NL2SQL feature is disabled | Enable via `ai_nl2sql_enabled` |
+| `ER_NL2SQL_TIMEOUT` | LLM request timed out | Increase `ai_nl2sql_timeout_ms` |
+| `ER_NL2SQL_NO_MODEL` | No LLM model available | Configure API key or Ollama |
+| `ER_NL2SQL_API_ERROR` | LLM API returned error | Check logs and API key |
+| `ER_NL2SQL_INVALID_QUERY` | Query doesn't start with prefix | Use correct prefix format |
+
+## Status Variables
+
+Monitor NL2SQL performance via status variables:
+
+```sql
+-- View all AI status variables
+SELECT * FROM runtime_mysql_servers
+WHERE variable_name LIKE 'ai_nl2sql_%';
+
+-- Key metrics
+SELECT * FROM stats_ai_nl2sql;
+```
+
+| Variable | Description |
+|----------|-------------|
+| `nl2sql_total_requests` | Total NL2SQL conversions |
+| `nl2sql_cache_hits` | Cache hit count |
+| `nl2sql_local_model_calls` | Ollama API calls |
+| `nl2sql_cloud_model_calls` | Cloud API calls |
+
+## See Also
+
+- [README.md](README.md) - User documentation
+- [ARCHITECTURE.md](ARCHITECTURE.md) - System architecture
+- [TESTING.md](TESTING.md) - Testing guide
diff --git a/doc/NL2SQL/ARCHITECTURE.md b/doc/NL2SQL/ARCHITECTURE.md
new file mode 100644
index 000000000..29b3fab99
--- /dev/null
+++ b/doc/NL2SQL/ARCHITECTURE.md
@@ -0,0 +1,434 @@
+# NL2SQL Architecture
+
+## System Overview
+
+```
+Client Query (NL2SQL: ...)
+    ↓
+MySQL_Session (detects prefix)
+    ↓
+AI_Features_Manager::get_nl2sql()
+    ↓
+NL2SQL_Converter::convert()
+    ├─ check_vector_cache()  ← sqlite-vec similarity search
+    ├─ build_prompt()         ← Schema context via MySQL_Tool_Handler
+    ├─ select_model()         ← Ollama/OpenAI/Anthropic selection
+    ├─ call_llm_api()         ← libcurl HTTP request
+    └─ validate_sql()         ← Keyword validation
+    ↓
+Return Resultset (sql_query, confidence, ...)
+```
+
+## Components
+
+### 1. NL2SQL_Converter
+
+**Location**: `include/NL2SQL_Converter.h`, `lib/NL2SQL_Converter.cpp`
+
+Main class coordinating the NL2SQL conversion pipeline.
+
+**Key Methods:**
+- `convert()`: Main entry point for conversion
+- `check_vector_cache()`: Semantic similarity search
+- `build_prompt()`: Construct LLM prompt with schema context
+- `select_model()`: Choose best LLM provider
+- `call_ollama()`, `call_openai()`, `call_anthropic()`: LLM API calls
+
+**Configuration:**
+```cpp
+struct {
+    bool enabled;
+    char* query_prefix;              // Default: "NL2SQL:"
+    char* model_provider;            // Default: "ollama"
+    char* ollama_model;              // Default: "llama3.2"
+    char* openai_model;              // Default: "gpt-4o-mini"
+    char* anthropic_model;           // Default: "claude-3-haiku"
+    int cache_similarity_threshold;  // Default: 85
+    int timeout_ms;                  // Default: 30000
+    char* openai_key;
+    char* anthropic_key;
+    bool prefer_local;
+} config;
+```
+
+### 2. LLM_Clients
+
+**Location**: `lib/LLM_Clients.cpp`
+
+HTTP clients for each LLM provider using libcurl.
+
+#### Ollama (Local)
+
+**Endpoint**: `POST http://localhost:11434/api/generate`
+
+**Request Format:**
+```json
+{
+  "model": "llama3.2",
+  "prompt": "Convert to SQL: Show top customers",
+  "stream": false,
+  "options": {
+    "temperature": 0.1,
+    "num_predict": 500
+  }
+}
+```
+
+**Response Format:**
+```json
+{
+  "response": "SELECT * FROM customers ORDER BY revenue DESC LIMIT 10",
+  "model": "llama3.2",
+  "total_duration": 123456789
+}
+```
+
+#### OpenAI (Cloud)
+
+**Endpoint**: `POST https://api.openai.com/v1/chat/completions`
+
+**Headers:**
+- `Content-Type: application/json`
+- `Authorization: Bearer sk-...`
+
+**Request Format:**
+```json
+{
+  "model": "gpt-4o-mini",
+  "messages": [
+    {"role": "system", "content": "You are a SQL expert..."},
+    {"role": "user", "content": "Convert to SQL: Show top customers"}
+  ],
+  "temperature": 0.1,
+  "max_tokens": 500
+}
+```
+
+**Response Format:**
+```json
+{
+  "choices": [{
+    "message": {
+      "content": "SELECT * FROM customers ORDER BY revenue DESC LIMIT 10",
+      "role": "assistant"
+    },
+    "finish_reason": "stop"
+  }],
+  "usage": {"total_tokens": 123}
+}
+```
+
+#### Anthropic (Cloud)
+
+**Endpoint**: `POST https://api.anthropic.com/v1/messages`
+
+**Headers:**
+- `Content-Type: application/json`
+- `x-api-key: sk-ant-...`
+- `anthropic-version: 2023-06-01`
+
+**Request Format:**
+```json
+{
+  "model": "claude-3-haiku-20240307",
+  "max_tokens": 500,
+  "messages": [
+    {"role": "user", "content": "Convert to SQL: Show top customers"}
+  ],
+  "system": "You are a SQL expert...",
+  "temperature": 0.1
+}
+```
+
+**Response Format:**
+```json
+{
+  "content": [{"type": "text", "text": "SELECT * FROM customers..."}],
+  "model": "claude-3-haiku-20240307",
+  "usage": {"input_tokens": 10, "output_tokens": 20}
+}
+```
+
+### 3. Vector Cache
+
+**Location**: Uses `SQLite3DB` with sqlite-vec extension
+
+**Tables:**
+
+```sql
+-- Cache entries
+CREATE TABLE nl2sql_cache (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    natural_language TEXT NOT NULL,
+    sql_query TEXT NOT NULL,
+    model_provider TEXT,
+    confidence REAL,
+    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
+);
+
+-- Virtual table for similarity search
+CREATE VIRTUAL TABLE nl2sql_cache_vec USING vec0(
+    embedding FLOAT[1536],  -- Dimension depends on embedding model
+    id INTEGER PRIMARY KEY
+);
+```
+
+**Similarity Search:**
+```sql
+SELECT nc.sql_query, nc.confidence, distance
+FROM nl2sql_cache_vec
+JOIN nl2sql_cache nc ON nl2sql_cache_vec.id = nc.id
+WHERE embedding MATCH ?
+AND k = 10  -- Return top 10 matches
+ORDER BY distance
+LIMIT 1;
+```
+
+### 4. MySQL_Session Integration
+
+**Location**: `lib/MySQL_Session.cpp` (around line ~6867)
+
+Query interception flow:
+
+1. Detect `NL2SQL:` prefix in query
+2. Extract natural language text
+3. Call `GloAI->get_nl2sql()->convert()`
+4. Return generated SQL as resultset
+5. User can review and execute
+
+### 5. AI_Features_Manager
+
+**Location**: `include/AI_Features_Manager.h`, `lib/AI_Features_Manager.cpp`
+
+Coordinates all AI features including NL2SQL.
+
+**Responsibilities:**
+- Initialize vector database
+- Create and manage NL2SQL_Converter instance
+- Handle configuration variables with `ai_nl2sql_` prefix
+- Provide thread-safe access to components
+
+## Flow Diagrams
+
+### Conversion Flow
+
+```
+┌─────────────────┐
+│ NL2SQL Request  │
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────────────┐
+│ Check Vector Cache      │
+│ - Generate embedding    │
+│ - Similarity search     │
+└────────┬────────────────┘
+         │
+    ┌────┴────┐
+    │ Cache   │ No ───────────────┐
+    │ Hit?    │                    │
+    └────┬────┘                    │
+         │ Yes                     │
+         ▼                          │
+    Return Cached                   ▼
+┌──────────────────┐      ┌─────────────────┐
+│   Build Prompt   │      │ Select Model    │
+│ - System role    │      │ - Latency       │
+│ - Schema context │      │ - Preference    │
+│ - User query     │      │ - API keys      │
+└────────┬─────────┘      └────────┬────────┘
+         │                         │
+         └─────────┬───────────────┘
+                   ▼
+         ┌──────────────────┐
+         │   Call LLM API   │
+         │ - libcurl HTTP   │
+         │ - JSON parse     │
+         └────────┬─────────┘
+                  │
+                  ▼
+         ┌──────────────────┐
+         │  Validate SQL    │
+         │ - Keyword check  │
+         │ - Clean output   │
+         └────────┬─────────┘
+                  │
+                  ▼
+         ┌──────────────────┐
+         │ Store in Cache   │
+         │ - Embed query    │
+         │ - Save result    │
+         └────────┬─────────┘
+                  │
+                  ▼
+         ┌──────────────────┐
+         │  Return Result   │
+         │ - sql_query      │
+         │ - confidence     │
+         │ - explanation    │
+         └──────────────────┘
+```
+
+### Model Selection Logic
+
+```
+┌─────────────────────────────────┐
+│     Start: Select Model         │
+└────────────┬────────────────────┘
+             │
+             ▼
+    ┌─────────────────────┐
+    │ max_latency_ms <    │──── Yes ────┐
+    │ 500ms?              │              │
+    └────────┬────────────┘              │
+             │ No                        │
+             ▼                           │
+    ┌─────────────────────┐              │
+    │ Check provider      │              │
+    │ preference          │              │
+    └────────┬────────────┘              │
+             │                           │
+      ┌──────┴──────┐                   │
+      │             │                   │
+      ▼             ▼                   │
+   OpenAI      Anthropic             Ollama
+      │             │                   │
+      ▼             ▼                   │
+ ┌─────────┐  ┌─────────┐         ┌─────────┐
+ │ API key │  │ API key │         │ Return  │
+ │ set?    │  │ set?    │         │ OLLAMA  │
+ └────┬────┘  └────┬────┘         └─────────┘
+      │            │
+     Yes          Yes
+      │            │
+      └──────┬─────┘
+             │
+             ▼
+     ┌──────────────┐
+     │ Return cloud │
+     │ provider     │
+     └──────────────┘
+```
+
+## Data Structures
+
+### NL2SQLRequest
+
+```cpp
+struct NL2SQLRequest {
+    std::string natural_language;           // Input query
+    std::string schema_name;                 // Current schema
+    int max_latency_ms;                      // Latency requirement
+    bool allow_cache;                        // Enable cache lookup
+    std::vector<std::string> context_tables; // Optional table hints
+};
+```
+
+### NL2SQLResult
+
+```cpp
+struct NL2SQLResult {
+    std::string sql_query;                  // Generated SQL
+    float confidence;                        // 0.0-1.0 score
+    std::string explanation;                 // Model info
+    std::vector<std::string> tables_used;    // Referenced tables
+    bool cached;                             // From cache
+    int64_t cache_id;                        // Cache entry ID
+};
+```
+
+## Configuration Management
+
+### Variable Namespacing
+
+All NL2SQL variables use `ai_nl2sql_` prefix:
+
+```
+ai_nl2sql_enabled
+ai_nl2sql_query_prefix
+ai_nl2sql_model_provider
+ai_nl2sql_ollama_model
+ai_nl2sql_openai_model
+ai_nl2sql_anthropic_model
+ai_nl2sql_cache_similarity_threshold
+ai_nl2sql_timeout_ms
+ai_nl2sql_openai_key
+ai_nl2sql_anthropic_key
+ai_nl2sql_prefer_local
+```
+
+### Variable Persistence
+
+```
+Runtime (memory)
+    ↑
+    | LOAD MYSQL VARIABLES TO RUNTIME
+    |
+    | SET ai_nl2sql_... = 'value'
+    |
+    | SAVE MYSQL VARIABLES TO DISK
+    ↓
+Disk (config file)
+```
+
+## Thread Safety
+
+- **NL2SQL_Converter**: NOT thread-safe by itself
+- **AI_Features_Manager**: Provides thread-safe access via `wrlock()`/`wrunlock()`
+- **Vector Cache**: Thread-safe via SQLite mutex
+
+## Error Handling
+
+### Error Categories
+
+1. **LLM API Errors**: Timeout, connection failure, auth failure
+   - Fallback: Try next available provider
+   - Return: Empty SQL with error in explanation
+
+2. **SQL Validation Failures**: Doesn't look like SQL
+   - Return: SQL with warning comment
+   - Confidence: Low (0.3)
+
+3. **Cache Errors**: Database failures
+   - Fallback: Continue without cache
+   - Log: Warning in ProxySQL log
+
+### Logging
+
+All NL2SQL operations log to `proxysql.log`:
+
+```
+NL2SQL: Converting query: Show top customers
+NL2SQL: Selecting local Ollama due to latency constraint
+NL2SQL: Calling Ollama with model: llama3.2
+NL2SQL: Conversion complete. Confidence: 0.85
+```
+
+## Performance Considerations
+
+### Optimization Strategies
+
+1. **Caching**: Enable for repeated queries
+2. **Local First**: Prefer Ollama for lower latency
+3. **Timeout**: Set appropriate `ai_nl2sql_timeout_ms`
+4. **Batch Requests**: Not yet implemented (planned)
+
+### Resource Usage
+
+- **Memory**: Vector cache grows with usage
+- **Network**: HTTP requests for each cache miss
+- **CPU**: Embedding generation for cache entries
+
+## Future Enhancements
+
+- **Phase 3**: Full vector cache implementation
+- **Phase 3**: Schema context retrieval via MySQL_Tool_Handler
+- **Phase 4**: Async conversion API
+- **Phase 5**: Batch query conversion
+- **Phase 6**: Custom fine-tuned models
+
+## See Also
+
+- [README.md](README.md) - User documentation
+- [API.md](API.md) - Complete API reference
+- [TESTING.md](TESTING.md) - Testing guide
diff --git a/doc/NL2SQL/README.md b/doc/NL2SQL/README.md
new file mode 100644
index 000000000..86b16e9f5
--- /dev/null
+++ b/doc/NL2SQL/README.md
@@ -0,0 +1,220 @@
+# NL2SQL - Natural Language to SQL for ProxySQL
+
+## Overview
+
+NL2SQL (Natural Language to SQL) is a ProxySQL feature that converts natural language questions into SQL queries using Large Language Models (LLMs).
+
+## Features
+
+- **Hybrid Deployment**: Local Ollama + Cloud APIs (OpenAI, Anthropic)
+- **Semantic Caching**: Vector-based cache for similar queries using sqlite-vec
+- **Schema Awareness**: Understands your database schema for better conversions
+- **Multi-Provider**: Switch between LLM providers seamlessly
+- **Security**: Generated SQL is returned for review before execution
+
+## Quick Start
+
+### 1. Enable NL2SQL
+
+```sql
+-- Via admin interface
+SET ai_nl2sql_enabled='true';
+LOAD MYSQL VARIABLES TO RUNTIME;
+```
+
+### 2. Configure LLM Provider
+
+**Using local Ollama (default):**
+
+```sql
+SET ai_nl2sql_model_provider='ollama';
+SET ai_nl2sql_ollama_model='llama3.2';
+LOAD MYSQL VARIABLES TO RUNTIME;
+```
+
+**Using OpenAI:**
+
+```sql
+SET ai_nl2sql_model_provider='openai';
+SET ai_nl2sql_openai_model='gpt-4o-mini';
+SET ai_nl2sql_openai_key='sk-...';
+LOAD MYSQL VARIABLES TO RUNTIME;
+```
+
+**Using Anthropic:**
+
+```sql
+SET ai_nl2sql_model_provider='anthropic';
+SET ai_nl2sql_anthropic_model='claude-3-haiku';
+SET ai_nl2sql_anthropic_key='sk-ant-...';
+LOAD MYSQL VARIABLES TO RUNTIME;
+```
+
+### 3. Use NL2SQL
+
+```sql
+-- In your SQL client, prefix your query with "NL2SQL:"
+mysql> SELECT * FROM runtime_mysql_servers WHERE variable_name='ai_nl2sql_enabled';
+
+-- Query converted to SQL
+mysql> NL2SQL: Show top 10 customers by revenue;
+```
+
+## Configuration
+
+### Variables
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `ai_nl2sql_enabled` | true | Enable/disable NL2SQL |
+| `ai_nl2sql_query_prefix` | NL2SQL: | Prefix for NL2SQL queries |
+| `ai_nl2sql_model_provider` | ollama | LLM provider (ollama/openai/anthropic) |
+| `ai_nl2sql_ollama_model` | llama3.2 | Ollama model name |
+| `ai_nl2sql_openai_model` | gpt-4o-mini | OpenAI model name |
+| `ai_nl2sql_anthropic_model` | claude-3-haiku | Anthropic model name |
+| `ai_nl2sql_cache_similarity_threshold` | 85 | Semantic similarity threshold (0-100) |
+| `ai_nl2sql_timeout_ms` | 30000 | LLM request timeout in milliseconds |
+| `ai_nl2sql_prefer_local` | true | Prefer local models when possible |
+
+### Model Selection
+
+The system automatically selects the best model based on:
+
+1. **Latency requirements**: Local Ollama for fast queries (< 500ms)
+2. **API key availability**: Falls back to Ollama if keys missing
+3. **User preference**: Respects `ai_nl2sql_model_provider` setting
+
+## Examples
+
+### Basic Queries
+
+```
+NL2SQL: Show all users
+NL2SQL: Find orders with amount > 100
+NL2SQL: Count customers by country
+```
+
+### Complex Queries
+
+```
+NL2SQL: Show top 5 customers by total order amount
+NL2SQL: Find customers who placed orders in the last 30 days
+NL2SQL: What is the average order value per month?
+```
+
+### Schema-Aware Queries
+
+```
+-- Switch to your schema first
+USE my_database;
+NL2SQL: List all products in the Electronics category
+NL2SQL: Find orders that contain specific products
+```
+
+### Results
+
+NL2SQL returns a resultset with:
+- `sql_query`: Generated SQL
+- `confidence`: 0.0-1.0 score
+- `explanation`: Which model was used
+- `cached`: Whether from semantic cache
+
+## Troubleshooting
+
+### NL2SQL returns empty result
+
+1. Check AI module is initialized:
+   ```sql
+   SELECT * FROM runtime_mysql_servers WHERE variable_name LIKE 'ai_%';
+   ```
+
+2. Verify LLM is accessible:
+   ```bash
+   # For Ollama
+   curl http://localhost:11434/api/tags
+
+   # For cloud APIs, check your API keys
+   ```
+
+3. Check logs:
+   ```bash
+   tail -f proxysql.log | grep NL2SQL
+   ```
+
+### Poor quality SQL
+
+1. **Try a different model:**
+   ```sql
+   SET ai_nl2sql_ollama_model='llama3.3';
+   ```
+
+2. **Increase timeout for complex queries:**
+   ```sql
+   SET ai_nl2sql_timeout_ms=60000;
+   ```
+
+3. **Check confidence score:**
+   - High confidence (> 0.7): Generally reliable
+   - Medium confidence (0.4-0.7): Review before using
+   - Low confidence (< 0.4): May need manual correction
+
+### Cache Issues
+
+```sql
+-- Clear cache (Phase 3 feature)
+-- TODO: Add cache clearing command
+
+-- Check cache stats
+SELECT * FROM stats_ai_nl2sql_cache;
+```
+
+## Performance
+
+| Operation | Typical Latency |
+|-----------|-----------------|
+| Local Ollama | ~1-2 seconds |
+| Cloud API | ~2-5 seconds |
+| Cache hit | < 50ms |
+
+**Tips for better performance:**
+- Use local Ollama for faster responses
+- Enable caching for repeated queries
+- Use `ai_nl2sql_timeout_ms` to limit wait time
+- Consider pre-warming cache with common queries
+
+## Security
+
+### Important Notes
+
+- NL2SQL queries are **NOT executed automatically**
+- Generated SQL is returned for **review first**
+- Always validate generated SQL before execution
+- Keep API keys secure (use environment variables)
+
+### Best Practices
+
+1. **Review generated SQL**: Always check the output before running
+2. **Use read-only accounts**: Test with limited permissions first
+3. **Monitor confidence scores**: Low confidence may indicate errors
+4. **Keep API keys secure**: Don't commit them to version control
+5. **Use caching wisely**: Balance speed vs. data freshness
+
+## API Reference
+
+For complete API documentation, see [API.md](API.md).
+
+## Architecture
+
+For system architecture details, see [ARCHITECTURE.md](ARCHITECTURE.md).
+
+## Testing
+
+For testing information, see [TESTING.md](TESTING.md).
+
+## Version History
+
+- **0.1.0** (2025-01-16): Initial release with Ollama, OpenAI, Anthropic support
+
+## License
+
+This feature is part of ProxySQL and follows the same license.
diff --git a/doc/NL2SQL/TESTING.md b/doc/NL2SQL/TESTING.md
new file mode 100644
index 000000000..2b5d1a865
--- /dev/null
+++ b/doc/NL2SQL/TESTING.md
@@ -0,0 +1,411 @@
+# NL2SQL Testing Guide
+
+## Test Suite Overview
+
+| Test Type | Location | Purpose | LLM Required |
+|-----------|----------|---------|--------------|
+| Unit Tests | `test/tap/tests/nl2sql_*.cpp` | Test individual components | Mocked |
+| Integration | `test/tap/tests/nl2sql_integration-t.cpp` | Test with real database | Mocked/Live |
+| E2E | `scripts/mcp/test_nl2sql_e2e.sh` | Complete workflow | Live |
+| MCP Tools | `scripts/mcp/test_nl2sql_tools.sh` | MCP protocol | Live |
+
+## Test Infrastructure
+
+### TAP Framework
+
+ProxySQL uses the Test Anything Protocol (TAP) for C++ tests.
+
+**Key Functions:**
+```cpp
+plan(number_of_tests);     // Declare how many tests
+ok(condition, description); // Test with description
+diag(message);              // Print diagnostic message
+skip(count, reason);        // Skip tests
+exit_status();              // Return proper exit code
+```
+
+**Example:**
+```cpp
+#include "tap.h"
+
+int main() {
+    plan(3);
+    ok(1 + 1 == 2, "Basic math works");
+    ok(true, "Always true");
+    diag("This is a diagnostic message");
+    return exit_status();
+}
+```
+
+### CommandLine Helper
+
+Gets test connection parameters from environment:
+
+```cpp
+CommandLine cl;
+if (cl.getEnv()) {
+    diag("Failed to get environment");
+    return -1;
+}
+
+// cl.host, cl.admin_username, cl.admin_password, cl.admin_port
+```
+
+## Running Tests
+
+### Unit Tests
+
+```bash
+cd test/tap
+
+# Build specific test
+make nl2sql_unit_base-t
+
+# Run the test
+./nl2sql_unit_base
+
+# Build all NL2SQL tests
+make nl2sql_*
+```
+
+### Integration Tests
+
+```bash
+cd test/tap
+make nl2sql_integration-t
+./nl2sql_integration
+```
+
+### E2E Tests
+
+```bash
+# With mocked LLM (faster)
+./scripts/mcp/test_nl2sql_e2e.sh --mock
+
+# With live LLM
+./scripts/mcp/test_nl2sql_e2e.sh --live
+```
+
+### All Tests
+
+```bash
+# Run all NL2SQL tests
+make test_nl2sql
+
+# Run with verbose output
+PROXYSQL_VERBOSE=1 make test_nl2sql
+```
+
+## Test Coverage
+
+### Unit Tests (`nl2sql_unit_base-t.cpp`)
+
+- [x] Initialization
+- [x] Basic conversion (mocked)
+- [x] Configuration management
+- [x] Variable persistence
+- [x] Error handling
+
+### Prompt Builder Tests (`nl2sql_prompt_builder-t.cpp`)
+
+- [x] Basic prompt construction
+- [x] Schema context inclusion
+- [x] System instruction formatting
+- [x] Edge cases (empty, special characters)
+- [x] Prompt structure validation
+
+### Model Selection Tests (`nl2sql_model_selection-t.cpp`)
+
+- [x] Latency-based selection
+- [x] Provider preference handling
+- [x] API key fallback logic
+- [x] Default selection
+- [x] Configuration integration
+
+### Integration Tests (`nl2sql_integration-t.cpp`)
+
+- [ ] Schema-aware conversion
+- [ ] Multi-table queries
+- [ ] Complex SQL patterns
+- [ ] Error recovery
+
+### E2E Tests (`test_nl2sql_e2e.sh`)
+
+- [x] Simple SELECT
+- [x] WHERE conditions
+- [x] JOIN queries
+- [x] Aggregations
+- [x] Date handling
+
+## Writing New Tests
+
+### Test File Template
+
+```cpp
+/**
+ * @file nl2sql_your_feature-t.cpp
+ * @brief TAP tests for your feature
+ *
+ * @date 2025-01-16
+ */
+
+#include <algorithm>
+#include <string>
+#include <string.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <vector>
+
+#include "mysql.h"
+#include "mysqld_error.h"
+
+#include "tap.h"
+#include "command_line.h"
+#include "utils.h"
+
+using std::string;
+
+MYSQL* g_admin = NULL;
+
+// ============================================================================
+// Helper Functions
+// ============================================================================
+
+string get_variable(const char* name) {
+    // Implementation
+}
+
+bool set_variable(const char* name, const char* value) {
+    // Implementation
+}
+
+// ============================================================================
+// Test: Your Test Category
+// ============================================================================
+
+void test_your_category() {
+    diag("=== Your Test Category ===");
+
+    // Test 1
+    ok(condition, "Test description");
+
+    // Test 2
+    ok(condition, "Another test");
+}
+
+// ============================================================================
+// Main
+// ============================================================================
+
+int main(int argc, char** argv) {
+    CommandLine cl;
+    if (cl.getEnv()) {
+        diag("Error getting environment");
+        return exit_status();
+    }
+
+    g_admin = mysql_init(NULL);
+    if (!mysql_real_connect(g_admin, cl.host, cl.admin_username,
+                            cl.admin_password, NULL, cl.admin_port, NULL, 0)) {
+        diag("Failed to connect to admin");
+        return exit_status();
+    }
+
+    plan(number_of_tests);
+
+    test_your_category();
+
+    mysql_close(g_admin);
+    return exit_status();
+}
+```
+
+### Test Naming Conventions
+
+- **Files**: `nl2sql_feature_name-t.cpp`
+- **Functions**: `test_feature_category()`
+- **Descriptions**: "Feature does something"
+
+### Test Organization
+
+```cpp
+// Section dividers
+// ============================================================================
+// Section Name
+// ============================================================================
+
+// Test function with docstring
+/**
+ * @test Test name
+ * @description What it tests
+ * @expected What should happen
+ */
+void test_something() {
+    diag("=== Test Category ===");
+    // Tests...
+}
+```
+
+### Best Practices
+
+1. **Use diag() for section headers**:
+   ```cpp
+   diag("=== Configuration Tests ===");
+   ```
+
+2. **Provide meaningful test descriptions**:
+   ```cpp
+   ok(result == expected, "Variable set to 'value' reflects in runtime");
+   ```
+
+3. **Clean up after tests**:
+   ```cpp
+   // Restore original values
+   set_variable("model", orig_value.c_str());
+   ```
+
+4. **Handle both stub and real implementations**:
+   ```cpp
+   ok(value == expected || value.empty(),
+      "Value matches expected or is empty (stub)");
+   ```
+
+## Mocking LLM Responses
+
+For fast unit tests, mock LLM responses:
+
+```cpp
+string mock_llm_response(const string& query) {
+    if (query.find("SELECT") != string::npos) {
+        return "SELECT * FROM table";
+    }
+    // Other patterns...
+}
+```
+
+## Debugging Tests
+
+### Enable Verbose Output
+
+```bash
+# Verbose TAP output
+./nl2sql_unit_base -v
+
+# ProxySQL debug output
+PROXYSQL_VERBOSE=1 ./nl2sql_unit_base
+```
+
+### GDB Debugging
+
+```bash
+gdb ./nl2sql_unit_base
+(gdb) break main
+(gdb) run
+(gdb) backtrace
+```
+
+### SQL Debugging
+
+```cpp
+// Print generated SQL
+diag("Generated SQL: %s", sql.c_str());
+
+// Check MySQL errors
+if (mysql_query(admin, query)) {
+    diag("MySQL error: %s", mysql_error(admin));
+}
+```
+
+## Continuous Integration
+
+### GitHub Actions (Planned)
+
+```yaml
+name: NL2SQL Tests
+on: [push, pull_request]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+      - name: Build ProxySQL
+        run: make
+      - name: Run NL2SQL Tests
+        run: make test_nl2sql
+```
+
+## Test Data
+
+### Sample Schema
+
+Tests use a standard test schema:
+
+```sql
+CREATE TABLE customers (
+    id INT PRIMARY KEY AUTO_INCREMENT,
+    name VARCHAR(100),
+    country VARCHAR(50),
+    created_at DATE
+);
+
+CREATE TABLE orders (
+    id INT PRIMARY KEY AUTO_INCREMENT,
+    customer_id INT,
+    total DECIMAL(10,2),
+    status VARCHAR(20),
+    FOREIGN KEY (customer_id) REFERENCES customers(id)
+);
+```
+
+### Sample Queries
+
+```sql
+-- Simple
+NL2SQL: Show all customers
+
+-- With conditions
+NL2SQL: Find customers from USA
+
+-- JOIN
+NL2SQL: Show orders with customer names
+
+-- Aggregation
+NL2SQL: Count customers by country
+```
+
+## Performance Testing
+
+### Benchmark Script
+
+```bash
+#!/bin/bash
+# benchmark_nl2sql.sh
+
+for i in {1..100}; do
+    start=$(date +%s%N)
+    mysql -h 127.0.0.1 -P 6033 -e "NL2SQL: Show top customers"
+    end=$(date +%s%N)
+    echo $((end - start))
+done | awk '{sum+=$1} END {print sum/NR " ns average"}'
+```
+
+## Known Issues
+
+1. **Stub Implementation**: Many features return empty/placeholder values
+2. **Live LLM Required**: Some tests need Ollama running
+3. **Timing Dependent**: Cache tests may fail on slow systems
+
+## Contributing Tests
+
+When contributing new tests:
+
+1. Follow the template above
+2. Add to Makefile if needed
+3. Update this documentation
+4. Ensure tests pass with `make test_nl2sql`
+
+## See Also
+
+- [README.md](README.md) - User documentation
+- [ARCHITECTURE.md](ARCHITECTURE.md) - System architecture
+- [API.md](API.md) - API reference