diff --git a/doc/GENAI.md b/doc/GENAI.md
new file mode 100644
index 000000000..66d5218a4
--- /dev/null
+++ b/doc/GENAI.md
@@ -0,0 +1,490 @@
+# GenAI Module Documentation
+
+## Overview
+
+The **GenAI (Generative AI) Module** in ProxySQL provides asynchronous, non-blocking access to embedding generation and document reranking services. It enables ProxySQL to interact with LLM services (like llama-server) for vector embeddings and semantic search operations without blocking MySQL threads.
+
+## Version
+
+- **Module Version**: 0.1.0
+- **Last Updated**: 2025-01-10
+- **Branch**: v3.1-vec_genAI_module
+
+## Architecture
+
+### Async Design
+
+The GenAI module uses a **non-blocking async architecture** based on socketpair IPC and epoll event notification:
+
+```
+┌─────────────────┐         socketpair         ┌─────────────────┐
+│  MySQL_Session  │◄────────────────────────────►│  GenAI Module   │
+│  (MySQL Thread) │  fds[0]              fds[1]  │  Listener Loop  │
+└────────┬────────┘                            └────────┬────────┘
+         │                                               │
+         │ epoll                                         │ queue
+         │                                               │
+         └── epoll_wait() ────────────────────────────────┘
+                     (GenAI Response Ready)
+```
+
+### Key Components
+
+1. **MySQL_Session** - Client-facing interface that receives GENAI: queries
+2. **GenAI Listener Thread** - Monitors socketpair fds via epoll for incoming requests
+3. **GenAI Worker Threads** - Thread pool that processes requests (blocking HTTP calls)
+4. **Socketpair Communication** - Bidirectional IPC between MySQL and GenAI modules
+
+### Communication Protocol
+
+#### Request Format (MySQL → GenAI)
+
+```c
+struct GenAI_RequestHeader {
+    uint64_t request_id;      // Client's correlation ID
+    uint32_t operation;       // GENAI_OP_EMBEDDING, GENAI_OP_RERANK, or GENAI_OP_JSON
+    uint32_t query_len;       // Length of JSON query that follows
+    uint32_t flags;           // Reserved (must be 0)
+    uint32_t top_n;           // For rerank: max results (0 = all)
+};
+// Followed by: JSON query (query_len bytes)
+```
+
+#### Response Format (GenAI → MySQL)
+
+```c
+struct GenAI_ResponseHeader {
+    uint64_t request_id;        // Echo of client's request ID
+    uint32_t status_code;       // 0 = success, >0 = error
+    uint32_t result_len;        // Length of JSON result that follows
+    uint32_t processing_time_ms;// Time taken by GenAI worker
+    uint64_t result_ptr;        // Reserved (must be 0)
+    uint32_t result_count;      // Number of results
+    uint32_t reserved;          // Reserved (must be 0)
+};
+// Followed by: JSON result (result_len bytes)
+```
+
+## Configuration Variables
+
+### Thread Configuration
+
+| Variable | Type | Default | Description |
+|----------|------|---------|-------------|
+| `genai-threads` | int | 4 | Number of GenAI worker threads (1-256) |
+
+### Service Endpoints
+
+| Variable | Type | Default | Description |
+|----------|------|---------|-------------|
+| `genai-embedding_uri` | string | `http://127.0.0.1:8013/embedding` | Embedding service endpoint |
+| `genai-rerank_uri` | string | `http://127.0.0.1:8012/rerank` | Reranking service endpoint |
+
+### Timeouts
+
+| Variable | Type | Default | Description |
+|----------|------|---------|-------------|
+| `genai-embedding_timeout_ms` | int | 30000 | Embedding request timeout (100-300000ms) |
+| `genai-rerank_timeout_ms` | int | 30000 | Reranking request timeout (100-300000ms) |
+
+### Admin Commands
+
+```sql
+-- Load/Save GenAI variables
+LOAD GENAI VARIABLES TO RUNTIME;
+SAVE GENAI VARIABLES FROM RUNTIME;
+LOAD GENAI VARIABLES FROM DISK;
+SAVE GENAI VARIABLES TO DISK;
+
+-- Set variables
+SET genai-threads = 8;
+SET genai-embedding_uri = 'http://localhost:8080/embed';
+SET genai-rerank_uri = 'http://localhost:8081/rerank';
+
+-- View variables
+SELECT @@genai-threads;
+SHOW VARIABLES LIKE 'genai-%';
+
+-- Checksum
+CHECKSUM GENAI VARIABLES;
+```
+
+## Query Syntax
+
+### GENAI: Query Format
+
+GenAI queries use the special `GENAI:` prefix followed by JSON:
+
+```sql
+GENAI: {"type": "embed", "documents": ["text1", "text2"]}
+GENAI: {"type": "rerank", "query": "search text", "documents": ["doc1", "doc2"]}
+```
+
+### Supported Operations
+
+#### 1. Embedding
+
+Generate vector embeddings for documents:
+
+```sql
+GENAI: {
+    "type": "embed",
+    "documents": [
+        "Machine learning is a subset of AI.",
+        "Deep learning uses neural networks."
+    ]
+}
+```
+
+**Response:**
+```
++------------------------------------------+
+| embedding                                |
++------------------------------------------+
+| 0.123, -0.456, 0.789, ...               |
+| 0.234, -0.567, 0.890, ...               |
++------------------------------------------+
+```
+
+#### 2. Reranking
+
+Rerank documents by relevance to a query:
+
+```sql
+GENAI: {
+    "type": "rerank",
+    "query": "What is machine learning?",
+    "documents": [
+        "Machine learning is a subset of artificial intelligence.",
+        "The capital of France is Paris.",
+        "Deep learning uses neural networks."
+    ],
+    "top_n": 2,
+    "columns": 3
+}
+```
+
+**Parameters:**
+- `query` (required): Search query text
+- `documents` (required): Array of documents to rerank
+- `top_n` (optional): Maximum results to return (0 = all, default: all)
+- `columns` (optional): 2 = {index, score}, 3 = {index, score, document} (default: 3)
+
+**Response:**
+```
++-------+-------+----------------------------------------------+
+| index | score | document                                    |
++-------+-------+----------------------------------------------+
+| 0     | 0.95  | Machine learning is a subset of AI...        |
+| 2     | 0.82  | Deep learning uses neural networks...        |
++-------+-------+----------------------------------------------+
+```
+
+### Response Format
+
+All GenAI queries return results in MySQL resultset format with:
+- `columns`: Array of column names
+- `rows`: Array of row data
+
+**Success:**
+```json
+{
+    "columns": ["index", "score", "document"],
+    "rows": [
+        [0, 0.95, "Most relevant document"],
+        [2, 0.82, "Second most relevant"]
+    ]
+}
+```
+
+**Error:**
+```json
+{
+    "error": "Error message describing what went wrong"
+}
+```
+
+## Usage Examples
+
+### Basic Embedding
+
+```sql
+-- Generate embedding for a single document
+GENAI: {"type": "embed", "documents": ["Hello, world!"]};
+
+-- Batch embedding for multiple documents
+GENAI: {
+    "type": "embed",
+    "documents": ["doc1", "doc2", "doc3"]
+};
+```
+
+### Basic Reranking
+
+```sql
+-- Find most relevant documents
+GENAI: {
+    "type": "rerank",
+    "query": "database optimization techniques",
+    "documents": [
+        "How to bake a cake",
+        "Indexing strategies for MySQL",
+        "Python programming basics",
+        "Query optimization in ProxySQL"
+    ]
+};
+```
+
+### Top N Results
+
+```sql
+-- Get only top 3 most relevant documents
+GENAI: {
+    "type": "rerank",
+    "query": "best practices for SQL",
+    "documents": ["doc1", "doc2", "doc3", "doc4", "doc5"],
+    "top_n": 3
+};
+```
+
+### Index and Score Only
+
+```sql
+-- Get only relevance scores (no document text)
+GENAI: {
+    "type": "rerank",
+    "query": "test query",
+    "documents": ["doc1", "doc2"],
+    "columns": 2
+};
+```
+
+## Integration with ProxySQL
+
+### Session Lifecycle
+
+1. **Session Start**: MySQL session creates `genai_epoll_fd_` for monitoring GenAI responses
+2. **Query Received**: `GENAI:` query detected in `handler___status_WAITING_CLIENT_DATA___STATE_SLEEP()`
+3. **Async Send**: Socketpair created, request sent, returns immediately
+4. **Main Loop**: `check_genai_events()` called on each iteration
+5. **Response Ready**: `handle_genai_response()` processes response
+6. **Result Sent**: MySQL result packet sent to client
+7. **Cleanup**: Socketpair closed, resources freed
+
+### Main Loop Integration
+
+The GenAI event checking is integrated into the main MySQL handler loop:
+
+```cpp
+handler_again:
+    switch (status) {
+        case WAITING_CLIENT_DATA:
+            handler___status_WAITING_CLIENT_DATA();
+#ifdef epoll_create1
+            // Check for GenAI responses before processing new client data
+            if (check_genai_events()) {
+                goto handler_again;  // Process more responses
+            }
+#endif
+            break;
+    }
+```
+
+## Backend Services
+
+### llama-server Integration
+
+The GenAI module is designed to work with [llama-server](https://github.com/ggerganov/llama.cpp), a high-performance C++ inference server for LLaMA models.
+
+#### Starting llama-server
+
+```bash
+# Start embedding server
+./llama-server \
+    --model /path/to/nomic-embed-text-v1.5.gguf \
+    --port 8013 \
+    --embedding \
+    --ctx-size 512
+
+# Start reranking server (using same model)
+./llama-server \
+    --model /path/to/nomic-embed-text-v1.5.gguf \
+    --port 8012 \
+    --ctx-size 512
+```
+
+#### API Compatibility
+
+The GenAI module expects:
+- **Embedding endpoint**: `POST /embedding` with JSON request
+- **Rerank endpoint**: `POST /rerank` with JSON request
+
+Compatible with:
+- llama-server
+- OpenAI-compatible embedding APIs
+- Custom services with matching request/response format
+
+## Testing
+
+### TAP Test Suite
+
+Comprehensive TAP tests are available in `test/tap/tests/genai_async-t.cpp`:
+
+```bash
+cd test/tap/tests
+make genai_async-t
+./genai_async-t
+```
+
+**Test Coverage:**
+- Single async requests
+- Sequential requests (embedding and rerank)
+- Batch requests (10+ documents)
+- Mixed embedding and rerank
+- Request/response matching
+- Error handling (invalid JSON, missing fields)
+- Special characters (quotes, unicode, etc.)
+- Large documents (5KB+)
+- `top_n` and `columns` parameters
+- Concurrent connections
+
+### Manual Testing
+
+```sql
+-- Test embedding
+mysql> GENAI: {"type": "embed", "documents": ["test document"]};
+
+-- Test reranking
+mysql> GENAI: {
+    ->   "type": "rerank",
+    ->   "query": "test query",
+    ->   "documents": ["doc1", "doc2", "doc3"]
+    -> };
+```
+
+## Performance Characteristics
+
+### Non-Blocking Behavior
+
+- **MySQL threads**: Return immediately after sending request (~1ms)
+- **GenAI workers**: Handle blocking HTTP calls (10-100ms typical)
+- **Throughput**: Limited by GenAI service capacity and worker thread count
+
+### Resource Usage
+
+- **Per request**: 1 socketpair (2 file descriptors)
+- **Memory**: Request metadata + pending response storage
+- **Worker threads**: Configurable via `genai-threads` (default: 4)
+
+### Scalability
+
+- **Concurrent requests**: Limited by `genai-threads` and GenAI service capacity
+- **Request queue**: Unlimited (pending requests stored in session map)
+- **Recommended**: Set `genai-threads` to match expected concurrency
+
+## Error Handling
+
+### Common Errors
+
+| Error | Cause | Solution |
+|-------|-------|----------|
+| `Failed to create GenAI communication channel` | Socketpair creation failed | Check system limits (ulimit -n) |
+| `Failed to register with GenAI module` | GenAI module not initialized | Run `LOAD GENAI VARIABLES TO RUNTIME` |
+| `Failed to send request to GenAI module` | Write error on socketpair | Check connection stability |
+| `GenAI module not initialized` | GenAI threads not started | Set `genai-threads > 0` and reload |
+
+### Timeout Handling
+
+Requests exceeding `genai-embedding_timeout_ms` or `genai-rerank_timeout_ms` will fail with:
+- Status code > 0 in response header
+- Error message in JSON result
+- Socketpair cleanup
+
+## Monitoring
+
+### Status Variables
+
+```sql
+-- Check GenAI module status (not yet implemented, planned)
+SHOW STATUS LIKE 'genai-%';
+```
+
+**Planned status variables:**
+- `genai_threads_initialized`: Number of worker threads running
+- `genai_active_requests`: Currently processing requests
+- `genai_completed_requests`: Total successful requests
+- `genai_failed_requests`: Total failed requests
+
+### Logging
+
+GenAI operations log at debug level:
+
+```bash
+# Enable GenAI debug logging
+SET mysql-debug = 1;
+
+# Check logs
+tail -f proxysql.log | grep GenAI
+```
+
+## Limitations
+
+### Current Limitations
+
+1. **document_from_sql**: Not yet implemented (requires MySQL connection handling in workers)
+2. **Shared memory**: Result pointer field reserved for future optimization
+3. **Request size**: Limited by socket buffer size (typically 64KB-256KB)
+
+### Platform Requirements
+
+- **Epoll support**: Linux systems (kernel 2.6+)
+- **Socketpair**: Unix domain sockets
+- **Threading**: POSIX threads (pthread)
+
+## Future Enhancements
+
+### Planned Features
+
+1. **document_from_sql**: Execute SQL to retrieve documents for reranking
+2. **Shared memory**: Zero-copy result transfer for large responses
+3. **Connection pooling**: Reuse HTTP connections to GenAI services
+4. **Metrics**: Enhanced monitoring and statistics
+5. **Batch optimization**: Better support for large document batches
+6. **Streaming**: Progressive result delivery for large operations
+
+## Related Documentation
+
+- [Posts Table Embeddings Setup](./posts-embeddings-setup.md) - Using sqlite-rembed with GenAI
+- [SQLite3 Server Documentation](./SQLite3-Server.md) - SQLite3 backend integration
+- [sqlite-rembed Integration](./sqlite-rembed-integration.md) - Embedding generation
+
+## Source Files
+
+### Core Implementation
+
+- `include/GenAI_Thread.h` - GenAI module interface and structures
+- `lib/GenAI_Thread.cpp` - Implementation of listener and worker loops
+- `include/MySQL_Session.h` - Session integration (GenAI async state)
+- `lib/MySQL_Session.cpp` - Async handlers and main loop integration
+- `include/Base_Session.h` - Base session GenAI members
+
+### Tests
+
+- `test/tap/tests/genai_module-t.cpp` - Admin commands and variables
+- `test/tap/tests/genai_embedding_rerank-t.cpp` - Basic embedding/reranking
+- `test/tap/tests/genai_async-t.cpp` - Async architecture tests
+
+## License
+
+Same as ProxySQL - See LICENSE file for details.
+
+## Contributing
+
+For contributions and issues:
+- GitHub: https://github.com/sysown/proxysql
+- Branch: `v3.1-vec_genAI_module`
+
+---
+
+*Last Updated: 2025-01-10*
+*Module Version: 0.1.0*