You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
proxysql/doc/rag-documentation.md

3.4 KiB

RAG (Retrieval-Augmented Generation) in ProxySQL

Overview

ProxySQL's RAG subsystem provides retrieval capabilities for LLM-powered applications. It allows you to:

  • Store documents and their embeddings in a SQLite-based vector database
  • Perform keyword search (FTS), semantic search (vector), and hybrid search
  • Fetch document and chunk content
  • Refetch authoritative data from source databases
  • Monitor RAG system statistics

Configuration

To enable RAG functionality, you need to enable the GenAI module and RAG features:

-- Enable GenAI module
SET genai.enabled = true;

-- Enable RAG features
SET genai.rag_enabled = true;

-- Configure RAG parameters (optional)
SET genai.rag_k_max = 50;
SET genai.rag_candidates_max = 500;
SET genai.rag_timeout_ms = 2000;

Available MCP Tools

The RAG subsystem provides the following MCP tools via the /mcp/rag endpoint:

Search Tools

  1. rag.search_fts - Keyword search using FTS5

    {
      "query": "search terms",
      "k": 10
    }
    
  2. rag.search_vector - Semantic search using vector embeddings

    {
      "query_text": "semantic search query",
      "k": 10
    }
    
  3. rag.search_hybrid - Hybrid search combining FTS and vectors

    {
      "query": "search query",
      "mode": "fuse",  // or "fts_then_vec"
      "k": 10
    }
    

Fetch Tools

  1. rag.get_chunks - Fetch chunk content by chunk_id

    {
      "chunk_ids": ["chunk1", "chunk2"],
      "return": {
        "include_title": true,
        "include_doc_metadata": true,
        "include_chunk_metadata": true
      }
    }
    
  2. rag.get_docs - Fetch document content by doc_id

    {
      "doc_ids": ["doc1", "doc2"],
      "return": {
        "include_body": true,
        "include_metadata": true
      }
    }
    
  3. rag.fetch_from_source - Refetch authoritative data from source database

    {
      "doc_ids": ["doc1"],
      "columns": ["Id", "Title", "Body"],
      "limits": {
        "max_rows": 10,
        "max_bytes": 200000
      }
    }
    

Admin Tools

  1. rag.admin.stats - Get operational statistics for RAG system
    {}
    

Database Schema

The RAG subsystem uses the following tables in the vector database (/var/lib/proxysql/ai_features.db):

  • rag_sources - Control plane for ingestion configuration
  • rag_documents - Canonical documents
  • rag_chunks - Retrieval units (chunked content)
  • rag_fts_chunks - FTS5 index for keyword search
  • rag_vec_chunks - Vector index for semantic search
  • rag_sync_state - Sync state for incremental ingestion
  • rag_chunk_view - Convenience view for debugging

Testing

You can test the RAG functionality using the provided test scripts:

# Test RAG functionality via MCP endpoint
./scripts/mcp/test_rag.sh

# Test RAG database schema
cd test/rag
make test_rag_schema
./test_rag_schema

Security

The RAG subsystem includes several security features:

  • Input validation and sanitization
  • Query length limits
  • Result size limits
  • Timeouts for all operations
  • Column whitelisting for refetch operations
  • Row and byte limits for all operations

Performance

Recommended performance settings:

  • Set appropriate timeouts (250-2000ms)
  • Limit result sizes (k_max=50, candidates_max=500)
  • Use connection pooling for source database connections
  • Monitor resource usage and adjust limits accordingly