You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
proxysql/RAG_IMPLEMENTATION_SUMMARY.md

3.4 KiB

ProxySQL RAG Subsystem Implementation Summary

Overview

This implementation adds a Retrieval-Augmented Generation (RAG) subsystem to ProxySQL, turning it into a RAG retrieval engine. The implementation follows the blueprint documents and integrates with ProxySQL's existing architecture.

Components Implemented

1. RAG Tool Handler

  • File: include/RAG_Tool_Handler.h and lib/RAG_Tool_Handler.cpp
  • Class: RAG_Tool_Handler inheriting from MCP_Tool_Handler
  • Functionality: Implements all required MCP tools for RAG operations

2. MCP Integration

  • Files: include/MCP_Thread.h and lib/MCP_Thread.cpp
  • Changes: Added RAG_Tool_Handler member and initialization
  • Endpoint: /mcp/rag registered in ProxySQL_MCP_Server

3. Database Schema

  • File: lib/AI_Features_Manager.cpp
  • Tables Created:
    • rag_sources: Control plane for ingestion configuration
    • rag_documents: Canonical documents
    • rag_chunks: Retrieval units (chunked content)
    • rag_fts_chunks: FTS5 index for keyword search
    • rag_vec_chunks: Vector index for semantic search
    • rag_sync_state: Sync state for incremental ingestion
    • rag_chunk_view: Convenience view for debugging

4. Configuration Variables

  • File: include/GenAI_Thread.h and lib/GenAI_Thread.cpp
  • Variables Added:
    • genai_rag_enabled: Enable RAG features
    • genai_rag_k_max: Maximum k for search results
    • genai_rag_candidates_max: Maximum candidates for hybrid search
    • genai_rag_query_max_bytes: Maximum query length
    • genai_rag_response_max_bytes: Maximum response size
    • genai_rag_timeout_ms: RAG operation timeout

MCP Tools Implemented

Search Tools

  1. rag.search_fts - Keyword search using FTS5
  2. rag.search_vector - Semantic search using vector embeddings
  3. rag.search_hybrid - Hybrid search with two modes:
    • "fuse": Parallel FTS + vector with Reciprocal Rank Fusion
    • "fts_then_vec": Candidate generation + rerank

Fetch Tools

  1. rag.get_chunks - Fetch chunk content by chunk_id
  2. rag.get_docs - Fetch document content by doc_id
  3. rag.fetch_from_source - Refetch authoritative data from source

Admin Tools

  1. rag.admin.stats - Operational statistics for RAG system

Key Features

Security

  • Input validation and sanitization
  • Query length limits
  • Result size limits
  • Timeouts for all operations
  • Column whitelisting for refetch operations
  • Row and byte limits for all operations

Performance

  • Proper use of prepared statements
  • Connection management
  • SQLite3-vec integration for vector operations
  • FTS5 integration for keyword search
  • Proper indexing strategies

Integration

  • Shares vector database with existing AI features
  • Uses existing LLM_Bridge for embedding generation
  • Integrates with existing MCP infrastructure
  • Follows ProxySQL coding conventions

Testing

Test Scripts

  • scripts/mcp/test_rag.sh: Tests RAG functionality via MCP endpoint
  • test/test_rag_schema.cpp: Tests RAG database schema creation
  • test/build_rag_test.sh: Simple build script for RAG test

Documentation

  • doc/rag-documentation.md: Comprehensive RAG documentation
  • doc/rag-examples.md: Examples of using RAG tools

Usage

To enable RAG functionality:

-- Enable GenAI module
SET genai.enabled = true;

-- Enable RAG features
SET genai.rag_enabled = true;

-- Load configuration
LOAD genai VARIABLES TO RUNTIME;

Then use the MCP tools via the /mcp/rag endpoint.