mirror of https://github.com/sysown/proxysql
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
13 KiB
13 KiB
RAG Subsystem Doxygen Documentation
Overview
The RAG (Retrieval-Augmented Generation) subsystem provides a comprehensive set of tools for semantic search and document retrieval through the MCP (Model Context Protocol). This documentation details the Doxygen-style comments added to the RAG implementation.
Main Classes
RAG_Tool_Handler
The primary class that implements all RAG functionality through the MCP protocol.
Class Definition
class RAG_Tool_Handler : public MCP_Tool_Handler
Constructor
/**
* @brief Constructor
* @param ai_mgr Pointer to AI_Features_Manager for database access and configuration
*
* Initializes the RAG tool handler with configuration parameters from GenAI_Thread
* if available, otherwise uses default values.
*
* Configuration parameters:
* - k_max: Maximum number of search results (default: 50)
* - candidates_max: Maximum number of candidates for hybrid search (default: 500)
* - query_max_bytes: Maximum query length in bytes (default: 8192)
* - response_max_bytes: Maximum response size in bytes (default: 5000000)
* - timeout_ms: Operation timeout in milliseconds (default: 2000)
*/
RAG_Tool_Handler(AI_Features_Manager* ai_mgr);
Public Methods
get_tool_list()
/**
* @brief Get list of available RAG tools
* @return JSON object containing tool definitions and schemas
*
* Returns a comprehensive list of all available RAG tools with their
* input schemas and descriptions. Tools include:
* - rag.search_fts: Keyword search using FTS5
* - rag.search_vector: Semantic search using vector embeddings
* - rag.search_hybrid: Hybrid search combining FTS and vectors
* - rag.get_chunks: Fetch chunk content by chunk_id
* - rag.get_docs: Fetch document content by doc_id
* - rag.fetch_from_source: Refetch authoritative data from source
* - rag.admin.stats: Operational statistics
*/
json get_tool_list() override;
execute_tool()
/**
* @brief Execute a RAG tool with arguments
* @param tool_name Name of the tool to execute
* @param arguments JSON object containing tool arguments
* @return JSON response with results or error information
*
* Executes the specified RAG tool with the provided arguments. Handles
* input validation, parameter processing, database queries, and result
* formatting according to MCP specifications.
*
* Supported tools:
* - rag.search_fts: Full-text search over documents
* - rag.search_vector: Vector similarity search
* - rag.search_hybrid: Hybrid search with two modes (fuse, fts_then_vec)
* - rag.get_chunks: Retrieve chunk content by ID
* - rag.get_docs: Retrieve document content by ID
* - rag.fetch_from_source: Refetch data from authoritative source
* - rag.admin.stats: Get operational statistics
*/
json execute_tool(const std::string& tool_name, const json& arguments) override;
Private Helper Methods
Database and Query Helpers
/**
* @brief Execute database query and return results
* @param query SQL query string to execute
* @return SQLite3_result pointer or NULL on error
*
* Executes a SQL query against the vector database and returns the results.
* Handles error checking and logging. The caller is responsible for freeing
* the returned SQLite3_result.
*/
SQLite3_result* execute_query(const char* query);
/**
* @brief Validate and limit k parameter
* @param k Requested number of results
* @return Validated k value within configured limits
*
* Ensures the k parameter is within acceptable bounds (1 to k_max).
* Returns default value of 10 if k is invalid.
*/
int validate_k(int k);
/**
* @brief Validate and limit candidates parameter
* @param candidates Requested number of candidates
* @return Validated candidates value within configured limits
*
* Ensures the candidates parameter is within acceptable bounds (1 to candidates_max).
* Returns default value of 50 if candidates is invalid.
*/
int validate_candidates(int candidates);
/**
* @brief Validate query length
* @param query Query string to validate
* @return true if query is within length limits, false otherwise
*
* Checks if the query string length is within the configured query_max_bytes limit.
*/
bool validate_query_length(const std::string& query);
JSON Parameter Extraction
/**
* @brief Extract string parameter from JSON
* @param j JSON object to extract from
* @param key Parameter key to extract
* @param default_val Default value if key not found
* @return Extracted string value or default
*
* Safely extracts a string parameter from a JSON object, handling type
* conversion if necessary. Returns the default value if the key is not
* found or cannot be converted to a string.
*/
static std::string get_json_string(const json& j, const std::string& key,
const std::string& default_val = "");
/**
* @brief Extract int parameter from JSON
* @param j JSON object to extract from
* @param key Parameter key to extract
* @param default_val Default value if key not found
* @return Extracted int value or default
*
* Safely extracts an integer parameter from a JSON object, handling type
* conversion from string if necessary. Returns the default value if the
* key is not found or cannot be converted to an integer.
*/
static int get_json_int(const json& j, const std::string& key, int default_val = 0);
/**
* @brief Extract bool parameter from JSON
* @param j JSON object to extract from
* @param key Parameter key to extract
* @param default_val Default value if key not found
* @return Extracted bool value or default
*
* Safely extracts a boolean parameter from a JSON object, handling type
* conversion from string or integer if necessary. Returns the default
* value if the key is not found or cannot be converted to a boolean.
*/
static bool get_json_bool(const json& j, const std::string& key, bool default_val = false);
/**
* @brief Extract string array from JSON
* @param j JSON object to extract from
* @param key Parameter key to extract
* @return Vector of extracted strings
*
* Safely extracts a string array parameter from a JSON object, filtering
* out non-string elements. Returns an empty vector if the key is not
* found or is not an array.
*/
static std::vector<std::string> get_json_string_array(const json& j, const std::string& key);
/**
* @brief Extract int array from JSON
* @param j JSON object to extract from
* @param key Parameter key to extract
* @return Vector of extracted integers
*
* Safely extracts an integer array parameter from a JSON object, handling
* type conversion from string if necessary. Returns an empty vector if
* the key is not found or is not an array.
*/
static std::vector<int> get_json_int_array(const json& j, const std::string& key);
Scoring and Normalization
/**
* @brief Compute Reciprocal Rank Fusion score
* @param rank Rank position (1-based)
* @param k0 Smoothing parameter
* @param weight Weight factor for this ranking
* @return RRF score
*
* Computes the Reciprocal Rank Fusion score for hybrid search ranking.
* Formula: weight / (k0 + rank)
*/
double compute_rrf_score(int rank, int k0, double weight);
/**
* @brief Normalize scores to 0-1 range (higher is better)
* @param score Raw score to normalize
* @param score_type Type of score being normalized
* @return Normalized score in 0-1 range
*
* Normalizes various types of scores to a consistent 0-1 range where
* higher values indicate better matches. Different score types may
* require different normalization approaches.
*/
double normalize_score(double score, const std::string& score_type);
Tool Specifications
rag.search_fts
Keyword search over documents using FTS5.
Parameters
query(string, required): Search query stringk(integer): Number of results to return (default: 10, max: 50)offset(integer): Offset for pagination (default: 0)filters(object): Filter criteria for resultsreturn(object): Return options for result fields
Filters
source_ids(array of integers): Filter by source IDssource_names(array of strings): Filter by source namesdoc_ids(array of strings): Filter by document IDsmin_score(number): Minimum score thresholdpost_type_ids(array of integers): Filter by post type IDstags_any(array of strings): Filter by any of these tagstags_all(array of strings): Filter by all of these tagscreated_after(string): Filter by creation date (after)created_before(string): Filter by creation date (before)
Return Options
include_title(boolean): Include title in results (default: true)include_metadata(boolean): Include metadata in results (default: true)include_snippets(boolean): Include snippets in results (default: false)
rag.search_vector
Semantic search over documents using vector embeddings.
Parameters
query_text(string, required): Text to search semanticallyk(integer): Number of results to return (default: 10, max: 50)filters(object): Filter criteria for resultsembedding(object): Embedding model specificationquery_embedding(object): Precomputed query embeddingreturn(object): Return options for result fields
rag.search_hybrid
Hybrid search combining FTS and vector search.
Parameters
query(string, required): Search query for both FTS and vectork(integer): Number of results to return (default: 10, max: 50)mode(string): Search mode: 'fuse' or 'fts_then_vec'filters(object): Filter criteria for resultsfuse(object): Parameters for fuse modefts_then_vec(object): Parameters for fts_then_vec mode
Fuse Mode Parameters
fts_k(integer): Number of FTS results for fusion (default: 50)vec_k(integer): Number of vector results for fusion (default: 50)rrf_k0(integer): RRF smoothing parameter (default: 60)w_fts(number): Weight for FTS scores (default: 1.0)w_vec(number): Weight for vector scores (default: 1.0)
FTS Then Vector Mode Parameters
candidates_k(integer): FTS candidates to generate (default: 200)rerank_k(integer): Candidates to rerank with vector search (default: 50)vec_metric(string): Vector similarity metric (default: 'cosine')
rag.get_chunks
Fetch chunk content by chunk_id.
Parameters
chunk_ids(array of strings, required): List of chunk IDs to fetchreturn(object): Return options for result fields
rag.get_docs
Fetch document content by doc_id.
Parameters
doc_ids(array of strings, required): List of document IDs to fetchreturn(object): Return options for result fields
rag.fetch_from_source
Refetch authoritative data from source database.
Parameters
doc_ids(array of strings, required): List of document IDs to refetchcolumns(array of strings): List of columns to fetchlimits(object): Limits for the fetch operation
rag.admin.stats
Get operational statistics for RAG system.
Parameters
None
Database Schema
The RAG subsystem uses the following tables in the vector database:
rag_sources: Ingestion configuration and source metadatarag_documents: Canonical documents with stable IDsrag_chunks: Chunked content for retrievalrag_fts_chunks: FTS5 contentless index for keyword searchrag_vec_chunks: sqlite3-vec virtual table for vector similarity searchrag_sync_state: Sync state tracking for incremental ingestionrag_chunk_view: Convenience view for debugging
Security Features
- Input Validation: Strict validation of all parameters and filters
- Query Limits: Maximum limits on query length, result count, and candidates
- Timeouts: Configurable operation timeouts to prevent resource exhaustion
- Column Whitelisting: Strict column filtering for refetch operations
- Row and Byte Limits: Maximum limits on returned data size
- Parameter Binding: Safe parameter binding to prevent SQL injection
Performance Features
- Prepared Statements: Efficient query execution with prepared statements
- Connection Management: Proper database connection handling
- SQLite3-vec Integration: Optimized vector operations
- FTS5 Integration: Efficient full-text search capabilities
- Indexing Strategies: Proper database indexing for performance
- Result Caching: Efficient result processing and formatting
Configuration Variables
genai_rag_enabled: Enable RAG featuresgenai_rag_k_max: Maximum k for search results (default: 50)genai_rag_candidates_max: Maximum candidates for hybrid search (default: 500)genai_rag_query_max_bytes: Maximum query length in bytes (default: 8192)genai_rag_response_max_bytes: Maximum response size in bytes (default: 5000000)genai_rag_timeout_ms: RAG operation timeout in ms (default: 2000)