proxysql

Commit Graph

Author	SHA1	Message	Date
René Cannaò	77fe582801	Merge pull request #5317 from sysown/v3.0.6-implement_FLUSH_STATS implement admin command 'PROXYSQL FLUSH STATS'	3 months ago
René Cannaò	ce9fdc9d7a	Merge branch 'v3.0' into v3.0.6-implement_FLUSH_STATS Signed-off-by: René Cannaò <rene@proxysql.com>	3 months ago
Rene Cannao	e3026cbc6f	Fix wrong index in connection cleanup loops (MySQL and PgSQL) This commit fixes a bug where connection cleanup loops were removing the wrong connection from the pool. The loops checked each connection by index (i), but when an expired connection was found, they removed index 0 instead of index i. This caused: - Fresh connections at index 0 to be incorrectly deleted - Expired connections to remain in the pool Fixed files: - lib/Base_HostGroups_Manager.cpp:2603 (MySQL) - lib/MySQL_HostGroups_Manager.cpp:2939 (MySQL) - lib/PgSQL_HostGroups_Manager.cpp:2782 (PgSQL) The fix changes `remove(0)` to `remove(i)` to remove the correct connection. Related: #5094 (fixes similar issue in drop_all_idle_connections)	3 months ago
René Cannaò	0524363322	Merge pull request #5339 from sysown/v3.0-merge-v4.0-genai Merge v4.0 GenAI features into v3.0 with conditional compilation	3 months ago
Wazir Ahmed	adb059c4b5	mcp/stats: Add doxygen documentation for tool handlers Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	643b322f29	MCP: Add stats endpoint and tools Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
René Cannaò	27f13ca18d	Merge pull request #5096 from Gonlo2/fix-close-ssl-con Avoid send close in ssl connections	3 months ago
Rene Cannao	79756e78d7	Fix missing #endif for PROXYSQLGENAI guard in Admin_Handler.cpp	3 months ago
Rene Cannao	1af768f932	Fix critical issues in GenAI code (PR #5339 ) This commit addresses critical issues identified in the PR review: 1. Build system redundancy (src/Makefile): - Remove redundant PROXYSQLGENAI conditionals in linking commands - The flag doesn't affect linking, so simplify to PROXYSQLCLICKHOUSE only 2. Missing PROXYSQLGENAI guard (lib/Admin_Handler.cpp): - Add #ifdef PROXYSQLGENAI around MCP VARIABLES DISK commands - Ensures MCP commands are only available when GenAI is enabled 3. Broken retry logic (lib/LLM_Clients.cpp): - Remove misleading is_retryable_error() checks that used stale values - last_curl_code and last_http_code were never updated - Simplify to retry on empty responses only (documented limitation) - Fix thread-safety: use thread_local std::mt19937 instead of rand() 4. Resource leak (lib/MySQL_Tool_Handler.cpp): - Clean up previously created connections on init failure - Both mysql_init and mysql_real_connect error paths now clean up 5. Race condition (lib/Query_Tool_Handler.cpp): - Add missing pool_lock in find_connection() - Prevents race condition when accessing connection_pool Fixes identified by automated PR review agents.	3 months ago
Rene Cannao	b965fc6df4	Fix PROXYSQLGENAI build - resolve circular includes and missing headers This commit fixes compilation errors when building with PROXYSQLGENAI=1: Header file fixes: - Add missing #endif /* PROXYSQLGENAI */ before header guards in multiple headers - Remove #include cpp.h from GenAI headers to avoid circular dependencies Source file fixes: - Add #include proxysql.h to GenAI .cpp files that were missing it - Add #include Static_Harvester.h to Query_Tool_Handler.cpp for forward decl Build system fixes: - Remove vec.o from libproxysql.a (it's linked separately in src/Makefile) - Prevents duplicate symbol errors during linking Test files: - Rename MCP test files with .sh suffix to prevent make clean deletion Both build modes now work: - make build_lib_debug (without GenAI) - PROXYSQLGENAI=1 make build_lib_debug (with GenAI)	3 months ago
Rene Cannao	48bc7dd7bf	Merge v4.0 GenAI features into v3.0 with conditional compilation This commit merges the experimental v4.0 GenAI/MCP features into the stable v3.0 branch using conditional compilation. All v4.0 features are disabled by default and only enabled when PROXYSQLGENAI=1 is set at compile time. Changes: Build System: - Modified main Makefile to pass PROXYSQLGENAI flag to sub-makefiles - Modified deps/Makefile to conditionally build sqlite-vec and sqlite-rembed - Modified lib/Makefile to add PSQLGA flag and include GenAI object files - Modified src/Makefile to add PSQLGA flag and conditional linking Headers (wrapped with #ifdef PROXYSQLGENAI): - All 20 new GenAI header files in include/ - Modified cpp.h, proxysql_glovars.hpp, proxysql_admin.h - Modified ProxySQL_Admin_Tables_Definitions.h for GenAI/MCP tables Source Files: - All 22 new GenAI source files in lib/ wrapped with #ifdef PROXYSQLGENAI - Modified src/main.cpp for conditional global variables and init/shutdown - Modified Admin_Handler.cpp for conditional command handlers - Modified Admin_Bootstrap.cpp for conditional table registration - Modified Admin_FlushVariables.cpp for conditional variable flushing - Modified ProxySQL_Admin.cpp for conditional admin methods - Modified ProxySQL_Admin_Stats.cpp for conditional MCP stats functions - Modified proxy_sqlite3_symbols.cpp to always compile (needed by core) - Modified MySQL_Session.cpp for conditional GenAI function calls Test Files: - Renamed test_mcp_query_rules-t to test_mcp_query_rules-t.sh - Renamed test_mcp_rag_metrics-t to test_mcp_rag_metrics-t.sh - Modified anomaly_detection-t.cpp for conditional test execution Usage: # Build without GenAI (v3.0 mode - default) make clean && make build_deps -j$(nproc) && make build_lib -j$(nproc) && make build_src -j$(nproc) # Build with GenAI (v4.0 mode) make clean && PROXYSQLGENAI=1 make build_deps -j$(nproc) && PROXYSQLGENAI=1 make build_lib -j$(nproc) && PROXYSQLGENAI=1 make build_src -j$(nproc)	3 months ago
René Cannaò	e00617be11	Merge pull request #5333 from wazir-ahmed/v4.0-config-embedding-model RAG: Fix query embedding and vector search	3 months ago
Wazir Ahmed	10026891ce	RAG: Convert vector similarity search into a subquery - `sqlite-vec` requires that `knn` queries (using the `MATCH` operator for vector similarity search) must have a `LIMIT` clause at the same query level as the `MATCH` clause. - Execute `knn` queries as a subquery and then do `JOIN`s with `rag_chunks` and `rag_documents`. Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Javier Jaramago Fernández	e351a0df74	feat: Add leading comments (--) detection for 'run_sql_readonly' Read only queries are no longer flagged as non-readonly when starting with double dash comments.	3 months ago
Miro Stauder	cc1d93d08b	implement admin command 'PROXYSQL FLUSH STATS' for DEBUG builds	3 months ago
Wazir Ahmed	8d022a0fcc	RAG: Add 'model' request parameter for embedding service - `model` is a required parameter for embedding request in OpenAI API specification. - https://platform.openai.com/docs/api-reference/embeddings Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Javier Jaramago Fernández	fec97bbabe	fix: Fix GENAI variable loading during 'ProxySQL_Admin::init'	3 months ago
Javier Jaramago Fernández	a5ef787c7e	feat: Improve logging for MCP and RAG tools - Add new column 'endpoint' for 'stats_mcp_query_tools_counters'. - Add new table 'rag_search_log' to log 'log_rag_search_fts' operations.	3 months ago
Wazir Ahmed	e4f4dc95ce	RAG: bm25 and MATCH do not work with table alias Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	c52c621b27	MCP: Add mcp-rag_endpoint_auth config Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	eb495f42ee	AI: Fix vector_db table creation Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	22c4e94d53	AI: Fix sqlite-vec extension loading Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	7167f95247	AI: Enable extensions for vector_db Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	4031f85394	AI: Fix vector_db initialization Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	cecb975f66	fix: LOAD GENAI TO RUNTIME does not initialize the module Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	50de536534	MCP: Fix crash during server restarts Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Rene Cannao	42a67ebaf2	Merge rahim/v4.0_rag_ingest into v4.0_rag_ingest_2 Merge changes from PR #5318 (RAG ingestion feature) into the new development branch v4.0_rag_ingest_2. Changes include: - RAG ingestion tool (rag_ingest) with chunking and embeddings - MySQL_Catalog fixes for NULL pointer handling - MCP server updates for RAG tools - Comprehensive documentation and test scripts	3 months ago
Rene Cannao	38e5e8e56b	Fix critical issues from coderabbitai review - Fix NULL pointer dereference in rag_ingest.cpp: use str_or_empty() helper for all sqlite3_column_text results assigned to std::string - Fix NULL tags/links crash in MySQL_Catalog.cpp: add null guards before assigning sqlite3_column_text results to std::string - Fix missing curl_global_cleanup on error path in rag_ingest.cpp - Fix std::out_of_range exception in rag_ingest.cpp: wrap std::stoll calls in try-catch blocks, fall back to string comparison on overflow	3 months ago
Rene Cannao	9ba3df0ce7	Address AI code review feedback from PR #5318 - Fix Makefile: Use $(CXXFLAGS) directly for consistency with build philosophy - Fix MySQL_Catalog: Return proper error JSON instead of empty array on missing query	3 months ago
Rahim Kanji	d28444a02e	Merge remote-tracking branch 'v4.0' into v4.0_rag_ingest	3 months ago
Rene Cannao	3ccfa2bcc1	Address AI code review feedback for PR #5313 This commit addresses concerns raised by AI code reviewers (gemini-code-assist, Copilot, coderabbitai) on the initial security fixes. Critical fixes: - Fix lock.release() → lock.unlock() in GenAI_Thread.cpp worker_loop (lock.release() detaches without unlocking, causing deadlock) - Add missing early return after schema validation failure in Query_Tool_Handler.cpp Code quality improvements: - Improve escape_string() memory management in MySQL_Tool_Handler.cpp: - Use std::string instead of new[]/delete[] for buffer management - Check return value of mysql_real_escape_string() for errors - Remove redundant validation checks in validate_sql_identifier functions (character class loop already rejects unsafe characters) - Add backslash escaping to escape_string_literal() for defense-in-depth - Improve column list validation in MySQL_Tool_Handler sample_rows(): - Replace blacklist approach with proper column identifier parsing - Allow qualified identifiers (table.column) - Allow AS aliases (column AS alias) - No longer rejects legitimate column names containing "JOIN" These changes improve robustness while maintaining the security posture of the original SQL injection fixes.	3 months ago
Rene Cannao	c914feb230	Fix security issues identified in PR #5312 code review This commit addresses critical and important security vulnerabilities found during comprehensive code review of the Gen AI features merge. Critical fixes: - SQL injection vulnerabilities in MySQL_Tool_Handler.cpp: - Added validate_sql_identifier() for schema/table validation - Added escape_string() for MySQL string escaping using mysql_real_escape_string - Fixed list_tables(), describe_table(), sample_rows(), sample_distinct() - SQL injection vulnerabilities in Query_Tool_Handler.cpp: - Added validate_sql_identifier_sqlite() for identifier validation - Added escape_string_literal() for SQLite string escaping - Fixed list_tables tool and catalog.get_relationships function - Use-after-free race condition in GenAI_Thread: - Changed shutdown_ from int to std::atomic<int> for proper memory ordering - Added additional shutdown check in worker_loop after popping request Important fixes: - Buffer overflow risks from sprintf usage: - Converted all sprintf() calls to snprintf() in GenAI_Thread.cpp - Converted sprintf() to snprintf() in MySQL_Session.cpp - Worker loop shutdown race condition: - Added shutdown check after popping request from queue - Properly clean up client_fd when shutdown is detected These fixes ensure: 1. All user input is properly validated before use in SQL queries 2. String values are properly escaped using database-specific escaping 3. Thread-safe shutdown with proper memory ordering guarantees 4. Bounds-safe string formatting to prevent buffer overflows	3 months ago
Rene Cannao	03e58146ea	fix: Re-enable SQLite3DB::LoadPlugin() with allow_load_plugin flag This addresses an issue from PR #22 where LoadPlugin() was completely disabled. The function performs necessary initialization even when no plugin is loaded (initializes built-in sqlite3 function pointers). Changes: - Added `const bool allow_load_plugin = false` flag in LoadPlugin() - Modified `if (plugin_name)` to `if (plugin_name && allow_load_plugin == true)` - Re-enabled the LoadPlugin() call in LoadPlugins() The plugin loading code remains disabled (allow_load_plugin=false) while the function pointer initialization from built-in SQLite3 now works correctly. TODO: Revisit plugin loading safety mechanism to allow actual plugin loading.	3 months ago
Rene Cannao	a3afde3472	fix: Address copilot review concerns for Discovery_Schema.cpp - Fix comment mismatch: Changed _2 suffix to match actual function name (mysql_query_digest_and_first_comment, not _2) - Make get_def_mysql_opts() static to avoid symbol pollution - Fix NULL first_comment parameter to prevent potential segfault - Pass valid char* pointer instead of NULL - Free first_comment if allocated by the function	3 months ago
Rene Cannao	7b6966b9c2	fix: Complete JSON escaping in fingerprint_mcp_args Address coderabbitai review - implement full JSON escaping for SQL digest: - Handle backslash (\) and double quote (") - Handle control characters: newline (\n), carriage return (\r), tab (\t) - Handle other control characters (U+0000 through U+001F) with \uXXXX escapes This ensures digest_text in stats_mcp_query_digest is always valid JSON, preventing parsing errors for consumers of this data.	3 months ago
Rene Cannao	f2536f01d2	Merge v3.1-vec into v3.1-MCP2_QR Resolved merge conflict in lib/ProxySQL_Admin_Stats.cpp: - Kept v3.1-MCP2_QR version with proper NULL handling - Uses stmt_unique_ptr (RAII smart pointer) - Checks for NULL before calling atoll() on numeric fields - Discarded v3.1-vec version which lacked NULL checks The merged code correctly handles NULL values for: - run_id - count_star - first_seen - last_seen - sum_time - min_time - max_time	3 months ago
Rene Cannao	5d4318b547	fix: Address coderabbitai review concerns for PR #27 - Fix Discovery_Schema.cpp fingerprint output JSON consistency - Quote placeholders (e.g., "?" instead of ?) for valid JSON - Fix test_mcp_query_rules_block.sh exec_admin_silent function - Add -B and -N flags for batch mode with no headers - Remove unused YELLOW variable from test_phase1_crud.sh - Fix test_phase2_load_save.sh runtime DELETE to be non-fatal - Remove unused INITIAL_COUNT from test_phase3_runtime.sh - Fix test_phase3_runtime.sh to verify exact ID ordering - Fix test_phase4_stats.sh read-only test assertion - Make INSERT/DELETE non-fatal and fix assertion logic - Remove unused DIGEST_TEXT_CHECK from test_phase5_digest.sh - Fix test_phase8_eval_timeout.sh non-timeout response handling - Return failure for non-timeout responses - Remove stray exit 1 from test_phase8_eval_timeout.sh	3 months ago
Rene Cannao	ffe5690360	fix: Address coderabbitai review - use-after-free, missing responses, SQL injection Fix issues identified by coderabbitai review: 1. Admin_Handler.cpp: Fix typo in strncasecmp for LOAD MCP QUERY RULES - Line 2365 had "LOAD MCP RULES FROM DISK" instead of "LOAD MCP QUERY RULES FROM DISK" 2. Admin_Handler.cpp: Fix use-after-free and missing client response - Removed l_free(ql,q) which freed q before caller used it - Added send_error_msg_to_client calls on all error paths - Added send_ok_msg_to_client call on success path - Changed return value from true to false (to match handler pattern) - Applied to both LOAD MCP QUERY RULES and SAVE MCP QUERY RULES handlers 3. ProxySQL_Admin_Stats.cpp: Fix sprintf SQL injection in stats___mcp_query_rules - Replaced sprintf with prepared statement using positional parameters - Changed from char query with malloc/free to sqlite3_stmt with prepare_v2 - Both columns now bound as parameters (?1, ?2)	3 months ago
Rene Cannao	5ece563514	fix: Correct SQL prepared statement API usage and template variable access Fix compilation errors in the SQL injection fixes: 1. ProxySQL_Admin_Stats.cpp: Use public statsdb->prepare_v2() API - Changed from direct proxy_sqlite3_prepare_v2() calls with statsdb->db - statsdb->db is private, must use public prepare_v2(query, &stmt) method 2. Admin_Handler.cpp: Add SPA cast for template function access - Added ProxySQL_Admin SPA=(ProxySQL_Admin )pa; declaration - Changed all admindb->execute to SPA->admindb->execute - Removed unused 'error' and 'success' variables The build now completes successfully.	3 months ago
Rene Cannao	9f07e9631e	fix: Use prepared statements in ProxySQL_Admin_Stats to prevent SQL injection Fix two SQL injection vulnerabilities identified by coderabbitai in ProxySQL_Admin_Stats.cpp by converting sprintf/snprintf interpolation to SQLite prepared statements. 1. stats___mcp_query_digest (lines 2581-2636): - Changed from sprintf with format string to prepared statement - digest_text field now properly bound as parameter (was unsafe) - All 10 columns now use positional parameters (?1-?10) 2. stats___mcp_query_tools_counters (lines 1587-1635): - Changed from snprintf with unescaped fields to prepared statement - Fixed incorrect table name logic (was appending _reset incorrectly) - All 8 columns now use positional parameters (?1-?8) These changes prevent SQL injection when resultset fields contain quotes, backslashes, or other SQL metacharacters.	3 months ago
Rene Cannao	3bcee22700	fix: Execute MCP query rules DELETE+INSERT as explicit transaction Fix multi-statement execution in Admin_Handler.cpp for MCP query rules. The previous code built a single SQL string with "DELETE ...; INSERT ..." but SQLite only executed the first statement. Changed to execute statements as an explicit transaction: 1. BEGIN 2. DELETE FROM target_table 3. INSERT OR REPLACE INTO target_table SELECT * FROM source_table 4. COMMIT (or ROLLBACK on error) Applied to both: - LOAD MCP QUERY RULES FROM DISK/TO MEMORY - SAVE MCP QUERY RULES TO DISK Addresses coderabbitai review comment.	3 months ago
Rene Cannao	188aef90fe	fix: Use delete instead of free for SQLite3_result in load_mcp_query_rules_to_runtime Change free(resultset) to delete resultset in ProxySQL_Admin::load_mcp_query_rules_to_runtime. SQLite3_result is a C++ class allocated with new, so it must be deallocated with delete, not free(). Using free() causes undefined behavior and memory leaks. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	bbc04974f1	fix: Fix mysql_query failure path and affected_rows race condition Fix two issues in Query_Tool_Handler's execute_query functions: 1. mysql_query() failure path now returns immediately after return_connection() instead of continuing to process on bad state. 2. Capture affected_rows BEFORE return_connection() to avoid race condition. Previously, mysql_affected_rows() was called after return_connection(), potentially accessing a stale connection. Apply fixes to both execute_query() and execute_query_with_schema(). Addresses coderabbitai review comments.	3 months ago
Rene Cannao	b3edc6524b	fix: Escape SQL strings in harvest_view_definitions Add escape_sql_string() helper function that doubles single quotes to prevent SQL injection when strings are used in SQL string concatenation. Update harvest_view_definitions to use this function for view_def, schema_name, and view_name. This prevents SQL injection in the UPDATE statement that stores view definitions in the catalog. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	6835713f11	fix: Correct column indexes in build_quick_profiles Fix off-by-one errors in row->fields indices. The SELECT clause returns: object_id(0), schema_name(1), object_name(2), object_type(3), engine(4), table_rows_est(5), data_length(6), index_length(7), has_primary_key(8), has_foreign_keys(9), has_time_column(10) But the code was reading from fields[3..9] instead of [4..10]. Added comment documenting the correct column order. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	e9abee625b	fix: Execute prepared statement in execute_parameterized_query The execute_parameterized_query function in RAG_Tool_Handler was creating a prepared statement and binding parameters, but then executing the raw query string instead of the prepared statement. This completely bypassed the parameter binding, making the function useless for preventing SQL injection. Changed to use vector_db->execute_prepared(stmt, ...) to execute the bound statement instead of the raw query, so the bound parameters are actually used. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	6305537ba8	fix: Use delete instead of free for SQLite3_result deallocation Change free(resultset) to delete resultset in Query_Tool_Handler (extract_schema_name function). SQLite3_result is a C++ class allocated with new, so it must be deallocated with delete, not free(). Using free() causes mixed allocator UB. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	5e121399aa	fix: Add AFTER UPDATE trigger to keep catalog_fts index in sync for upserts Add catalog_au AFTER UPDATE trigger in MySQL_Catalog that mirrors the delete+insert pattern used in catalog_ad/catalog_ai. This keeps the FTS index current when upserts occur (INSERT OR REPLACE ... ON CONFLICT ... DO UPDATE), since the UPDATE doesn't trigger INSERT/DELETE triggers. The trigger first deletes the old entry from catalog_fts then inserts the new entry, ensuring the full-text search index stays synchronized with the catalog table. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	5dd5dbe6b7	fix: Add missing assert(proxy_sqlite3_bind_blob) in sqlite3db.cpp Add assert(proxy_sqlite3_bind_blob) to the assertion block in SQLite3DB::LoadPlugin to ensure the symbol is provided by plugins. Without this, if a plugin fails to provide the symbol, the code will crash at runtime with no safety check. proxy_sqlite3_bind_blob is actively used in Anomaly_Detector.cpp:765 to bind embeddings. Addresses coderabbitai review comment.	3 months ago
Rene Cannao	bd6d34f52b	fix: Address SQL injection vulnerabilities from PR #26 review - lib/MySQL_Catalog.cpp: Convert search/list/remove to use SQLite prepared statements instead of string concatenation for user parameters - lib/RAG_Tool_Handler.cpp: Add escape_fts_query() function to properly escape single quotes in FTS5 MATCH clauses; update all FTS and vector MATCH queries to use escaped values - lib/Static_Harvester.cpp: Add is_valid_schema_name() validation function to ensure schema names only contain safe characters (alphanumeric, underscore, dollar sign) before using in INFORMATION_SCHEMA queries - lib/Query_Tool_Handler.cpp: Add clarifying comments to validate_readonly_query explaining the blacklist (quick exit) + whitelist (allowed query types) approach - Remove backup file lib/Anomaly_Detector.cpp.bak Addresses gemini-code-assist review comments from PR #26.	3 months ago
Javier Jaramago Fernández	52142c4648	fix: Multiple issues with MCP query_(rules/digests) - Fixed invalid memory accesses for tables: + mcp_query_rules + stats_mcp_query_rules + stats_mcp_query_digests - Fixed inactive 'mcp_query_rules' being loaded to runtime. - Fixed hash computation in 'compute_mcp_digest'. - Fixed invalid escaping during 'stats_mcp_query_digests' gen. - Fixed digest generation for MCP arguments: + SQL queries are now preserved using 'mysql_query_digest_and_first_comment'. + TODO: Options for the tokenizer are right now hardcoded. - Added initial testing and testing plan for MCP query_(rules/digests). + TODO: Test finished on phase8. Timeouts destroy the MCP connection, leaving it unusable for subsequent queries this should be fixed for continuing testing. - TODO: There are several limitations to fix in 'validate_readonly_query'. This reflect in some query hacks in the testing. + 'SELECT' starting with comments (--) gets flagged as non-read. + 'SELECT' must have a 'SELECT .* FROM' structure. While common, simple testing queries many times lack this form.	3 months ago
Rene Cannao	b4f521c634	Merge v3.1-MCP2 into v3.1-vec Conflict resolution summary: Features preserved from v3.1-vec: - mcp_use_ssl variable (HTTP/HTTPS mode support) - RAG tool handler support - FTS (Full Text Search) functionality in MySQL_Tool_Handler Changes accepted from v3.1-MCP2: - Schema isolation for catalog (removed mcp_catalog_path/fts_path variables) - Schema parameter added to catalog functions - Comprehensive query rewriting improvements in Query_Tool_Handler - Format string fixes in vector_db_performance-t.cpp Key decisions: - catalog_path/fts_path: Removed per v3.1-MCP2 commit `35b0b224` (intentional) - MySQL_Catalog.cpp: Accepted v3.1-MCP2 version (schema isolation improvements) - Query_Tool_Handler.cpp: Accepted v3.1-MCP2 version (significant improvements) - Makefile: Kept v3.1-vec's EXCLUDE_TRACKING_VARIABLES (correct spelling) Build notes: - MySQL_Tool_Handler now requires schema parameter for catalog operations - Catalog path is hardcoded to datadir/mcp_catalog.db - FTS functionality preserved but may need updates for schema isolation	3 months ago
Rene Cannao	02918d18b8	Fix PR #25 Review: All AI code reviewer feedback addressed This commit addresses all recommendations from CodeRabbit, Gemini Code Assist, and Copilot for PR #25 (FTS security and code quality improvements). Critical Security Fixes: - MCP_Thread.cpp: Rollback fts_path on reset failure to keep config consistent - MySQL_FTS.cpp: Add escape_mysql_identifier() for MySQL query identifier escaping - MySQL_FTS.cpp: Add unique hash-based fallback to sanitize_name() for empty strings - MySQL_FTS.cpp: Add where_clause validation to block dangerous SQL patterns Memory Safety Fixes: - MySQL_FTS.cpp: Fix indexes_result memory leak on early return in search() - MySQL_FTS.h: Delete copy/move operations to prevent accidental resource duplication Thread Safety Documentation: - MySQL_Tool_Handler.cpp: Add comment explaining FTS lock design rationale Test Script Improvements: - test_mcp_fts.sh: Add curl timeouts (5s connect, 30s max) - test_mcp_fts.sh: Remove unused delete_response variable - test_mcp_fts_detailed.sh: Make cleanup tolerant of non-existent indexes Build Fixes: - Makefile: Fix EXCLUDE_TRACKING_VARAIABLES typo to EXCLUDE_TRACKING_VARIABLES - vector_db_performance-t.cpp: Fix printf format specifiers to %lld with cast Schema Fixes: - Query_Tool_Handler.cpp: Change fts_index_table columns schema from string to array Code Cleanup: - MySQL_Tool_Handler.cpp: Remove all remaining debug fprintf statements (34 lines) - Documentation: Change "Full Text" to "Full-Text" (hyphenated) Total: ~50 fixes across 10 files	3 months ago
Rene Cannao	a10c09bcc9	Fix PR #21 review: Security, memory safety, thread safety, and code cleanup Security fixes: - Add escape_identifier() helper for proper SQLite identifier escaping - Replace sanitize_name() with allowlist validation (ASCII letters, digits, underscore only) - Fix MATCH clause FTS5 operator injection by wrapping query in double quotes - Apply escape_identifier() to all DDL statements (CREATE, DROP, triggers) Memory safety fixes: - Replace VLA with std::vector in MySQL_FTS::init(), add delete on error path - Fix memory leak: free error string before return in list_indexes() - Fix reindex_json["error"] potential exception using .value() with default Thread safety fixes: - reinit_fts(): Add mutex lock around pointer swap - reset_fts_path(): Move blocking init() outside lock, only swap pointer under lock Code cleanup: - Remove 7 debug fprintf statements from Query_Tool_Handler.cpp - Remove unused #include <memory> from MySQL_FTS.h Test script security fixes: - Use MYSQL_PWD environment variable instead of -p"..." for password - Add escape_sql() function and apply to INSERT statement - Fix CURL_OPTS quoting: ${CURL_OPTS:+"${CURL_OPTS}"} - Remove unused FTS_INDEX_NAME and SEARCH_QUERIES variables Documentation fixes: - Fix bare URL to markdown link format - Add code block language identifiers (text, bash)	3 months ago
René Cannaò	af28598b23	Merge pull request #19 from ProxySQL/v3.1-MCP2_QR feat: Add MCP query rules and digest statistics [WIP]	3 months ago
Rene Cannao	709649232b	fix: Address AI code review concerns from PR #19 This commit addresses valid concerns raised by coding agents (Gemini, Copilot, CoderabbitAI): 1. Fix stats_mcp_query_digest naming conflict (ProxySQL_Admin.cpp): - Made reset and non-reset paths mutually exclusive using else block - Prevents both flags from being true, matching MySQL pattern - Ensures reset takes precedence over non-reset 2. Fix INSERT OR REPLACE sync issue (Admin_Handler.cpp): - Added DELETE before INSERT OR REPLACE in LOAD/SAVE MCP QUERY RULES - Prevents stale rules from persisting when syncing disk <-> memory - Ensures deleted source rows are also removed from target 3. Fix integer division truncation for timeout (Query_Tool_Handler.cpp): - Changed timeout_ms/1000 to (timeout_ms+999)/1000 for ceiling division - Ensures sub-second timeouts (e.g., 500ms) become at least 1 second - Prevents zero-second timeouts from causing unexpected behavior 4. Remove confusing comment (Discovery_Schema.cpp): - Simplified column count comment to be clear and accurate Note: The re_modifiers parsing code already correctly handles VARCHAR "CASELESS" to int conversion (lines 2414-2425), so that review comment was already addressed.	3 months ago
René Cannaò	a831670a79	Merge pull request #21 from rahim-kanji/v3.1_fts-support Add full-text search (FTS) tools to MCP query server	3 months ago
Rahim Kanji	ea60d85aa2	Merge remote-tracking branch 'v3.1-vec' into v3.1_rag	3 months ago
René Cannaò	75a62f334d	Merge branch 'v4.0' into v3.1-vec Signed-off-by: René Cannaò <rene@proxysql.com>	3 months ago
Rene Cannao	0db022a179	Apply fixes	3 months ago
Rene Cannao	2dfd61a958	Replace remaining direct sqlite3_* calls with proxy_sqlite3_* equivalents (address code-review)	3 months ago
Rene Cannao	a24b8adaa3	Use proxy_sqlite3_* for SQLite calls in Anomaly_Detector.cpp (address PR review)	3 months ago
René Cannaò	d43ae6e121	Surgical fixes for macOS compatibility: headers, types, and Makefile linking	3 months ago
Rene Cannao	8dc4246bdc	Introduce canonical proxy_sqlite3 symbol TU; update lib Makefile; remove MAIN_PROXY_SQLITE3 from main.cpp	3 months ago
Rahim Kanji	18dd249438	Merge remote-tracking branch 'v3.1-vec' into v3.1_fts-support	3 months ago
Rene Cannao	23aaf80cd1	fix: Address AI code review concerns for PR #19 Fixed 6 legitimate issues from AI code review: Critical fixes: - Implement stats___mcp_query_digest to populate table with actual data - Fix double trigger bug (stats_mcp_query_digest_reset substring matching) Important fixes: - Fix re_modifiers parsing (VARCHAR "CASELESS" -> int 1) - Fix TOCTOU race condition in reset path (use write lock from start) - Add column count validation before accessing row fields Documentation: - Add memory ownership documentation for evaluate_mcp_query_rules False positives ignored (Issue 2: Schema mismatch, Issue 6: SQL injection)	3 months ago
Rahim Kanji	0d56918744	Add full-text search (FTS) tools to MCP query server Implement BM25-ranked full-text search capability for MySQL/MariaDB tables using SQLite-based external FTS index. Changes: - Add MySQL_FTS class for managing SQLite FTS indexes - Add FTS tools: fts_index_table, fts_search, fts_reindex, fts_delete_index, fts_list_indexes, fts_rebuild_all - Add thread-safe FTS lifecycle management with fts_lock mutex - Add reset_fts_path() for runtime FTS database path configuration - Add comprehensive FTS test scripts (test_mcp_fts.sh, test_mcp_fts_detailed.sh)	3 months ago
Rene Cannao	5d08deca7d	Fix AI agent review issues - Address SQL injection vulnerabilities by adding input validation and escaping - Fix configuration variable handling in get_variable and set_variable methods for RAG variables - Make embedding dimension configurable for rag_vec_chunks table - Remove code duplication in SQL filter building logic by creating consolidated build_sql_filters function - Update all search tools (FTS, vector, hybrid) to use consolidated filter building	3 months ago
Wazir Ahmed	e450f1b30f	MCP: Handle DELETE method - Respond with 405 Method Not Allowed, when clients send DELETE request for session termination. Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	68a41d6db8	MCP: Add handler for prompts and resources Issue ----- - ProxySQL only supports the `tools` feature of the MCP protocol and does not support features such as `prompts` and `resources`. - Although ProxySQL expresses this in its `initialize` response, (server capabilities list contains only the `tools` object), clients such as Warp Terminal ignore it and continue to send requests for methods such `prompts/list` and `resources/list`. - Any response other than `HTTP 200 OK` is treated as an error and client fails to initialize. Fix --- - Handle prompt and resource list requests by returning an empty array. Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	2f38def403	MCP: Handle client notifications properly - Fix incorrect method name for notification - Handle all notification messages in a generic way - Respond with HTTP 202 Accepted (no response body) Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Wazir Ahmed	155a77f969	MCP: Bump protocolVersion to 2025-06-18 - Version 2024-11-05 only supports HTTP_SSE as transport. - ProxySQL's MCP implemenation aligns more with the StreamableHTTP transport specified in version 2025-06-18. - Support for SSE in StreamableHTTP transport is optional. Signed-off-by: Wazir Ahmed <wazir@proxysql.com>	3 months ago
Rahim Kanji	bf429f0a52	Fixed multiple issues	3 months ago
Rene Cannao	a1d9d2f1ba	docs: Add comprehensive documentation to MCP features Add detailed function-level documentation to all MCP query rules, query digest, static harvester, and catalog components. Static_Harvester.cpp: - Document all 18+ harvest functions (schemas, objects, columns, indexes, FKs, views) - Document lifecycle methods (init, close, connect, disconnect) - Document helper methods (is_time_type, is_id_like_name) - Document run management (start_run, finish_run, run_full_harvest) - Document statistics methods (get_harvest_stats) Query_Tool_Handler.cpp: - Document JSON helper functions (json_string, json_int, json_double) - Document digest tracking section with flow explanation MySQL_Catalog.cpp: - Document schema isolation architecture - Document CRUD operations (upsert, get, search, list, remove, merge) Discovery_Schema.cpp: - Document MCP query rules evaluation (evaluate_mcp_query_rules) - Document digest functions (compute_mcp_digest, fingerprint_mcp_args) - Document update/get functions for rules and digests ProxySQL_Admin_Stats.cpp: - Document stats collection functions ProxySQL_Admin.cpp: - Document load/save functions for query rules Admin_Handler.cpp: - Document MCP query rules command handlers include/ProxySQL_Admin_Tables_Definitions.h: - Add comments explaining table purposes	3 months ago
Rene Cannao	ad166c6b8a	docs: Add comprehensive Doxygen documentation for RAG subsystem - Enhanced inline Doxygen comments in RAG_Tool_Handler.h and RAG_Tool_Handler.cpp - Added detailed parameter descriptions, return values, and cross-references - Created Doxyfile for documentation generation - Added documentation summary and guidelines - Documented all RAG tools with their schemas and usage patterns - Added security and performance considerations documentation The RAG subsystem is now fully documented with comprehensive Doxygen comments that provide clear guidance for developers working with the codebase.	3 months ago
Rene Cannao	55715ecc4b	feat: Complete RAG implementation according to blueprint specifications - Fully implemented rag.search_hybrid tool with both fuse and fts_then_vec modes - Added complete filter support across all search tools (source_ids, source_names, doc_ids, post_type_ids, tags_any, tags_all, created_after, created_before, min_score) - Implemented proper score normalization (higher is better) for all search modes - Updated all tool schemas to match blueprint specifications exactly - Added metadata inclusion in search results - Implemented Reciprocal Rank Fusion (RRF) scoring for hybrid search - Enhanced error handling and input validation - Added debug information for hybrid search ranking - Updated documentation and created completion summary This completes the v0 RAG implementation according to the blueprint requirements.	3 months ago
Rene Cannao	c092fdbd3b	fix: Load re_modifiers field from database in load_mcp_query_rules() Previously re_modifiers was hardcoded to 1 (CASELESS), ignoring the value stored in the database. Now properly reads from row->fields[7].	3 months ago
Rene Cannao	cc3cc25532	fix: Remove unused reset parameter from stats___mcp_query_rules() Change function signature from stats___mcp_query_rules(bool reset) to stats___mcp_query_rules() to match MySQL query rules pattern. The reset parameter was never used in the function body and MySQL's stats___mysql_query_rules() has no reset parameter.	3 months ago
Rene Cannao	8c9aecce9b	feat: Add LOAD MCP QUERY RULES FROM DISK / TO MEMORY commands - Add LOAD MCP QUERY RULES FROM DISK command - Add LOAD MCP QUERY RULES TO MEMORY command - Both commands copy rules from disk.mcp_query_rules to main.mcp_query_rules This completes the full set of MCP query rules LOAD/SAVE commands, matching the MySQL query rules pattern.	3 months ago
Rene Cannao	7e6f9f0ab3	fix: Add MCP query rules LOAD/SAVE command handlers - Add separate MCP QUERY RULES command block in Admin_Handler - Fix string length comparison (21 chars for "SAVE/LOAD MCP QUERY RULES ") - Add handlers for: - LOAD MCP QUERY RULES TO RUNTIME - SAVE MCP QUERY RULES TO DISK - SAVE MCP QUERY RULES TO MEMORY / FROM RUNTIME - Register mcp_query_rules in disk database (tables_defs_config) Previously MCP commands were incorrectly nested inside MYSQL/PGSQL block and could not be reached. Now they have their own conditional block.	3 months ago
Rene Cannao	1dc5eb6581	fix: Fix RAG implementation compilation issues - Use public GenAI_Thread embed_documents() method instead of private LLM_Bridge get_text_embedding() - Fix signedness comparison warning in validate_query_length() - Fix JSON ternary operator type mismatch - Remove unused variables to eliminate warnings - Add GloGATH extern declaration	3 months ago
Rene Cannao	3daaa5c592	feat: Implement RAG (Retrieval-Augmented Generation) subsystem Adds a complete RAG subsystem to ProxySQL with: - RAG_Tool_Handler implementing all MCP tools for retrieval operations - Database schema with FTS and vector support - FTS, vector, and hybrid search capabilities - Fetch and refetch tools for document/chunk retrieval - Admin tools for monitoring - Configuration variables for RAG parameters - Comprehensive documentation and test scripts Implements v0 deliverables from RAG blueprint: - SQLite schema initialization - Source registry management - MCP tools: search_fts, search_vector, search_hybrid, get_chunks, get_docs, fetch_from_source, admin.stats - Unit/integration tests and examples	3 months ago
Rahim Kanji	5b8bb1952e	Merge remote-tracking branch 'wqv3.1-vec' into v3.1_mcp-http-ssl-toggle	3 months ago
Rene Cannao	f01fc79584	feat: Add runtime_mcp_query_rules table and fix stats_mcp_query_rules schema - Add ADMIN_SQLITE_TABLE_RUNTIME_MCP_QUERY_RULES schema (17 columns, same as mcp_query_rules) - Fix STATS_SQLITE_TABLE_MCP_QUERY_RULES to only have rule_id and hits columns - Add runtime_mcp_query_rules detection and refresh in ProxySQL_Admin - Implement save_mcp_query_rules_from_runtime(bool _runtime) for both config and runtime tables - Update get_mcp_query_rules() to return 17 columns (no hits) - get_stats_mcp_query_rules() returns 2 columns (rule_id, hits) Mirrors the MySQL query rules pattern: - mcp_query_rules: config table (17 cols) - runtime_mcp_query_rules: runtime state (17 cols) - stats_mcp_query_rules: hit counters (2 cols)	3 months ago
Rahim Kanji	9b66224df1	Fix critical double-free bug, SQL injection vulnerability, and hardcoded path This commit addresses three issues identified by code review: 1. CRITICAL: Fix double-free bug in MCP server restart logic - Remove manual handler deletions in Admin_FlushVariables.cpp - ProxySQL_MCP_Server destructor already properly cleans up all handlers - Previously caused crashes when toggling SSL mode or changing port - Simplified restart: delete server (destructor cleanup) → create new server - Verified with 10+ rapid SSL toggles without crashes 2. HIGH: Fix SQL injection vulnerability in catalog search - Rewrite MySQL_Catalog::search() to use prepared statements - Use parameter binding (proxy_sqlite3_bind_text/bind_int) for user input - Escape single quotes in FTS5 MATCH clause (doesn't support parameters) - Tested against multiple injection attempts (single quote, backslash, comments, UNION SELECT, kind/tags parameter injection) - All 21 catalog tests still pass with new implementation 3. MEDIUM: Fix hardcoded user-specific path in config - Revert datadir from user-specific absolute path to /var/lib/proxysql - Ensures portability across different environments Testing: - SSL toggle: 7 tests passed (HTTP↔HTTPS, port changes, stress test) - SQL injection: 10 tests passed (various injection attempts blocked) - Catalog functionality: 21 tests passed (FTS5, BM25 ranking, etc.) - Total: 38 tests passed, 0 failed Fixes issues identified in GitHub PR #16 review.	3 months ago
Rahim Kanji	f7397f633c	Fix catalog search to use FTS5 and enhance test suite The catalog_fts FTS5 virtual table was being created but the search() function was using slow LIKE queries instead of FTS5 MATCH operator. Changes to lib/MySQL_Catalog.cpp: - Use FTS5 MATCH with INNER JOIN to catalog_fts when query provided - Add BM25 relevance ranking (ORDER BY bm25(f) ASC) - Significant performance improvement: O(log n) vs O(n) Changes to scripts/mcp/test_catalog.sh: - Add 8 new FTS5-specific tests (CAT013-CAT020): - Multi-term search (AND logic) - Phrase search with quotes - Boolean operators (OR, NOT) - Prefix search with wildcards - Kind and tags filter combinations - Relevance ranking verification - Add SSL/HTTP support with auto-detection - New options: --ssl, --no-ssl, MCP_USE_SSL env var - Fix endpoint path: /query -> /mcp/querywq	3 months ago
Rene Cannao	f449c4236f	fix: Improve question learning fallback and error logging Two bug fixes for the question learning feature: 1. Fallback to most recent agent_run across all schemas - get_last_agent_run_id() now falls back to the most recent agent_run_id across ALL runs if none exists for the specific run_id - This allows adding questions even when the current schema's discovery didn't include an agent run - Adds logging to show when fallback is used 2. Fix error message extraction for query_tool_calls logging - Fixed bug where error messages weren't being extracted correctly - The old code checked for result["error"]["message"] but create_error_response only has result["error"] (no nested "message" field) - Now correctly extracts result["error"] as a string when present - This ensures failed tool calls are properly logged with error messages This fixes the issue where llm.question_template_add would fail with "No agent run found" even when agent runs exist for other schemas.	3 months ago
Rene Cannao	5b502c0864	feat: Add question learning capability to demo agent Add ability for the demo agent to learn new questions and add them to the catalog, making it smarter over time. Changes: - Added get_last_agent_run_id() function to Discovery_Schema: - Queries agent_runs table for the most recent agent_run_id for a run_id - Returns 0 if no agent runs exist for the schema - Updated llm.question_template_add handler: - Made agent_run_id optional (defaults to 0 when not provided) - When agent_run_id <= 0, auto-fetches last agent_run_id for the schema - Returns helpful error if no agent run exists for the schema - Returns agent_run_id in response for visibility - Updated llm.question_template_add tool schema: - Moved agent_run_id from required to optional parameters - Updated description to explain auto-fetch behavior - Updated demo_agent_claude.sh prompt: - Added llm.question_template_add to available tools - Added Step 4: "Learn from Success" to workflow - Added explicit instruction to ALWAYS LEARN new questions - Added example showing learning workflow - Expanded from 4 steps to 5 steps to include learning Now the demo agent can: 1. Search for existing questions 2. Reuse SQL if a good match exists 3. Generate new SQL if no good match 4. LEARN new questions by adding them to the catalog 5. Present results This enables continuous learning - the more users interact with it, the smarter it becomes.	3 months ago
Rene Cannao	ee74384c79	fix: Prevent llm.search from returning huge object lists in list mode When llm.search is called with an empty query (list mode) to retrieve all available questions, include_objects=true was returning full object schemas for all related objects, resulting in massive responses that could fill the LLM's context and cause rejections. Fix: include_objects now only works when query is non-empty (search mode). When query is empty (list mode), only question templates are returned without object details, regardless of include_objects setting. This makes semantic sense: - Empty query = "list all questions" → just titles/bodies (compact) - Non-empty query = "search for specific questions" → full details including object schemas (for answering the question) Changes: - Modified fts_search_llm() to check !query.empty() before fetching objects - Updated tool schema description to clarify this behavior	3 months ago
Rene Cannao	7e522aa2c0	feat: Add schema parameter to run_sql_readonly with per-connection tracking Add optional schema parameter to run_sql_readonly tool that allows queries to be executed against a specific schema, independent of the default schema configured in mcp-mysql_schema. Changes: - Added current_schema field to MySQLConnection structure to track the currently selected schema for each connection in the pool - Added find_connection() helper to find connection wrapper by mysql pointer - Added execute_query_with_schema() function that: - Uses mysql_select_db() instead of 'USE schema' SQL statement - Only calls mysql_select_db() if the requested schema differs from the current schema (optimization to avoid unnecessary switches) - Updates current_schema after successful schema switch - Updated run_sql_readonly handler: - Extracts optional 'schema' parameter - Calls execute_query_with_schema() instead of execute_query() - Returns error response when query fails (instead of success) - Updated tool schema to document the new 'schema' parameter This fixes the issue where queries would run against the default schema (configured in mcp-mysql_schema) instead of the schema being queried, causing "Table doesn't exist" errors when the default schema differs from the discovered schema.	3 months ago
Rene Cannao	ee13e4bf13	feat: Add include_objects parameter to llm_search for complete object retrieval Enhance the llm_search MCP tool to return complete question template data and optionally include full object schemas, reducing the need for additional MCP calls when answering questions. Changes: - Added related_objects column to llm_question_templates table - Updated add_question_template() to accept and store related_objects JSON array - Enhanced fts_search_llm() with include_objects parameter: - LEFT JOIN with llm_question_templates to return example_sql, related_objects, template_json, and confidence - When include_objects=true, fetches full object schemas (columns, indexes) for all related objects in a single batch operation - Added error checking for SQL execution failures - Fixed fts_search_llm() get_object() call to pass schema_name and object_name separately instead of combined object_key - Updated Query_Tool_Handler: - Added is_boolean() handling to json_int() helper to properly convert JSON boolean true/false to int 1/0 - Updated llm.search handler to extract and pass include_objects parameter - Updated llm.question_template_add to extract and pass related_objects - Updated tool schemas to document new parameters This change allows agents to get all necessary schema information in a single llm_search call instead of making multiple catalog_get_object calls, significantly reducing MCP call overhead.	3 months ago
Rene Cannao	1b42cfbd27	feat: Add empty query support to llm_search for listing all artifacts Changes: - fts_search_llm(): Empty query now returns all artifacts (list mode) - Update llm.search tool: query parameter is now optional - Tool description mentions empty query lists all artifacts - Add body field to llm_search results - Update demo script: Add special case for "What questions can I ask?" This enables agents to retrieve all pre-defined question templates when users ask what questions are available, instead of inferring questions from schema.	3 months ago
Rene Cannao	5668c86809	fix: Implement FTS indexing for LLM artifacts and fix reserved keyword issue - Rename llm_search_log column from \"limit\" to \"lmt\" to avoid SQL reserved keyword - Add FTS inserts to all LLM artifact upsert functions: - add_question_template(): index question templates for search - add_llm_note(): index notes for search - upsert_llm_summary(): index object summaries for search - upsert_llm_domain(): index domains for search - upsert_llm_metric(): index metrics for search - Remove content='' from fts_llm table to store content directly - Add <functional> header for std::hash usage This fixes the bug where llm_search always returned empty results because the FTS index was never populated.	3 months ago
Rene Cannao	2250b762a3	feat: Add query_tool_calls table to log MCP tool invocations Add query_tool_calls table to Discovery Schema to track all MCP tool invocations via the /mcp/query/ endpoint. Logs: - tool_name: Name of the tool that was called - schema: Schema name (nullable, empty if not applicable) - run_id: Run ID from discovery (nullable, 0 if not applicable) - start_time: Start monotonic time in microseconds - execution_time: Execution duration in microseconds - error: Error message (null if success) Modified files: - Discovery_Schema.cpp: Added table creation and log_query_tool_call function - Discovery_Schema.h: Added function declaration - Query_Tool_Handler.cpp: Added logging after each tool execution	3 months ago
Rene Cannao	8a395b9b47	style: Add spaces around commas in SQL CREATE TABLE statements Format column definitions in CREATE TABLE IF NOT EXISTS statements to have a space before and after each comma (e.g., " , "). This allows ProxySQL Admin to properly display multi-line table schemas. Modified files: - Discovery_Schema.cpp - MySQL_Catalog.cpp - AI_Features_Manager.cpp	3 months ago
Rene Cannao	7c93280174	fix: Escape SQL reserved keyword 'limit' in llm_search_log table The column name 'limit' conflicts with SQL reserved keyword. Escaped as "\"limit\"" to fix table creation.	3 months ago
Rene Cannao	77643859e3	feat: Add timing columns to stats_mcp_query_tools_counters Extend the stats_mcp_query_tools_counters table with timing statistics (first_seen, last_seen, sum_time, min_time, max_time) following the same pattern as stats_mysql_query_digest. All timing values are in microseconds using monotonic_time(). New schema: - tool VARCHAR - schema VARCHAR - count INT - first_seen INTEGER (microseconds) - last_seen INTEGER (microseconds) - sum_time INTEGER (microseconds - total execution time) - min_time INTEGER (microseconds - minimum execution time) - max_time INTEGER (microseconds - maximum execution time)	3 months ago
Rene Cannao	fb66af7c1b	feat: Expose MCP catalog database in ProxySQL Admin interface The MCP catalog database is now accessible as the 'mcp_catalog' schema from the ProxySQL Admin interface, enabling direct SQL queries against discovered schemas and LLM memories.	3 months ago
Rahim Kanji	7564306e18	Handledwq "notifications/initialized" method	3 months ago
Rahim Kanji	4a858521c9	Fix JSON-RPC ID type Change id parameter from string to json& to support JSON-RPC 2.0 spec (id can be string, number, or null)	3 months ago
Rahim Kanji	a15be695e0	Add GET/OPTIONS handlers for MCP HTTP transport - Add render_GET() returning 405 Method Not Allowed - Add render_OPTIONS()	3 months ago
Rene Cannao	35b0b224ff	refactor: Remove mcp-catalog_path variable and hardcode catalog path Remove the mcp-catalog_path configuration variable and hardcode the catalog database path to datadir/mcp_catalog.db for stability. Rationale: The catalog database is session state, not user configuration. Runtime swapping of the catalog could cause tables to be missed and the catalog to fail even if it was succeeding a second earlier. Changes: - Removed catalog_path from mcp_thread_variables_names array - Removed mcp_catalog_path from MCP_Thread variables struct - Removed getter/setter logic for catalog_path - Hardcoded catalog path to GloVars.datadir/mcp_catalog.db in: - ProxySQL_MCP_Server.cpp (Query_Tool_Handler initialization) - Admin_FlushVariables.cpp (MySQL_Tool_Handler reinitialization) - Updated VARIABLES.md to document the hardcoded path - Updated configure_mcp.sh to remove catalog_path configuration - Updated MCP README to remove catalog_path references	3 months ago
Rene Cannao	a816a756d4	feat: Add MCP query tool usage counters to stats schema Add stats_mcp_query_tools_counters and stats_mcp_query_tools_counters_reset tables to track MCP query tool usage statistics. - Added get_tool_usage_stats_resultset() method to Query_Tool_Handler - Defined table schemas in ProxySQL_Admin_Tables_Definitions.h - Registered tables in Admin_Bootstrap.cpp - Added pattern matching in ProxySQL_Admin.cpp - Added stats___mcp_query_tools_counters() in ProxySQL_Admin_Stats.cpp - Fixed friend declaration for track_tool_invocation() - Fixed Discovery_Schema.cpp log_llm_search() to use prepare_v2/finalize	3 months ago
Rene Cannao	393967f511	fix: Use row->cnt instead of row->fields_count	3 months ago
Rene Cannao	df0527c044	refactor: list_schemas to use catalog instead of live database - Query schemas from catalog's schemas table - Maintains same output format for compatibility - Removes dependency on live MySQL connection	3 months ago
Rene Cannao	527a748d16	refactor: Remove describe_table tool completely Tool was deprecated; users should use catalog.get_object instead.	3 months ago
Rene Cannao	623675b369	feat: Add schema name resolver and deprecate direct DB tools - Add resolve_run_id() to map schema names to latest run_id - Update all catalog and LLM tools to accept schema names - Deprecate describe_table, table_profile, column_profile - Deprecate get_constraints, suggest_joins, find_reference_candidates - Keep sample_rows, sample_distinct for data preview	3 months ago
Rene Cannao	757cdaff15	fix: Improve error logging and fix llm.domain_set_members 1. Fix error logging to catch ALL tool failures, not just those with both success and result fields. Previously, error responses like {"success": false, "error": "..."} without a result field were silently ignored. 2. Fix llm.domain_set_members to accept both array and JSON string formats for the members parameter. Some clients send it as a JSON string, others as a native array. 3. Add detailed error logging for llm.domain_set_members failures, including what was actually received.	3 months ago
Rahim Kanji	ddc4e65706	Add plain HTTP support for MCP server and fix SSL/port restart issues * Add full support for both HTTP and HTTPS modes in MCP server via the mcp_use_ssl configuration variable, enabling plain HTTP for development and HTTPS for production with proper certificate validation * Server now automatically restarts when SSL mode or port configuration changes, fixing silent configuration failures where changes appeared to succeed but didn't take effect until manual restart. Features: - Explicit support for HTTP mode (mcp_use_ssl=false) without SSL certificates - Explicit support for HTTPS mode (mcp_use_ssl=true) with certificate validation - Configurable via configure_mcp.sh with --no-ssl or --use-ssl flags - Settable via admin interface: SET mcp-use_ssl=true/false - Automatic restart detection for SSL mode changes (HTTP ↔ HTTPS) - Automatic restart detection for port changes (mcp_port)	3 months ago
Rene Cannao	d962caea7e	feat: Improve MCP error logging with request payloads Exception handlers now log the full request payload that caused the error, making debugging much easier. Changes: - Move req_body/req_path declarations outside try block so catch handlers can access them - Log request payload in all exception handlers (parse errors, std::exception, and catch-all) - Log tool arguments when tool execution fails Previously, exceptions would only log the error message without context, making it impossible to reproduce the issue. Now the full payload is logged.	3 months ago
Rene Cannao	53ecda7730	fix: Add comprehensive error handling and logging for MCP tools - Add try-catch around handle_jsonrpc_request to catch unexpected exceptions - Add detailed logging for tool execution success/failure - Add proper SQLite error checking in create_agent_run with error messages - Fix json_int/json_double to handle both numbers and numeric strings The json_int function was throwing exceptions when receiving numeric strings (e.g., "14" instead of 14) from clients, causing 500 errors. Now it handles both formats gracefully. Also added logging so tool failures are visible in logs instead of being silent 500 errors.	3 months ago
Rene Cannao	1b7335acfe	Fix two-phase discovery documentation and scripts - Add mcp_config.example.json for Claude Code MCP configuration - Fix MCP bridge path in example config (../../proxysql_mcp_stdio_bridge.py) - Update Two_Phase_Discovery_Implementation.md with correct Phase 1/Phase 2 usage - Fix Two_Phase_Discovery_Implementation.md DELETE FROM fts_objects to scope to run_id - Update README.md with two-phase discovery section and multi-agent legacy note - Create static_harvest.sh bash wrapper for Phase 1 - Create two_phase_discovery.py orchestration script with prompts - Add --run-id parameter to skip auto-fetch - Fix RUN_ID placeholder mismatch (<USE_THE_PROVIDED_RUN_ID>) - Fix catalog path default to mcp_catalog.db - Add test_catalog.sh to verify catalog tools work - Fix Discovery_Schema.cpp FTS5 syntax (missing space) - Remove invalid CREATE INDEX on FTS virtual tables - Add MCP tool call logging to track tool usage - Fix Static_Harvester::get_harvest_stats() to accept run_id parameter - Fix DELETE FROM fts_objects to only delete for specific run_id - Update system prompts to say DO NOT call discovery.run_static - Update user prompts to say Phase 1 is already complete - Add --mcp-only flag to restrict Claude Code to MCP tools only - Make FTS table failures non-fatal (check if table exists first) - Add comprehensive documentation for both discovery approaches	3 months ago
Rene Cannao	6f23d5bcd0	feat: Implement two-phase schema discovery architecture Phase 1 (Static/Deterministic): - Add Discovery_Schema: SQLite catalog with deterministic and LLM tables - Add Static_Harvester: MySQL INFORMATION_SCHEMA metadata extraction - Harvest schemas, objects, columns, indexes, foreign keys, view definitions - Compute derived hints: is_time, is_id_like, has_pk, has_fks, has_time - Build quick profiles and FTS5 indexes Phase 2 (LLM Agent): - Add 19 new MCP tools for two-phase discovery - discovery.run_static: Trigger ProxySQL's static harvest - Catalog tools: init, search, get_object, list_objects, get_relationships - Agent tools: run_start, run_finish, event_append - LLM tools: summary_upsert, relationship_upsert, domain_upsert, etc. Files: - include/Discovery_Schema.h, lib/Discovery_Schema.cpp - include/Static_Harvester.h, lib/Static_Harvester.cpp - include/Query_Tool_Handler.h, lib/Query_Tool_Handler.cpp (updated) - lib/Makefile (updated) - scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/prompts/ - scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/two_phase_discovery.py	3 months ago
Rene Cannao	7de3f0c510	feat: Add schema separation to MCP catalog and discovery scope constraint This commit addresses two issues: 1. MCP Catalog Schema Separation: - Add 'schema' column to catalog table for proper isolation - Update all catalog methods (upsert, get, search, list, remove) to accept schema parameter - Update MCP tool handlers and JSON-RPC parameter parsing - Unique constraint changed from (kind, key) to (schema, kind, key) - FTS table updated to include schema column 2. Discovery Prompt Scope Constraint: - Add explicit SCOPE CONSTRAINT section to multi_agent_discovery_prompt.md - Agents now respect Target Schema and skip list_schemas when specified - Prevents analyzing all schemas when only one is targeted Files modified: - include/MySQL_Catalog.h: Add schema parameter to all catalog methods - include/MySQL_Tool_Handler.h: Update wrapper method signatures - lib/MySQL_Catalog.cpp: Implement schema filtering in all operations - lib/MySQL_Tool_Handler.cpp: Update wrapper implementations - lib/Query_Tool_Handler.cpp: Extract schema from JSON-RPC params, update tool descriptions - scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/prompts/multi_agent_discovery_prompt.md: Add scope constraint	3 months ago
Rene Cannao	a3f0bade4e	feat: Convert NL2SQL to generic LLM bridge - Rename NL2SQL_Converter to LLM_Bridge for generic prompt processing - Update MySQL protocol handler from /* NL2SQL: / to / LLM: / - Remove SQL-specific fields (sql_query, confidence, tables_used) - Add GENAI_OP_LLM operation type to GenAI module - Rename all genai_nl2sql_ variables to genai_llm_* - Update AI_Features_Manager to use LLM_Bridge - Deprecate ai_nl2sql_convert MCP tool with error message - LLM bridge now handles any prompt type via MySQL protocol This enables generic LLM access (summarization, code generation, translation, analysis) while preserving infrastructure for future NL2SQL implementation via Web UI + external agents.	3 months ago
Rene Cannao	3fe8a48f70	Fix genai variable handling and add API key masking - Add has_variable() method to GenAI_Threads_Handler for variable validation - Add genai- prefix check in is_valid_global_variable() - Auto-initialize NL2SQL converter when genai-nl2sql_enabled is set to true at runtime - Make init_nl2sql() public to allow runtime initialization - Mask API keys in logs (show only first 2 chars, rest as 'x')	3 months ago
Rene Cannao	1eb42c57d0	fix: Add GenAI variables to runtime_global_variables population Add flush_genai_variables___runtime_to_database() call to the central location where all modules populate runtime_global_variables table. This was missing, causing genai-* variables to not appear in runtime_global_variables.	3 months ago
Rene Cannao	6ffb59b856	fix: Use db parameter instead of hardcoded admindb in GenAI database_to_runtime The flush_genai_variables___database_to_runtime() function was using hardcoded 'admindb' instead of the 'db' parameter passed to the function. This caused the function to always query from admindb, ignoring the actual database parameter. This fixes the issue where runtime_global_variables table was not being populated on startup because the query was always hitting the same database regardless of the parameter.	3 months ago
Rene Cannao	4018a0ad3b	fix: Follow MCP pattern for GenAI variables runtime table population Update flush_genai_variables___database_to_runtime() to match the MCP pattern exactly: - Add 'lock' parameter (default true) for flexibility - Use ProxySQL_Admin's wrlock()/wrunlock() instead of GloGATH's - Use consistent variable naming (var_name = name + 6 for 'genai-' prefix) - Follow exact same locking pattern as MCP variables This fixes the issue where runtime_global_variables table was not being populated on startup because the locking pattern was incorrect.	3 months ago
Rene Cannao	1ea67900ab	fix: Populate runtime_global_variables for GenAI variables on startup The flush_genai_variables___database_to_runtime() function was only setting internal state in GloGATH but not populating the runtime_global_variables table. This caused variables to appear in global_variables but not in runtime_global_variables after startup. Fix: Add call to flush_genai_variables___runtime_to_database() with runtime=true to populate the runtime table, following the same pattern used by MCP variables.	3 months ago
Rene Cannao	51fd51e3f5	fix: Add missing GenAI_Thread.h include and fix variables reference Added missing #include for GenAI_Thread.h in AI_Features_Manager.cpp to resolve the compilation error in debug mode. Also fixed the remaining reference to variables.ai_features_enabled which should now use GloGATH->variables.genai_enabled. This fixes the "make debug" build failure.	3 months ago
Rene Cannao	a7dac5ef3d	feat: Make NL2SQL use async GenAI path instead of blocking calls This is a critical architectural fix - NL2SQL was making blocking calls to LLMs which would block the entire MySQL thread. Now NL2SQL uses the same async socketpair pattern as the GENAI embed/rerank operations. Changes: - Added nl2sql operation type to process_json_query() in GenAI module - Updated NL2SQL handler to construct JSON query and use async GENAI path - Added extern declaration for GloAI in GenAI_Thread.cpp - Falls back to synchronous path only on systems without epoll Architecture: - Before: NL2SQL: query → blocking nl2sql->convert() → blocks MySQL thread - After: NL2SQL: query → JSON GENAI request → async socketpair → non-blocking JSON protocol for NL2SQL: GENAI: {"type": "nl2sql", "query": "Show customers", "schema": "mydb"} The NL2SQL result is delivered asynchronously through the existing GENAI response handler, making the system fully non-blocking. Related to: https://github.com/ProxySQL/proxysql-vec/pull/13	3 months ago
Rene Cannao	527bfed297	fix: Migrate AI variables to GenAI module for proper architecture This commit fixes a serious design flaw where AI configuration variables were not integrated with the ProxySQL admin interface. All ai_* variables have been migrated to the GenAI module as genai-* variables. Changes: - Added 21 new genai_* variables to GenAI_Thread.h structure - Implemented get/set functions for all new variables in GenAI_Thread.cpp - Removed internal variables struct from AI_Features_Manager - AI_Features_Manager now reads from GloGATH instead of internal state - Updated documentation to reference genai-* variables - Fixed debug.cpp assertion for PROXY_DEBUG_NL2SQL and PROXY_DEBUG_ANOMALY Variable mapping: - ai_nl2sql_enabled → genai-nl2sql_enabled - ai_anomaly_detection_enabled → genai-anomaly_enabled - ai_features_enabled → genai-enabled - All other ai_* variables follow the same pattern The flush functions automatically handle all variables in the genai_thread_variables_names array, so database persistence works correctly without additional changes. Related to: https://github.com/ProxySQL/proxysql-vec/pull/13	3 months ago
Rene Cannao	2888ee3f45	Fix gemini-code-assist recommendations and implement comprehensive anomaly detection tests - Fix retry logic to use is_retryable_error function for proper HTTP error handling - Add exception handling to get_json_int function with try-catch around std::stoi - Improve validate_numeric_range to use strtol instead of atoi for better error reporting - Fix Chinese characters in documentation (non-zero -> non-zero) - Replace placeholder tests with actual comprehensive tests for anomaly detection functionality - Create new standalone unit test anomaly_detector_unit-t.cpp with 29 tests covering: * SQL injection pattern detection (12 tests) * Query normalization (8 tests) * Risk scoring calculations (5 tests) * Configuration validation (4 tests) - All tests pass successfully, providing meaningful validation of core anomaly detection logic Thanks to gemini-code-assist for the thorough code review and recommendations.	3 months ago
Rene Cannao	ae4200dbc0	Enhance AI features with improved validation, memory safety, error handling, and performance monitoring - Rename validate_provider_name to validate_provider_format for clarity - Add null checks and error handling for all strdup() operations - Enhance error messages with more context and HTTP status codes - Implement performance monitoring with timing metrics for LLM calls and cache operations - Add comprehensive test coverage for edge cases, retry scenarios, and performance - Extend status variables to track performance metrics - Update MySQL session to report timing information to AI manager	3 months ago
Rahim Kanji	01f08ea901	Fix a crash (SIGABRT) that occurred when reloading MCP variables while the MCP server was already running. The issue was caused by improper cleanup of handler objects during reinitialization. Root cause: - ProxySQL_MCP_Server destructor deletes mysql_tool_handler - The old code tried to delete handlers again after deleting the server, causing double-free corruption The fix properly handles handler lifecycle during reinitialization: 1. Delete Query_Tool_Handler first (server destructor doesn't clean this) 2. Delete the server (which also deletes MySQL_Tool_Handler via destructor) 3. Delete other handlers (config/admin/cache/observe) created by old server 4. Create new MySQL_Tool_Handler with updated configuration 5. Create new Query_Tool_Handler 6. Create new server (recreates all handlers with new endpoints) This ensures proper cleanup and prevents double-free issues while allowing runtime reconfiguration of MySQL connection parameters.	3 months ago
Rene Cannao	49092e9c8d	test: Add unit tests for AI configuration validation This commit adds comprehensive unit tests for the AI configuration validation functions used in AI_Features_Manager. Changes: - Add test/tap/tests/ai_validation-t.cpp with 61 unit tests - Test URL format validation (validate_url_format) - Test API key format validation (validate_api_key_format) - Test numeric range validation (validate_numeric_range) - Test provider name validation (validate_provider_name) - Test edge cases and boundary conditions The test file is self-contained with its own copies of the validation functions to avoid complex linking dependencies on libproxysql. Test Categories: - URL validation: 15 tests (http://, https:// protocols) - API key validation: 14 tests (OpenAI, Anthropic formats) - Numeric range: 13 tests (min/max boundaries) - Provider name: 8 tests (openai, anthropic) - Edge cases: 11 tests (NULL handling, long values) All 61 tests pass successfully. Part of: Phase 4 of NL2SQL improvement plan	3 months ago
Rene Cannao	8f38b8a577	feat: Add exponential backoff retry for transient LLM failures This commit adds configurable retry logic with exponential backoff for NL2SQL LLM API calls. Changes: - Add retry configuration to NL2SQLRequest (max_retries, retry_backoff_ms, retry_multiplier, retry_max_backoff_ms) - Add is_retryable_error() to identify retryable HTTP/CURL errors - Add sleep_with_jitter() for exponential backoff with 10% jitter - Add call_generic_openai_with_retry() wrapper - Add call_generic_anthropic_with_retry() wrapper - Update NL2SQL_Converter::convert() to use retry wrappers Default retry behavior: - 3 retries with 1000ms initial backoff - 2.0x multiplier, 30000ms max backoff - Retries on empty responses (transient failures) Part of: Phase 3 of NL2SQL improvement plan	3 months ago
Rene Cannao	d0dc36ac0b	feat: Add structured logging with timing and request IDs Add comprehensive structured logging for NL2SQL LLM API calls with request correlation, timing metrics, and detailed error context. Changes: - Add request_id field to NL2SQLRequest with UUID-like auto-generation - Add structured logging macros: * LOG_LLM_REQUEST: Logs URL, model, prompt length with request ID * LOG_LLM_RESPONSE: Logs HTTP status, duration_ms, response preview * LOG_LLM_ERROR: Logs error phase, message, and status code - Update call_generic_openai() signature to accept req_id parameter - Update call_generic_anthropic() signature to accept req_id parameter - Add timing metrics to both LLM call functions using clock_gettime() - Replace existing debug logging with structured logging macros - Update convert() to pass request_id to LLM calls Request IDs are generated as UUID-like strings (e.g., "12345678-9abc-def0-1234-567890abcdef") and are included in all log messages for correlation. This allows tracking a single NL2SQL request through all log lines from request to response. Timing is measured using CLOCK_MONOTONIC for accurate duration tracking of LLM API calls, reported in milliseconds. This provides much better debugging capability when troubleshooting NL2SQL issues, as administrators can now: - Correlate all log lines for a single request - See exact timing of LLM API calls - Identify which phase of processing failed - Track request/response metrics Fixes #2 - Add Structured Logging	3 months ago
Rene Cannao	45e592b623	feat: Add structured error messages with context to NL2SQL Add comprehensive error details to help users debug NL2SQL conversion issues. Changes: - Add error_code, error_details, http_status_code, provider_used fields to NL2SQLResult - Add NL2SQLErrorCode enum with structured error codes: * SUCCESS, ERR_API_KEY_MISSING, ERR_API_KEY_INVALID, ERR_TIMEOUT * ERR_CONNECTION_FAILED, ERR_RATE_LIMITED, ERR_SERVER_ERROR * ERR_EMPTY_RESPONSE, ERR_INVALID_RESPONSE, ERR_SQL_INJECTION_DETECTED * ERR_VALIDATION_FAILED, ERR_UNKNOWN_PROVIDER, ERR_REQUEST_TOO_LARGE - Add nl2sql_error_code_to_string() function for error code conversion - Add format_error_context() helper to create detailed error messages including: * Query (truncated if too long) * Schema name * Provider attempted * Endpoint URL * Specific error message - Add set_error_details() helper to populate error fields - Update error handling in convert() to use new error details - Track provider_used in successful conversions This provides much better debugging information when NL2SQL conversions fail, making it easier to identify misconfigurations and connectivity issues. Fixes #1 - Improve Error Messages	3 months ago
Rene Cannao	40b2608c2d	feat: Add configuration validation to AI_Features_Manager Add comprehensive validation for AI features configuration variables to prevent invalid states and improve error messages. Changes: - Add validate_url_format(): Checks for http:// or https:// prefix and host part - Add validate_api_key_format(): Validates API key format, checks for whitespace, minimum length, and incomplete key patterns (sk- with <20 chars, sk-ant- with <25 chars) - Add validate_numeric_range(): Validates numeric values are within min/max range - Add validate_provider_name(): Ensures provider is 'openai' or 'anthropic' - Update set_variable() to call validation functions before setting values Validated variables: - ai_nl2sql_provider: Must be 'openai' or 'anthropic' - ai_nl2sql_provider_url: Must have http:// or https:// prefix - ai_nl2sql_provider_key: No whitespace, minimum 10 chars - ai_nl2sql_cache_similarity_threshold: Range [0, 100] - ai_nl2sql_timeout_ms: Range [1000, 300000] (1 second to 5 minutes) - ai_nl2sql_max_cloud_requests_per_hour: Range [1, 10000] - ai_anomaly_similarity_threshold: Range [0, 100] - ai_anomaly_risk_threshold: Range [0, 100] - ai_anomaly_rate_limit: Range [1, 10000] - ai_vector_dimension: Range [128, 4096] This prevents misconfigurations and provides clear error messages to users when invalid values are provided. Fixes compilation issue by moving validation helper functions before set_variable() to resolve forward declaration errors.	3 months ago
Rene Cannao	36b11223b2	feat: Improve SQL validation with multi-factor scoring Add comprehensive SQL validation with confidence scoring based on: - SQL keyword detection (17 keywords covering DDL/DML/transactions) - Structural validation (balanced parentheses and quotes) - SQL injection pattern detection - Length and quality checks Confidence scoring: - Base 0.4 for valid SQL keyword - +0.15 for balanced parentheses - +0.15 for balanced quotes - +0.1 for minimum length - +0.1 for FROM clause in SELECT statements - +0.1 for no injection patterns - -0.3 penalty for injection patterns detected Low confidence (< 0.5) results are logged with detailed info. Cache storage threshold updated to 0.5 confidence (from implicit valid_sql). This improves detection of malformed or potentially malicious SQL while providing granular confidence scores for downstream use.	3 months ago
Rene Cannao	897d306d2d	Refactor: Simplify NL2SQL to use only generic providers Remove Ollama-specific provider code and use only generic OpenAI-compatible and Anthropic-compatible providers. Ollama is now used via its OpenAI-compatible endpoint at /v1/chat/completions. Changes: - Remove LOCAL_OLLAMA from ModelProvider enum - Remove ai_nl2sql_ollama_model and ai_nl2sql_ollama_url variables - Remove call_ollama() function from LLM_Clients.cpp - Update default configuration to use OpenAI provider with Ollama URL - Update all documentation to reflect generic-only approach Configuration: - ai_nl2sql_provider: 'openai' or 'anthropic' (default: 'openai') - ai_nl2sql_provider_url: endpoint URL (default: Ollama OpenAI-compatible) - ai_nl2sql_provider_model: model name - ai_nl2sql_provider_key: API key (optional for local endpoints) This simplifies the codebase by removing a separate code path for Ollama and aligns with the goal of avoiding provider-specific variables.	3 months ago
Rene Cannao	637b2a669c	feat: Implement NL2SQL vector cache and complete Anomaly threat pattern management NL2SQL_Converter improvements: - Implement get_query_embedding() using GenAI module - Implement check_vector_cache() with KNN search via sqlite-vec - Implement store_in_vector_cache() with embedding storage - All stub methods now fully functional Anomaly_Detector improvements: - Implement add_threat_pattern() with embedding generation - Stores patterns in both main table and virtual vec table - Returns pattern ID on success, -1 on error Documentation: - Add comprehensive VECTOR_FEATURES documentation - README.md (471 lines): User guide and quick start - API.md (736 lines): Complete API reference - ARCHITECTURE.md (358 lines): System architecture - TESTING.md (767 lines): Testing guide and procedures This completes the vector features implementation, enabling: - Semantic similarity caching for NL2SQL queries - Embedding-based threat pattern detection - Full CRUD operations for threat patterns	3 months ago
Rene Cannao	782f6cb66b	feat: Implement threat pattern management and improve statistics Improve Anomaly_Detector with full threat pattern CRUD operations: Changes to lib/Anomaly_Detector.cpp: - Implement list_threat_patterns(): * Returns JSON array of all threat patterns * Shows pattern_name, pattern_type, query_example, severity, created_at * Ordered by severity DESC (highest risk first) - Implement remove_threat_pattern(): * Deletes from both anomaly_patterns and anomaly_patterns_vec tables * Proper error handling with error messages * Returns true on success, false on failure - Improve get_statistics(): * Add threat_patterns_count to statistics * Add threat_patterns_by_type breakdown * Shows patterns grouped by type (sql_injection, dos, etc.) - Add count_by_pattern_type query for categorization Features: - Full CRUD operations for threat patterns - JSON-formatted output for API integration - Statistics include both counts and categorization - Proper cleanup of both main and virtual tables	3 months ago
Rene Cannao	1c7cd8c2b1	fix: Correct PROXY_DEBUG constant from AI_GENERIC to GENAI	3 months ago
Rene Cannao	f226c0e687	feat: Implement embedding-based threat similarity for Anomaly Detection Implemented embedding-based threat pattern detection using GenAI and sqlite-vec: Changes to lib/Anomaly_Detector.cpp: - Add GenAI_Thread.h include and GloGATH extern - Implement get_query_embedding(): * Calls GloGATH->embed_documents() via llama-server * Normalizes query before embedding for better quality * Returns std::vector<float> with embedding - Implement check_embedding_similarity(): * Generates embedding for query if not provided * Performs sqlite-vec KNN search against anomaly_patterns table * Uses cosine distance (vec_distance_cosine) for similarity * Calculates risk score based on severity and distance * Returns AnomalyResult with pattern details and blocking decision - Implement add_threat_pattern(): * Generates embedding for threat pattern example * Stores pattern with embedding in anomaly_patterns table * Updates anomaly_patterns_vec virtual table for KNN search * Returns pattern ID on success Features: - Semantic similarity detection against known threat patterns - Configurable similarity threshold (ai_anomaly_similarity_threshold) - Risk scoring based on pattern severity (1-10) and similarity - Automatic threat pattern management with vector indexing	3 months ago
Rene Cannao	fec7d64093	feat: Implement NL2SQL vector cache with GenAI embedding generation Implemented semantic caching for NL2SQL using sqlite-vec and GenAI module: Changes to lib/AI_Features_Manager.cpp: - Create virtual vec0 tables for similarity search: * nl2sql_cache_vec for NL2SQL cache * anomaly_patterns_vec for threat patterns * query_history_vec for query history Changes to include/NL2SQL_Converter.h: - Add get_query_embedding() method declaration Changes to lib/NL2SQL_Converter.cpp: - Add GenAI_Thread.h include and GloGATH extern - Implement get_query_embedding() - calls GloGATH->embed_documents() - Implement check_vector_cache() - sqlite-vec KNN search with cosine distance - Implement store_in_vector_cache() - stores embedding and updates vec table - Implement clear_cache() - deletes from both main and vec tables - Implement get_cache_stats() - returns cache entry/hit counts - Add vector_to_json() helper for sqlite-vec MATCH queries Features: - Uses GenAI module (llama-server) for embedding generation - Cosine similarity search via sqlite-vec vec_distance_cosine() - Configurable similarity threshold (ai_nl2sql_cache_similarity_threshold) - Automatic hit counting and timestamp tracking	3 months ago
Rene Cannao	52a70b0b09	feat: Implement AI-based Anomaly Detection for ProxySQL Phase 3: Anomaly Detection Implementation This commit implements a comprehensive multi-stage anomaly detection system for real-time SQL query security analysis. Core Detection Methods: 1. SQL Injection Pattern Detection (lib/Anomaly_Detector.cpp) - Regex-based detection of 11 SQL injection patterns - Suspicious keyword detection (11 patterns) - Covers: tautologies, union-based, comment-based, stacked queries 2. Query Normalization (lib/Anomaly_Detector.cpp:normalize_query) - Converts to lowercase - Removes SQL comments - Replaces string/numeric literals with placeholders - Normalizes whitespace 3. Rate Limiting (lib/Anomaly_Detector.cpp:check_rate_limiting) - Per user/host query rate tracking - Configurable time windows (3600s default) - Auto-block on threshold exceeded - Prevents DoS and brute force attacks 4. Statistical Anomaly Detection (lib/Anomaly_Detector.cpp:check_statistical_anomaly) - Z-score based outlier detection - Abnormal execution time detection (>5s) - Large result set detection (>10000 rows) - Behavioral profiling per user 5. Embedding-based Similarity (lib/Anomaly_Detector.cpp:check_embedding_similarity) - Placeholder for vector similarity search - Framework for sqlite-vec integration - Detects novel attack variations Query Flow Integration: - Added `detect_ai_anomaly()` to MySQL_Session (line 3626) - Integrated after libinjection SQLi detection (line 5150) - Blocks queries when risk threshold exceeded (default: 0.70) - Sends error response with anomaly details Status Variables Added: - `ai_detected_anomalies`: Total anomalies detected - `ai_blocked_queries`: Total queries blocked - Available via: `SELECT * FROM stats_mysql_global` Configuration (defaults): - `enabled`: true - `risk_threshold`: 70 (0-100) - `similarity_threshold`: 85 (0-100) - `rate_limit`: 100 queries/hour - `auto_block`: true - `log_only`: false Detection Pipeline: ``` Query → SQLi Check → AI Anomaly Check → [Block if needed] → Execute (libinjection) (Multi-stage) ``` Files Modified: - include/MySQL_Session.h: Added detect_ai_anomaly() declaration - include/MySQL_Thread.h: Added AI status variables - lib/Anomaly_Detector.cpp: Full implementation (700+ lines) - lib/MySQL_Session.cpp: Integration and query flow - lib/MySQL_Thread.cpp: Status variable definitions Next Steps: - Add unit tests for each detection method - Add integration tests with sample attacks - Add user and developer documentation Related: Phase 1-2 (NL2SQL foundation and testing) Related: Phase 4 (Vector storage for embeddings)	3 months ago
Rene Cannao	3f44229e28	feat: Add MCP AI Tool Handler for NL2SQL with test script Phase 5: MCP Tool Implementation for NL2SQL This commit implements the AI Tool Handler for the MCP (Model Context Protocol) server, exposing NL2SQL functionality as an MCP tool. New Files: - include/AI_Tool_Handler.h: Header for AI_Tool_Handler class - Provides ai_nl2sql_convert tool via MCP protocol - Wraps NL2SQL_Converter and Anomaly_Detector - Inherits from MCP_Tool_Handler base class - lib/AI_Tool_Handler.cpp: Implementation - Implements ai_nl2sql_convert tool execution - Accepts parameters: natural_language (required), schema, context_tables, max_latency_ms, allow_cache - Returns JSON response with sql_query, confidence, explanation, cached, cache_id - scripts/mcp/test_nl2sql_tools.sh: Test script for NL2SQL MCP tool - Tests ai_nl2sql_convert via JSON-RPC over HTTPS - 10 test cases covering SELECT, WHERE, JOIN, aggregation, etc. - Includes error handling test for empty queries - Supports --verbose, --quiet options Modified Files: - include/MCP_Thread.h: Add AI_Tool_Handler forward declaration and pointer - lib/Makefile: Add AI_Tool_Handler.oo to _OBJ_CXX list - lib/ProxySQL_MCP_Server.cpp: Initialize and register AI tool handler - Creates AI_Tool_Handler with GloAI components - Registers /mcp/ai endpoint - Adds cleanup in destructor MCP Tool Details: - Endpoint: /mcp/ai - Tool: ai_nl2sql_convert - Parameters: - natural_language (string, required): Natural language query - schema (string, optional): Database schema name - context_tables (string, optional): Comma-separated table list - max_latency_ms (integer, optional): Max acceptable latency - allow_cache (boolean, optional): Check semantic cache (default: true) Testing: Run the test script with: ./scripts/mcp/test_nl2sql_tools.sh [--verbose] [--quiet] See scripts/mcp/test_nl2sql_tools.sh --help for usage. Related: Phase 1-4 (Documentation, Unit Tests, Integration Tests, E2E Tests) Related: Phase 6-8 (User Docs, Developer Docs, Test Docs)	3 months ago
Rene Cannao	af68f347d4	fix: Add missing verbosity level to proxy_debug call in Anomaly_Detector The proxy_debug macro requires a verbosity level as the second parameter. Fixed the call in Anomaly_Detector::analyze() to include the level.	3 months ago
Rene Cannao	4f45c25945	docs: Add comprehensive doxygen comments to NL2SQL headers and LLM_Clients - Add file-level doxygen documentation with @file, @brief, @date, @version - Add detailed class and method documentation with @param, @return, @note, @see - Document data structures (NL2SQLRequest, NL2SQLResult, ModelProvider) - Add section comments and inline documentation for implementation files - Document all three LLM provider APIs (Ollama, OpenAI, Anthropic)	3 months ago
Rene Cannao	bc4fff12ce	feat: Add NL2SQL query interception in MySQL_Session - Add NL2SQL handler declaration - Add routing for 'NL2SQL:' prefix - Return resultset with generated SQL and metadata	3 months ago
Rene Cannao	147a059781	feat: Add NL2SQL converter with hybrid LLM support - Add NL2SQL_Converter with prompt building and model selection - Add LLM clients for Ollama, OpenAI, Anthropic APIs - Update Makefile for new source files	3 months ago
Rene Cannao	d9346fe64d	feat: Add AI features manager foundation - Add AI_Features_Manager coordinator class - Add AI_Vector_Storage interface (stub) - Add Anomaly_Detector class (stub for Phase 3) - Update includes and main initialization	3 months ago
René Cannaò	2637d28f36	Merge pull request #5299 from sysown/v3.0_pg-cancel-terminate-backend-param-support_5298 Add parameterized PID support for pg_cancel_backend/pg_terminate_backend in extended query protocol	3 months ago
Rahim Kanji	9ec045ca74	Fix PostgreSQL deadlock with Close Statement flood exceeding threshold_resultset_size Bug Description: ProxySQL would deadlock when processing extended query frames where: 1. Many Close Statement messages accumulate responses in PSarrayOUT 2. Total response size exceeds pgsql-threshold_resultset_size 3. A backend operation (Describe/Execute) follows in the same frame Root Cause: - Close Statement operations are handled locally by ProxySQL (no backend routing) - Their CloseComplete responses accumulate in PSarrayOUT - When threshold_resultset_size is exceeded, ProxySQL stops reading from backend - Subsequent backend operations (Describe/Execute) need backend responses to complete - This creates a deadlock: ProxySQL won't read, backend operation can't complete - Extended query frame never finishes, query times out The Fix: When PSarrayOUT exceeds threshold_resultset_size and a backend operation is pending, ProxySQL now flushes all accumulated data in PSarrayOUT to the client first, then continues processing backend operations. This breaks the deadlock by clearing the buffer before attempting to read more data from the backend.	3 months ago
Rahim Kanji	67cbe46450	Simplify PID extraction Using bind message to obtain parameter information, rather than determining whether the query is parameterized from the query itself. Multiple parameters are not possible in this case, as PostgreSQL itself rejects multi-parameter pg_cancel_backend() and pg_terminate_backend() and only accepts a single parameter for these functions.	3 months ago
Rahim Kanji	5066ddd181	Removed isdigit	3 months ago
Rahim Kanji	ce42c188f5	Improvements	3 months ago

1 2 3 4 5 ...

4893 Commits (e762f91cda174ea3e6ab85bbcbdbc457280edf7f)