You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
proxysql/doc/VECTOR_FEATURES/TESTING.md

15 KiB

Vector Features Testing Guide

Overview

This document describes testing strategies and procedures for Vector Features in ProxySQL, including unit tests, integration tests, and manual testing procedures.

Test Suite Overview

Test Type Location Purpose External Dependencies
Unit Tests test/tap/tests/vector_features-t.cpp Test vector feature configuration and initialization None
Integration Tests test/tap/tests/nl2sql_integration-t.cpp Test NL2SQL with real database Test database
E2E Tests scripts/mcp/test_nl2sql_e2e.sh Complete workflow testing Ollama/llama-server
Manual Tests This document Interactive testing All components

Prerequisites

1. Enable AI Features

-- Connect to ProxySQL admin
mysql -h 127.0.0.1 -P 6032 -u admin -padmin

-- Enable AI features
SET ai_features_enabled='true';
SET ai_nl2sql_enabled='true';
SET ai_anomaly_detection_enabled='true';
LOAD MYSQL VARIABLES TO RUNTIME;

2. Start llama-server

# Start embedding service
ollama run nomic-embed-text-v1.5

# Or via llama-server directly
llama-server --model nomic-embed-text-v1.5 --port 8013 --embedding

3. Verify GenAI Connection

# Test embedding endpoint
curl -X POST http://127.0.0.1:8013/embedding \
  -H "Content-Type: application/json" \
  -d '{"content": "test embedding"}'

# Should return JSON with embedding array

Unit Tests

Running Unit Tests

cd /home/rene/proxysql-vec/test/tap

# Build vector features test
make vector_features

# Run the test
./vector_features

Test Categories

1. Virtual Table Creation Tests

Purpose: Verify sqlite-vec virtual tables are created correctly

void test_virtual_tables_created() {
    // Checks:
    // - AI features initialized
    // - Vector DB path configured
    // - Vector dimension is 1536
}

Expected Output:

=== Virtual vec0 Table Creation Tests ===
ok 1 - AI features initialized
ok 2 - Vector DB path configured (or default used)
ok 3 - Vector dimension is 1536 or default

2. NL2SQL Cache Configuration Tests

Purpose: Verify NL2SQL cache variables are accessible and configurable

void test_nl2sql_cache_config() {
    // Checks:
    // - Cache enabled by default
    // - Similarity threshold is 85
    // - Threshold can be changed
}

Expected Output:

=== NL2SQL Vector Cache Configuration Tests ===
ok 4 - NL2SQL enabled by default
ok 5 - Cache similarity threshold is 85 or default
ok 6 - Cache threshold changed to 90
ok 7 - Cache threshold changed to 90

3. Anomaly Embedding Configuration Tests

Purpose: Verify anomaly detection variables are accessible

void test_anomaly_embedding_config() {
    // Checks:
    // - Anomaly detection enabled
    // - Similarity threshold is 85
    // - Risk threshold is 70
}

4. Status Variables Tests

Purpose: Verify Prometheus-style status variables exist

void test_status_variables() {
    // Checks:
    // - ai_detected_anomalies exists
    // - ai_blocked_queries exists
}

Expected Output:

=== Status Variables Tests ===
ok 12 - ai_detected_anomalies status variable exists
ok 13 - ai_blocked_queries status variable exists

Integration Tests

NL2SQL Semantic Cache Test

Test Case: Semantic Cache Hit

Purpose: Verify that semantically similar queries hit the cache

-- Step 1: Clear cache
DELETE FROM nl2sql_cache;

-- Step 2: First query (cache miss)
-- This will call LLM and cache the result
SELECT * FROM runtime_mysql_servers
WHERE variable_name = 'ai_nl2sql_enabled';

-- Via NL2SQL:
NL2SQL: Show all customers from USA;

-- Step 3: Similar query (should hit cache)
NL2SQL: Display USA customers;

-- Step 4: Another similar query
NL2SQL: List customers in United States;

Expected Result:

  • First query: Calls LLM (takes 1-5 seconds)
  • Subsequent queries: Return cached result (takes < 100ms)

Verify Cache Hit

// Check cache statistics
std::string stats = converter->get_cache_stats();
// Should show increased hit count

// Or via SQL
SELECT COUNT(*) as cache_entries,
       SUM(hit_count) as total_hits
FROM nl2sql_cache;

Anomaly Detection Tests

Test Case 1: Known Threat Pattern

Purpose: Verify detection of known SQL injection

-- Add threat pattern
-- (Via C++ API)
detector->add_threat_pattern(
    "OR 1=1 Tautology",
    "SELECT * FROM users WHERE id=1 OR 1=1--",
    "sql_injection",
    9
);

-- Test detection
SELECT * FROM users WHERE id=5 OR 2=2--';

-- Should be BLOCKED (high similarity to OR 1=1 pattern)

Expected Result:

  • Query blocked
  • Risk score > 0.7 (70%)
  • Threat type: sql_injection

Test Case 2: Threat Variation

Purpose: Detect variations of attack patterns

-- Known pattern: "SELECT ... WHERE id=1 AND sleep(10)"
-- Test variation:
SELECT * FROM users WHERE id=5 AND SLEEP(5)--';

-- Should be FLAGGED (similar but lower severity)

Expected Result:

  • Query flagged
  • Risk score: 0.4-0.6 (medium)
  • Action: Flagged but allowed

Test Case 3: Legitimate Query

Purpose: Ensure false positives are minimal

-- Normal query
SELECT * FROM users WHERE id=5;

-- Should be ALLOWED

Expected Result:

  • No detection
  • Query allowed through

Manual Testing Procedures

Test 1: NL2SQL Vector Cache

Setup

-- Enable NL2SQL
SET ai_nl2sql_enabled='true';
SET ai_nl2sql_cache_similarity_threshold='85';
LOAD MYSQL VARIABLES TO RUNTIME;

-- Clear cache
DELETE FROM nl2sql_cache;
DELETE FROM nl2sql_cache_vec;

Procedure

  1. First Query (Cold Cache)

    NL2SQL: Show all customers from USA;
    
    • Record response time
    • Should take 1-5 seconds (LLM call)
  2. Check Cache Entry

    SELECT id, natural_language, generated_sql, hit_count
    FROM nl2sql_cache;
    
    • Should have 1 entry
    • hit_count should be 0 or 1
  3. Similar Query (Warm Cache)

    NL2SQL: Display USA customers;
    
    • Record response time
    • Should take < 100ms (cache hit)
  4. Verify Cache Hit

    SELECT id, natural_language, hit_count
    FROM nl2sql_cache;
    
    • hit_count should be increased
  5. Different Query (Cache Miss)

    NL2SQL: Show orders from last month;
    
    • Should take 1-5 seconds (new LLM call)

Expected Results

Query Expected Time Source
First unique query 1-5s LLM
Similar query < 100ms Cache
Different query 1-5s LLM

Troubleshooting

If cache doesn't work:

  1. Check ai_nl2sql_enabled='true'
  2. Check llama-server is running
  3. Check vector DB exists: ls -la /var/lib/proxysql/ai_features.db
  4. Check logs: tail -f proxysql.log | grep NL2SQL

Test 2: Anomaly Detection Embedding Similarity

Setup

-- Enable anomaly detection
SET ai_anomaly_detection_enabled='true';
SET ai_anomaly_similarity_threshold='85';
SET ai_anomaly_risk_threshold='70';
SET ai_anomaly_auto_block='true';
LOAD MYSQL VARIABLES TO RUNTIME;

-- Add test threat patterns (via C++ API or script)
-- See scripts/add_threat_patterns.sh

Procedure

  1. Test SQL Injection Detection

    -- Known threat: OR 1=1
    SELECT * FROM users WHERE id=1 OR 1=1--';
    
    • Expected: BLOCKED
    • Risk: > 70%
    • Type: sql_injection
  2. Test Injection Variation

    -- Variation: OR 2=2
    SELECT * FROM users WHERE id=5 OR 2=2--';
    
    • Expected: BLOCKED or FLAGGED
    • Risk: 60-90%
  3. Test DoS Detection

    -- Known threat: Sleep-based DoS
    SELECT * FROM users WHERE id=1 AND SLEEP(10);
    
    • Expected: BLOCKED or FLAGGED
    • Type: dos
  4. Test Legitimate Query

    -- Normal query
    SELECT * FROM users WHERE id=5;
    
    • Expected: ALLOWED
    • No detection
  5. Check Statistics

    SHOW STATUS LIKE 'ai_anomaly_%';
    -- ai_detected_anomalies
    -- ai_blocked_queries
    -- ai_flagged_queries
    

Expected Results

Query Expected Action Risk Score
OR 1=1 injection BLOCKED > 70%
OR 2=2 variation BLOCKED/FLAGGED 60-90%
Sleep DoS BLOCKED/FLAGGED > 50%
Normal query ALLOWED < 30%

Troubleshooting

If detection doesn't work:

  1. Check threat patterns exist: SELECT COUNT(*) FROM anomaly_patterns;
  2. Check similarity threshold: Lower to 80 for more sensitivity
  3. Check embeddings are being generated: tail -f proxysql.log | grep GenAI
  4. Verify query normalization: Check log for normalized query

Test 3: Threat Pattern Management

Add Threat Pattern

// Via C++ API
Anomaly_Detector* detector = GloAI->get_anomaly();

bool success = detector->add_threat_pattern(
    "Test Pattern",
    "SELECT * FROM test WHERE id=1",
    "test",
    5
);

if (success) {
    std::cout << "Pattern added successfully\n";
}

List Threat Patterns

std::string patterns_json = detector->list_threat_patterns();
std::cout << "Patterns:\n" << patterns_json << "\n";

Or via SQL:

SELECT id, pattern_name, pattern_type, severity
FROM anomaly_patterns
ORDER BY severity DESC;

Remove Threat Pattern

bool success = detector->remove_threat_pattern(1);

Or via SQL:

-- Note: This is for testing only, use C++ API in production
DELETE FROM anomaly_patterns WHERE id=1;
DELETE FROM anomaly_patterns_vec WHERE rowid=1;

Performance Testing

Baseline Metrics

Record baseline performance for your environment:

# Create test script
cat > test_performance.sh <<'EOF'
#!/bin/bash

echo "=== NL2SQL Performance Test ==="

# Test 1: Cold cache (no similar queries)
time mysql -h 127.0.0.1 -P 6033 -u test -ptest \
  -e "NL2SQL: Show all products from electronics category;"

sleep 1

# Test 2: Warm cache (similar query)
time mysql -h 127.0.0.1 -P 6033 -u test -ptest \
  -e "NL2SQL: Display electronics products;"

echo ""
echo "=== Anomaly Detection Performance Test ==="

# Test 3: Anomaly check
time mysql -h 127.0.0.1 -P 6033 -u test -ptest \
  -e "SELECT * FROM users WHERE id=1 OR 1=1--';"

EOF

chmod +x test_performance.sh
./test_performance.sh

Expected Performance

Operation Target Time Max Time
Embedding generation < 200ms 500ms
Cache search < 50ms 100ms
Similarity check < 50ms 100ms
LLM call (Ollama) 1-2s 5s
Cached query < 100ms 200ms

Load Testing

# Test concurrent queries
for i in {1..100}; do
  mysql -h 127.0.0.1 -P 6033 -u test -ptest \
    -e "NL2SQL: Show customer $i;" &
done
wait

# Check statistics
SHOW STATUS LIKE 'ai_%';

Debugging Tests

Enable Debug Logging

// In ProxySQL configuration
proxysql-debug-level 3

Key Debug Commands

# NL2SQL logs
tail -f proxysql.log | grep NL2SQL

# Anomaly logs
tail -f proxysql.log | grep Anomaly

# GenAI/Embedding logs
tail -f proxysql.log | grep GenAI

# Vector DB logs
tail -f proxysql.log | grep "vec"

# All AI logs
tail -f proxysql.log | grep -E "(NL2SQL|Anomaly|GenAI|AI:)"

Direct Database Inspection

# Open vector database
sqlite3 /var/lib/proxysql/ai_features.db

# Check schema
.schema

# View cache entries
SELECT id, natural_language, hit_count, created_at FROM nl2sql_cache;

# View threat patterns
SELECT id, pattern_name, pattern_type, severity FROM anomaly_patterns;

# Check virtual tables
SELECT rowid FROM nl2sql_cache_vec LIMIT 10;

# Count embeddings
SELECT COUNT(*) FROM nl2sql_cache WHERE embedding IS NOT NULL;

Test Checklist

Unit Tests

  • Virtual tables created
  • NL2SQL cache configuration
  • Anomaly embedding configuration
  • Vector DB file exists
  • Status variables exist
  • GenAI module accessible

Integration Tests

  • NL2SQL semantic cache hit
  • NL2SQL cache miss
  • Anomaly detection of known threats
  • Anomaly detection of variations
  • False positive check
  • Threat pattern CRUD operations

Manual Tests

  • NL2SQL end-to-end flow
  • Anomaly blocking
  • Anomaly flagging
  • Performance within targets
  • Concurrent load handling
  • Memory usage acceptable

Continuous Testing

Automated Test Script

#!/bin/bash
# run_vector_tests.sh

set -e

echo "=== Vector Features Test Suite ==="

# 1. Unit tests
echo "Running unit tests..."
cd test/tap
make vector_features
./vector_features

# 2. Integration tests
echo "Running integration tests..."
# Add integration test commands here

# 3. Performance tests
echo "Running performance tests..."
# Add performance test commands here

# 4. Cleanup
echo "Cleaning up..."
# Clear test data

echo "=== All tests passed ==="

CI/CD Integration

# Example GitHub Actions workflow
name: Vector Features Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Start llama-server
        run: ollama run nomic-embed-text-v1.5 &
      - name: Build ProxySQL
        run: make
      - name: Run unit tests
        run: cd test/tap && make vector_features && ./vector_features
      - name: Run integration tests
        run: ./scripts/mcp/test_nl2sql_e2e.sh --mock

Common Issues and Solutions

Issue: "No such table: nl2sql_cache_vec"

Cause: Virtual tables not created

Solution:

-- Recreate virtual tables
-- (Requires restarting ProxySQL)

Issue: "Failed to generate embedding"

Cause: GenAI module not connected to llama-server

Solution:

# Check llama-server is running
curl http://127.0.0.1:8013/embedding

# Check ProxySQL logs
tail -f proxysql.log | grep GenAI

Issue: "Poor similarity detection"

Cause: Threshold too high or embeddings not generated

Solution:

-- Lower threshold for testing
SET ai_anomaly_similarity_threshold='75';

Issue: "Cache not hitting"

Cause: Similarity threshold too high

Solution:

-- Lower cache threshold
SET ai_nl2sql_cache_similarity_threshold='75';

Test Data

Sample NL2SQL Queries

-- Simple queries
NL2SQL: Show all customers;
NL2SQL: Display all users;
NL2SQL: List all customers;  -- Should hit cache

-- Conditional queries
NL2SQL: Find customers from USA;
NL2SQL: Display USA customers;  -- Should hit cache
NL2SQL: Show users in United States;  -- Should hit cache

-- Aggregation
NL2SQL: Count customers by country;
NL2SQL: How many customers per country?;  -- Should hit cache

Sample Threat Patterns

See scripts/add_threat_patterns.sh for 10 example patterns covering:

  • SQL Injection (OR 1=1, UNION, comments, etc.)
  • DoS attacks (sleep, benchmark)
  • Data exfiltration (INTO OUTFILE)
  • Privilege escalation (DROP TABLE)
  • Reconnaissance (schema probing)

Reporting Test Results

Test Result Template

## Vector Features Test Results - [Date]

### Environment
- ProxySQL version: [version]
- Vector dimension: 1536
- Similarity threshold: 85
- llama-server status: [running/not running]

### Unit Tests
- Total: 20
- Passed: XX
- Failed: XX
- Skipped: XX

### Integration Tests
- NL2SQL cache: [PASS/FAIL]
- Anomaly detection: [PASS/FAIL]

### Performance
- Embedding generation: XXXms
- Cache search: XXms
- Similarity check: XXms
- Cold cache query: X.Xs
- Warm cache query: XXms

### Issues Found
1. [Description]
2. [Description]

### Notes
[Additional observations]