7.0 KiB
sqlite-rembed Integration Test Suite
Overview
This test suite comprehensively validates the integration of sqlite-rembed (Rust SQLite extension for text embedding generation) into ProxySQL. The tests verify the complete AI pipeline from client registration to embedding generation and vector similarity search.
Prerequisites
System Requirements
- ProxySQL compiled with
sqlite-rembedandsqlite-vecextensions - MySQL client (
mysqlcommand line tool) - Bash shell environment
- Network access to embedding API endpoint (or local Ollama/OpenAI API)
ProxySQL Configuration
Ensure ProxySQL is running with SQLite3 server enabled:
cd /home/rene/proxysql-vec/src
./proxysql --sqlite3-server
Test Configuration
The test script uses default connection parameters:
- Host:
127.0.0.1 - Port:
6030(default SQLite3 server port) - User:
root - Password:
root
Modify these in the script if your configuration differs.
Test Suite Structure
The test suite is organized into 9 phases, each testing specific components:
Phase 1: Basic Connectivity and Function Verification
- ✅ ProxySQL connection
- ✅ Database listing
- ✅
sqlite-vecfunction availability - ✅
sqlite-rembedfunction registration - ✅
temp.rembed_clientsvirtual table existence
Phase 2: Client Configuration
- ✅ Create embedding API client with
rembed_client_options() - ✅ Verify client registration in
temp.rembed_clients - ✅ Test
rembed_client_optionsfunction
Phase 3: Embedding Generation Tests
- ✅ Generate embeddings for short and long text
- ✅ Verify embedding data type (BLOB) and size (768 dimensions × 4 bytes)
- ✅ Error handling for non-existent clients
Phase 4: Table Creation and Data Storage
- ✅ Create regular table for document storage
- ✅ Create virtual vector table using
vec0 - ✅ Insert test documents with diverse content
Phase 5: Embedding Generation and Storage
- ✅ Generate embeddings for all documents
- ✅ Store embeddings in vector table
- ✅ Verify embedding count matches document count
- ✅ Check embedding storage format
Phase 6: Similarity Search Tests
- ✅ Exact self-match (document with itself, distance = 0.0)
- ✅ Similarity search with query text
- ✅ Verify result ordering by ascending distance
Phase 7: Edge Cases and Error Handling
- ✅ Empty text input
- ✅ Very long text input
- ✅ SQL injection attempt safety
Phase 8: Performance and Concurrency
- ✅ Sequential embedding generation timing
- ✅ Basic performance validation (< 10 seconds for 3 embeddings)
Phase 9: Cleanup and Final Verification
- ✅ Clean up test tables
- ✅ Verify no test artifacts remain
Usage
Running the Full Test Suite
cd /home/rene/proxysql-vec/doc
./sqlite-rembed-test.sh
Expected Output
The script provides color-coded output:
- 🟢 Green: Test passed
- 🔴 Red: Test failed
- 🔵 Blue: Information and headers
- 🟡 Yellow: Test being executed
Exit Codes
0: All tests passed1: One or more tests failed2: Connection issues or missing dependencies
Configuration
Modifying Connection Parameters
Edit the following variables in sqlite-rembed-test.sh:
PROXYSQL_HOST="127.0.0.1"
PROXYSQL_PORT="6030"
MYSQL_USER="root"
MYSQL_PASS="root"
API Configuration
The test uses a synthetic OpenAI endpoint by default. Set API_KEY environment variable or modify the variable below to use your own API:
API_CLIENT_NAME="test-client-$(date +%s)"
API_FORMAT="openai"
API_URL="https://api.synthetic.new/openai/v1/embeddings"
API_KEY="${API_KEY:-YOUR_API_KEY}" # Uses environment variable or placeholder
API_MODEL="hf:nomic-ai/nomic-embed-text-v1.5"
VECTOR_DIMENSIONS=768
For other providers (Ollama, Cohere, Nomic), adjust the format and URL accordingly.
Test Data
Sample Documents
The test creates 4 sample documents:
- Machine Learning - "Machine learning algorithms improve with more training data..."
- Database Systems - "Database management systems efficiently store, retrieve..."
- Artificial Intelligence - "AI enables computers to perform tasks typically..."
- Vector Databases - "Vector databases enable similarity search for embeddings..."
Query Texts
Test searches use:
- Self-match: Document 1 with itself
- Query: "data science and algorithms"
Troubleshooting
Common Issues
1. Connection Failed
Error: Cannot connect to ProxySQL at 127.0.0.1:6030
Solution: Ensure ProxySQL is running with --sqlite3-server flag.
2. Missing Functions
ERROR 1045 (28000): no such function: rembed
Solution: Verify sqlite-rembed was compiled and linked into ProxySQL binary.
3. API Errors
Error from embedding API
Solution: Check network connectivity and API credentials.
4. Vector Table Errors
ERROR 1045 (28000): A LIMIT or 'k = ?' constraint is required on vec0 knn queries.
Solution: All sqlite-vec similarity queries require LIMIT clause.
Debug Mode
For detailed debugging, run with trace:
bash -x ./sqlite-rembed-test.sh
Integration with CI/CD
The test script can be integrated into CI/CD pipelines:
# Example GitHub Actions workflow
name: sqlite-rembed Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build ProxySQL with sqlite-rembed
run: |
cd deps && make cleanpart && make sqlite3
cd ../lib && make
cd ../src && make
- name: Start ProxySQL
run: |
cd src && ./proxysql --sqlite3-server &
sleep 5
- name: Run Integration Tests
run: |
cd doc && ./sqlite-rembed-test.sh
Extending the Test Suite
Adding New Tests
- Add new test function following existing pattern
- Update phase header and test count
- Add to appropriate phase section
Testing Different Providers
Modify the API configuration block to test:
- Ollama: Use
format='ollama'and local URL - Cohere: Use
format='cohere'and appropriate model - Nomic: Use
format='nomic'and Nomic API endpoint
Performance Testing
Extend Phase 8 for:
- Concurrent embedding generation
- Batch processing tests
- Memory usage monitoring
Results Interpretation
Success Criteria
- All connectivity tests pass
- Embeddings generated with correct dimensions
- Vector search returns ordered results
- No test artifacts remain after cleanup
Performance Benchmarks
- Embedding generation: < 3 seconds per request (network-dependent)
- Similarity search: < 100ms for small datasets
- Memory: Stable during sequential operations
References
License
This test suite is part of the ProxySQL project and follows the same licensing terms.
Last Updated: $(date) Test Suite Version: 1.0