10 KiB

Raw Blame History Unescape Escape

MCP Tooling for ProxySQL RAG Engine (v0 Blueprint)

This document defines the MCP tool surface for querying ProxySQL’s embedded RAG index. It is intended as a stable interface for AI agents. Internally, these tools query the SQLite schema described in schema.sql and the retrieval logic described in architecture-runtime-retrieval.md.

Design goals

Stable tool contracts (do not break agents when internals change)
Strict bounds (prevent unbounded scans / large outputs)
Deterministic schemas (agents can reliably parse outputs)
Separation of concerns:
- Retrieval returns identifiers and scores
- Fetch returns content
- Optional refetch returns authoritative source rows

1. Conventions

1.1 Identifiers

doc_id: stable document identifier (e.g. posts:12345)
chunk_id: stable chunk identifier (e.g. posts:12345#0)
source_id / source_name: corresponds to rag_sources

1.2 Scores

FTS score: score_fts (bm25; lower is better in SQLite’s bm25 by default)
Vector score: score_vec (distance or similarity, depending on implementation)
Hybrid score: score (normalized fused score; higher is better)

Recommendation Normalize scores in MCP layer so:

higher is always better for agent ranking
raw internal ranking can still be returned as score_fts_raw, distance_raw, etc. if helpful

1.3 Limits and budgets (recommended defaults)

All tools should enforce caps, regardless of caller input:

k_max = 50
candidates_max = 500
query_max_bytes = 8192
response_max_bytes = 5_000_000
timeout_ms (per tool): 250–2000ms depending on tool type

Tools must return a truncated boolean if limits reduce output.

2. Shared filter model

Many tools accept the same filter structure. This is intentionally simple in v0.

2.1 Filter object

{
  "source_ids": [1,2],
  "source_names": ["stack_posts"],
  "doc_ids": ["posts:12345"],
  "min_score": 5,
  "post_type_ids": [1],
  "tags_any": ["mysql","json"],
  "tags_all": ["mysql","json"],
  "created_after": "2022-01-01T00:00:00Z",
  "created_before": "2025-01-01T00:00:00Z"
}

Notes

In v0, most filters map to metadata_json values. Implementation can:
- filter in SQLite if JSON functions are available, or
- filter in MCP layer after initial retrieval (acceptable for small k/candidates)
For production, denormalize hot filters into dedicated columns for speed.

2.2 Filter behavior

If both source_ids and source_names are provided, treat as intersection.
If no source filter is provided, default to all enabled sources but enforce a strict global budget.

3. Tool: `rag.search_fts`

Keyword search over rag_fts_chunks.

3.1 Request schema

{
  "query": "json_extract mysql",
  "k": 10,
  "offset": 0,
  "filters": { },
  "return": {
    "include_title": true,
    "include_metadata": true,
    "include_snippets": false
  }
}

3.2 Semantics

Executes FTS query (MATCH) over indexed content.
Returns top-k chunk matches with scores and identifiers.
Does not return full chunk bodies unless include_snippets is requested (still bounded).

3.3 Response schema

{
  "results": [
    {
      "chunk_id": "posts:12345#0",
      "doc_id": "posts:12345",
      "source_id": 1,
      "source_name": "stack_posts",
      "score_fts": 0.73,
      "title": "How to parse JSON in MySQL 8?",
      "metadata": { "Tags": "<mysql><json>", "Score": "12" }
    }
  ],
  "truncated": false,
  "stats": {
    "k_requested": 10,
    "k_returned": 10,
    "ms": 12
  }
}

4. Tool: `rag.search_vector`

Semantic search over rag_vec_chunks.

4.1 Request schema (text input)

{
  "query_text": "How do I extract JSON fields in MySQL?",
  "k": 10,
  "filters": { },
  "embedding": {
    "model": "text-embedding-3-large"
  }
}

4.2 Request schema (precomputed vector)

{
  "query_embedding": {
    "dim": 1536,
    "values_b64": "AAAA..."  // float32 array packed and base64 encoded
  },
  "k": 10,
  "filters": { }
}

4.3 Semantics

If query_text is provided, ProxySQL computes embedding internally (preferred for agents).
If query_embedding is provided, ProxySQL uses it directly (useful for advanced clients).
Returns nearest chunks by distance/similarity.

4.4 Response schema

{
  "results": [
    {
      "chunk_id": "posts:9876#1",
      "doc_id": "posts:9876",
      "source_id": 1,
      "source_name": "stack_posts",
      "score_vec": 0.82,
      "title": "Query JSON columns efficiently",
      "metadata": { "Tags": "<mysql><json>", "Score": "8" }
    }
  ],
  "truncated": false,
  "stats": {
    "k_requested": 10,
    "k_returned": 10,
    "ms": 18
  }
}

5. Tool: `rag.search_hybrid`

Hybrid search combining FTS and vectors. Supports two modes:

Mode A: parallel FTS + vector, fuse results (RRF recommended)
Mode B: broad FTS candidate generation, then vector rerank

5.1 Request schema (Mode A: fuse)

{
  "query": "json_extract mysql",
  "k": 10,
  "filters": { },
  "mode": "fuse",
  "fuse": {
    "fts_k": 50,
    "vec_k": 50,
    "rrf_k0": 60,
    "w_fts": 1.0,
    "w_vec": 1.0
  }
}

5.2 Request schema (Mode B: candidates + rerank)

{
  "query": "json_extract mysql",
  "k": 10,
  "filters": { },
  "mode": "fts_then_vec",
  "fts_then_vec": {
    "candidates_k": 200,
    "rerank_k": 50,
    "vec_metric": "cosine"
  }
}

5.3 Semantics (Mode A)

Run FTS top fts_k
Run vector top vec_k
Merge candidates by chunk_id
Compute fused score (RRF recommended)
Return top k

5.4 Semantics (Mode B)

Run FTS top candidates_k
Compute vector similarity within those candidates
- either by joining candidate chunk_ids to stored vectors, or
- by embedding candidate chunk text on the fly (not recommended)
Return top k reranked results
Optionally return debug info about candidate stages

5.5 Response schema

{
  "results": [
    {
      "chunk_id": "posts:12345#0",
      "doc_id": "posts:12345",
      "source_id": 1,
      "source_name": "stack_posts",
      "score": 0.91,
      "score_fts": 0.74,
      "score_vec": 0.86,
      "title": "How to parse JSON in MySQL 8?",
      "metadata": { "Tags": "<mysql><json>", "Score": "12" },
      "debug": {
        "rank_fts": 3,
        "rank_vec": 6
      }
    }
  ],
  "truncated": false,
  "stats": {
    "mode": "fuse",
    "k_requested": 10,
    "k_returned": 10,
    "ms": 27
  }
}

6. Tool: `rag.get_chunks`

Fetch chunk bodies by chunk_id. This is how agents obtain grounding text.

6.1 Request schema

{
  "chunk_ids": ["posts:12345#0", "posts:9876#1"],
  "return": {
    "include_title": true,
    "include_doc_metadata": true,
    "include_chunk_metadata": true
  }
}

6.2 Response schema

{
  "chunks": [
    {
      "chunk_id": "posts:12345#0",
      "doc_id": "posts:12345",
      "title": "How to parse JSON in MySQL 8?",
      "body": "<p>I tried JSON_EXTRACT...</p>",
      "doc_metadata": { "Tags": "<mysql><json>", "Score": "12" },
      "chunk_metadata": { "chunk_index": 0 }
    }
  ],
  "truncated": false,
  "stats": { "ms": 6 }
}

Hard limit recommendation

Cap total returned chunk bytes to a safe maximum (e.g. 1–2 MB).

7. Tool: `rag.get_docs`

Fetch full canonical documents by doc_id (not chunks). Useful for inspection or compact docs.

7.1 Request schema

{
  "doc_ids": ["posts:12345"],
  "return": {
    "include_body": true,
    "include_metadata": true
  }
}

7.2 Response schema

{
  "docs": [
    {
      "doc_id": "posts:12345",
      "source_id": 1,
      "source_name": "stack_posts",
      "pk_json": { "Id": 12345 },
      "title": "How to parse JSON in MySQL 8?",
      "body": "<p>...</p>",
      "metadata": { "Tags": "<mysql><json>", "Score": "12" }
    }
  ],
  "truncated": false,
  "stats": { "ms": 7 }
}

8. Tool: `rag.fetch_from_source`

Refetch authoritative rows from the source DB using doc_id (via pk_json).

8.1 Request schema

{
  "doc_ids": ["posts:12345"],
  "columns": ["Id","Title","Body","Tags","Score"],
  "limits": {
    "max_rows": 10,
    "max_bytes": 200000
  }
}

8.2 Semantics

Look up doc(s) in rag_documents to get source_id and pk_json
Resolve source connection from rag_sources
Execute a parameterized query by primary key
Return requested columns only
Enforce strict limits

8.3 Response schema

{
  "rows": [
    {
      "doc_id": "posts:12345",
      "source_name": "stack_posts",
      "row": {
        "Id": 12345,
        "Title": "How to parse JSON in MySQL 8?",
        "Score": 12
      }
    }
  ],
  "truncated": false,
  "stats": { "ms": 22 }
}

Security note

This tool must not allow arbitrary SQL.
Only allow fetching by primary key and a whitelist of columns.

9. Tool: `rag.admin.stats` (recommended)

Operational visibility for dashboards and debugging.

9.1 Request

{}

9.2 Response

{
  "sources": [
    {
      "source_id": 1,
      "source_name": "stack_posts",
      "docs": 123456,
      "chunks": 456789,
      "last_sync": null
    }
  ],
  "stats": { "ms": 5 }
}

10. Tool: `rag.admin.sync` (optional in v0; required in v1)

Kicks ingestion for a source or all sources. In v0, ingestion may run as a separate process; in ProxySQL product form, this would trigger an internal job.

10.1 Request

{
  "source_names": ["stack_posts"]
}

10.2 Response

{
  "accepted": true,
  "job_id": "sync-2026-01-19T10:00:00Z"
}

11. Implementation notes (what the coding agent should implement)

Input validation and caps for every tool.
Consistent filtering across FTS/vector/hybrid.
Stable scoring semantics (higher-is-better recommended).
Efficient joins:
- vector search returns chunk_ids; join to rag_chunks/rag_documents for metadata.
Hybrid modes:
- Mode A (fuse): implement RRF
- Mode B (fts_then_vec): candidate set then vector rerank
Error model:
- return structured errors with codes (e.g. INVALID_ARGUMENT, LIMIT_EXCEEDED, INTERNAL)
Observability:
- return stats.ms in responses
- track tool usage counters and latency histograms

12. Summary

These MCP tools define a stable retrieval interface:

Search: rag.search_fts, rag.search_vector, rag.search_hybrid
Fetch: rag.get_chunks, rag.get_docs, rag.fetch_from_source
Admin: rag.admin.stats, optionally rag.admin.sync

10 KiB Raw Blame History Unescape Escape