You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
proxysql/doc/MCP/Tool_Discovery_Guide.md

17 KiB

MCP Tool Discovery Guide

This guide explains how to discover and interact with MCP tools available on all endpoints, with a focus on the Query endpoint which includes database exploration and two-phase discovery tools.

Overview

The MCP (Model Context Protocol) Query endpoint provides dynamic tool discovery through the tools/list method. This allows clients to:

  1. Discover all available tools at runtime
  2. Get detailed schemas for each tool (parameters, requirements, descriptions)
  3. Dynamically adapt to new tools without code changes

Endpoint Information

  • URL: https://127.0.0.1:6071/mcp/query
  • Protocol: JSON-RPC 2.0 over HTTPS
  • Authentication: Bearer token (optional, if configured)

Getting the Tool List

Basic Request

curl -k -X POST https://127.0.0.1:6071/mcp/query \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/list",
    "id": 1
  }' | jq

With Authentication

If authentication is configured:

curl -k -X POST https://127.0.0.1:6071/mcp/query \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/list",
    "id": 1
  }' | jq

Using Query Parameter (Alternative)

If header authentication is not available:

curl -k -X POST "https://127.0.0.1:6071/mcp/query?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/list",
    "id": 1
  }' | jq

Response Format

{
  "id": "1",
  "jsonrpc": "2.0",
  "result": {
    "tools": [
      {
        "name": "tool_name",
        "description": "Tool description",
        "inputSchema": {
          "type": "object",
          "properties": {
            "param_name": {
              "type": "string|integer",
              "description": "Parameter description"
            }
          },
          "required": ["param1", "param2"]
        }
      }
    ]
  }
}

Available Query Endpoint Tools

Inventory Tools

list_schemas

List all available schemas/databases.

Parameters:

  • page_token (string, optional) - Pagination token
  • page_size (integer, optional) - Results per page (default: 50)

list_tables

List tables in a schema.

Parameters:

  • schema (string, required) - Schema name
  • page_token (string, optional) - Pagination token
  • page_size (integer, optional) - Results per page (default: 50)
  • name_filter (string, optional) - Filter table names by pattern

Structure Tools

describe_table

Get detailed table schema including columns, types, keys, and indexes.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name

get_constraints

Get constraints (foreign keys, unique constraints, etc.) for a table.

Parameters:

  • schema (string, required) - Schema name
  • table (string, optional) - Table name

Profiling Tools

table_profile

Get table statistics including row count, size estimates, and data distribution.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • mode (string, optional) - Profile mode: "quick" or "full" (default: "quick")

column_profile

Get column statistics including distinct values, null count, and top values.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • column (string, required) - Column name
  • max_top_values (integer, optional) - Maximum top values to return (default: 20)

Sampling Tools

sample_rows

Get sample rows from a table (with hard cap on rows returned).

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • columns (string, optional) - Comma-separated column names
  • where (string, optional) - WHERE clause filter
  • order_by (string, optional) - ORDER BY clause
  • limit (integer, optional) - Maximum rows (default: 20)

sample_distinct

Sample distinct values from a column.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • column (string, required) - Column name
  • where (string, optional) - WHERE clause filter
  • limit (integer, optional) - Maximum values (default: 50)

Query Tools

run_sql_readonly

Execute a read-only SQL query with safety guardrails enforced.

Parameters:

  • sql (string, required) - SQL query to execute
  • max_rows (integer, optional) - Maximum rows to return (default: 200)
  • timeout_sec (integer, optional) - Query timeout (default: 2)

Safety rules:

  • Must start with SELECT
  • No dangerous keywords (DROP, DELETE, INSERT, UPDATE, etc.)
  • SELECT * requires LIMIT clause

explain_sql

Explain a query execution plan using EXPLAIN or EXPLAIN ANALYZE.

Parameters:

  • sql (string, required) - SQL query to explain

Relationship Inference Tools

suggest_joins

Suggest table joins based on heuristic analysis of column names and types.

Parameters:

  • schema (string, required) - Schema name
  • table_a (string, required) - First table
  • table_b (string, optional) - Second table (if omitted, checks all)
  • max_candidates (integer, optional) - Maximum join candidates (default: 5)

find_reference_candidates

Find tables that might be referenced by a foreign key column.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • column (string, required) - Column name
  • max_tables (integer, optional) - Maximum tables to check (default: 50)

Catalog Tools (LLM Memory)

catalog_upsert

Store or update an entry in the catalog (LLM external memory).

Parameters:

  • kind (string, required) - Entry kind (e.g., "table", "relationship", "insight")
  • key (string, required) - Unique identifier
  • document (string, required) - JSON document with data
  • tags (string, optional) - Comma-separated tags
  • links (string, optional) - Comma-separated related keys

catalog_get

Retrieve an entry from the catalog.

Parameters:

  • kind (string, required) - Entry kind
  • key (string, required) - Entry key

Search the catalog for entries matching a query.

Parameters:

  • query (string, required) - Search query
  • kind (string, optional) - Filter by kind
  • tags (string, optional) - Filter by tags
  • limit (integer, optional) - Maximum results (default: 20)
  • offset (integer, optional) - Results offset (default: 0)

catalog_list

List catalog entries by kind.

Parameters:

  • kind (string, optional) - Filter by kind
  • limit (integer, optional) - Maximum results (default: 50)
  • offset (integer, optional) - Results offset (default: 0)

catalog_merge

Merge multiple catalog entries into a single consolidated entry.

Parameters:

  • keys (string, required) - Comma-separated keys to merge
  • target_key (string, required) - Target key for merged entry
  • kind (string, optional) - Entry kind (default: "domain")
  • instructions (string, optional) - Merge instructions

catalog_delete

Delete an entry from the catalog.

Parameters:

  • kind (string, required) - Entry kind
  • key (string, required) - Entry key

Two-Phase Discovery Tools

discovery.run_static

Run Phase 1 of two-phase discovery: static harvest of database metadata.

Parameters:

  • schema_filter (string, optional) - Filter schemas by name pattern
  • table_filter (string, optional) - Filter tables by name pattern
  • run_id (string, optional) - Custom run identifier

Returns:

  • run_id - Unique identifier for this discovery run
  • objects_count - Number of database objects discovered
  • schemas_count - Number of schemas processed
  • tables_count - Number of tables processed
  • columns_count - Number of columns processed
  • indexes_count - Number of indexes processed
  • constraints_count - Number of constraints processed

agent.run_start

Start a new agent run for discovery coordination.

Parameters:

  • run_id (string, required) - Discovery run identifier
  • agent_id (string, required) - Agent identifier
  • capabilities (array, optional) - List of agent capabilities

agent.run_finish

Mark an agent run as completed.

Parameters:

  • run_id (string, required) - Discovery run identifier
  • agent_id (string, required) - Agent identifier
  • status (string, required) - Final status ("success", "error", "timeout")
  • summary (string, optional) - Summary of work performed

agent.event_append

Append an event to an agent run.

Parameters:

  • run_id (string, required) - Discovery run identifier
  • agent_id (string, required) - Agent identifier
  • event_type (string, required) - Type of event
  • data (object, required) - Event data
  • timestamp (string, optional) - ISO8601 timestamp

LLM Interaction Tools

llm.summary_upsert

Store or update a table/column summary generated by LLM.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • column (string, optional) - Column name (if column-level summary)
  • summary (string, required) - LLM-generated summary
  • confidence (number, optional) - Confidence score (0.0-1.0)

llm.summary_get

Retrieve LLM-generated summary for a table or column.

Parameters:

  • schema (string, required) - Schema name
  • table (string, required) - Table name
  • column (string, optional) - Column name

llm.relationship_upsert

Store or update an inferred relationship between tables.

Parameters:

  • source_schema (string, required) - Source schema
  • source_table (string, required) - Source table
  • target_schema (string, required) - Target schema
  • target_table (string, required) - Target table
  • confidence (number, required) - Confidence score (0.0-1.0)
  • description (string, required) - Relationship description
  • type (string, optional) - Relationship type ("fk", "semantic", "usage")

llm.domain_upsert

Store or update a business domain classification.

Parameters:

  • domain_id (string, required) - Domain identifier
  • name (string, required) - Domain name
  • description (string, required) - Domain description
  • confidence (number, optional) - Confidence score (0.0-1.0)
  • tags (array, optional) - Domain tags

llm.domain_set_members

Set the members (tables) of a business domain.

Parameters:

  • domain_id (string, required) - Domain identifier
  • members (array, required) - List of table identifiers
  • confidence (number, optional) - Confidence score (0.0-1.0)

llm.metric_upsert

Store or update a business metric definition.

Parameters:

  • metric_id (string, required) - Metric identifier
  • name (string, required) - Metric name
  • description (string, required) - Metric description
  • formula (string, required) - SQL formula or description
  • domain_id (string, optional) - Associated domain
  • tags (array, optional) - Metric tags

llm.question_template_add

Add a question template that can be answered using this data.

Parameters:

  • template_id (string, required) - Template identifier
  • question (string, required) - Question template with placeholders
  • answer_plan (object, required) - Steps to answer the question
  • complexity (string, optional) - Complexity level ("low", "medium", "high")
  • estimated_time (number, optional) - Estimated time in minutes
  • tags (array, optional) - Template tags

llm.note_add

Add a general note or insight about the data.

Parameters:

  • note_id (string, required) - Note identifier
  • content (string, required) - Note content
  • type (string, optional) - Note type ("insight", "warning", "recommendation")
  • confidence (number, optional) - Confidence score (0.0-1.0)
  • tags (array, optional) - Note tags

Search LLM-generated content and insights.

Parameters:

  • query (string, required) - Search query
  • type (string, optional) - Content type to search ("summary", "relationship", "domain", "metric", "note")
  • schema (string, optional) - Filter by schema
  • limit (number, optional) - Maximum results (default: 10)

Calling a Tool

Request Format

curl -k -X POST https://127.0.0.1:6071/mcp/query \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "list_tables",
      "arguments": {
        "schema": "testdb"
      }
    },
    "id": 2
  }' | jq

Response Format

{
  "id": "2",
  "jsonrpc": "2.0",
  "result": {
    "success": true,
    "data": [...]
  }
}

Error Response

{
  "id": "2",
  "jsonrpc": "2.0",
  "result": {
    "success": false,
    "error": "Error message"
  }
}

Python Examples

Basic Tool Discovery

import requests
import json

# Get tool list
response = requests.post(
    "https://127.0.0.1:6071/mcp/query",
    json={
        "jsonrpc": "2.0",
        "method": "tools/list",
        "id": 1
    },
    verify=False  # For self-signed cert
)

tools = response.json()["result"]["tools"]

# Print all tools
for tool in tools:
    print(f"\n{tool['name']}")
    print(f"  Description: {tool['description']}")
    print(f"  Required: {tool['inputSchema'].get('required', [])}")

Calling a Tool

def call_tool(tool_name, arguments):
    response = requests.post(
        "https://127.0.0.1:6071/mcp/query",
        json={
            "jsonrpc": "2.0",
            "method": "tools/call",
            "params": {
                "name": tool_name,
                "arguments": arguments
            },
            "id": 2
        },
        verify=False
    )
    return response.json()["result"]

# List tables
result = call_tool("list_tables", {"schema": "testdb"})
print(json.dumps(result, indent=2))

# Describe a table
result = call_tool("describe_table", {
    "schema": "testdb",
    "table": "customers"
})
print(json.dumps(result, indent=2))

# Run a query
result = call_tool("run_sql_readonly", {
    "sql": "SELECT * FROM customers LIMIT 10"
})
print(json.dumps(result, indent=2))

Complete Example: Database Discovery

import requests
import json

class MCPQueryClient:
    def __init__(self, host="127.0.0.1", port=6071, token=None):
        self.url = f"https://{host}:{port}/mcp/query"
        self.headers = {
            "Content-Type": "application/json",
            **({"Authorization": f"Bearer {token}"} if token else {})
        }

    def list_tools(self):
        response = requests.post(
            self.url,
            json={"jsonrpc": "2.0", "method": "tools/list", "id": 1},
            headers=self.headers,
            verify=False
        )
        return response.json()["result"]["tools"]

    def call_tool(self, name, arguments):
        response = requests.post(
            self.url,
            json={
                "jsonrpc": "2.0",
                "method": "tools/call",
                "params": {"name": name, "arguments": arguments},
                "id": 2
            },
            headers=self.headers,
            verify=False
        )
        return response.json()["result"]

    def explore_schema(self, schema):
        """Explore a schema: list tables and their structures"""
        print(f"\n=== Exploring schema: {schema} ===\n")

        # List tables
        tables = self.call_tool("list_tables", {"schema": schema})
        for table in tables.get("data", []):
            table_name = table["name"]
            print(f"\nTable: {table_name}")
            print(f"  Type: {table['type']}")
            print(f"  Rows: {table.get('row_count', 'unknown')}")

            # Describe table
            schema_info = self.call_tool("describe_table", {
                "schema": schema,
                "table": table_name
            })

            if schema_info.get("success"):
                print(f"  Columns: {', '.join([c['name'] for c in schema_info['data']['columns']])}")

# Usage
client = MCPQueryClient()
client.explore_schema("testdb")

Using the Test Script

The test script provides a convenient way to discover and test tools:

# List all discovered tools (without testing)
./scripts/mcp/test_mcp_tools.sh --list-only

# Test only query endpoint
./scripts/mcp/test_mcp_tools.sh --endpoint query

# Test specific tool with verbose output
./scripts/mcp/test_mcp_tools.sh --endpoint query --tool list_tables -v

# Test all endpoints
./scripts/mcp/test_mcp_tools.sh

Other Endpoints

The same discovery pattern works for all MCP endpoints:

  • Config: /mcp/config - Configuration management tools
  • Query: /mcp/query - Database exploration, query, and discovery tools
  • Admin: /mcp/admin - Administrative operations
  • Cache: /mcp/cache - Cache management tools
  • Observe: /mcp/observe - Monitoring and metrics tools
  • AI: /mcp/ai - AI and LLM features

Simply change the endpoint URL:

curl -k -X POST https://127.0.0.1:6071/mcp/config \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "method": "tools/list", "id": 1}'

Version

  • Last Updated: 2026-01-19
  • MCP Protocol: JSON-RPC 2.0 over HTTPS