mcp tests: add phase-B TAP coverage and optional real Claude CLI E2E runner

Add two complementary TAP tests for phase-B validation and an optional real Claude CLI E2E helper, so we can validate both 'without Claude credentials' and 'with real Claude CLI' workflows. What was added: 1) CI-safe deterministic phase-B TAP - New: test/tap/tests/test_mcp_llm_discovery_phaseb-t.sh - Validates MCP phase-B primitives end-to-end without external LLM API: - list_targets - discovery.run_static (target_id-scoped setup) - catalog.list_objects - agent.run_start / agent.run_finish - llm.summary_upsert / llm.summary_get - llm.domain_upsert / llm.domain_set_members - llm.metric_upsert - llm.question_template_add - llm.search - Uses unique per-run markers to assert persisted artifacts are retrievable 2) Claude headless flow TAP smoke - New: test/tap/tests/test_mcp_claude_headless_flow-t.sh - Always validates integration path without external dependencies: - static_harvest.sh wrapper executes and yields run_id - two_phase_discovery.py --dry-run executes with target_id/run_id context - Optional real Claude execution: - enabled via TAP_RUN_REAL_CLAUDE=1 - skipped by default to keep CI deterministic 3) Manual real-CLI E2E runner - New: scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/run_real_claude_e2e.sh - Runs full two-step flow manually when credentials/CLI are available: - phase-A static harvest (or --skip-phase-a + --run-id) - phase-B real Claude run via two_phase_discovery.py 4) Documentation updates - scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/README.md: - documents run_real_claude_e2e.sh usage - test/tap/groups/ai/README.md: - adds manual run instructions for: - phase-A static harvest test - phase-B deterministic TAP test - Claude headless smoke (with optional real mode) Validation run: - bash -n: - test_mcp_llm_discovery_phaseb-t.sh - test_mcp_claude_headless_flow-t.sh - run_real_claude_e2e.sh - static_harvest.sh - python3 -m py_compile: - two_phase_discovery.py
1 day ago · 9ffc3f8d71
parent ade0130e67
commit 9ffc3f8d71
5 changed files with 478 additions and 0 deletions
--- a/scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/README.md
+++ b/scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/README.md
@ -67,10 +67,20 @@ cp mcp_config.example.json mcp_config.json
 | File | Purpose |
 |------|---------|
 | `two_phase_discovery.py` | Orchestration script for Phase 2 |
+| `run_real_claude_e2e.sh` | Manual real-CLI E2E runner (phase A + phase B) |
 | `mcp_config.example.json` | Example MCP configuration for Claude Code |
 | `prompts/two_phase_discovery_prompt.md` | System prompt for LLM agent |
 | `prompts/two_phase_user_prompt.md` | User prompt template |

+### Manual Real Claude E2E
+
+```bash
+./run_real_claude_e2e.sh \
+  --target-id tap_mysql_default \
+  --schema testdb \
+  --mcp-config ./mcp_config.json
+```
+
 ### Documentation

 See [Two_Phase_Discovery_Implementation.md](../../../../doc/Two_Phase_Discovery_Implementation.md) for complete implementation details.
--- a/scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/run_real_claude_e2e.sh
+++ b/scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/run_real_claude_e2e.sh
@ -0,0 +1,106 @@
+#!/usr/bin/env bash
+#
+# run_real_claude_e2e.sh
+#
+# Manual end-to-end runner:
+# 1) phase-A static harvest
+# 2) phase-B real Claude Code execution
+#
+# This script is intentionally NOT used by default TAP CI.
+#
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+STATIC_HARVEST="${SCRIPT_DIR}/static_harvest.sh"
+TWO_PHASE="${SCRIPT_DIR}/two_phase_discovery.py"
+
+TARGET_ID="${MCP_TARGET_ID:-}"
+SCHEMA="${TEST_DB_NAME:-}"
+MCP_CONFIG="${SCRIPT_DIR}/mcp_config.json"
+MODEL="${CLAUDE_MODEL:-claude-3.5-sonnet}"
+NOTES="real_claude_e2e"
+ENDPOINT="${PROXYSQL_MCP_ENDPOINT:-https://127.0.0.1:6071/mcp/query}"
+SKIP_PHASE_A=0
+
+usage() {
+    cat <<EOF
+Usage: $0 --target-id ID --schema NAME [options]
+
+Required:
+  --target-id ID           MCP target_id
+  --schema NAME            Schema/database to harvest
+
+Optional:
+  --mcp-config PATH        Claude Code MCP config (default: ${SCRIPT_DIR}/mcp_config.json)
+  --model NAME             Claude model name (default: ${MODEL})
+  --notes TEXT             Notes for static harvest (default: ${NOTES})
+  --endpoint URL           MCP query endpoint (default: ${ENDPOINT})
+  --skip-phase-a           Skip static harvest and use --run-id
+  --run-id N               Existing run_id (required when --skip-phase-a)
+  -h, --help               Show this help
+
+Examples:
+  $0 --target-id tap_mysql_default --schema testdb --mcp-config ./mcp_config.json
+  $0 --target-id tap_pgsql_default --schema public --skip-phase-a --run-id 42
+EOF
+}
+
+RUN_ID=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --target-id) TARGET_ID="$2"; shift 2 ;;
+        --schema) SCHEMA="$2"; shift 2 ;;
+        --mcp-config) MCP_CONFIG="$2"; shift 2 ;;
+        --model) MODEL="$2"; shift 2 ;;
+        --notes) NOTES="$2"; shift 2 ;;
+        --endpoint) ENDPOINT="$2"; shift 2 ;;
+        --skip-phase-a) SKIP_PHASE_A=1; shift ;;
+        --run-id) RUN_ID="$2"; shift 2 ;;
+        -h|--help) usage; exit 0 ;;
+        *) echo "Unknown option: $1" >&2; usage; exit 1 ;;
+    esac
+done
+
+if [[ -z "${TARGET_ID}" || -z "${SCHEMA}" ]]; then
+    usage
+    exit 1
+fi
+
+if [[ ! -f "${MCP_CONFIG}" ]]; then
+    echo "Error: MCP config file not found: ${MCP_CONFIG}" >&2
+    exit 1
+fi
+
+if ! command -v claude >/dev/null 2>&1; then
+    echo "Error: claude CLI is not installed or not in PATH" >&2
+    exit 1
+fi
+
+if [[ "${SKIP_PHASE_A}" -eq 0 ]]; then
+    echo "[1/2] Running static harvest..."
+    out="$("${STATIC_HARVEST}" --endpoint "${ENDPOINT}" --target-id "${TARGET_ID}" --schema "${SCHEMA}" --notes "${NOTES}")"
+    echo "${out}"
+    RUN_ID="$(echo "${out}" | sed -n 's/^Run ID:[[:space:]]*\([0-9][0-9]*\)$/\1/p' | head -n1)"
+    if [[ -z "${RUN_ID}" ]]; then
+        echo "Error: could not extract run_id from static harvest output" >&2
+        exit 1
+    fi
+else
+    if [[ -z "${RUN_ID}" ]]; then
+        echo "Error: --run-id is required when --skip-phase-a is used" >&2
+        exit 1
+    fi
+fi
+
+echo "[2/2] Running real Claude phase-B discovery..."
+echo "target_id=${TARGET_ID} schema=${SCHEMA} run_id=${RUN_ID} model=${MODEL}"
+python3 "${TWO_PHASE}" \
+    --mcp-config "${MCP_CONFIG}" \
+    --target-id "${TARGET_ID}" \
+    --schema "${SCHEMA}" \
+    --run-id "${RUN_ID}" \
+    --model "${MODEL}"
+
+echo "Done."
--- a/test/tap/groups/ai/README.md
+++ b/test/tap/groups/ai/README.md
@ -53,6 +53,27 @@ bash test/tap/tests/test_mcp_static_harvest-t.sh
 bash test/tap/groups/ai/post-proxysql.bash
 ```

+For phase-B MCP discovery primitives (agent/llm/catalog tools, CI-safe):
+
+```bash
+source test/tap/groups/ai/env.sh
+bash test/tap/groups/ai/pre-proxysql.bash
+bash test/tap/tests/test_mcp_llm_discovery_phaseb-t.sh
+bash test/tap/groups/ai/post-proxysql.bash
+```
+
+For Claude headless flow smoke (dry-run + optional real Claude execution):
+
+```bash
+source test/tap/groups/ai/env.sh
+bash test/tap/groups/ai/pre-proxysql.bash
+bash test/tap/tests/test_mcp_claude_headless_flow-t.sh
+# Optional real run:
+# TAP_RUN_REAL_CLAUDE=1 TAP_CLAUDE_MCP_CONFIG=./scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/mcp_config.json \
+#   bash test/tap/tests/test_mcp_claude_headless_flow-t.sh
+bash test/tap/groups/ai/post-proxysql.bash
+```
+
 ## Notes

 - All variables can be overridden from the environment before running hooks.
--- a/test/tap/tests/test_mcp_claude_headless_flow-t.sh
+++ b/test/tap/tests/test_mcp_claude_headless_flow-t.sh
@ -0,0 +1,120 @@
+#!/usr/bin/env bash
+#
+# test_mcp_claude_headless_flow-t.sh
+#
+# TAP smoke test for ClaudeCode_Headless integration artifacts:
+# - static_harvest.sh wrapper
+# - two_phase_discovery.py orchestration script (dry-run always)
+# - optional real Claude execution (opt-in)
+#
+
+set -euo pipefail
+
+PLAN=6
+DONE=0
+FAIL=0
+
+REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+HEADLESS_DIR="${REPO_ROOT}/scripts/mcp/DiscoveryAgent/ClaudeCode_Headless"
+STATIC_HARVEST="${HEADLESS_DIR}/static_harvest.sh"
+TWO_PHASE="${HEADLESS_DIR}/two_phase_discovery.py"
+
+TARGET_ID="${MCP_TARGET_ID:-tap_mysql_default}"
+SCHEMA_NAME="${TEST_DB_NAME:-testdb}"
+MCP_CONFIG_PATH="${TAP_CLAUDE_MCP_CONFIG:-${HEADLESS_DIR}/mcp_config.example.json}"
+RUN_REAL_CLAUDE="${TAP_RUN_REAL_CLAUDE:-0}"
+CLAUDE_TIMEOUT_SEC="${TAP_CLAUDE_TIMEOUT:-900}"
+
+RUN_ID=""
+
+tap_ok() {
+    DONE=$((DONE + 1))
+    echo "msg: ok ${DONE} - $1"
+}
+
+tap_not_ok() {
+    DONE=$((DONE + 1))
+    FAIL=$((FAIL + 1))
+    echo "msg: not ok ${DONE} - $1"
+    if [[ $# -gt 1 ]]; then
+        echo "msg: # $2"
+    fi
+}
+
+tap_skip() {
+    DONE=$((DONE + 1))
+    echo "msg: ok ${DONE} - $1 # SKIP $2"
+}
+
+echo "msg: 1..${PLAN}"
+echo "msg: # MCP Claude Headless Flow Smoke Test"
+
+if [[ -x "${STATIC_HARVEST}" && -f "${TWO_PHASE}" ]]; then
+    tap_ok "Claude headless scripts exist"
+else
+    tap_not_ok "Claude headless scripts exist" "Missing ${STATIC_HARVEST} or ${TWO_PHASE}"
+fi
+
+if command -v jq >/dev/null 2>&1 && command -v python3 >/dev/null 2>&1; then
+    tap_ok "jq and python3 available"
+else
+    tap_not_ok "jq and python3 available" "jq/python3 required"
+fi
+
+static_out="$("${STATIC_HARVEST}" --target-id "${TARGET_ID}" --schema "${SCHEMA_NAME}" --notes "tap_claude_headless_flow" 2>&1 || true)"
+RUN_ID="$(echo "${static_out}" | sed -n 's/^Run ID:[[:space:]]*\([0-9][0-9]*\)$/\1/p' | head -n1)"
+if [[ -n "${RUN_ID}" ]]; then
+    tap_ok "static_harvest.sh returns run id"
+else
+    tap_not_ok "static_harvest.sh returns run id" "${static_out}"
+fi
+
+dry_out="$(python3 "${TWO_PHASE}" --mcp-config "${MCP_CONFIG_PATH}" --target-id "${TARGET_ID}" --schema "${SCHEMA_NAME}" --run-id "${RUN_ID}" --dry-run 2>&1 || true)"
+if echo "${dry_out}" | grep -q "\[DRY RUN\]" && echo "${dry_out}" | grep -q "Target ID: ${TARGET_ID}"; then
+    tap_ok "two_phase_discovery.py dry-run includes target_id and succeeds"
+else
+    tap_not_ok "two_phase_discovery.py dry-run includes target_id and succeeds" "${dry_out}"
+fi
+
+if [[ "${RUN_REAL_CLAUDE}" != "1" ]]; then
+    tap_skip "real Claude execution" "set TAP_RUN_REAL_CLAUDE=1 to enable"
+else
+    if ! command -v claude >/dev/null 2>&1; then
+        tap_skip "real Claude execution" "claude CLI not found"
+    else
+        set +e
+        if command -v timeout >/dev/null 2>&1; then
+            timeout "${CLAUDE_TIMEOUT_SEC}" python3 "${TWO_PHASE}" \
+                --mcp-config "${MCP_CONFIG_PATH}" \
+                --target-id "${TARGET_ID}" \
+                --schema "${SCHEMA_NAME}" \
+                --run-id "${RUN_ID}"
+            rc=$?
+        else
+            python3 "${TWO_PHASE}" \
+                --mcp-config "${MCP_CONFIG_PATH}" \
+                --target-id "${TARGET_ID}" \
+                --schema "${SCHEMA_NAME}" \
+                --run-id "${RUN_ID}"
+            rc=$?
+        fi
+        set -e
+        if [[ ${rc} -eq 0 ]]; then
+            tap_ok "real Claude execution completed"
+        else
+            tap_not_ok "real Claude execution completed" "exit_code=${rc}"
+        fi
+    fi
+fi
+
+if [[ -f "${MCP_CONFIG_PATH}" ]]; then
+    tap_ok "MCP config path exists (${MCP_CONFIG_PATH})"
+else
+    tap_skip "MCP config path exists" "${MCP_CONFIG_PATH} not present (dry-run still valid)"
+fi
+
+if [[ "${FAIL}" -ne 0 ]]; then
+    echo "msg: # FAILURES=${FAIL}/${PLAN}"
+    exit 1
+fi
+exit 0
--- a/test/tap/tests/test_mcp_llm_discovery_phaseb-t.sh
+++ b/test/tap/tests/test_mcp_llm_discovery_phaseb-t.sh
@ -0,0 +1,221 @@
+#!/usr/bin/env bash
+#
+# test_mcp_llm_discovery_phaseb-t.sh
+#
+# TAP test for MCP phase-B (LLM-driven discovery primitives), CI-safe:
+# - no external LLM credentials required
+# - validates agent/catalog/llm tools end-to-end on harvested catalog data
+#
+
+set -euo pipefail
+
+PLAN=12
+DONE=0
+FAIL=0
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+HELPERS="${SCRIPT_DIR}/mcp_rules_testing/mcp_test_helpers.sh"
+
+if [[ ! -f "${HELPERS}" ]]; then
+    echo "msg: 1..1"
+    echo "msg: not ok 1 - missing helper ${HELPERS}"
+    exit 1
+fi
+source "${HELPERS}"
+
+MCP_MYSQL_TARGET_ID="${MCP_TARGET_ID:-tap_mysql_default}"
+MYSQL_SCHEMA="${MYSQL_DATABASE:-testdb}"
+
+RUN_ID=""
+AGENT_RUN_ID=""
+OBJECT_ID=""
+UNIQ_KEY="tap_phaseb_$(date +%s)"
+
+tap_ok() {
+    DONE=$((DONE + 1))
+    echo "msg: ok ${DONE} - $1"
+}
+
+tap_not_ok() {
+    DONE=$((DONE + 1))
+    FAIL=$((FAIL + 1))
+    echo "msg: not ok ${DONE} - $1"
+    if [[ $# -gt 1 ]]; then
+        echo "msg: # $2"
+    fi
+}
+
+tap_skip() {
+    DONE=$((DONE + 1))
+    echo "msg: ok ${DONE} - $1 # SKIP $2"
+}
+
+mcp_tool_call() {
+    local tool_name="$1"
+    local args_json="$2"
+    local req_id="$3"
+    local req
+    req="$(jq -cn --arg name "${tool_name}" --argjson args "${args_json}" --argjson id "${req_id}" \
+        '{jsonrpc:"2.0",id:$id,method:"tools/call",params:{name:$name,arguments:$args}}')"
+    mcp_request "query" "${req}"
+}
+
+mcp_success_text() {
+    local resp="$1"
+    if echo "${resp}" | jq -e '.error' >/dev/null 2>&1; then
+        return 1
+    fi
+    if echo "${resp}" | jq -e '.result.isError == true' >/dev/null 2>&1; then
+        return 1
+    fi
+    echo "${resp}" | jq -r '.result.content[0].text // empty'
+    return 0
+}
+
+echo "msg: 1..${PLAN}"
+echo "msg: # MCP Phase-B LLM Discovery Tooling Test Suite"
+
+if check_proxysql_admin; then
+    tap_ok "ProxySQL admin reachable"
+else
+    tap_not_ok "ProxySQL admin reachable"
+fi
+
+if check_mcp_server; then
+    tap_ok "MCP server reachable"
+else
+    tap_not_ok "MCP server reachable"
+fi
+
+targets_resp="$(mcp_tool_call "list_targets" '{}' 1)"
+if echo "${targets_resp}" | grep -q "\"target_id\":\"${MCP_MYSQL_TARGET_ID}\""; then
+    tap_ok "MySQL target_id present in list_targets"
+else
+    tap_not_ok "MySQL target_id present in list_targets" "${targets_resp}"
+fi
+
+harvest_resp="$(mcp_tool_call "discovery.run_static" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"schema_filter\":\"${MYSQL_SCHEMA}\",\"notes\":\"${UNIQ_KEY}\"}" 2)"
+harvest_text=""
+if harvest_text="$(mcp_success_text "${harvest_resp}")"; then
+    RUN_ID="$(echo "${harvest_text}" | jq -r '.run_id // empty' 2>/dev/null || true)"
+fi
+if [[ -n "${RUN_ID}" ]]; then
+    tap_ok "discovery.run_static returns run_id for phase-B setup"
+else
+    tap_not_ok "discovery.run_static returns run_id for phase-B setup" "${harvest_resp}"
+fi
+
+list_resp="$(mcp_tool_call "catalog.list_objects" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"schema_name\":\"${MYSQL_SCHEMA}\",\"object_type\":\"table\",\"page_size\":20}" 3)"
+list_text=""
+if list_text="$(mcp_success_text "${list_resp}")"; then
+    OBJECT_ID="$(echo "${list_text}" | jq -r '.results[0].object_id // empty' 2>/dev/null || true)"
+fi
+if [[ -n "${OBJECT_ID}" ]]; then
+    tap_ok "catalog.list_objects returns object_id for run"
+else
+    tap_not_ok "catalog.list_objects returns object_id for run" "${list_resp}"
+fi
+
+agent_resp="$(mcp_tool_call "agent.run_start" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"model_name\":\"tap-ci-model\",\"prompt_hash\":\"${UNIQ_KEY}\"}" 4)"
+agent_text=""
+if agent_text="$(mcp_success_text "${agent_resp}")"; then
+    AGENT_RUN_ID="$(echo "${agent_text}" | jq -r '.agent_run_id // empty' 2>/dev/null || true)"
+fi
+if [[ -n "${AGENT_RUN_ID}" ]]; then
+    tap_ok "agent.run_start returns agent_run_id"
+else
+    tap_not_ok "agent.run_start returns agent_run_id" "${agent_resp}"
+fi
+
+summary_payload="$(jq -cn \
+    --arg target_id "${MCP_MYSQL_TARGET_ID}" \
+    --arg run_id "${RUN_ID}" \
+    --argjson object_id "${OBJECT_ID}" \
+    --argjson agent_run_id "${AGENT_RUN_ID}" \
+    --arg uniq "${UNIQ_KEY}" \
+    '{
+      target_id:$target_id,
+      agent_run_id:$agent_run_id,
+      run_id:$run_id,
+      object_id:$object_id,
+      summary:{
+        hypothesis:("phaseb-summary-" + $uniq),
+        grain:"one row per entity",
+        primary_key:["id"],
+        time_columns:[],
+        dimensions:[],
+        measures:[],
+        join_keys:[],
+        example_questions:["What is this table used for?"],
+        warnings:[]
+      },
+      confidence:0.75,
+      status:"draft",
+      sources:{source:"tap-phaseb"}
+    }')"
+summary_resp="$(mcp_tool_call "llm.summary_upsert" "${summary_payload}" 5)"
+if mcp_success_text "${summary_resp}" >/dev/null; then
+    tap_ok "llm.summary_upsert succeeds"
+else
+    tap_not_ok "llm.summary_upsert succeeds" "${summary_resp}"
+fi
+
+summary_get_resp="$(mcp_tool_call "llm.summary_get" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"object_id\":${OBJECT_ID},\"latest\":1}" 6)"
+summary_get_text=""
+if summary_get_text="$(mcp_success_text "${summary_get_resp}")" && echo "${summary_get_text}" | grep -q "phaseb-summary-${UNIQ_KEY}"; then
+    tap_ok "llm.summary_get returns persisted summary marker"
+else
+    tap_not_ok "llm.summary_get returns persisted summary marker" "${summary_get_resp}"
+fi
+
+domain_key="tap_domain_${UNIQ_KEY}"
+domain_resp="$(mcp_tool_call "llm.domain_upsert" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"domain_key\":\"${domain_key}\",\"title\":\"TAP Domain\",\"description\":\"TAP phaseb domain\",\"confidence\":0.7}" 7)"
+if mcp_success_text "${domain_resp}" >/dev/null; then
+    tap_ok "llm.domain_upsert succeeds"
+else
+    tap_not_ok "llm.domain_upsert succeeds" "${domain_resp}"
+fi
+
+members_json="$(jq -cn --argjson oid "${OBJECT_ID}" '[{object_id:$oid,role:"entity",confidence:0.8}]')"
+domain_members_resp="$(mcp_tool_call "llm.domain_set_members" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"domain_key\":\"${domain_key}\",\"members\":${members_json}}" 8)"
+if mcp_success_text "${domain_members_resp}" >/dev/null; then
+    tap_ok "llm.domain_set_members succeeds"
+else
+    tap_not_ok "llm.domain_set_members succeeds" "${domain_members_resp}"
+fi
+
+metric_key="tap_metric_${UNIQ_KEY}"
+metric_resp="$(mcp_tool_call "llm.metric_upsert" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"metric_key\":\"${metric_key}\",\"title\":\"TAP Metric\",\"description\":\"metric from tap phaseb\",\"domain_key\":\"${domain_key}\",\"grain\":\"daily\",\"unit\":\"count\",\"sql_template\":\"SELECT COUNT(*) FROM ${MYSQL_SCHEMA}.tap_mysql_static_customers\",\"depends\":[],\"confidence\":0.65}" 9)"
+if mcp_success_text "${metric_resp}" >/dev/null; then
+    tap_ok "llm.metric_upsert succeeds"
+else
+    tap_not_ok "llm.metric_upsert succeeds" "${metric_resp}"
+fi
+
+qt_resp="$(mcp_tool_call "llm.question_template_add" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"title\":\"tap question ${UNIQ_KEY}\",\"question_nl\":\"How many static customers exist?\",\"template\":{\"kind\":\"single_metric\"},\"example_sql\":\"SELECT COUNT(*) FROM ${MYSQL_SCHEMA}.tap_mysql_static_customers\",\"related_objects\":[\"tap_mysql_static_customers\"],\"confidence\":0.66}" 10)"
+if mcp_success_text "${qt_resp}" >/dev/null; then
+    tap_ok "llm.question_template_add succeeds"
+else
+    tap_not_ok "llm.question_template_add succeeds" "${qt_resp}"
+fi
+
+search_resp="$(mcp_tool_call "llm.search" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"query\":\"phaseb-summary-${UNIQ_KEY}\",\"limit\":5}" 11)"
+search_text=""
+if search_text="$(mcp_success_text "${search_resp}")" && echo "${search_text}" | grep -q "phaseb-summary-${UNIQ_KEY}"; then
+    tap_ok "llm.search finds persisted phase-B artifact"
+else
+    tap_not_ok "llm.search finds persisted phase-B artifact" "${search_resp}"
+fi
+
+finish_resp="$(mcp_tool_call "agent.run_finish" "{\"agent_run_id\":${AGENT_RUN_ID},\"status\":\"success\"}" 12)"
+if mcp_success_text "${finish_resp}" >/dev/null; then
+    tap_ok "agent.run_finish succeeds"
+else
+    tap_not_ok "agent.run_finish succeeds" "${finish_resp}"
+fi
+
+if [[ "${FAIL}" -ne 0 ]]; then
+    echo "msg: # FAILURES=${FAIL}/${PLAN}"
+    exit 1
+fi
+exit 0