mcp tests: add phase-B TAP coverage and optional real Claude CLI E2E runner

Add two complementary TAP tests for phase-B validation and an optional real Claude CLI E2E helper, so we can validate both 'without Claude credentials' and 'with real Claude CLI' workflows.

What was added:

1) CI-safe deterministic phase-B TAP
- New: test/tap/tests/test_mcp_llm_discovery_phaseb-t.sh
- Validates MCP phase-B primitives end-to-end without external LLM API:
  - list_targets
  - discovery.run_static (target_id-scoped setup)
  - catalog.list_objects
  - agent.run_start / agent.run_finish
  - llm.summary_upsert / llm.summary_get
  - llm.domain_upsert / llm.domain_set_members
  - llm.metric_upsert
  - llm.question_template_add
  - llm.search
- Uses unique per-run markers to assert persisted artifacts are retrievable

2) Claude headless flow TAP smoke
- New: test/tap/tests/test_mcp_claude_headless_flow-t.sh
- Always validates integration path without external dependencies:
  - static_harvest.sh wrapper executes and yields run_id
  - two_phase_discovery.py --dry-run executes with target_id/run_id context
- Optional real Claude execution:
  - enabled via TAP_RUN_REAL_CLAUDE=1
  - skipped by default to keep CI deterministic

3) Manual real-CLI E2E runner
- New: scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/run_real_claude_e2e.sh
- Runs full two-step flow manually when credentials/CLI are available:
  - phase-A static harvest (or --skip-phase-a + --run-id)
  - phase-B real Claude run via two_phase_discovery.py

4) Documentation updates
- scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/README.md:
  - documents run_real_claude_e2e.sh usage
- test/tap/groups/ai/README.md:
  - adds manual run instructions for:
    - phase-A static harvest test
    - phase-B deterministic TAP test
    - Claude headless smoke (with optional real mode)

Validation run:
- bash -n:
  - test_mcp_llm_discovery_phaseb-t.sh
  - test_mcp_claude_headless_flow-t.sh
  - run_real_claude_e2e.sh
  - static_harvest.sh
- python3 -m py_compile:
  - two_phase_discovery.py
v3.0-MCP_multi
Rene Cannao 1 day ago
parent ade0130e67
commit 9ffc3f8d71

@ -67,10 +67,20 @@ cp mcp_config.example.json mcp_config.json
| File | Purpose |
|------|---------|
| `two_phase_discovery.py` | Orchestration script for Phase 2 |
| `run_real_claude_e2e.sh` | Manual real-CLI E2E runner (phase A + phase B) |
| `mcp_config.example.json` | Example MCP configuration for Claude Code |
| `prompts/two_phase_discovery_prompt.md` | System prompt for LLM agent |
| `prompts/two_phase_user_prompt.md` | User prompt template |
### Manual Real Claude E2E
```bash
./run_real_claude_e2e.sh \
--target-id tap_mysql_default \
--schema testdb \
--mcp-config ./mcp_config.json
```
### Documentation
See [Two_Phase_Discovery_Implementation.md](../../../../doc/Two_Phase_Discovery_Implementation.md) for complete implementation details.

@ -0,0 +1,106 @@
#!/usr/bin/env bash
#
# run_real_claude_e2e.sh
#
# Manual end-to-end runner:
# 1) phase-A static harvest
# 2) phase-B real Claude Code execution
#
# This script is intentionally NOT used by default TAP CI.
#
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
STATIC_HARVEST="${SCRIPT_DIR}/static_harvest.sh"
TWO_PHASE="${SCRIPT_DIR}/two_phase_discovery.py"
TARGET_ID="${MCP_TARGET_ID:-}"
SCHEMA="${TEST_DB_NAME:-}"
MCP_CONFIG="${SCRIPT_DIR}/mcp_config.json"
MODEL="${CLAUDE_MODEL:-claude-3.5-sonnet}"
NOTES="real_claude_e2e"
ENDPOINT="${PROXYSQL_MCP_ENDPOINT:-https://127.0.0.1:6071/mcp/query}"
SKIP_PHASE_A=0
usage() {
cat <<EOF
Usage: $0 --target-id ID --schema NAME [options]
Required:
--target-id ID MCP target_id
--schema NAME Schema/database to harvest
Optional:
--mcp-config PATH Claude Code MCP config (default: ${SCRIPT_DIR}/mcp_config.json)
--model NAME Claude model name (default: ${MODEL})
--notes TEXT Notes for static harvest (default: ${NOTES})
--endpoint URL MCP query endpoint (default: ${ENDPOINT})
--skip-phase-a Skip static harvest and use --run-id
--run-id N Existing run_id (required when --skip-phase-a)
-h, --help Show this help
Examples:
$0 --target-id tap_mysql_default --schema testdb --mcp-config ./mcp_config.json
$0 --target-id tap_pgsql_default --schema public --skip-phase-a --run-id 42
EOF
}
RUN_ID=""
while [[ $# -gt 0 ]]; do
case "$1" in
--target-id) TARGET_ID="$2"; shift 2 ;;
--schema) SCHEMA="$2"; shift 2 ;;
--mcp-config) MCP_CONFIG="$2"; shift 2 ;;
--model) MODEL="$2"; shift 2 ;;
--notes) NOTES="$2"; shift 2 ;;
--endpoint) ENDPOINT="$2"; shift 2 ;;
--skip-phase-a) SKIP_PHASE_A=1; shift ;;
--run-id) RUN_ID="$2"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if [[ -z "${TARGET_ID}" || -z "${SCHEMA}" ]]; then
usage
exit 1
fi
if [[ ! -f "${MCP_CONFIG}" ]]; then
echo "Error: MCP config file not found: ${MCP_CONFIG}" >&2
exit 1
fi
if ! command -v claude >/dev/null 2>&1; then
echo "Error: claude CLI is not installed or not in PATH" >&2
exit 1
fi
if [[ "${SKIP_PHASE_A}" -eq 0 ]]; then
echo "[1/2] Running static harvest..."
out="$("${STATIC_HARVEST}" --endpoint "${ENDPOINT}" --target-id "${TARGET_ID}" --schema "${SCHEMA}" --notes "${NOTES}")"
echo "${out}"
RUN_ID="$(echo "${out}" | sed -n 's/^Run ID:[[:space:]]*\([0-9][0-9]*\)$/\1/p' | head -n1)"
if [[ -z "${RUN_ID}" ]]; then
echo "Error: could not extract run_id from static harvest output" >&2
exit 1
fi
else
if [[ -z "${RUN_ID}" ]]; then
echo "Error: --run-id is required when --skip-phase-a is used" >&2
exit 1
fi
fi
echo "[2/2] Running real Claude phase-B discovery..."
echo "target_id=${TARGET_ID} schema=${SCHEMA} run_id=${RUN_ID} model=${MODEL}"
python3 "${TWO_PHASE}" \
--mcp-config "${MCP_CONFIG}" \
--target-id "${TARGET_ID}" \
--schema "${SCHEMA}" \
--run-id "${RUN_ID}" \
--model "${MODEL}"
echo "Done."

@ -53,6 +53,27 @@ bash test/tap/tests/test_mcp_static_harvest-t.sh
bash test/tap/groups/ai/post-proxysql.bash
```
For phase-B MCP discovery primitives (agent/llm/catalog tools, CI-safe):
```bash
source test/tap/groups/ai/env.sh
bash test/tap/groups/ai/pre-proxysql.bash
bash test/tap/tests/test_mcp_llm_discovery_phaseb-t.sh
bash test/tap/groups/ai/post-proxysql.bash
```
For Claude headless flow smoke (dry-run + optional real Claude execution):
```bash
source test/tap/groups/ai/env.sh
bash test/tap/groups/ai/pre-proxysql.bash
bash test/tap/tests/test_mcp_claude_headless_flow-t.sh
# Optional real run:
# TAP_RUN_REAL_CLAUDE=1 TAP_CLAUDE_MCP_CONFIG=./scripts/mcp/DiscoveryAgent/ClaudeCode_Headless/mcp_config.json \
# bash test/tap/tests/test_mcp_claude_headless_flow-t.sh
bash test/tap/groups/ai/post-proxysql.bash
```
## Notes
- All variables can be overridden from the environment before running hooks.

@ -0,0 +1,120 @@
#!/usr/bin/env bash
#
# test_mcp_claude_headless_flow-t.sh
#
# TAP smoke test for ClaudeCode_Headless integration artifacts:
# - static_harvest.sh wrapper
# - two_phase_discovery.py orchestration script (dry-run always)
# - optional real Claude execution (opt-in)
#
set -euo pipefail
PLAN=6
DONE=0
FAIL=0
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
HEADLESS_DIR="${REPO_ROOT}/scripts/mcp/DiscoveryAgent/ClaudeCode_Headless"
STATIC_HARVEST="${HEADLESS_DIR}/static_harvest.sh"
TWO_PHASE="${HEADLESS_DIR}/two_phase_discovery.py"
TARGET_ID="${MCP_TARGET_ID:-tap_mysql_default}"
SCHEMA_NAME="${TEST_DB_NAME:-testdb}"
MCP_CONFIG_PATH="${TAP_CLAUDE_MCP_CONFIG:-${HEADLESS_DIR}/mcp_config.example.json}"
RUN_REAL_CLAUDE="${TAP_RUN_REAL_CLAUDE:-0}"
CLAUDE_TIMEOUT_SEC="${TAP_CLAUDE_TIMEOUT:-900}"
RUN_ID=""
tap_ok() {
DONE=$((DONE + 1))
echo "msg: ok ${DONE} - $1"
}
tap_not_ok() {
DONE=$((DONE + 1))
FAIL=$((FAIL + 1))
echo "msg: not ok ${DONE} - $1"
if [[ $# -gt 1 ]]; then
echo "msg: # $2"
fi
}
tap_skip() {
DONE=$((DONE + 1))
echo "msg: ok ${DONE} - $1 # SKIP $2"
}
echo "msg: 1..${PLAN}"
echo "msg: # MCP Claude Headless Flow Smoke Test"
if [[ -x "${STATIC_HARVEST}" && -f "${TWO_PHASE}" ]]; then
tap_ok "Claude headless scripts exist"
else
tap_not_ok "Claude headless scripts exist" "Missing ${STATIC_HARVEST} or ${TWO_PHASE}"
fi
if command -v jq >/dev/null 2>&1 && command -v python3 >/dev/null 2>&1; then
tap_ok "jq and python3 available"
else
tap_not_ok "jq and python3 available" "jq/python3 required"
fi
static_out="$("${STATIC_HARVEST}" --target-id "${TARGET_ID}" --schema "${SCHEMA_NAME}" --notes "tap_claude_headless_flow" 2>&1 || true)"
RUN_ID="$(echo "${static_out}" | sed -n 's/^Run ID:[[:space:]]*\([0-9][0-9]*\)$/\1/p' | head -n1)"
if [[ -n "${RUN_ID}" ]]; then
tap_ok "static_harvest.sh returns run id"
else
tap_not_ok "static_harvest.sh returns run id" "${static_out}"
fi
dry_out="$(python3 "${TWO_PHASE}" --mcp-config "${MCP_CONFIG_PATH}" --target-id "${TARGET_ID}" --schema "${SCHEMA_NAME}" --run-id "${RUN_ID}" --dry-run 2>&1 || true)"
if echo "${dry_out}" | grep -q "\[DRY RUN\]" && echo "${dry_out}" | grep -q "Target ID: ${TARGET_ID}"; then
tap_ok "two_phase_discovery.py dry-run includes target_id and succeeds"
else
tap_not_ok "two_phase_discovery.py dry-run includes target_id and succeeds" "${dry_out}"
fi
if [[ "${RUN_REAL_CLAUDE}" != "1" ]]; then
tap_skip "real Claude execution" "set TAP_RUN_REAL_CLAUDE=1 to enable"
else
if ! command -v claude >/dev/null 2>&1; then
tap_skip "real Claude execution" "claude CLI not found"
else
set +e
if command -v timeout >/dev/null 2>&1; then
timeout "${CLAUDE_TIMEOUT_SEC}" python3 "${TWO_PHASE}" \
--mcp-config "${MCP_CONFIG_PATH}" \
--target-id "${TARGET_ID}" \
--schema "${SCHEMA_NAME}" \
--run-id "${RUN_ID}"
rc=$?
else
python3 "${TWO_PHASE}" \
--mcp-config "${MCP_CONFIG_PATH}" \
--target-id "${TARGET_ID}" \
--schema "${SCHEMA_NAME}" \
--run-id "${RUN_ID}"
rc=$?
fi
set -e
if [[ ${rc} -eq 0 ]]; then
tap_ok "real Claude execution completed"
else
tap_not_ok "real Claude execution completed" "exit_code=${rc}"
fi
fi
fi
if [[ -f "${MCP_CONFIG_PATH}" ]]; then
tap_ok "MCP config path exists (${MCP_CONFIG_PATH})"
else
tap_skip "MCP config path exists" "${MCP_CONFIG_PATH} not present (dry-run still valid)"
fi
if [[ "${FAIL}" -ne 0 ]]; then
echo "msg: # FAILURES=${FAIL}/${PLAN}"
exit 1
fi
exit 0

@ -0,0 +1,221 @@
#!/usr/bin/env bash
#
# test_mcp_llm_discovery_phaseb-t.sh
#
# TAP test for MCP phase-B (LLM-driven discovery primitives), CI-safe:
# - no external LLM credentials required
# - validates agent/catalog/llm tools end-to-end on harvested catalog data
#
set -euo pipefail
PLAN=12
DONE=0
FAIL=0
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HELPERS="${SCRIPT_DIR}/mcp_rules_testing/mcp_test_helpers.sh"
if [[ ! -f "${HELPERS}" ]]; then
echo "msg: 1..1"
echo "msg: not ok 1 - missing helper ${HELPERS}"
exit 1
fi
source "${HELPERS}"
MCP_MYSQL_TARGET_ID="${MCP_TARGET_ID:-tap_mysql_default}"
MYSQL_SCHEMA="${MYSQL_DATABASE:-testdb}"
RUN_ID=""
AGENT_RUN_ID=""
OBJECT_ID=""
UNIQ_KEY="tap_phaseb_$(date +%s)"
tap_ok() {
DONE=$((DONE + 1))
echo "msg: ok ${DONE} - $1"
}
tap_not_ok() {
DONE=$((DONE + 1))
FAIL=$((FAIL + 1))
echo "msg: not ok ${DONE} - $1"
if [[ $# -gt 1 ]]; then
echo "msg: # $2"
fi
}
tap_skip() {
DONE=$((DONE + 1))
echo "msg: ok ${DONE} - $1 # SKIP $2"
}
mcp_tool_call() {
local tool_name="$1"
local args_json="$2"
local req_id="$3"
local req
req="$(jq -cn --arg name "${tool_name}" --argjson args "${args_json}" --argjson id "${req_id}" \
'{jsonrpc:"2.0",id:$id,method:"tools/call",params:{name:$name,arguments:$args}}')"
mcp_request "query" "${req}"
}
mcp_success_text() {
local resp="$1"
if echo "${resp}" | jq -e '.error' >/dev/null 2>&1; then
return 1
fi
if echo "${resp}" | jq -e '.result.isError == true' >/dev/null 2>&1; then
return 1
fi
echo "${resp}" | jq -r '.result.content[0].text // empty'
return 0
}
echo "msg: 1..${PLAN}"
echo "msg: # MCP Phase-B LLM Discovery Tooling Test Suite"
if check_proxysql_admin; then
tap_ok "ProxySQL admin reachable"
else
tap_not_ok "ProxySQL admin reachable"
fi
if check_mcp_server; then
tap_ok "MCP server reachable"
else
tap_not_ok "MCP server reachable"
fi
targets_resp="$(mcp_tool_call "list_targets" '{}' 1)"
if echo "${targets_resp}" | grep -q "\"target_id\":\"${MCP_MYSQL_TARGET_ID}\""; then
tap_ok "MySQL target_id present in list_targets"
else
tap_not_ok "MySQL target_id present in list_targets" "${targets_resp}"
fi
harvest_resp="$(mcp_tool_call "discovery.run_static" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"schema_filter\":\"${MYSQL_SCHEMA}\",\"notes\":\"${UNIQ_KEY}\"}" 2)"
harvest_text=""
if harvest_text="$(mcp_success_text "${harvest_resp}")"; then
RUN_ID="$(echo "${harvest_text}" | jq -r '.run_id // empty' 2>/dev/null || true)"
fi
if [[ -n "${RUN_ID}" ]]; then
tap_ok "discovery.run_static returns run_id for phase-B setup"
else
tap_not_ok "discovery.run_static returns run_id for phase-B setup" "${harvest_resp}"
fi
list_resp="$(mcp_tool_call "catalog.list_objects" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"schema_name\":\"${MYSQL_SCHEMA}\",\"object_type\":\"table\",\"page_size\":20}" 3)"
list_text=""
if list_text="$(mcp_success_text "${list_resp}")"; then
OBJECT_ID="$(echo "${list_text}" | jq -r '.results[0].object_id // empty' 2>/dev/null || true)"
fi
if [[ -n "${OBJECT_ID}" ]]; then
tap_ok "catalog.list_objects returns object_id for run"
else
tap_not_ok "catalog.list_objects returns object_id for run" "${list_resp}"
fi
agent_resp="$(mcp_tool_call "agent.run_start" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"model_name\":\"tap-ci-model\",\"prompt_hash\":\"${UNIQ_KEY}\"}" 4)"
agent_text=""
if agent_text="$(mcp_success_text "${agent_resp}")"; then
AGENT_RUN_ID="$(echo "${agent_text}" | jq -r '.agent_run_id // empty' 2>/dev/null || true)"
fi
if [[ -n "${AGENT_RUN_ID}" ]]; then
tap_ok "agent.run_start returns agent_run_id"
else
tap_not_ok "agent.run_start returns agent_run_id" "${agent_resp}"
fi
summary_payload="$(jq -cn \
--arg target_id "${MCP_MYSQL_TARGET_ID}" \
--arg run_id "${RUN_ID}" \
--argjson object_id "${OBJECT_ID}" \
--argjson agent_run_id "${AGENT_RUN_ID}" \
--arg uniq "${UNIQ_KEY}" \
'{
target_id:$target_id,
agent_run_id:$agent_run_id,
run_id:$run_id,
object_id:$object_id,
summary:{
hypothesis:("phaseb-summary-" + $uniq),
grain:"one row per entity",
primary_key:["id"],
time_columns:[],
dimensions:[],
measures:[],
join_keys:[],
example_questions:["What is this table used for?"],
warnings:[]
},
confidence:0.75,
status:"draft",
sources:{source:"tap-phaseb"}
}')"
summary_resp="$(mcp_tool_call "llm.summary_upsert" "${summary_payload}" 5)"
if mcp_success_text "${summary_resp}" >/dev/null; then
tap_ok "llm.summary_upsert succeeds"
else
tap_not_ok "llm.summary_upsert succeeds" "${summary_resp}"
fi
summary_get_resp="$(mcp_tool_call "llm.summary_get" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"object_id\":${OBJECT_ID},\"latest\":1}" 6)"
summary_get_text=""
if summary_get_text="$(mcp_success_text "${summary_get_resp}")" && echo "${summary_get_text}" | grep -q "phaseb-summary-${UNIQ_KEY}"; then
tap_ok "llm.summary_get returns persisted summary marker"
else
tap_not_ok "llm.summary_get returns persisted summary marker" "${summary_get_resp}"
fi
domain_key="tap_domain_${UNIQ_KEY}"
domain_resp="$(mcp_tool_call "llm.domain_upsert" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"domain_key\":\"${domain_key}\",\"title\":\"TAP Domain\",\"description\":\"TAP phaseb domain\",\"confidence\":0.7}" 7)"
if mcp_success_text "${domain_resp}" >/dev/null; then
tap_ok "llm.domain_upsert succeeds"
else
tap_not_ok "llm.domain_upsert succeeds" "${domain_resp}"
fi
members_json="$(jq -cn --argjson oid "${OBJECT_ID}" '[{object_id:$oid,role:"entity",confidence:0.8}]')"
domain_members_resp="$(mcp_tool_call "llm.domain_set_members" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"domain_key\":\"${domain_key}\",\"members\":${members_json}}" 8)"
if mcp_success_text "${domain_members_resp}" >/dev/null; then
tap_ok "llm.domain_set_members succeeds"
else
tap_not_ok "llm.domain_set_members succeeds" "${domain_members_resp}"
fi
metric_key="tap_metric_${UNIQ_KEY}"
metric_resp="$(mcp_tool_call "llm.metric_upsert" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"metric_key\":\"${metric_key}\",\"title\":\"TAP Metric\",\"description\":\"metric from tap phaseb\",\"domain_key\":\"${domain_key}\",\"grain\":\"daily\",\"unit\":\"count\",\"sql_template\":\"SELECT COUNT(*) FROM ${MYSQL_SCHEMA}.tap_mysql_static_customers\",\"depends\":[],\"confidence\":0.65}" 9)"
if mcp_success_text "${metric_resp}" >/dev/null; then
tap_ok "llm.metric_upsert succeeds"
else
tap_not_ok "llm.metric_upsert succeeds" "${metric_resp}"
fi
qt_resp="$(mcp_tool_call "llm.question_template_add" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"agent_run_id\":${AGENT_RUN_ID},\"run_id\":\"${RUN_ID}\",\"title\":\"tap question ${UNIQ_KEY}\",\"question_nl\":\"How many static customers exist?\",\"template\":{\"kind\":\"single_metric\"},\"example_sql\":\"SELECT COUNT(*) FROM ${MYSQL_SCHEMA}.tap_mysql_static_customers\",\"related_objects\":[\"tap_mysql_static_customers\"],\"confidence\":0.66}" 10)"
if mcp_success_text "${qt_resp}" >/dev/null; then
tap_ok "llm.question_template_add succeeds"
else
tap_not_ok "llm.question_template_add succeeds" "${qt_resp}"
fi
search_resp="$(mcp_tool_call "llm.search" "{\"target_id\":\"${MCP_MYSQL_TARGET_ID}\",\"run_id\":\"${RUN_ID}\",\"query\":\"phaseb-summary-${UNIQ_KEY}\",\"limit\":5}" 11)"
search_text=""
if search_text="$(mcp_success_text "${search_resp}")" && echo "${search_text}" | grep -q "phaseb-summary-${UNIQ_KEY}"; then
tap_ok "llm.search finds persisted phase-B artifact"
else
tap_not_ok "llm.search finds persisted phase-B artifact" "${search_resp}"
fi
finish_resp="$(mcp_tool_call "agent.run_finish" "{\"agent_run_id\":${AGENT_RUN_ID},\"status\":\"success\"}" 12)"
if mcp_success_text "${finish_resp}" >/dev/null; then
tap_ok "agent.run_finish succeeds"
else
tap_not_ok "agent.run_finish succeeds" "${finish_resp}"
fi
if [[ "${FAIL}" -ne 0 ]]; then
echo "msg: # FAILURES=${FAIL}/${PLAN}"
exit 1
fi
exit 0
Loading…
Cancel
Save