Dogfooding Session: Wire Smoke Validation

This is a real example of how CQ was used to build CQ's own wire validation system. Every step — from exploration to commit — was orchestrated through CQ itself, on the same codebase being changed.

Step 1: /pi — Exploring the Problem

The question was simple but the answer wasn't obvious:

/pi "wire가 실제로 prod에서 작동하는지 확인하는 방법"

/pi spins up a research loop. It pulls relevant knowledge from the knowledge base, scans recent traces, and drafts an idea document. The AI explored three approaches:

L2 Smoke — spawn the MCP stdio process, dispatch a known-safe tool call, verify the response schema. Fast (under 5 seconds), no side effects, runs in CI.

L3 Canary — deploy a shadow instance alongside prod, route 1% of real traffic through it, diff responses. Accurate but requires infra overhead.

Breadth-first dispatch — call every registered tool with a minimal payload, check for panics and schema drift. Comprehensive but slow.

The discussion converged on L2 smoke as the right first layer:

"A canary tells you what broke in production. A smoke test tells you whether production is even reachable. You need the smoke test first."

Output: .c4/ideas/wire-smoke-validation.md — a structured idea document with problem statement, constraints, and a proposed implementation sketch.

Step 2: /c4-plan — Design + Tasks

/c4-plan

/c4-plan reads the idea document and converts it into EARS-format requirements, then generates a task graph.

Requirements generated:

WHEN the binary is rebuilt, the wire smoke test SHALL verify all registered MCP tools respond within 2 seconds
WHERE a tool returns a non-200 response or schema mismatch, the script SHALL exit non-zero and print the failing tool name
IF --all flag is passed, the script SHALL test every tool; otherwise only the c4_status sentinel

Design decisions recorded:

Decision	Choice	Rationale
Script vs inline Go	Shell script	Portable, no recompile needed, readable in CI logs
Hard fail vs soft	Hard fail	Silent failures are worse than noisy ones
MCP transport	stdio	Same path as real agent connections

Three tasks created:

T-WSV-01: Write wire-smoke.sh — spawn stdio MCP, dispatch tool calls, verify responses
T-WSV-02: Integrate into /c4-finish as Step 6.5 — run smoke test before commit
T-WSV-03: Integration test — full --all run against local build, 0 failures required

Dependencies: T-WSV-02 blocked on T-WSV-01. T-WSV-03 blocked on T-WSV-02.

Step 3: /c4-run — Implementation

/c4-run

Workers spawned. Each picked up one task from the queue.

T-WSV-01 — the worker read the idea document, checked existing MCP handler registration patterns in c4-core/internal/mcp/, and wrote scripts/wire-smoke.sh. The script:

Builds the binary (make install)
Spawns cq serve in stdio mode with --smoke flag
Sends a JSON-RPC tools/call for each registered tool
Validates the response envelope matches the MCP schema
Reports pass/fail per tool, exits 1 on any failure

T-WSV-02 — unblocked as soon as T-WSV-01 completed. The worker found the /c4-finish skill definition and inserted Step 6.5 between "build verification" and "commit":

bash

# Step 6.5: Wire smoke validation
if [ -f scripts/wire-smoke.sh ]; then
  bash scripts/wire-smoke.sh --sentinel || exit 1
fi

T-WSV-03 — the integration test worker ran the full suite:

Testing 183 tools... [183/183] PASS
Wire smoke: 0 failures

All three tasks completed. No human intervention between task spawn and final result.

Step 4: /c4-finish — Verify + Commit

/c4-finish

The finish skill runs its standard checklist:

Step 1: git status — 4 files modified, 1 new file
Step 2: go build ./... — OK
Step 3: go vet ./... — OK
Step 4: go test ./... — OK (2m14s)
Step 5: Lint — OK
Step 6.5: Wire smoke (sentinel) — PASS [cq_status responded in 341ms]
Step 7: Commit

Commit message:

feat(infra): wire-smoke.sh L2 validation

Add shell-based smoke test that spawns MCP stdio and verifies
tool dispatch end-to-end. Integrated into /c4-finish as Step 6.5.
183 tools validated, sentinel mode for fast CI path.

Closes T-WSV-01 T-WSV-02 T-WSV-03

What Happened Next

The same /pi → /c4-plan → /c4-run → /c4-finish pattern was applied to adjacent systems:

CF Worker smoke test (cf-smoke.sh) — validates that the Cloudflare Worker serving the Streamable MCP surface returns correct tool schemas. Added to deployment pipeline.

Edge Functions smoke test (edge-smoke.sh) — validates Supabase Edge Functions. On its first run against production, it caught a live bug: telegram-notify was returning 200 even on authentication failure (fail-open instead of fail-closed). The bug had been silent for 11 days.

L3 canary in GitLab CI — after L2 smoke proved stable, the canary layer was added. It routes synthetic tool calls through the production relay and diffs responses against a known-good snapshot. Runs nightly.

Key Takeaway

"CQ builds CQ" is not a slogan. The wire validation system that now protects every CQ deployment was itself built through the same loop it enforces:

/pi — understand the problem
/c4-plan — make the design explicit
/c4-run — implement without drift
/c4-finish — verify before shipping

The edge-smoke finding — a silent auth failure in production — is the clearest evidence that the system works. The bug wasn't caught in code review. It wasn't caught in unit tests. It was caught by running the actual wire under controlled conditions and checking what came back.

That is what wire validation is for. And that is what CQ is for.

Dogfooding Session: Wire Smoke Validation ​

Step 1: /pi — Exploring the Problem ​

Step 2: /c4-plan — Design + Tasks ​

Step 3: /c4-run — Implementation ​

Step 4: /c4-finish — Verify + Commit ​

What Happened Next ​