channel-debug-core
$
npx mdskill add vercel-labs/vercel-openclaw/channel-debug-coreProve deployment state and gather evidence before proposing fixes.
- Diagnoses delivery failures across Slack, Telegram, Discord, and WhatsApp.
- Integrates Vercel logs, sandbox runtime, and workflow state tools.
- Launches parallel diagnostic lanes to collect direct evidence first.
- Delivers verified path diagrams and hypothesis tables for handoff.
SKILL.md
.github/skills/channel-debug-coreView on GitHub ↗
---
name: channel-debug-core
description: "Channel webhook triage for vercel-openclaw Slack/Telegram/Discord/WhatsApp issues: prove deployment state, collect admin readiness endpoints, build evidence-first handoff before fixes."
---
# Channel Debug Core
Use this skill for any Slack, Telegram, Discord, or WhatsApp delivery issue.
## Non-Negotiables
Before proposing a fix, produce:
1. Deployment-state proof.
2. Runtime path diagram with each edge marked `unknown`, `verified-good`, or `verified-bad`.
3. Hypothesis table with fastest falsifier and status.
4. Channel Specialist Handoff.
Do not write "most likely" twice. Gather direct evidence.
## Parallel First Rule
For live deployments, channel debugging starts with parallel lanes. The parent agent creates one artifact root and launches these lanes before any patch proposal:
| Lane | Agent role | Evidence focus |
|---|---|---|
| Vercel/app logs | `channel_vercel_logs` | Project targeting proof, deployment proof, admin readiness JSON, Vercel runtime logs, request/delivery correlation |
| Sandbox runtime | `channel_sandbox_runtime` | `npx sandbox` or app admin SSH fallback, process/port/config/log evidence |
| Workflow state | `channel_workflow_state` | Project targeting proof, `npx workflow inspect runs --backend vercel`, drain workflow state, fallback log correlation |
| Channel specialist | `channel_<channel>` | Channel route semantics, signatures/raw body, dedup, boot message, terminal-path audit |
No fix may be proposed until the merge gate is satisfied:
1. Each lane has returned a handoff, or the lane is marked unavailable with the attempted fallback.
2. At least one runtime edge is `verified-bad`.
3. Contradictory lane evidence has been added to the hypothesis table.
4. Prior known fixes have been compared against current evidence.
## Project Targeting Preflight
Before any `vercel`, `npx sandbox`, or `npx workflow` investigation, prove the command is targeting the intended Vercel project and team.
Always inspect `.vercel/project.json` and compare it to the incident target. If they differ, do not rely on inferred project context. Use explicit project/team/deployment targeting and record the mismatch in the handoff.
Known sharp edge: this repo may be linked to release 45 in `.vercel/project.json` (`projectName: vercel-openclaw-45`, `projectId: prj_ag6Uzj9b4deNK98jVS5SO0H2Shmx`). For release 46, use explicit target context (`projectName: vercel-openclaw-46`, `projectId: prj_JSzrXyJiMzT6F7BUA76Naa7nkjRa`, team `vercel-internal-playground`) rather than trusting `.vercel` inference.
If a tool prints that it inferred the project from `.vercel`, treat the output as suspect until the inferred project is proven to match the incident target. This is especially important for `npx workflow`, which can silently inspect the wrong project when run from a repo linked to a different release.
## Required Admin Surfaces
Collect these first:
```bash
GET /api/admin/why-not-ready
GET /api/channels/summary
GET /api/admin/sandbox-diag
GET /api/admin/logs
```
Save the raw JSON before continuing. Logs are a ring buffer and can evict important events.
## Readiness Triad
Report these separately:
- `route-ready`: platform route/native route appears registered.
- `native-accepted`: `lastForward.ok` true with `classification:"accepted"`, or equivalent handler acceptance evidence.
- `user-visible-reply`: real user saw a Slack/Telegram/Discord/WhatsApp reply or status update.
Never collapse these into `connected`.
## Deployment Proof
Run or request equivalent evidence:
```bash
git rev-parse HEAD
git ls-remote origin main
curl "$URL/api/admin/sandbox-diag"
```
If the deployed runtime cannot be tied to the source being read, say so and stop code-level conclusions.
## Evidence Artifact Rule
Write evidence under:
```text
.agent-runs/channel-debug/<timestamp>/<channel>/
```
Do not commit runtime evidence. Redact admin secrets, bypass secrets, bot tokens, webhook secrets, and platform access tokens.
## Release Env Files
When the incident references a release-specific env file, such as `.env.45`, use it only to investigate that release's live run.
Allowed uses:
- deployment URL / app URL
- Vercel project/team/deployment context
- Vercel token for logs/workflow/sandbox CLI
- `ADMIN_SECRET`
- Deployment Protection bypass
- platform credentials required to verify the specific channel run
Rules:
- Source the file; do not print it.
- Do not copy it into artifacts.
- Do not include raw values in handoffs.
- Record only the file name and variable names used.
- Treat env files as credential/context input, not deployment proof. Deployment proof still comes from git, project targeting proof, and live runtime signals.
## Sandbox Runtime Probe
The sandbox lane must first identify the actual Vercel Sandbox runtime ID. Use `/api/admin/sandbox-diag` and `/api/channels/summary` before CLI access.
If the ID is `oc-prj*`, `prj_*`, `OPENCLAW_INSTANCE_ID`, or `VERCEL_PROJECT_ID`, do not pass it as `<sandbox_id>` to `sandbox exec` or `sandbox connect`.
When `npx sandbox` / `sandbox` works, collect read-only evidence:
```bash
sandbox exec "$SANDBOX_ID" sh -lc '
set -eu
echo "== process =="
ps -eo pid,ppid,comm,args | grep -E "[o]penclaw|[n]ode" || true
echo "== ports =="
(ss -ltnp || netstat -ltnp) 2>/dev/null | grep -E ":(3000|8787)\\b" || true
echo "== openclaw logs =="
tail -200 /tmp/openclaw/openclaw-*.log 2>/dev/null || true
'
```
For config, print only shape, never secrets:
```bash
sandbox exec "$SANDBOX_ID" node - <<'NODE'
const fs = require("fs");
const p = "/home/vercel-sandbox/.openclaw/openclaw.json";
const j = JSON.parse(fs.readFileSync(p, "utf8"));
console.log(JSON.stringify({
channels: Object.keys(j.channels || {}),
pluginAllow: j.plugins?.allow || null,
hasSlack: Boolean(j.channels?.slack),
hasTelegram: Boolean(j.channels?.telegram),
hasDiscord: Boolean(j.channels?.discord),
hasWhatsApp: Boolean(j.channels?.whatsapp),
}, null, 2));
NODE
```
If CLI access fails because the ID is incompatible or auth cannot resolve the sandbox, save the CLI failure and switch to the app admin SSH/exec fallback. The fallback must run the same read-only probes.
## Workflow State Probe
The workflow lane must inspect Vercel Workflow state before treating the workflow path as healthy or broken:
```bash
npx workflow inspect runs --backend vercel > "$ART/workflow/runs.txt" 2> "$ART/workflow/runs.stderr.txt"
```
Correlate by channel, incident time, request ID, delivery ID, platform event ID, and `drainChannelWorkflow`.
Collect workflow run ID, status, current/last step, retry count or repeated step evidence, terminal error, whether native forward attempts happened, and whether `channels.forward_outcome` was written.
If `npx workflow` is unavailable or cannot inspect Vercel state, save the failure and fall back to Vercel logs filtered to workflow/channel events, `/api/admin/logs`, and `/api/channels/summary.lastForward`. Do not use local workflow data as evidence for production behavior.
## Prior Fix Comparison
Before proposing a patch, compare current evidence to known prior fixes and failure signatures:
| Prior fix / failure mode | Current matching evidence | Current contradicting evidence | Match? |
|---|---|---|---|
| openclaw-42 zero-plugin gateway wedge | | | |
| stale sandbox public URL / sandbox-not-listening | | | |
| workflow retry exhaustion | | | |
| route-ready vs native-accepted mismatch | | | |
| recent accepted-forward override of stale configSync failure | | | |
| channel-specific signature/raw-body/secret issue | | | |
A repeated operator-facing message is not enough. Match on runtime signals: status codes, route probes, process/log evidence, Workflow state, and `lastForward`.
## Fallback When Live Access Is Unavailable
If you cannot call the deployed endpoints, produce a static code-path audit and mark runtime edges `unknown`. Do not claim runtime behavior from code alone.
More from vercel-labs/vercel-openclaw
- admin-ui-debugAdmin UI and operator surface debugging for vercel-openclaw: command shell design, admin actions, request core, status panels, launch verification UI, channel readiness UI, and local read-only production-data workflows. Use when the root admin UI, controls, visual state, or operator copy is wrong.
- auth-store-debugAuth and store debugging for vercel-openclaw: admin-secret mode, Sign in with Vercel, session cookies, CSRF, LOCAL_READ_ONLY, Redis vs memory store, keyspace namespacing, and metadata shape migrations. Use when login, route authorization, Redis persistence, or metadata state is suspect.
- channel-forward-parityWebhook route parity audit for channel delivery changes: ensure terminal paths log, record lastForward, classify failures, and refresh stale sandbox port URLs.
- cron-watchdog-debugCron and watchdog debugging for vercel-openclaw: Vercel Cron auth, persisted OpenClaw jobs, cron wake keys, token refresh, restore oracle, hot spare, and watchdog reports. Use when scheduled OpenClaw jobs fail to wake or run, watchdog status is wrong, cron persistence is suspect, or /api/cron/watchdog behavior changes.
- discord-deliveryDiscord channel specialist workflow: debug interaction webhooks, Ed25519 signatures, deferred replies, workflow forwarding to /discord-webhook, integration reconcile, and token expiry.
- firewall-ai-gateway-debugFirewall and Vercel AI Gateway debugging for vercel-openclaw: network policy allowlists, OIDC token refresh, AI Gateway transform rules, firewall learning/enforcement, and sandbox.update networkPolicy calls. Use when model calls, egress, token refresh, or firewall policy application fails.
- gateway-proxy-debugGateway and proxy debugging for vercel-openclaw: /gateway routing, HTML injection, WebSocket rewrite, gateway-token handoff, waiting page, status heartbeat, sandbox port URL cache, and proxy auth. Use when the OpenClaw UI, WebSockets, gateway proxying, or waiting-page flow breaks.
- lat-md>-
- launch-verify-debugLaunch verification and remote smoke debugging for vercel-openclaw: preflight, queue ping, ensureRunning, chatCompletions, wakeFromSleep, restorePrepared, channelReadiness, NDJSON progress, and vclaw create readiness. Use when launch verification, vclaw create validation, or remote smoke checks fail.
- openclaw-bootstrap-debugOpenClaw bootstrap, bundle, config, and restore-asset debugging for vercel-openclaw: openclaw.bundle sidecars, plugin discovery, channel catalog, restart scripts, config hashes, dynamic resume files, and fast restore. Use when setup, gateway boot, plugin loading, or bundle-sidecar compatibility fails.