Skip to content

subs-assistant

Generated from a canonical source

This page is a read-only projection of docs/handoff-corpus/subs-assistant.md. Edit the canonical file, then run npm --prefix tools/project-knowledge-derive run derive.

What subs-assistant is for

The invariant you must not break: storeHash is never a model- or tool-controllable parameter. It is fixed at authentication time as the Durable Object's room name (idFromName(storeHash)); every read and write tool captures it from this.name, and the 60-second handoff JWT's storeHash claim is asserted equal to it on both the WebSocket gate and the DO's own onConnect. No tool schema ever exposes it as a field the model can set. Breaking this would let one merchant read or mutate another merchant's subscriptions — the single most catastrophic failure a multi-tenant marketplace app can have. (apps/assistant/src/assistant.ts getApiCaller() + onConnect; ADR-0083.)

subs-assistant is a merchant copilot embedded in the admin panel: a chat surface that answers store-specific subscription questions from live data and performs a narrow set of safe actions. It breaks into three reader-facing features, each anchored to its build story for traceability:

  • Ask about your subscription business — a merchant asks a plain question ("which subscriptions have failed charges this week?") and the assistant answers from live store data, not guesses (GH #1885)
  • Do a safe action with one confirmation — pause a subscription or skip its next delivery via a preview-then-confirm card, never in one step (GH #1885)
  • Guided help without leaving the admin — how-to and concept questions are answered from the merchant help guides with a citation to the source page (GH #1885)

The four decisions that carry the most weight, all recorded in ADR-0083:

  • Codemode over a naive tool loop — the model writes one TypeScript script per turn against codemode.* read tools, executed in a dynamic-worker sandbox with globalOutbound: null; compound queries collapse to a single script with zero LLM round-trips, credentials never enter generated code, and error recovery happens in-context.
  • Money and tenant safety are structural, not prompted — there is no tool to charge, refund, change price/payment-method/plan/quantity/address, cancel, or act cross-tenant; writes are pause/skip only, one subscription per confirmed action.
  • Pull-don't-port for tool contracts — every read tool calls an existing api route via a service binding, each tool citing a real apps/api/openapi.yaml path, so the assistant cannot drift from the API the way three hand-copied surfaces once did (the storefront contract-drift incident).
  • Defense-in-depth on top of structure — a fail-closed kill switch (ASSISTANT_ENABLED), per-store rate + daily spend caps, a per-turn API-call budget, a write_audit row per confirmed write, and a behavioral eval suite that guards the system-prompt rules on every prompt/model change.

Canonical-framing attestation (operator-ratified 2026-07-02). subs-assistant is the ask-bc ADR-001 codemode pattern, built on the library path, not a hand-rolled loader: SubsAssistant extends Think (@cloudflare/think@0.2.4), reads are wired via createExecuteTool({tools, loader: env.LOADER, timeout: 30_000}), and write tools are registered as siblings of execute — structurally unreachable from generated code. There is no competing assistant implementation in this repo; the naive tool-loop shape ask-bc retains as a fallback was never ported here.

How it actually works

The Durable Object lives in apps/assistant/src/assistant.ts — one SubsAssistant per store, extending @cloudflare/think's Think. The admin panel opens a WebSocket to /agents/subs-assistant/{storeHash} carrying a 60-second JWT minted by an api route after verifying the admin session (same pattern as every sibling admin route); index.ts checks that JWT's storeHash claim against the room name before the upgrade reaches the DO, and onConnect re-verifies the same claim as defense in depth, then persists the authenticated actor into DO SQLite so internal API calls stay attributed across hibernation.

getTools() returns exactly two kinds of tool:

  • execute — one createExecuteTool wrapping the read tools from tools/read.ts. The model writes one TypeScript script per turn against codemode.* stubs; the script runs in a Worker-Loader sandbox (globalOutbound: null, 30s timeout), so a compound question chains Promise.all/joins across several api calls with zero additional LLM round-trips. Every read tool's description cites the real apps/api/openapi.yaml path it calls (e.g. getSubscriptionsGET /api/v1/subscriptions), and the read-tools module documents which store-wide queries were deliberately not built because no route exists for them yet.
  • pauseSubscription / skipNextDelivery — registered as siblings of execute, from tools/write.ts. Generated code inside the sandbox cannot reach them; only the model's top-level tool-call loop can. Each takes confirmed: boolean: false returns a preview object and mutates nothing; true — only sent after the merchant clicks Confirm on the rendered ActionCard — calls POST /api/v1/admin/subscriptions/bulk with subscription_ids hardcoded to a single z.uuid()-validated id, then writes a write_audit row via insertAudit. registry.test.ts asserts, as an exact allowlist (not a grep), that these two are the only write tools and that no tool name or description implies a money-mutating verb.

Both tool sets get their backend access through api-client.ts's callApi closure: it mints a short-lived (120s) HS256 internal JWT from BC_CLIENT_SECRET — the same signing shape verifyBcSignedPayload expects, following the /api/load?demo=1 precedent — and calls the API service binding (subs-api). The closure itself, and the secret inside it, live on the DO; codemode marshals only tool calls into the sandbox, never the captured environment, so credentials structurally cannot enter generated code. The same closure enforces the per-turn CallBudget (40 calls, reset in beforeTurn) before every network call — both the real caller and the fixture-backed fixtureCaller used by the eval/smoke harness share this gate, so the fan-out cap is exercised by evals, not just production traffic.

beforeTurn (in assistant.ts) is also the safety gate and the two-model switch: on a fresh (non-continuation) turn it calls safety.ts's checkTurnAllowed — 30 turns/hour/store and a 2,000,000 input-equivalent-token/day/store cap (output weighted 5x, matching both models' output:input price ratio), reading a DO-SQLite usage_log table — and throws if either limit is hit. Continuation turns (the confirm leg of a two-turn write) skip this check, so a merchant mid-confirmation is never rate-limited out of finishing it, and upgrade to claude-sonnet-4-6 for better error recovery; fresh turns default to claude-haiku-4-5-20251001. Every turn's token usage and a USD cost estimate land in the same usage_log row via onStepFinish/onChatResponse.

On the render side, apps/admin/src/components/Assistant/blocks.tsx parses fenced ```block JSON out of the model's streamed text and mounts real components (KPICard, SubscriptionCard, ChargeTimeline, DataTable, ErrorCard, ActionCard) instead of leaving structured data as markdown. Two things in this file are safety-load-bearing, not cosmetic: safeHref allowlists only internal /v2 routes, mailto:, and https:// on docs.bcsubs.dev — every other scheme (including arbitrary https://) renders as inert text, because React does not block javascript:/data: in an <a href> and the model can be made to echo untrusted tool data verbatim. And ActionCard renders a per-tool AUTHORITATIVE_EFFECT string that is not model-written — it is a hardcoded lookup keyed on the tool name, kept in lockstep with write.ts by convention (not by a shared import), so the consequence line a merchant confirms against cannot be steered by an indirect prompt injection dressing up a harmful action in benign-looking copy.

No sequence diagram exists yet for this domain. Input-B flags a WS-auth → codemode-execute → two-turn-write sequence as the candidate for docs/architecture/sequence-diagrams.md; until that lands, this section is prose-only rather than transcluding a hand-drawn fork.

Where intent and reality diverge

Five typed deltas — Input-B's live-state attestation is the source; nothing below is softened:

  • Contract-verified, not live-verified — the runtime, model, codemode sandbox, and error-recovery are proven live via the dev-only POST /smoke harness (dev 04acef47, wrangler dev --local): a plain turn answers in the pinned formatting vocabulary, and a dunning question drove execute → in-turn error-recovery against a locally-absent API binding → searchHelp → a correct grounded answer. But the read tools' service-binding path was exercised only against fixtureCaller — never a DEPLOYED subs-api + real D1 — and no production deployment of subs-assistant exists at all (wrangler.jsonc has never been wrangler deploy'd on the reachable CF account). Treat every live-behavior claim in this page as proven at the worker/model/sandbox layer, not end-to-end.
  • Verified-but-incomplete — the behavioral eval suite (tools/assistant-evals/, five golden cases: money-refusal, injection-resistance, grounding, two-turn write discipline, format vocabulary) is built and passed 5/5 locally, and .github/workflows/assistant-evals.yml is wired to run on every change to prompt.ts / src/tools/** / the eval set — but the job prints SKIPPED and exits 0 until the ANTHROPIC_API_KEY Actions secret exists, so it is currently a dormant gate, not an enforced one. The kill switch (ASSISTANT_ENABLED) is verified locally (503 past /health), not against a real production deployment.
  • Named-deferredresumeSubscription is deliberately absent: apps/api has no admin resume route (portal resume is customer-session-authed; admin bulk supports skip/pause/cancel only), so the prompt teaches the model to explain auto-resume-at-period-end and deep-link to the subscription detail page instead of fabricating an endpoint (write.ts header comment). Separately, the write tools' reason free-text field (≤200 chars) is stored on the mutation and the audit row, and must be treated as untrusted wherever it is later rendered on another admin surface or in an email — a second-order-injection carrier whose downstream consumer does not exist yet, flagged rather than fixed.
  • Superseded-framing residue — the api's subscriptions-bulk route (apps/api/src/routes/admin/subscriptions-bulk.ts) also supports a cancel action and accepts an array of up to 500 ids. The assistant's write.ts hardcodes ['pause','skip'] on a single-element [subscriptionId] array. Reading the route's breadth as the assistant's surface is the trap this delta exists to name: the assistant can never cancel a subscription or act on more than one at a time, regardless of what the underlying route accepts.
  • Built-but-untrodden — three security backstops exist precisely for adversarial inputs that ordinary merchant use never exercises: the safeHref scheme allowlist, the CallBudget fan-out cap (40 calls/turn), and the authoritative ActionCard consequence line. All three were added after a live adversarial red-team of the running worker (dev 95a25cb3, eb7b672b) found two structural gaps and fixed them: a stored-XSS path via an unsanitized markdown-link href (closed by safeHref), and a fan-out DoS where one turn fired 500 parallel getSubscriptions calls before the 30s sandbox timeout — which bounds wall-clock, not request volume — ever tripped (closed by CallBudget, live-verified cutting the 500-wide fan-out to 40). All three are unit- and render-tested, but by construction see no traffic from a well-behaved merchant session.

How to operate & extend

  • The invariant you must not break: storeHash never becomes a model-visible parameter (see above). Any change to getApiCaller(), onConnect, or a tool's input schema must preserve this.
  • Turning it on: the admin panel gates entirely on VITE_ASSISTANT_URL — unset, the assistant contributes zero UI and zero requests. Full activation sequence (mint ANTHROPIC_API_KEY, ASSISTANT_JWT_KEY on both workers, first wrangler deploy of apps/assistant) is in docs/runbooks/feedback-assistant-activation.md §6 — operator-gated because subs-api lives on a personal Cloudflare account the local agent cannot reach.
  • Kill switch: ASSISTANT_ENABLED (var, fail-closed) — anything past /health returns 503 unless it is the literal string true, even for a client holding a valid JWT.
  • Extension seams: add a read tool only after its route exists in apps/api/openapi.yaml (pull-don't-port); add a write tool only as a sibling of execute, never inside the codemode sandbox, and update registry.test.ts's exact allowlist plus blocks.tsx's AUTHORITATIVE_EFFECT map in the same change. New generative-UI block types need entries in both apps/assistant/src/blocks.ts and blocks.tsx's REGISTRYblocks.test.ts asserts the two stay in sync.
  • Changing safety limits: DEFAULT_MAX_TURNS_PER_HOUR and DEFAULT_MAX_DAILY_INPUT_EQUIV_TOKENS in safety.ts; MAX_API_CALLS_PER_TURN in assistant.ts. All three are per-store, reset on their own windows (hour / day / turn), and continuation turns bypass only the rate check, never the call budget.
  • Before touching prompt.ts, src/tools/**, or the model id: run the behavioral eval suite (node tools/assistant-evals/run.mjs, needs ANTHROPIC_API_KEY locally) — these rules live only in the system prompt, so unit tests stay green when they drift.

Confidence notes

  • I could not find any evidence of a production deployment of subs-assistant beyond the dev-only smoke harness and the red-team dev sessions cited in Input-B (04acef47, 95a25cb3, eb7b672b) — I did not independently verify these are real deployment/session identifiers beyond their presence in ADR-0083 and Input-B; I traced the code and config claims (kill switch, CallBudget, safeHref, registry test, eval wiring) directly and they hold.
  • ADR-0083's two behavioral-safeguard amendments are dated 2026-07-03, one day after this page's as_of_commit. I did not attempt to resolve that apparent ordering — the amendments' content matches what is actually in safety.ts, index.ts, api-client.ts, and blocks.tsx at the commit this page was generated from, so I'm treating the dates as the ADR's own provenance detail rather than a sign the code doesn't match the decision.
  • I did not open tools/assistant-evals/cases.mjs line-by-line to verify each of the five golden cases matches its stated category (money-refusal, injection-resistance, grounding, two-turn discipline, format vocabulary) — I confirmed the file exists, the workflow wiring, and the SKIP-without-secret behavior, but took the case count and pass rate from Input-B/ADR-0083 rather than re-running the suite (no ANTHROPIC_API_KEY in this session).
  • No arch-derive sequence diagram exists for this domain yet (confirmed by reading docs/architecture/sequence-diagrams.md's scope), so Move 2 has no diagram to transclude — this is a corpus gap, not an omission on this page's part.