subs-assistant¶
Generated from a canonical source
This page is a read-only projection of docs/handoff-corpus/subs-assistant.md.
Edit the canonical file, then run npm --prefix tools/project-knowledge-derive run derive.
What subs-assistant is for¶
The invariant you must not break: storeHash is never a model- or
tool-controllable parameter. It is fixed at authentication time as the
Durable Object's room name (idFromName(storeHash)); every read and write
tool captures it from this.name, and the 60-second handoff JWT's
storeHash claim is asserted equal to it on both the WebSocket gate and the
DO's own onConnect. No tool schema ever exposes it as a field the model can
set. Breaking this would let one merchant read or mutate another merchant's
subscriptions — the single most catastrophic failure a multi-tenant
marketplace app can have. (apps/assistant/src/assistant.ts getApiCaller()
+ onConnect; ADR-0083.)
subs-assistant is a merchant copilot embedded in the admin panel: a chat surface that answers store-specific subscription questions from live data and performs a narrow set of safe actions. It breaks into three reader-facing features, each anchored to its build story for traceability:
- Ask about your subscription business — a merchant asks a plain question ("which subscriptions have failed charges this week?") and the assistant answers from live store data, not guesses (GH #1885)
- Do a safe action with one confirmation — pause a subscription or skip its next delivery via a preview-then-confirm card, never in one step (GH #1885)
- Guided help without leaving the admin — how-to and concept questions are answered from the merchant help guides with a citation to the source page (GH #1885)
The four decisions that carry the most weight, all recorded in ADR-0083:
- Codemode over a naive tool loop — the model writes one TypeScript
script per turn against
codemode.*read tools, executed in a dynamic-worker sandbox withglobalOutbound: null; compound queries collapse to a single script with zero LLM round-trips, credentials never enter generated code, and error recovery happens in-context. - Money and tenant safety are structural, not prompted — there is no tool to charge, refund, change price/payment-method/plan/quantity/address, cancel, or act cross-tenant; writes are pause/skip only, one subscription per confirmed action.
- Pull-don't-port for tool contracts — every read tool calls an existing
api route via a service binding, each tool citing a real
apps/api/openapi.yamlpath, so the assistant cannot drift from the API the way three hand-copied surfaces once did (the storefront contract-drift incident). - Defense-in-depth on top of structure — a fail-closed kill switch
(
ASSISTANT_ENABLED), per-store rate + daily spend caps, a per-turn API-call budget, awrite_auditrow per confirmed write, and a behavioral eval suite that guards the system-prompt rules on every prompt/model change.
Canonical-framing attestation (operator-ratified 2026-07-02).
subs-assistant is the ask-bc ADR-001 codemode pattern, built on the library
path, not a hand-rolled loader: SubsAssistant extends Think
(@cloudflare/think@0.2.4), reads are wired via
createExecuteTool({tools, loader: env.LOADER, timeout: 30_000}), and write
tools are registered as siblings of execute — structurally unreachable from
generated code. There is no competing assistant implementation in this repo;
the naive tool-loop shape ask-bc retains as a fallback was never ported here.
How it actually works¶
The Durable Object lives in
apps/assistant/src/assistant.ts —
one SubsAssistant per store, extending @cloudflare/think's Think. The
admin panel opens a WebSocket to /agents/subs-assistant/{storeHash} carrying
a 60-second JWT minted by an api route after verifying the admin session
(same pattern as every sibling admin route); index.ts checks that JWT's
storeHash claim against the room name before the upgrade reaches the DO,
and onConnect re-verifies the same claim as defense in depth, then persists
the authenticated actor into DO SQLite so internal API calls stay attributed
across hibernation.
getTools() returns exactly two kinds of tool:
execute— onecreateExecuteToolwrapping the read tools fromtools/read.ts. The model writes one TypeScript script per turn againstcodemode.*stubs; the script runs in a Worker-Loader sandbox (globalOutbound: null, 30s timeout), so a compound question chainsPromise.all/joins across several api calls with zero additional LLM round-trips. Every read tool's description cites the realapps/api/openapi.yamlpath it calls (e.g.getSubscriptions→GET /api/v1/subscriptions), and the read-tools module documents which store-wide queries were deliberately not built because no route exists for them yet.pauseSubscription/skipNextDelivery— registered as siblings ofexecute, fromtools/write.ts. Generated code inside the sandbox cannot reach them; only the model's top-level tool-call loop can. Each takesconfirmed: boolean:falsereturns a preview object and mutates nothing;true— only sent after the merchant clicks Confirm on the rendered ActionCard — callsPOST /api/v1/admin/subscriptions/bulkwithsubscription_idshardcoded to a singlez.uuid()-validated id, then writes awrite_auditrow viainsertAudit.registry.test.tsasserts, as an exact allowlist (not a grep), that these two are the only write tools and that no tool name or description implies a money-mutating verb.
Both tool sets get their backend access through
api-client.ts's callApi
closure: it mints a short-lived (120s) HS256 internal JWT from
BC_CLIENT_SECRET — the same signing shape verifyBcSignedPayload expects,
following the /api/load?demo=1 precedent — and calls the API service
binding (subs-api). The closure itself, and the secret inside it, live on
the DO; codemode marshals only tool calls into the sandbox, never the
captured environment, so credentials structurally cannot enter generated
code. The same closure enforces the per-turn CallBudget (40 calls, reset in
beforeTurn) before every network call — both the real caller and the
fixture-backed fixtureCaller used by the eval/smoke harness share this gate,
so the fan-out cap is exercised by evals, not just production traffic.
beforeTurn (in assistant.ts) is also the safety gate and the two-model
switch: on a fresh (non-continuation) turn it calls
safety.ts's checkTurnAllowed — 30
turns/hour/store and a 2,000,000 input-equivalent-token/day/store cap (output
weighted 5x, matching both models' output:input price ratio), reading a
DO-SQLite usage_log table — and throws if either limit is hit. Continuation
turns (the confirm leg of a two-turn write) skip this check, so a merchant
mid-confirmation is never rate-limited out of finishing it, and upgrade to
claude-sonnet-4-6 for better error recovery; fresh turns default to
claude-haiku-4-5-20251001. Every turn's token usage and a USD cost estimate
land in the same usage_log row via onStepFinish/onChatResponse.
On the render side,
apps/admin/src/components/Assistant/blocks.tsx
parses fenced ```block JSON out of the model's streamed text and mounts
real components (KPICard, SubscriptionCard, ChargeTimeline, DataTable,
ErrorCard, ActionCard) instead of leaving structured data as markdown.
Two things in this file are safety-load-bearing, not cosmetic: safeHref
allowlists only internal /v2 routes, mailto:, and https:// on
docs.bcsubs.dev — every other scheme (including arbitrary https://)
renders as inert text, because React does not block javascript:/data: in
an <a href> and the model can be made to echo untrusted tool data verbatim.
And ActionCard renders a per-tool AUTHORITATIVE_EFFECT string that is
not model-written — it is a hardcoded lookup keyed on the tool name, kept
in lockstep with write.ts by convention (not by a shared import), so the
consequence line a merchant confirms against cannot be steered by an indirect
prompt injection dressing up a harmful action in benign-looking copy.
No sequence diagram exists yet for this domain. Input-B flags a
WS-auth → codemode-execute → two-turn-write sequence as the candidate for
docs/architecture/sequence-diagrams.md; until that lands, this section is
prose-only rather than transcluding a hand-drawn fork.
Where intent and reality diverge¶
Five typed deltas — Input-B's live-state attestation is the source; nothing below is softened:
- Contract-verified, not live-verified — the runtime, model, codemode
sandbox, and error-recovery are proven live via the dev-only
POST /smokeharness (dev04acef47,wrangler dev --local): a plain turn answers in the pinned formatting vocabulary, and a dunning question droveexecute→ in-turn error-recovery against a locally-absent API binding →searchHelp→ a correct grounded answer. But the read tools' service-binding path was exercised only againstfixtureCaller— never a DEPLOYED subs-api + real D1 — and no production deployment ofsubs-assistantexists at all (wrangler.jsonchas never beenwrangler deploy'd on the reachable CF account). Treat every live-behavior claim in this page as proven at the worker/model/sandbox layer, not end-to-end. - Verified-but-incomplete — the behavioral eval suite
(
tools/assistant-evals/, five golden cases: money-refusal, injection-resistance, grounding, two-turn write discipline, format vocabulary) is built and passed 5/5 locally, and.github/workflows/assistant-evals.ymlis wired to run on every change toprompt.ts/src/tools/**/ the eval set — but the job prints SKIPPED and exits 0 until theANTHROPIC_API_KEYActions secret exists, so it is currently a dormant gate, not an enforced one. The kill switch (ASSISTANT_ENABLED) is verified locally (503 past/health), not against a real production deployment. - Named-deferred —
resumeSubscriptionis deliberately absent:apps/apihas no admin resume route (portal resume is customer-session-authed; admin bulk supports skip/pause/cancel only), so the prompt teaches the model to explain auto-resume-at-period-end and deep-link to the subscription detail page instead of fabricating an endpoint (write.tsheader comment). Separately, the write tools'reasonfree-text field (≤200 chars) is stored on the mutation and the audit row, and must be treated as untrusted wherever it is later rendered on another admin surface or in an email — a second-order-injection carrier whose downstream consumer does not exist yet, flagged rather than fixed. - Superseded-framing residue — the api's
subscriptions-bulkroute (apps/api/src/routes/admin/subscriptions-bulk.ts) also supports acancelaction and accepts an array of up to 500 ids. The assistant'swrite.tshardcodes['pause','skip']on a single-element[subscriptionId]array. Reading the route's breadth as the assistant's surface is the trap this delta exists to name: the assistant can never cancel a subscription or act on more than one at a time, regardless of what the underlying route accepts. - Built-but-untrodden — three security backstops exist precisely for
adversarial inputs that ordinary merchant use never exercises: the
safeHrefscheme allowlist, theCallBudgetfan-out cap (40 calls/turn), and the authoritativeActionCardconsequence line. All three were added after a live adversarial red-team of the running worker (dev95a25cb3,eb7b672b) found two structural gaps and fixed them: a stored-XSS path via an unsanitized markdown-linkhref(closed bysafeHref), and a fan-out DoS where one turn fired 500 parallelgetSubscriptionscalls before the 30s sandbox timeout — which bounds wall-clock, not request volume — ever tripped (closed byCallBudget, live-verified cutting the 500-wide fan-out to 40). All three are unit- and render-tested, but by construction see no traffic from a well-behaved merchant session.
How to operate & extend¶
- The invariant you must not break:
storeHashnever becomes a model-visible parameter (see above). Any change togetApiCaller(),onConnect, or a tool's input schema must preserve this. - Turning it on: the admin panel gates entirely on
VITE_ASSISTANT_URL— unset, the assistant contributes zero UI and zero requests. Full activation sequence (mintANTHROPIC_API_KEY,ASSISTANT_JWT_KEYon both workers, firstwrangler deployofapps/assistant) is indocs/runbooks/feedback-assistant-activation.md§6 — operator-gated becausesubs-apilives on a personal Cloudflare account the local agent cannot reach. - Kill switch:
ASSISTANT_ENABLED(var, fail-closed) — anything past/healthreturns 503 unless it is the literal stringtrue, even for a client holding a valid JWT. - Extension seams: add a read tool only after its route exists in
apps/api/openapi.yaml(pull-don't-port); add a write tool only as a sibling ofexecute, never inside the codemode sandbox, and updateregistry.test.ts's exact allowlist plusblocks.tsx'sAUTHORITATIVE_EFFECTmap in the same change. New generative-UI block types need entries in bothapps/assistant/src/blocks.tsandblocks.tsx'sREGISTRY—blocks.test.tsasserts the two stay in sync. - Changing safety limits:
DEFAULT_MAX_TURNS_PER_HOURandDEFAULT_MAX_DAILY_INPUT_EQUIV_TOKENSinsafety.ts;MAX_API_CALLS_PER_TURNinassistant.ts. All three are per-store, reset on their own windows (hour / day / turn), and continuation turns bypass only the rate check, never the call budget. - Before touching
prompt.ts,src/tools/**, or the model id: run the behavioral eval suite (node tools/assistant-evals/run.mjs, needsANTHROPIC_API_KEYlocally) — these rules live only in the system prompt, so unit tests stay green when they drift.
Confidence notes¶
- I could not find any evidence of a production deployment of
subs-assistantbeyond the dev-only smoke harness and the red-team dev sessions cited in Input-B (04acef47,95a25cb3,eb7b672b) — I did not independently verify these are real deployment/session identifiers beyond their presence in ADR-0083 and Input-B; I traced the code and config claims (kill switch, CallBudget, safeHref, registry test, eval wiring) directly and they hold. - ADR-0083's two behavioral-safeguard amendments are dated 2026-07-03, one
day after this page's
as_of_commit. I did not attempt to resolve that apparent ordering — the amendments' content matches what is actually insafety.ts,index.ts,api-client.ts, andblocks.tsxat the commit this page was generated from, so I'm treating the dates as the ADR's own provenance detail rather than a sign the code doesn't match the decision. - I did not open
tools/assistant-evals/cases.mjsline-by-line to verify each of the five golden cases matches its stated category (money-refusal, injection-resistance, grounding, two-turn discipline, format vocabulary) — I confirmed the file exists, the workflow wiring, and the SKIP-without-secret behavior, but took the case count and pass rate from Input-B/ADR-0083 rather than re-running the suite (noANTHROPIC_API_KEYin this session). - No arch-derive sequence diagram exists for this domain yet (confirmed by
reading
docs/architecture/sequence-diagrams.md's scope), so Move 2 has no diagram to transclude — this is a corpus gap, not an omission on this page's part.