← Code reviewARCHITECTURE.md · last updated 2026-05-24

Raw architecture document — no interpretation. For a guided review entry point, seeCode Review. For the full entity-relationship diagram (85 tables, mechanically generated from the live schema), seethe data-model ERD — renders natively on GitHub.

One term used below without inline definition: "Hive" is this project's internal proposal/decision/task coordination substrate (process tooling — issue tracking + an automation layer), not a runtime dependency of the shipped product. Where you see "Hive #N" or "Hive proposal", that's a reference to that internal tracker, not something the running application calls. The rendered document below also carries other internal shorthand without inline definition — ADR numbers and bare synthesis IDs (decision-record references), and BC-payments acronyms like MIT (Merchant-Initiated Transaction) and NTI (Network Transaction Identifier) — which this wrapper page doesn't gloss term-by-term since that content lives in the canonical doc itself, not this page.

Solution Architecture: BC-Native Subscription Platform

Companion to: PRD.md, BRD.md, APP-ADMIN-SPEC.md Authoritative phasing: ADR-0030 (Cloudflare → GCP) + ADR-0021 (tier cohorts) Status: Draft v0.2 — current state in §0; §5–§7, §10 are historical decision-time content

Read this first. §1–§4 + §0 below describe how bc-subscriptions is built today. §5–§7 and §10 are preserved as the decision-time comparison that produced ADR-0030; they read as forward-looking but the decision has shipped. §8 (cost model), §9 (operational), and §11 (open architectural decisions) remain load-bearing. §12–§14 (the Approach-A1 / Approach-B renewal sequences + the capability→service comparison table) are decision-time comparison content like §5–§7 — they depict rejected stacks and are kept only for the ADR-0030 decision trail. The authoritative as-built sequence, container, context, flow, and integration diagrams live in docs/architecture/ (see §0), render natively on GitHub, and are the ones to cite.


Table of Contents

  1. Current Architecture (as built)start here
  2. Solution Context
  3. Shared Architectural Decisions
  4. Systems of Record
  5. Capability Map
  6. Approach A — Vercel-Native (historical, decision-time)
  7. Approach B — GCP-Only (historical, decision-time)
  8. Side-by-Side Comparison (historical, decision-time)
  9. Rough Cost Model
  10. Operational Considerations
  11. Recommendation (historical — superseded by ADR-0030)
  12. Open Architectural Decisions
  13. Appendix A — Renewal Sequence (Approach A1, Vercel) (historical, decision-time — see docs/architecture/sequence-diagrams.md for the as-built)
  14. Appendix B — Renewal Sequence (Approach B, GCP) (historical, decision-time)
  15. Summary Table — Capability to Service (historical, decision-time)

0. Current Architecture (as built)

Phase 1 (in production today): Cloudflare. All five deployables run on the Cloudflare platform.

Component Service
Admin UI (apps/admin/, React + BigDesign, BC App Extension iframe) Cloudflare Pages
Subscriber portal (apps/storefront-svelte/) Cloudflare Pages
Catalyst storefront integration (apps/storefront-catalyst/) Cloudflare Pages
Docs site (apps/docs-site/, MkDocs Material) Cloudflare Pages
API + webhooks (apps/api/, Hono) Cloudflare Workers
Hive substrate MCP server Cloudflare Workers (subs-hive-mcp.bigcommerce-testing-7727.workers.dev)
Archaeology / chat-island Worker Cloudflare Workers (subs-archaeology.bigcommerce-testing-7727.workers.dev)
Primary OLTP Cloudflare D1
Cache + locks + rate limits Cloudflare KV
Async jobs (charges, webhooks, reconciliation) Cloudflare Queues (EVENTS_QUEUEsubs-events)
Cron / scheduled triggers Cloudflare Cron Triggers
Secrets Cloudflare Workers secrets (BC API tokens, encryption keys) + per-store D1 columns (encrypted at rest via HKDF-derived keys per Hive f80e037c)
Observability (Phase 1) Workers Analytics + Sentry; sink-abstracted per tools/observability/ for Phase 2 cutover
CI/CD GitHub Actions → wrangler deploy

Phase 2 (planned): GCP. Migration shape ratified in ADR-0030 and aligned with the reference dms-self-serve GCP pattern.

Component Phase 2 service
All web surfaces Cloud Run (containerized)
Primary OLTP Cloud SQL (MySQL)
Cache + sessions Memorystore (Redis)
Async jobs + events Pub/Sub + Cloud Tasks
Object storage Cloud Storage
Cron / orchestration Cloud Scheduler + Cloud Workflows
Secrets Secret Manager
Identity for CI Workload Identity Federation (per-env pool, repo+environment attribute conditions)
Observability Cloud Logging + Cloud Monitoring + Sentry; same sink abstraction
IaC Terraform (hand-rolled google_*, not terraform-gcp-platform) per infra/terraform/

What is explicitly NOT in the architecture.

  • Vercel is not a deployment target. It appears in §5–§7 below as decision-time comparison content and in package.json only as a peer-dependency footprint of certain next packages. No production surface runs on Vercel.
  • Supabase is not in use. Earlier Hive substrate explorations (ai-hive, hackathon-hive) ran on Supabase + Vercel; the production substrate for bc-subscriptions is Cloudflare Workers + D1 with GitHub-derived proposal lifecycle (ADR-0055).
  • Direct Stripe vault / Braintree SDK is fallback, not primary. The canonical charge rail is BigCommerce's stored-instruments vault per ADR-0037. Direct gateway SDKs are capability fallbacks for merchants whose gateways don't expose the BC stored-instruments path.
  • No object-store (R2) binding in the product runtime. No product-runtime Worker (apps/api, apps/storefront-svelte, apps/email-consumer) declares an r2_buckets binding — verified against infra/cloudflare/wrangler.toml + the Worker Env interface by tools/arch-derive/. The only R2 bucket in the repo belongs to the archaeology substrate tool (process tooling, not a product deployable). Audit-log/export durability is served by the append-only events table in D1 (§2.5), not object storage.

Marketplace shape. Distributed as a BigCommerce Marketplace app per ADR-0029 — destination, not waypoint, with native-readiness preserved as a design constraint. See memory: marketplace-first-native-ready for the fork-revisit conditions.

Substrate (process side, not runtime). Hive proposal/decision/task substrate per METHODOLOGY.md §4b and WAYS-OF-WORKING.md. The substrate runtime is the Cloudflare Worker named above; the lifecycle is derived from GitHub issue state per ADR-0055.


1. Solution Context

From the PRD and BRD, the platform must:

  • Install as a BC Marketplace app with OAuth + signed-JWT load flow
  • Render merchant UI inside the BC admin iframe with partitioned cookies
  • Host a storefront widget that works on Stencil and Catalyst
  • Host a subscriber portal (both hosted and headless SDK)
  • Run a durable billing engine with scheduling, retries, dunning, idempotency
  • Integrate two processor adapters at MVP (BC Payments via Braintree, Stripe)
  • Create native BC orders on every successful renewal
  • Deliver outbound webhooks to merchant integrations
  • Maintain PCI SAQ-A boundary (no card data in our systems)
  • Support GDPR DSAR + erasure end-to-end
  • Scale to 100M active subscriptions across ~10K stores

The architecture must favor platforms that minimize time-to-ship for this workload, not generic flexibility.


2. Shared Architectural Decisions

These decisions apply to both approaches. They are platform-agnostic and should not change based on hosting choice.

2.1 Component-Level Isolation

Five distinct deployables, each with independent scaling and failure domains:

  1. Admin UI — BC-iframe-embedded merchant app (React + BigDesign v2, apps/admin/, Cloudflare Pages)
  2. Subscriber Portal — hosted customer-facing portal (SvelteKit + Tailwind, apps/storefront-svelte/, Cloudflare Pages)
  3. Storefront Widget — vanilla-JS IIFE embeddable in any theme (~15KB gzipped)
  4. API Layer — Hono on Cloudflare Workers (apps/api/) for CRUD + webhooks + public REST API
  5. Billing Engine — the Worker scheduled() handler + Cloudflare Cron Triggers (apps/api/src/scheduler.ts::runScheduledTick) driving scheduling, executor, dunning, reconciliation

The admin UI and subscriber portal share the same API backend but deploy separately so a merchant-admin release doesn't couple to subscriber portal timing.

2.2 Framework Choices

  • Admin UI: React + BigDesign v2 on Cloudflare Pages — React is forced by @bigcommerce/big-design being React-only. Matches aisles-admin and ask-bc precedent in the workspace.
  • Subscriber Portal: SvelteKit + Tailwind on Cloudflare Pages (apps/storefront-svelte/) — the as-built choice. (The decision-time draft had proposed matching the admin's stack for team consistency; the shipped portal is SvelteKit.)
  • Storefront Widget: vanilla TypeScript — no framework dependency so it drops into any Stencil theme or Catalyst build. Compiled to an IIFE bundle.
  • Billing Engine: TypeScript on the Cloudflare Worker scheduled() handler + Cron Triggers (apps/api/src/scheduler.ts). Phase 2 maps this to Cloud Scheduler + Cloud Workflows per ADR-0030; the decision-time comparison of workflow runtimes is preserved in §12–§14 (historical).

For the always-current, drift-gated view of the deployables and their bindings, see the generated docs/architecture/c4-container.md.

2.3 Tenancy

  • Multi-tenant, row-level isolated. Every D1 table carries store_hash; tenant isolation is enforced by mandatory store_hash scoping on every query (Cloudflare D1 is SQLite and has no row-level-security engine), not by database-level RLS policies. Phase 2's managed SQL tier can add RLS per ADR-0030.
  • Per-store encryption: access tokens encrypted with AES-256-GCM using a per-store nonce derived from store_hash + platform master key.
  • All BC-issued secret material is per-merchant identity, not per-app: the OAuth access_token (granted at install), the Storefront API token (minted per-store via POST /stores/{store_hash}/v3/storefront/api-token — see ADR-0059), and the webhook signing secret (the client_secret of the OAuth client that registered the webhook — see BC_WEBHOOK_SIGNING_SECRET env decoupling from BC_CLIENT_SECRET). The app-wide env-var pattern is sandbox-only legacy and breaks by construction for any multi-store deployment.
  • Noisy-neighbor protection: per-store rate limits on BC API calls and processor charges.

2.4 Data Boundaries

  • PCI scope: zero. No PAN, CVV, or full card data ever enters our systems. At standard Stencil/Catalyst checkout, the shopper enters card data into BC's hosted checkout UI; BigPay vaults the card with the provider and returns a bigpay_token — our app receives only the opaque vault token via GET /v3/orders/{id}/transactions in the webhook handler (ADR-0037). For our portal add-payment-method flow (BRD US-19.1), shoppers add cards via an Instrument Access Token (IAT) issued by our backend — also zero PAN exposure. No gateway-specific tokenization SDK (Stripe Elements, Braintree Hosted Fields) is required by our subscription engine for the standard recurring-charge rail; Stripe-direct edge case uses apps/api/src/adapters/stripe.ts.
  • PII minimization. We store subscriber email, name, and address snapshots on subscriptions (not copied continuously). We never store BC customer auth credentials.
  • GDPR region affinity. EU merchants' data lives in EU-region databases (both approaches support this via region-pinning).

2.5 Observability

  • Structured logs with {store_hash, subscription_id, charge_id, correlation_id} on every event
  • Append-only events table per store (our application audit log)
  • Platform metrics (latency, error rate, queue depth) exported to the platform's metrics stack

2.6 Idempotency

  • Every processor charge uses charge.id as the idempotency key
  • Every state-changing API accepts an optional Idempotency-Key header
  • Inbound webhook delivery uses HMAC signatures; signing identity is per-registering-OAuth-client, not per-app (BC signs with the client_secret of whichever client registered the webhook). The verification secret is stored in BC_WEBHOOK_SIGNING_SECRET, decoupled from the app-wide BC_CLIENT_SECRET. Receiver is responsible for idempotent handling.

2.7 Shared Stack Elements (Platform-Agnostic)

  • Language: TypeScript 5 everywhere
  • Data: Cloudflare D1 (SQLite) via the Workers DB binding; Drizzle for query building + migrations
  • Validation: Zod v4
  • Env: the typed Worker Env interface (apps/api/src/types.ts) — bindings + secrets, no external env framework
  • Auth libraries: jose for JWT
  • Email: Resend (SendGrid as swap candidate)
  • SMS: Twilio (P2)
  • Testing: Vitest — including @cloudflare/vitest-pool-workers running the G4 behavioral scenarios over a real D1 instance — plus Playwright for e2e. See docs/methodology/test-strategy.md for the full G1–G5 verification ladder

3. Systems of Record

<!-- traceability:start:ARCH:3 -->

Prototype: Membership · Event Timeline · Drift Sweep · Replay Tool · Structured Logs

<!-- traceability:end:ARCH:3 -->

Clear SoR boundaries prevent reconciliation nightmares. This table is authoritative.

Domain System of Record Why Our Copy?
Product catalog, variants BigCommerce BC owns the catalog merchant edits Read-through; never cached beyond 5 min
Pricing (base) BigCommerce BC is the merchant's price truth Read-through
Price Lists BigCommerce Subscription pricing often references these Read-through
Customer profile BigCommerce BC owns customer records + auth Read; mirror only what we need (name, email for notifications)
Customer group membership BigCommerce Used for plan scoping + Price List resolution Read at runtime
Orders (including subscription renewals) BigCommerce Every renewal produces a BC order We store a bc_order_id pointer only
Order status + fulfillment state BigCommerce BC's OMS/WMS integrations operate here Read via webhook
Payment method tokens (vault) Processor (BC Payments → Braintree, or Stripe) PCI scope; processor owns tokens We store payment_method_ref only
Payment settlement + payout Processor + BC Money Dashboard (for BC Payments) Money movement Read for reconciliation
Subscription agreement Us BC has no subscription model We are the SoR
Charge history (attempts, outcomes) Us Derived from processor events + our scheduler We are the SoR
Plan configuration Us No BC equivalent We are the SoR (pointer stored as BC metafield for discoverability)
Eligibility rules Us No BC equivalent We are the SoR
Promotions (subscription-specific) Us BC's native promos don't handle subscription semantics We are the SoR
Dunning policies Us No BC equivalent We are the SoR
Event / audit log Us Platform-specific audit requirements We are the SoR
Subscriber portal auth session Us (magic link) BC's customer auth not reusable in iframe-less context We are the SoR
App install credentials Us OAuth result specific to our app We are the SoR; encrypted at rest
Entitlement state (grant lifecycle) Us Engine owns grant/revoke state machine tied to subscription lifecycle (PRD-COMPANION D1) We are the SoR; a pluggable provider adapter applies the external effect — v1: BC customer-group assignment; post-v1: SSO, feature-flag, license-server. The adapter's target system is its own SoR for whatever it controls, but our Entitlement row is the truth of whether access is granted.
AllotmentGrant (admin-granted recurring quota) Us Independent of paid-subscription Entitlement; admin-issued credit/quota with refresh cadence (PRD-COMPANION D18) We are the SoR; debits link optionally to BC orders or our charges. Audit on granted_by + reason.
Multi-actor membership on subscription (actors[]) Us BC owns customer identity; role assignment (owner / payer / beneficiary / manager / org_admin) is our concept (PRD-COMPANION D19, ADR-0023) We are the SoR; bc_customer_id is FK to BC's customer SoR. payer.role is unique-per-subscription via partial index.
DeliveryInstance (per-shipment row) Us Lazily populated; BC has no native concept; required when delivery cadence ≠ billing cadence (PRD-COMPANION D2, ADR-0024) We are the SoR; n:1 charge → delivery fan-out permitted (Cintas weekly delivery / monthly billing). bc_order_id set when the per-instance order materializes.
CustomFieldDefinition (merchant-defined typed schema) Us Schema for extensible per-subscription / per-plan / per-grant data (PRD-COMPANION D20) We are the SoR for the schema; field data lives on parent entity's metadata.custom_fields JSONB key.
Per-org processor connection routing v2 — Us RESERVED for marketplace MoR v2 (ADR-0022); single-tenant MoR in Phase 1 Phase 1: store-level processor_connection_id is our SoR. v2: processor_connections keyed on (store_hash, payer_org_id) is our SoR; Actor.processor_connection_ref is the resolver column.
Funds Merchant-of-Record Phase 1: platform tenant. v2: per-buyer-org configurable. Determined by which processor connection settles the charge (ADR-0022). Pointer; we do not hold funds.

The SoR discipline dictates the architecture: we are a system of systems, not a commerce platform. We own the subscription, charge, and rule data models; everything else we read-through, webhook-consume, or pointer-reference.


4. Capability Map

Regardless of hosting approach, these capabilities must exist. Each will map to concrete services in Approach A and B.

graph TD
  subgraph "Presentation"
    A1[Admin UI<br/>React + BigDesign<br/>BC iframe]
    A2[Subscriber Portal<br/>SvelteKit]
    A3[Storefront Widget<br/>vanilla TS]
    A4[Headless SDK<br/>TypeScript]
  end

  subgraph "API & Auth"
    B1[OAuth + BC JWT<br/>session mint]
    B2[Magic-link auth<br/>subscriber]
    B3[Public REST API<br/>v1]
    B4[GraphQL API<br/>Phase 2]
    B5[Inbound webhooks<br/>BC + processor]
    B6[Outbound webhooks<br/>signed + retried]
  end

  subgraph "Billing Engine"
    C1[Scheduler cron<br/>every 15 min]
    C2[Charge Executor<br/>durable workflow]
    C3[Dunning state<br/>machine]
    C4[Reconciliation<br/>daily sweep]
    C5[Replay tool]
  end

  subgraph "Processor Adapters"
    D1[BC Payments<br/>Braintree under]
    D2[Stripe Connect]
    D3[Future adapters<br/>Braintree, Adyen, ...]
  end

  subgraph "Integrations"
    E1[BigCommerce APIs<br/>V2/V3/GraphQL/Storefront]
    E2[Resend - email]
    E3[Twilio - SMS, P2]
    E4[Klaviyo / Gorgias / ... <br/>via webhooks]
  end

  subgraph "Data"
    F1[(D1 SQLite<br/>transactional)]
    F2[(KV + ratelimit<br/>cache, locks,<br/>rate limits)]
    F3[CSV exports,<br/>DSAR bundles<br/>P2+]
    F4[(Analytics warehouse<br/>P2+)]
  end

  A1 --> B1
  A2 --> B2
  A3 --> B3
  A4 --> B3

  B1 --> F1
  B3 --> F1
  B5 --> F1
  B5 --> C1

  C1 --> C2
  C2 --> D1
  C2 --> D2
  C2 --> E1
  C2 --> F1
  C2 --> F2
  C3 --> C2

  C4 --> F1
  C4 --> E1
  C4 --> D1
  C4 --> D2

  B6 --> E4
  B3 --> E2
  C2 --> E2

  F1 --> F4

5. Approach A — Vercel-Native

Historical, decision-time content. §5–§7 below preserve the original Vercel-vs-GCP comparison that produced ADR-0030. The ratified architecture is Cloudflare (Phase 1) → GCP (Phase 2). See §0 above for current state. Preserved here because the comparison framework + cost model + service mapping retain audit value when revisiting the migration shape.

5.1 Summary

Built on Vercel's platform with Neon for Postgres, Upstash for Redis, and Resend for email. Two sub-variants differ only in where the billing engine runs.

  • A1 (Pure Vercel): billing engine on Vercel Workflow + Vercel Queues + Vercel Cron. Single platform, single deploy pipeline, single observability surface.
  • A2 (Vercel + Cloudflare for billing engine): admin/portal/API on Vercel; billing engine on Cloudflare Workers + Durable Objects (following the ask-bc precedent). Trades operational complexity for per-store locality and sub-10ms hot paths.

Supabase is offered as an optional substitution for Neon if merchants want Realtime (live exception queue) and Storage (DSAR bundles) in the same SKU. Neon + Vercel Blob covers the same needs with simpler vendor count.

5.2 Component-to-Service Mapping

Component Service Notes
Admin UI Next.js 16 App Router on Vercel Fluid Compute runtime; BigDesign v2; partitioned cookies for iframe
Subscriber Portal Next.js 16 App Router on Vercel Separate Vercel project; optional custom domain per merchant (wildcard cert)
Storefront Widget Static IIFE served via Vercel's edge network Bundled by Vite; deployed as a Vercel static project with long cache + versioned paths
API Layer Next.js route handlers, Node.js runtime on Fluid Compute Shared between admin + portal; same codebase
Inbound webhooks Next.js route handlers, Node runtime BC webhooks + processor webhooks
Scheduler cron Vercel Cron (every 15 min) Picks up due charges, enqueues to Vercel Queue
Charge executor A1: Vercel Workflow. A2: Cloudflare Durable Object per store. Durable state, checkpointed steps
Dunning state machine Same runtime as executor Re-enters executor with new retry_attempt
Reconciliation sweep Vercel Cron (daily) + Vercel Queue Long-running; fan out to worker invocations
Outbound webhook delivery Vercel Queue consumer Exponential backoff retry, dead-letter after 24h
Postgres Neon Serverless (default) or Supabase Postgres (alt) Branching for preview environments; per-region primary
Redis Upstash Redis Multi-region with per-store read replicas
Object storage Vercel Blob (default) or Supabase Storage (alt) DSAR bundles, CSV exports, migration files
Secrets Vercel Environment Variables + @t3-oss/env-nextjs validation OIDC-based integrations where possible
Email Resend Transactional only; MVP
SMS Twilio (P2) With opt-in consent tracking
Analytics warehouse Neon Postgres read replica + ClickHouse (P3) OR BigQuery via Vercel Integrations
Metrics / logs Vercel's Observability (logs, traces, speed insights) + OpenTelemetry export to Datadog (P2)
Error tracking Sentry Both Vercel-hosted and Cloudflare Workers (A2) supported
CDN / edge Built into Vercel Global anycast; widget served edge-cached
Rate limiting @upstash/ratelimit in middleware Per-store + per-IP

5.3 Topology (A1 — Pure Vercel)

graph LR
  subgraph Browser
    MA[Merchant Admin<br/>BC iframe]
    SP[Subscriber Portal]
    SF[Storefront PDP<br/>with widget]
  end

  subgraph "Vercel"
    direction TB
    UIs[Next.js<br/>Admin + Portal]
    API[Next.js API<br/>routes + webhooks]
    WIDGET[Widget IIFE<br/>static]
    CRON[Vercel Cron]
    Q[Vercel Queue]
    WF[Vercel Workflow<br/>charge executor]
  end

  subgraph "Data"
    NEON[(Neon<br/>Postgres)]
    UPS[(Upstash<br/>Redis)]
    BLOB[Vercel Blob]
  end

  subgraph "External"
    BC[BigCommerce]
    BCP[BC Payments<br/>Braintree]
    STRIPE[Stripe]
    RESEND[Resend]
    KLAVIYO[Klaviyo etc]
  end

  MA <--> UIs
  SP <--> UIs
  SF <--> WIDGET
  SF <--> API

  UIs --> API
  API <--> NEON
  API <--> UPS
  API --> BLOB
  API <--> BC

  BC -.webhooks.-> API
  BCP -.webhooks.-> API
  STRIPE -.webhooks.-> API

  CRON --> Q
  Q --> WF
  WF --> BCP
  WF --> STRIPE
  WF --> BC
  WF <--> NEON
  WF <--> UPS
  WF --> RESEND
  WF --> Q

  Q -.outbound webhooks.-> KLAVIYO

5.4 Topology (A2 — Vercel + Cloudflare)

A2 is identical to A1 except for the billing engine. The Cloudflare Worker hosts a Durable Object per store, giving us:

  • Per-store locality — all that store's charge state lives in one DO with local SQLite storage
  • Durable state — DO persists across restarts; no need for external workflow orchestrator
  • Low latency — sub-10ms state access for hot paths
  • Independent scaling — hot stores don't starve cold stores
graph LR
  subgraph Vercel
    CRON[Vercel Cron] --> DISPATCH[Dispatcher<br/>Vercel Function]
  end

  subgraph Cloudflare
    DO1[Durable Object<br/>store A<br/>+ SQLite]
    DO2[Durable Object<br/>store B<br/>+ SQLite]
    DO3[Durable Object<br/>store C<br/>+ SQLite]
  end

  DISPATCH -->|enqueue charge| DO1
  DISPATCH -->|enqueue charge| DO2
  DISPATCH -->|enqueue charge| DO3

  DO1 --> BC[BigCommerce]
  DO1 --> PROC[Processor Adapter]
  DO2 --> BC
  DO2 --> PROC
  DO3 --> BC
  DO3 --> PROC

When A2 wins over A1:

  • 10K charges per store per day (per-store concurrency matters)

  • Merchants who tolerate a second platform (no additional merchant-facing friction)
  • Cost at scale (Durable Objects are cheaper per-invocation than Workflow for millions of charges)

When A1 wins over A2:

  • MVP and early Phase 2 (scale not yet a concern)
  • Single-platform operations is simpler for small teams
  • Vercel Workflow's durable execution is enough at < 10K charges/day/store

Recommendation within Approach A: Start at A1 for MVP. Measure. Migrate the billing engine to A2 (Cloudflare) only if P95 latency or cost math requires it. The dispatcher-to-DO pattern is additive, not replacement — A1 can evolve into A2 without rewriting the API layer.

5.5 Approach A Strengths

  • Workspace precedent. aisles-admin (Next.js + BigDesign + Neon + Upstash) and ask-bc (Next.js + Cloudflare Workers + DO) already establish both patterns. Team knows them.
  • DevEx. Deploy is git push; previews are automatic; branch databases via Neon; environment management via vercel env.
  • Framework leverage. Next.js middleware, Server Components, vercel.ts config, Fluid Compute concurrency reuse, Vercel Queues + Workflow all fit this workload natively.
  • iframe hosting. Vercel's response-header and partitioned-cookie defaults match BC admin embed requirements (verified against aisles-admin and ask-bc in production).
  • Time-to-ship. The aisles-admin skeleton + ask-bc agent-runtime patterns cut weeks off setup.
  • Edge CDN for widget. Vercel's global anycast ensures < 50ms widget delivery worldwide without extra configuration.
  • Integration ecosystem. Vercel Marketplace for Neon, Upstash, Resend auto-provisions keys via OIDC.

5.6 Approach A Weaknesses / Risks

  • Vercel Queues is in public beta (per current platform state). Committing MVP to a beta product carries some risk; mitigate by designing abstractions so a swap to a mature queue (e.g., upstash-queue, RabbitMQ on a VM) is possible.
  • Vercel Workflow durability guarantees must be verified for our billing use case before MVP commit (flagged in BRD open questions #4).
  • Multi-cloud hybrid in A2 adds operational complexity: two deploy pipelines, two secrets stores, two observability surfaces. Acceptable for a billing engine, but worth naming.
  • Single-vendor concentration risk. Vercel outages affect admin, portal, and API simultaneously. SLA monitoring + statuspage subscribership is essential.

6. Approach B — GCP-Only

6.1 Summary

All components hosted on Google Cloud Platform. Container-based services on Cloud Run, Cloud SQL for Postgres, Cloud Tasks + Cloud Workflows for orchestration, Firestore only considered for ephemeral data, Pub/Sub for internal eventing.

6.2 Component-to-Service Mapping

Component Service Notes
Admin UI Next.js 16 on Cloud Run (Gen 2, min instances > 0 for admin iframe to avoid cold starts) Containerized; served behind Cloud Load Balancer
Subscriber Portal Next.js 16 on Cloud Run Separate service; separate domain
Storefront Widget Cloud Storage static bucket + Cloud CDN Versioned paths; immutable cache
API Layer Next.js route handlers on Cloud Run Same codebase as UIs; separate service for scaling independence
Inbound webhooks Cloud Run service dedicated to webhook ingest Fronted by Cloud Armor for DDoS / rate limiting
Scheduler cron Cloud Scheduler (cron) → Pub/Sub topic Every 15 min
Charge executor Cloud Workflows (durable) OR Cloud Run Jobs Workflow for step-by-step durability with checkpointing
Dunning state machine Cloud Workflows Re-enters executor with new state
Reconciliation sweep Cloud Scheduler (daily) → Cloud Run Job
Outbound webhook delivery Cloud Tasks with retries + Cloud Run consumer Native exponential backoff + DLQ
Postgres Cloud SQL for Postgres with HA config, read replicas per region
Redis Memorystore for Redis (Standard HA tier) Regional
Object storage Cloud Storage DSAR bundles, CSV exports, static assets
Secrets Secret Manager + workload identity federation
Email Resend (external) or SendGrid via Marketplace SendGrid has better GCP integration
SMS Twilio (external) Same as Approach A
Analytics warehouse BigQuery + Dataflow ingestion from Cloud SQL via Datastream Strong analytics story is a GCP pro
Metrics / logs Cloud Logging + Cloud Monitoring Integrated dashboards, SLO tooling
Error tracking Sentry (external) or Error Reporting (GCP-native)
CDN / edge Cloud CDN via Cloud Load Balancer Global; integrates with Cloud Armor
Rate limiting Cloud Armor at edge + per-service Redis-backed limiting
Identity (subscriber) Identity Platform (magic-link flows) OR custom magic-link over Cloud SQL Custom is simpler for our scope
IaC Terraform + Google Cloud provider Required; no equivalent to Vercel's deploy simplicity
CI/CD Cloud Build or GitHub Actions → Artifact Registry → Cloud Run deploy More configuration than Vercel

6.3 Topology

graph LR
  subgraph "Browser"
    MA[Merchant Admin<br/>BC iframe]
    SP[Subscriber Portal]
    SF[Storefront PDP]
  end

  subgraph "GCP Edge"
    LB[Cloud Load Balancer<br/>+ Cloud Armor]
    CDN[Cloud CDN]
  end

  subgraph "GCP Services"
    CR_UI[Cloud Run<br/>Admin + Portal]
    CR_API[Cloud Run<br/>API]
    CR_WH[Cloud Run<br/>Webhook Ingest]
    SCHED[Cloud Scheduler]
    PS[Pub/Sub]
    WF[Cloud Workflows<br/>charge executor]
    CT[Cloud Tasks]
    CR_REC[Cloud Run Job<br/>Reconciliation]
  end

  subgraph "GCP Data"
    SQL[(Cloud SQL<br/>Postgres HA)]
    MEMO[(Memorystore<br/>Redis)]
    GCS[Cloud Storage]
    BQ[(BigQuery)]
  end

  subgraph "External"
    BC[BigCommerce]
    PROCS[Processors:<br/>BC Payments / Stripe]
    EMAIL[SendGrid / Resend]
    INT[Klaviyo / Gorgias]
  end

  MA --> LB
  SP --> LB
  SF --> CDN
  SF --> LB

  LB --> CR_UI
  LB --> CR_API
  LB --> CR_WH

  CR_UI --> CR_API
  CR_API --> SQL
  CR_API --> MEMO
  CR_API --> GCS
  CR_API --> BC
  CR_API --> CT

  BC -.webhooks.-> CR_WH
  PROCS -.webhooks.-> CR_WH
  CR_WH --> PS

  SCHED --> PS
  PS --> WF

  WF --> PROCS
  WF --> BC
  WF --> SQL
  WF --> MEMO
  WF --> EMAIL
  WF --> CT

  CT -.outbound webhooks.-> INT

  SCHED --> CR_REC
  CR_REC --> SQL
  CR_REC --> BC
  CR_REC --> PROCS

  SQL -->|Datastream| BQ

6.4 Approach B Strengths

  • Single-cloud compliance story. Some BC enterprise merchants (healthcare, regulated finance) require single-vendor cloud footprint. GCP-only is a clean compliance argument.
  • BigQuery native. Cohort retention, LTV modeling, subscriber analytics all benefit from BigQuery as an analytics warehouse with Dataflow streaming from Cloud SQL. Phase 2 analytics epics are easier here.
  • Cloud Workflows is GA (since 2020) and has documented durable execution semantics — lower adoption risk than Vercel Workflow (currently beta).
  • SLO / monitoring ecosystem. Cloud Monitoring has native SLO + error budget objects, which shortens the path to formal SLO tracking for SOC 2 (vs. wiring Vercel logs into an external SLO tool).
  • Cloud Armor for edge security is a real advantage for webhook ingest + public API (Vercel has basic but less sophisticated WAF options).
  • Cost predictability at scale. Cloud Run Gen 2 + committed-use discounts can undercut Vercel for large workloads once steady-state is known.
  • Workload Identity Federation eliminates long-lived secrets for inter-service auth — stronger security posture than managing rotated secrets.

6.5 Approach B Weaknesses / Risks

  • Cold starts. Cloud Run Gen 2 is improved but still has perceptible cold-start latency vs. Vercel Fluid Compute (which reuses function instances across concurrent requests). For admin iframe load, this is felt by the merchant; mitigate with min-instances = 1 per service (ongoing cost).
  • DevEx. No git push to deploy; requires Terraform, Cloud Build, Artifact Registry, environment management via Deployment Manager or config files. Preview environments per PR require custom Cloud Build triggers. Team needs GCP ops expertise.
  • iframe hosting. Partitioned cookie handling, SameSite=none enforcement across Cloud Load Balancer, and iframe-safe redirects are all achievable but require explicit header configuration — Vercel ships these as defaults.
  • No framework-native features. Next.js runs in container mode (fine), but loses some of Vercel's Next.js-specific optimizations (ISR coordination, streaming edge features, vercel.ts config).
  • More moving parts. Admin + Portal + API + Webhook Ingest = 4 Cloud Run services plus 2 jobs plus Scheduler/Tasks/Workflows. Each has IAM, networking, Monitoring dashboards. Operational surface area is 3× Approach A.
  • Slower iteration for a small team. If Phase 1 is 14 weeks (per PRD), Approach A probably ships in 14 weeks; Approach B probably needs 18–20 weeks to cover the GCP setup tax.
  • Widget edge delivery. Cloud CDN is fine, but Vercel's global anycast for Next.js is noticeably lower-latency for dynamic edges.

6.6 Firestore Deliberately Not Used

Firestore's document data model does not fit the relational subscription / charge / event domain. Firestore would require extensive denormalization, lose referential integrity guarantees, and complicate the rule engine for eligibility/promotions. Cloud SQL Postgres is the right answer.

Exception: Firestore could be used for ephemeral presence data (like real-time exception queue live-updates to replace WebSocket infrastructure) — but that optimization can wait.


7. Side-by-Side Comparison

Dimension Approach A1 (Pure Vercel) Approach A2 (Vercel + CF) Approach B (GCP)
Admin UI hosting Vercel Fluid Compute Vercel Fluid Compute Cloud Run Gen 2
Subscriber Portal hosting Vercel Vercel Cloud Run
Widget hosting Vercel Edge Vercel Edge Cloud CDN + Storage
API hosting Vercel Fluid Compute Vercel Fluid Compute Cloud Run
Billing engine runtime Vercel Workflow Cloudflare DO Cloud Workflows
Scheduler Vercel Cron Vercel Cron Cloud Scheduler
Queue Vercel Queue (beta) Vercel Queue + CF Queues Pub/Sub + Cloud Tasks
Postgres Neon (or Supabase) Neon Cloud SQL
Redis Upstash Upstash + per-DO SQLite Memorystore
Object store Vercel Blob (or Supabase) Vercel Blob Cloud Storage
Secrets Vercel env vars Vercel + CF secrets Secret Manager
CDN / edge Vercel native Vercel + CF native Cloud CDN
IaC Vercel config (vercel.ts) Vercel + Wrangler (CF) Terraform (mandatory)
CI/CD git push + Vercel git push + Vercel + Wrangler Cloud Build / GHA + Terraform
Preview envs Automatic per PR Automatic (Vercel); manual (CF) Custom Cloud Build setup
Observability Vercel native + Sentry Vercel + CF + Sentry Cloud Logging/Monitoring + Sentry
SLO tooling Basic (via external) Basic Native, mature
WAF / DDoS Vercel basic CF advanced Cloud Armor advanced
Single-cloud compliance No No Yes
Analytics warehouse Neon / ClickHouse P3 Neon / ClickHouse P3 BigQuery native
Cold start risk (admin) Very low (Fluid Compute) Very low Low with min-instances
MVP time-to-ship ✅ Fastest (14 wk target) Slower (~15 wk) Slowest (~18–20 wk)
Operational surface Smallest Medium Largest
Cost at MVP scale Low-mid Low-mid Mid
Cost at 100M subs scale Mid-high Mid Predictable low-mid
PCI DSS SAQ-A
GDPR EU residency ✅ (Neon EU + Vercel EU) ✅ (regional Cloud SQL)
Team expertise fit ✅ High (aisles-admin, ask-bc precedent) ✅ High Medium (GCP ops curve)

8. Rough Cost Model

Monthly estimates for two scale points, steady-state, per store unless noted. Not a quote — a sanity check.

8.1 MVP scale (50 stores, 5K active subs, 500K total API calls/mo)

Category Approach A (Vercel) Approach B (GCP)
Compute (UI + API + webhooks) $200–$400 $300–$500 (min-instances)
Postgres $50 (Neon Scale) $150 (Cloud SQL HA)
Redis $30 (Upstash) $100 (Memorystore Standard)
Object storage < $10 < $10
Secrets / auth included included
Email (Resend) $30 $30
CDN included $20
Observability $50 (Sentry) $80 (Cloud Ops + Sentry)
Total / month ~$370–$520 ~$690–$890

8.2 Growth scale (1K stores, 500K active subs, 50M API calls/mo)

Category Approach A Approach B
Compute $2K–$4K $1.5K–$3K (CUD discounts)
Postgres $800 (Neon Business) $1K (Cloud SQL HA + read replicas)
Redis $400 $500
Object storage $50 $50
Email $500 $500
CDN + egress included $300
Observability $400 $500
BigQuery analytics n/a $300
Total / month ~$4.15K–$6.25K ~$4.65K–$6.15K

Conclusion: Vercel is cheaper at MVP scale (~40% cheaper); GCP approaches parity at growth scale, with GCP edging out on pure compute (CUDs) but incurring storage/egress/ops overhead.


9. Operational Considerations

9.1 Deployment

  • Approach A: git push deploys; preview URL per PR; production deploy is a promote in the Vercel dashboard or via vercel deploy --prod. A2 adds wrangler deploy for the Cloudflare piece.
  • Approach B: Terraform plan/apply for infrastructure; Cloud Build triggers for app deploys; blue/green or canary via Cloud Run traffic splits. Requires discipline around state management and IAM.

9.2 Disaster Recovery

  • RPO (data loss tolerance): 5 min for charges; 0 for events; 1 hour for other tables
  • RTO (restore time): 1 hour for read-only operation; 4 hours for full write restoration
  • Approach A: Neon point-in-time recovery to any second within 7d (Scale) / 30d (Business); Upstash snapshot restore; Vercel Blob versioning
  • Approach B: Cloud SQL automated backups + PITR; Memorystore replica failover; Cloud Storage versioning + cross-region replication

Both meet RPO/RTO for MVP. Approach B's Cloud SQL offers stronger cross-region failover at higher cost.

9.3 Security Posture

  • Approach A: Vercel SOC 2 Type II + HIPAA-eligible (Enterprise plan); Neon SOC 2; Upstash SOC 2. Stacked vendor compliance with narrow blast radius per vendor.
  • Approach B: GCP's unified compliance (SOC 2, ISO 27001, HIPAA, PCI DSS shared responsibility). Single-vendor evidence collection simplifies SOC 2 audit.

Either approach meets PCI SAQ-A eligibility (we never see card data).

9.4 Incident Response

  • Approach A: Vercel has fast status page updates and history; Cloudflare same. Incident response runbooks span 2–3 vendors.
  • Approach B: GCP status + Cloud Monitoring alert policies; single incident surface; stronger diagnostic tooling via Cloud Profiler / Cloud Trace.

9.5 Multi-Region / Data Residency

  • Approach A: Multi-region is opt-in per service. Neon multi-region is available; Vercel multi-region compute is the norm. EU-data-stays-in-EU requires Neon EU region + Vercel EU hosting + Upstash EU — achievable but vendor-by-vendor configuration.
  • Approach B: Region-pinning is uniform across every service via a single regional deployment; EU data residency is one Terraform variable.

Approach B wins clearly on data residency for Phase 2 international expansion.

9.6 Vendor Concentration Risk

  • Approach A1: Medium (Vercel outage affects all user-facing surfaces)
  • Approach A2: Medium-low (Cloudflare billing engine keeps running during Vercel outages; admin UI degrades but subscribers' existing subscriptions still bill)
  • Approach B: Medium (GCP outage affects all)

9.7 Talent Market

  • Approach A: Next.js + Vercel experience is widely available; Cloudflare Workers more specialized but growing
  • Approach B: GCP expertise is deep but narrower than AWS; Cloud Run + Workflows + Terraform combination requires stronger platform engineering

10. Recommendation

Superseded by ADR-0030. The recommendation below reflects the original April 2026 decision-time analysis (favoring Approach A1 / Vercel for MVP). The ratified architecture is Cloudflare (Phase 1) → GCP (Phase 2) — see §0 above. Preserved verbatim because the reasoning structure (time-to-ship, workspace precedent, cost-at-MVP, GCP-trigger conditions) remains useful when re-litigating platform choices.

Original (April 2026) recommendation: Choose Approach A1 (Pure Vercel) for MVP. Plan an evolution path to A2 (Vercel + Cloudflare for billing engine) if and when scale warrants.

Reasoning

  1. Time-to-ship is the primary risk. BC Payments launched March 2026; the strategic window to ship a BC-Payments-unified subscription app is measured in quarters, not years. Approach A1 ships in 14 weeks; Approach B takes 18–20. That 4–6 week gap is strategically material.

  2. Workspace precedent is a real multiplier. aisles-admin, ask-bc, and aisles-storefront have already paid the setup cost for Vercel + Next.js + Neon + Upstash + BigDesign. Approach B throws that away and adds a GCP learning curve the team doesn't currently have invested.

  3. Approach A1 → A2 evolution is clean. Migrating the billing engine from Vercel Workflow to Cloudflare Durable Objects is additive: introduce a dispatcher that routes charges to DOs for stores where the scale math justifies it, leave the Workflow path for others. This is the ask-bc worker pattern applied to billing. No rewrite.

  4. GCP's advantages don't fire at MVP scale. BigQuery analytics, Cloud Armor, Cloud Workflows maturity, single-cloud compliance — all real wins, but Phase 2/3 needs. Phase 1 doesn't touch any of them materially.

  5. The cost gap at MVP scale is meaningful (~40% cheaper). At growth scale they converge, and the cost-of-change to migrate to GCP later is non-trivial but not prohibitive.

When to Reconsider Approach B

Revisit GCP if any of these become true:

  • A merchant pipeline of enterprise accounts with GCP-single-cloud requirements materializes (regulated finance, healthcare). Revenue-weighted: > 30% of pipeline.
  • Cost at steady-state exceeds $15K/month on Approach A with no clear optimization path.
  • Team gains meaningful GCP operational depth (hires, acquisitions).
  • BigQuery analytics becomes a competitive differentiator the Approach A warehouse pattern can't match.
  • Phase 2 international expansion makes data residency setup-cost > payoff on Approach A's per-vendor configuration.

What to Decide Now

  • A1 for MVP. Commit to Vercel + Neon + Upstash + Resend stack.
  • Abstract the billing engine. Write the scheduler, executor, and dunning logic against interfaces that let us swap Vercel Workflow → Cloudflare DOs → Cloud Workflows without rewriting business logic.
  • Abstract the queue. Vercel Queue is beta; wrap it behind a JobQueue interface. If it becomes a problem, swap to Upstash-backed queue or Cloudflare Queues without touching callers.
  • Abstract object storage. Behind a BlobStore interface; Vercel Blob today, swap if needed.
  • Monitor three specific metrics that would trigger A2 migration:
    • P95 charge execution latency > 5s sustained for 7 days
    • Charges per store per day > 10K on > 5% of stores
    • Cost per 1000 charges > $0.05 steady-state

Supabase / Cloudflare Optional Components (within Approach A)

  • Supabase as Neon substitute: Skip unless the team wants Realtime (live exception queue) and Storage in one SKU. Neon is the simpler choice and matches existing workspace pattern.
  • Cloudflare Workers for widget only: Skip. Vercel Edge is sufficient for widget CDN.
  • Cloudflare for billing engine (A2): Defer until scale metrics trigger. Pattern already exists in ask-bc.

11. Open Architectural Decisions

These require a decision before engineering kickoff. Most are flagged in PRD.md §13 and BRD-specs.md Appendix C; surfacing the architectural implications here.

  1. Vercel Workflow GA / durability commitment — confirm with Vercel that Workflow meets our durability requirements for billing, or we wrap it in abstractions that allow swap-out.
  2. Vercel Queue GA timing — if still beta at MVP ship, commit to the abstraction layer approach.
  3. BC Payments MIT surfaceRESOLVED (synthesis 7574bb48, decision 256591ec; further reframed by ADR-0037 2026-05-15). Day-0 spike eb94cf4a confirmed BC's BigPay layer auto-classifies API-initiated stored-instrument charges as MIT (MREC flag for recurring); production proof at Wren Laboratories. The stored-instruments vault rail (payments.bigcommerce.com) is the canonical path for any BC-supported gateway with "Stored Payments" enabled; direct Braintree SDK is a fallback capability only. ADR-0037 reframes PI-5062: it is a Worldpay/Paymetric-specific NTI threading issue, not a blocker for the standard vault rail. Adapter-side NTI persistence bridges for Worldpay/Paymetric merchants until PI-5062 closes (PRD-COMPANION D17).
  4. Neon vs Supabase Postgres — recommend Neon for pattern consistency, but Supabase is viable.
  5. Subscriber portal custom domains — wildcard cert provisioning automation. Vercel supports this; confirm pricing at scale.
  6. Multi-region from day one vs. single-region with migration path — recommend single US region for MVP; EU expansion in Phase 2.
  7. Analytics warehouse — Neon read replica for MVP; ClickHouse or BigQuery for Phase 2 when cohort/LTV queries become expensive.
  8. Merchant-tier-aware infrastructure — do enterprise-tier merchants get dedicated compute pools? Deferred to Phase 3.
  9. Storefront extension on B2B-Edition storesRESOLVED (synthesis eb678099, ADR-0023). When the BigCommerce B2B Edition Buyer Portal is mounted on a Catalyst storefront (per the Open Source Buyer Portal working with One Click Catalyst integration guide), our subscription widget detects Buyer Portal presence (window.B2B namespace probe) and degrades to a non-purchasable "contact your account manager" CTA. Subscription enrollment for B2B buyer-orgs runs through CS tools (Epic 22) rather than the standard storefront cart-capture flow. The cart-capture handler (Epic 9) is bypassed because the Buyer Portal owns its own quote-to-cart conversion flow and our on-cart-created listener never fires for Buyer Portal-routed checkouts. Cart-merge (where we own checkout for B2B-Edition stores by intercepting Buyer Portal's flow) is deferred to v3 behind spike #113 (Epic 24 unblocker). Composite PDP architecture: standard widget renders for non-B2B contexts; B2B variant renders when Buyer Portal SDK is loaded. Reference files: core/b2b/loader.tsx, core/b2b/use-b2b-cart.ts, core/b2b/use-product-details.tsx from the Catalyst integration guide.
  10. Marketplace Merchant-of-Record (per-buyer-org processor routing)RESOLVED (synthesis eb678099, ADR-0022). Out of scope for Phase 1. Phase 1 ships single-tenant MoR (one processor connection per BC store). The multi-actor primitive (PRD-COMPANION D19) carries the v2 seam via Actor.processor_connection_ref reserved column. v2 migration is additive: adds processor_connections keyed on (store_hash, payer_org_id), backfills existing subscriptions to the store's primary connection, extends Epic 14 charge orchestration to resolve payer.processor_connection_ref ?? store.primary_connection_ref. No Phase-1 charge or subscription row is invalidated by the v2 migration.
  11. NTI freshness for long-cycle subscriptionsRESOLVED (synthesis eb678099, PRD-COMPANION D21). Phase 2 implementation slot. Workflow extension fires a MUSE verification charge before the renewal MIT when now - subscriptions.last_nti_refreshed_at > merchant.nti_freshness_days (default 365). Verification success persists fresh NTI; verification failure routes to dunning as PM-decline. Adapter contract (D17) is unchanged — only the workflow gains a verification stage before the existing MIT call.

12. Appendix A — Sequence: End-to-End Renewal (Approach A1)

Historical — decision-time (rejected stack). This sequence depicts the Vercel-native Approach A1 (Vercel Cron/Queue/Workflow, Neon, Upstash) that was not built. It is retained only as part of the ADR-0030 decision trail. The as-built renewal / subscribe / dunning sequences (Cloudflare Cron → Worker scheduled()scheduler.ts → BC Payments vault → D1 → Queues) are the authoritative ones: see docs/architecture/sequence-diagrams.md. Do not cite the diagram below as current.

sequenceDiagram
  participant Cron as Vercel Cron
  participant Disp as Dispatcher (Next.js API)
  participant Q as Vercel Queue
  participant WF as Vercel Workflow
  participant DB as Neon Postgres
  participant R as Upstash Redis
  participant Proc as Processor (BC Payments)
  participant BC as BigCommerce API
  participant Email as Resend

  Cron->>Disp: trigger (every 15m)
  Disp->>DB: SELECT charges due
  Disp->>Q: enqueue(charge_id × N)
  Q->>WF: start workflow(charge_id)
  WF->>R: acquire lock charge_id
  WF->>DB: load subscription + plan
  WF->>BC: GET variant stock
  WF->>BC: POST checkout price quote (tax + shipping)
  WF->>Proc: charge(amount, pm_ref, idempotency_key)
  Proc-->>WF: settled ok (txn_id)
  WF->>BC: POST /v2/orders (payment_status=captured)
  BC-->>WF: order_id
  WF->>DB: charges.status=succeeded, bc_order_id=...
  WF->>DB: schedule next charge (anchor + interval)
  WF->>Email: send renewal receipt
  WF->>R: release lock

13. Appendix B — Sequence: End-to-End Renewal (Approach B)

Historical — decision-time (Phase-2 target, not yet built). This sequence depicts the GCP Approach B (Cloud Scheduler / Pub/Sub / Cloud Workflows / Cloud SQL / Memorystore) — the ratified Phase-2 migration shape per ADR-0030, not the current runtime. The as-built Phase-1 sequences live in docs/architecture/sequence-diagrams.md.

sequenceDiagram
  participant Sched as Cloud Scheduler
  participant PS as Pub/Sub
  participant WF as Cloud Workflows
  participant DB as Cloud SQL
  participant R as Memorystore
  participant Proc as Processor
  participant BC as BigCommerce API
  participant Email as Email

  Sched->>PS: publish(tick)
  PS->>WF: trigger dispatcher
  WF->>DB: SELECT charges due
  loop per charge
    WF->>WF: execute charge workflow
    WF->>R: acquire lock
    WF->>DB: load state
    WF->>BC: stock + pricing
    WF->>Proc: charge
    Proc-->>WF: result
    WF->>BC: create order
    WF->>DB: finalize
    WF->>Email: notify
  end

14. Summary Table — Capability to Service (One Page)

Historical — decision-time comparison. This one-page table compares the three candidate stacks (Approach A1 Vercel-native, A2 Vercel+Cloudflare hybrid, B GCP) that fed ADR-0030. The shipped Phase-1 stack is §0's Cloudflare column (closest to A2's Cloudflare primitives); this table is retained for the decision trail, not as a description of what runs today. For the as-built container + binding view, see docs/architecture/c4-container.md.

Capability Approach A1 Approach A2 Approach B
Admin UI Next.js on Vercel Next.js on Vercel Next.js on Cloud Run
Subscriber Portal Next.js on Vercel Next.js on Vercel Next.js on Cloud Run
Storefront Widget Vercel Edge static Vercel Edge static Cloud CDN + Cloud Storage
API + Webhooks Next.js Routes on Vercel Next.js Routes on Vercel Cloud Run services
Scheduler Vercel Cron Vercel Cron Cloud Scheduler
Job Queue Vercel Queue Vercel Queue + CF Queue Cloud Tasks + Pub/Sub
Charge Executor Vercel Workflow Cloudflare Durable Object Cloud Workflows
Dunning State Vercel Workflow Cloudflare DO Cloud Workflows
Reconciliation Vercel Cron + Queue Vercel Cron + Queue Cloud Scheduler + Cloud Run Job
Postgres Neon Neon Cloud SQL
Redis Upstash Upstash (+ DO SQLite) Memorystore
Object Storage Vercel Blob Vercel Blob Cloud Storage
Secrets Vercel Env + OIDC Vercel + CF secrets Secret Manager
Email Resend Resend SendGrid / Resend
SMS Twilio Twilio Twilio
CDN Vercel native Vercel + Cloudflare Cloud CDN
Observability Vercel + Sentry Vercel + CF + Sentry Cloud Ops + Sentry
Analytics Neon replica (P3: ClickHouse) Neon replica BigQuery
IaC vercel.ts vercel.ts + wrangler.toml Terraform
CI/CD Vercel Vercel + Wrangler Cloud Build