Checks

A check is a single execution of your test suite against your app. Think of flows as “what to test” and checks as “when we tested it”.

The lifecycle of a check

Every check moves through these states:

  • scheduled — the check has been created (manually, via cron, via the GitHub App, or via API) and is waiting to be picked up by a worker.
  • running — a worker has the check and is actively executing flows.
  • finished — all selected flows ran. The check shows pass/fail/skip counts and per-flow results.
  • failed — the check itself crashed (e.g. the app URL was unreachable). Different from “finished with failing flows”.

What happens during a check

  1. Bootstrap (if needed) — if the app has zero flows, the check first runs discovery and auto-approves the discovered flows.
  2. Optional discovery pass — if the check was triggered with discover_flows: true, an explicit discovery pass runs in parallel.
  3. Flow selection — the check reads any attached git/PR metadata and asks an LLM to decide which flows are likely affected. If no useful metadata is available, it runs all flows.
  4. Parallel execution — selected flows run in parallel with a bounded concurrency limit. Each flow runs in its own browser context.
  5. Per-step screenshots — every step captures a screenshot. These are compared against the latest baseline (see Visual regression).
  6. Assertions — after the steps for a flow complete, the AI verifier evaluates each assertion against the final state.
  7. Reporting — results, screenshots, diffs, and any new flow suggestions are saved to the check. The GitHub status (if applicable) is updated.

Smart flow selection

Running every flow on every change is wasteful — a mature app may have hundreds. When you attach git or PR metadata to a check, Qassandra uses it to decide which flows are relevant:

  • The PR title and description are passed to an LLM alongside the flow names and categories.
  • The LLM ranks each flow as likely affected, possibly affected, or unaffected.
  • Only the affected and possibly-affected flows are executed.
  • Skipped flows are recorded so the UI can explain “skipped because the PR only changed billing code”.

When no metadata is available (e.g. a manual button-click check), Qassandra falls back to running every active flow.

How a check gets triggered

  • Manually — the Trigger check button in the app dashboard.
  • Scheduled — a per-app cron expression (UTC). See Scheduled checks.
  • GitHub App — automatically on opened or synchronised pull requests on the connected repo. See GitHub App.
  • HTTP APIPOST /api/apps/{appId}/trigger from your CI, deploy webhook, or any other system. See Trigger a check.
i

One queue, many sources

No matter how a check is triggered, it ends up as a row in the checks table with status scheduled. The same shared check processor runs them in production (via a Supabase Edge Function webhook) and locally (via scripts/process-checks-queue.ts), so behaviour is identical across environments.

Reading a check report

The check detail page shows:

  • An overall pass/fail/skip summary.
  • Per-flow rows you can expand to see steps and screenshots.
  • The Browserbase replay link for each flow run (when Browserbase is enabled).
  • Visual diffs against the baseline, classified by severity.
  • Newly discovered flow suggestions, ready to approve or reject.