Checks

A check is a single execution of your test suite against your app. Think of flows as “what to test” and checks as “when we tested it”.

The lifecycle of a check

Every check moves through these states:

scheduled — the check has been created (manually, via cron, via the GitHub App, or via API) and is waiting to be picked up by a worker.
running — a worker has the check and is actively executing flows.
finished — all selected flows ran. The check shows pass/fail/skip counts and per-flow results.
failed — the check itself crashed (e.g. the app URL was unreachable). Different from “finished with failing flows”.

What happens during a check

Bootstrap (if needed) — if the app has zero flows, the check first runs discovery and auto-approves the discovered flows.
Optional discovery pass — if the check was triggered with discover_flows: true, an explicit discovery pass runs in parallel.
Flow selection — the check reads any attached git/PR metadata and asks an LLM to decide which flows are likely affected. If no useful metadata is available, it runs all flows.
Parallel execution — selected flows run in parallel with a bounded concurrency limit. Each flow runs in its own browser context.
Per-step screenshots — every step captures a screenshot. These are compared against the latest baseline (see Visual regression).
Assertions — after the steps for a flow complete, the AI verifier evaluates each assertion against the final state.
Reporting — results, screenshots, diffs, and any new flow suggestions are saved to the check. The GitHub status (if applicable) is updated.

Smart flow selection

Running every flow on every change is wasteful — a mature app may have hundreds. When you attach git or PR metadata to a check, Qassandra uses it to decide which flows are relevant:

The PR title and description are passed to an LLM alongside the flow names and categories.
The LLM ranks each flow as likely affected, possibly affected, or unaffected.
Only the affected and possibly-affected flows are executed.
Skipped flows are recorded so the UI can explain “skipped because the PR only changed billing code”.

When no metadata is available (e.g. a manual button-click check), Qassandra falls back to running every active flow.

How a check gets triggered

Manually — the Trigger check button in the app dashboard.
Scheduled — a per-app cron expression (UTC). See Scheduled checks.
GitHub App — automatically on opened or synchronised pull requests on the connected repo. See GitHub App.
HTTP API — POST /api/apps/{appId}/trigger from your CI, deploy webhook, or any other system. See Trigger a check.

One queue, many sources

No matter how a check is triggered, it ends up as a row in the checks table with status scheduled. The same shared check processor runs them in production (via a Supabase Edge Function webhook) and locally (via scripts/process-checks-queue.ts), so behaviour is identical across environments.

Reading a check report

The check detail page shows:

An overall pass/fail/skip summary.
Per-flow rows you can expand to see steps and screenshots.
The Browserbase replay link for each flow run (when Browserbase is enabled).
Visual diffs against the baseline, classified by severity.
Newly discovered flow suggestions, ready to approve or reject.