How-to
Investigating an Alert
Read the alert summary, follow the playbook, check related alerts, and act — without having to know every detector's internals.
Jump to section
The shape of an alert page
Every alert page is organised top-to-bottom for how an operator actually reads an incident:
- Header — severity, status, and a live-state pill that tells you whether the underlying condition is still happening.
- What happened — a one-sentence summary in natural language. No jargon, no chipped columns. "🇬🇧 Cloudflare (AS13335) · GB is sending 2.7× its normal traffic — 1.23k req/min vs. 450 baseline."
- What to do — a detector-specific playbook with 3–5 concrete actions, plus a "When to close" criterion.
- Actions — Acknowledge / Resolve buttons, plus deep-links to related pages (ASN detail, path detail, bgp.tools, edit rule).
- Related open alerts — other alerts sharing the same target or the same detector kind. Answers "is this one of a cluster?".
- Event timeline — what happened when, in plain English.
- Rule details — the raw detector mechanics (thresholds, window sizes, context JSON). Collapsed by default; for SREs.
Live / Cooling / Stale
Every alert has a small coloured pill near the header:
- Live (red, pulsing) — the detector evaluated this within the last 5 minutes. The underlying condition is ongoing.
- Cooling (amber) — 5–60 minutes since last update. The condition may have just stopped.
- Stale (grey) — more than an hour since last update. The incident has likely passed even if the alert is still technically open.
Use this to prioritise: Live alerts need attention now; Stale alerts can usually be resolved without investigation.
Severity
- Info — worth knowing about, not urgent
- Warning — the detector's primary threshold tripped
- Critical — threshold exceeded by 3× or novel condition (e.g. new-traffic ASN from a type we expect clean traffic from)
Severity is shown as a coloured left-rail on the Alerts list and as a pill on the detail page.
Kind filter + grouping on the Alerts list
The Alerts page groups open alerts by detector kind into collapsible sections, and surfaces chip-style filters across the top. Click a chip to narrow to one detector. Click the Active Alerts panel groups on the Overview dashboard to deep-link straight into the filtered list.
Acknowledging vs resolving
- Acknowledge — "I've seen this, I'm investigating." The detector keeps evaluating. The alert moves to the Acknowledged tab, stops appearing in the unread count, but continues to update if the condition persists.
- Resolve — "I've taken action (or the condition cleared)." The alert moves to Resolved and stops updating.
Most detectors auto-resolve when the condition clears (e.g. an ASN's ratio drops below the multiplier for a sustained period). Manual resolution is for when you've applied a WAF rule or decided to accept the alert.
When to close — the "When to close" hint
Every detector has a specific close criterion written into its playbook. Example from ASN Spike: "Close when this (ASN, country) pair's ratio returns below 1.5× the type-specific multiplier for 15 minutes, or when a WAF mitigation is in place."
This takes the "have I done enough?" guesswork out of resolution. If the criterion is met, close. If not, acknowledge and keep watching.
Related open alerts
If other open alerts share the same target (same ASN / path / source) OR the same detector kind, they appear in the Related open alerts card. Two blocks:
- Same target — other alerts on this exact dimension (e.g. three detectors all firing on AS13335 in GB = one incident, three alerts)
- Other open [kind] alerts — other alerts of the same detector kind on different targets
Use this to avoid duplicate investigation work and to build a fuller picture of the incident.
Pausing alert emails
If you're in the middle of a known incident and don't want more notification emails, use Settings → Pause alert emails. The detector keeps firing and building an audit trail; webhooks still fire (so external integrations keep working); you just stop getting emails.