How-to

A/B testing subject lines and email bodies

How to set up an A/B test, what sample size and metric to choose, and how the winner is auto-promoted.

3 min readLast updated 17 June 2026

Jump to section

What you can test
Setting it up
What happens at send time
The honesty badge
Sample sizing — what to pick
When not to A/B test

A/B testing subject lines and email bodies

A/B testing lets you ship two variants of a campaign to a small slice of your audience, measure which performs better, and auto-send the winner to everyone else.

What you can test

Today you can A/B test on one of two variables:

Subject line — same email body, two different subject lines.
Email body — same subject line, two different bodies (different copy, layout, CTA, anything).

A few variables are deferred to a future release — from-name, send time, and three-way tests. Stick to subject or body for now.

Setting it up

Open the campaign you want to test. In the editor, click Add A/B variant at the top.

You'll then configure:

Variant A vs Variant B — edit each variant. They start as a copy of the other.
Sample size — what percentage of your audience gets a test variant (split 50/50 within that). Defaults to 50% — meaning 25% gets A, 25% gets B, and 50% is "held back" for the winner.
Duration — how long we measure before declaring a winner. Defaults to 24 hours. Longer if your list has slow openers.
Metric — what we're optimising. Pick one:
- Open rate — best for subject-line tests.
- Click rate — best for body tests.

What happens at send time

We send to the test slice first (e.g. 25% gets A, 25% gets B).
We measure for the configured duration.
At the end, we calculate the difference and run a statistical-significance check.
The winner is sent to the held-back remainder.
If neither variant won at statistical significance, we declare a draw and send variant A to the remainder by default.

The honesty badge

After each test, the campaign detail page shows a significance badge:

✅ Significant — the difference between A and B was large enough that it's very unlikely to be chance. Trust the result.
⚠️ Inconclusive — the difference was real but the sample wasn't big enough to be sure. Treat as a hint, not a verdict.
❌ No difference — A and B performed roughly the same. Neither is clearly better.

This matters. A 2% difference between A and B at 200 recipients is noise. The same 2% at 20,000 recipients is real. The badge tells you which case you're in.

Sample sizing — what to pick

A few rules of thumb:

If your audience is under 1,000, set sample size to 100% — there's no held-back remainder. You're really just running both variants and seeing what each did. Significance will rarely be reached, but you'll learn something.
If your audience is 1,000–10,000, set sample size to 20–30%. Enough to detect a real difference, with a sizable remainder for the winner.
If your audience is 10,000+, set sample size to 10–20%. Smaller test slices, plenty of statistical power.

When not to A/B test

One-off important campaigns — your product launch shouldn't have a B variant nobody approved.
Tiny lists — if your audience is under 500, the test won't reach significance and you're just delaying the send.
Time-sensitive campaigns — if you have to send in the next hour, skip the test. The measurement window doesn't fit.