Stop Guessing Your Subject Lines: A/B Test Them on Autopilot

Free

A detailed walkthrough of wiring a behaviour-triggered, always-on A/B test for trial-start email subject lines in Spreeflo, covering journey setup, random splits, tagging, analytics, and how to turn subject-line testing into a habitual growth system.

Industry

Niche

Pattern

Loading sequence...

CartWizard’s founder knew the “welcome to your free trial” email was doing some work. Installs kept coming in from the Shopify App Store, but trial-to-paid conversion had flatlined. Open rates hovered around 32%, then 31%, then 33%. Close enough to noise that it felt pointless to tweak.

So he did what most of us do: rewrote the subject line every few weeks based on gut. Sometimes it bumped opens a little. Sometimes it tanked them. There was no clean way to tell what actually worked, and no time to run manual experiments.

The sequence at the top of this page is the opposite of that. It’s a tiny, behaviour-triggered A/B test that runs 24/7, splits new trial users 50/50, and tells you which subject line wins on real data.

In this article, we’ll unpack that journey node by node, show how it’s configured in Spreeflo, and talk about how to read the results. The goal isn’t to turn you into a full-time growth analyst. It’s to set up a small system once, so your subject lines quietly improve while you ship features.

Founder-led businesses win on leverage, not headcount. This is what that looks like in practice.

Why subject-line tests are such an efficient dial for Shopify apps

If you sell a Shopify or ecommerce app, your install funnel is brutally simple:

Merchant discovers you on the app store.
They install and poke around.
Either they hit activation and see value fast, or they drift away.

Your trial-start email is the one message every new account sees. If that subject line gets 10–20% more people to open, every downstream metric has a better chance: feature activation, first revenue recovered, first report viewed.

This is why subject-line tests punch above their weight:

They’re fast to create. You don’t need new designs or flows.
They’re low-risk. The body of the email stays the same.
They’re reusable. Once the experiment harness exists, you can plug it into any point in your lifecycle: trial start, upgrade nudges, win-backs.

The journey on this page is that harness. Let’s walk it step by step.

The journey at a glance: behaviour-triggered, 50/50 split

Before we zoom in, here’s the high-level shape of the sequence at the top of this page:

Trigger: a behaviour, like trial_started or app_installed.
Time Delay: wait 1 hour.
Random Split: send 50% of contacts down path A and 50% down path B.
Path A: Send Email with Subject A.
Path B: Send Email with Subject B.
Optional: Add Tag on each branch, then Merge back into your standard onboarding.

Everything else you do with the data happens in analytics and segments. No complicated in-journey reporting logic.

Now, onto the nodes.

Step 1: Start from a behaviour, not a one-off list

At the very top of the sequence you’ll see a trigger node. For our ecommerce app example, we’ll use a Custom Event trigger:

Event name: trial_started (or app_installed)
Re-enrollment: off (isReEnrollment=false)

You’ll send that event into Spreeflo from your backend or via the SDK whenever a merchant starts a trial. The Spreeflo API makes that part straightforward: one event per trial, tagged with the contact’s email.

Why a behavioural trigger?

Timing stays relevant. The email always goes out right after they install or start a trial, not whenever you remember to run a campaign.
The test runs forever. You don’t have to “launch an experiment.” Every new trial becomes part of it.

We keep re-enrollment off here because we want each merchant to count once in this experiment. If a contact later triggers trial_started again (say, they uninstall and reinstall), we don’t want them seeing two variants and polluting the data.

If your product’s concept of “trial user” is defined by attributes instead of events, you can swap this node for a Criteria Match trigger with the right filters, or a Join Segment trigger tied to a saved “In Trial” segment. The rest of the journey doesn’t change.

Step 2: A short pause so the email lands at the right moment

Immediately after the trigger, we add a Time Delay node:

Unit: Hour(s)
Count: 1

This does two useful things:

It avoids pinging the user while they’re still in the middle of installation or onboarding inside Shopify.
It gives your app time to create any account data you reference in the email (store name, first recommended action, etc.).

You can experiment with this delay, but start with 1 hour. Short enough to feel immediate, long enough to not be noise during setup. If you already have a more elaborate onboarding sequence, just make sure this delay keeps you compliant with your overall pacing rules (no back-to-back emails).

Step 3: Split traffic cleanly with Random Split

Next comes the heart of the experiment: a Random Split process node.

Configuration:

Branch A label: something simple like subject_a
Branch B label: subject_b
Percentage weight: 0.5 (i.e., 50% to branch A, 50% to branch B)

This node sends each contact down one of two paths at random, based on the chosen weighting. At 50/50, both variants get equal traffic over time.

A few points that matter:

Keep the percentage at 50/50 while you’re running the initial test. Tilting it prematurely biases your data.
Random Split is better than running two separate journeys, because Spreeflo controls the assignment centrally. A contact can only be in one path at a time.
If you later want to keep testing, you can adjust the percentage to, say, 90/10 so most new trials see the current winner but a trickle continue hitting new ideas.

Each branch’s first node is a Send Email action. They’re nearly identical, except for the subject.

Step 4: Two Send Email nodes, identical except for subject

On the subject_a branch, add a Send Email action:

Template: Trial Welcome – Subject A
From email: your usual sender
Send only once: on

On the subject_b branch, add another Send Email:

Template: Trial Welcome – Subject B
From email: same as A
Send only once: on

In Spreeflo, each Send Email node references a marketing email template. This is where the email builder pays off:

Create your base trial welcome email with the body you want.
Save it as Trial Welcome – Subject A with your first subject line.
Duplicate that template, change only the subject and preview text, and save as Trial Welcome – Subject B.

Now you wire template A into branch A and template B into branch B.

Experiment hygiene matters here:

Keep the body identical. Same CTAs, same layout, same sender.
Change only the subject line (and optional preview text). Otherwise you’re not testing “which subject works better,” you’re testing a bundle of changes at once.
Avoid mixing in send-time changes while the test runs. If you also want to test send time, do that in a separate experiment.

Because each Send Email node points to a different template, Spreeflo automatically tracks opens, clicks, and other email activity separately for A and B.

Step 5: (Optional) Tag contacts by variant, then Merge

For simple subject-line tests, you could stop the branches right after the Send Email nodes. In practice, you probably want this test to sit inside a longer onboarding journey.

To cleanly rejoin both paths:

On each branch, add an Add Tag action after the Send Email:
After each Add Tag node, point the next to a single Merge node.

Branch A: Add Tag subject_test_a
Branch B: Add Tag subject_test_b
Force tag trigger: off

These tags are optional but handy. They give you another way to segment contacts later (for example, to check long-term retention differences between variants) using the segment builder.

The Merge node has no configuration. Its job is just to bring both paths back together so you can continue with the rest of your onboarding flow without duplicating it.

Downstream of Merge, you might have:

A Time Delay of 1–2 days.
A second onboarding email that isn’t part of this experiment.
A Wait Condition that looks for a key in-app action.

The subject-line A/B test becomes the first “branch” in that larger journey, but you only had to touch this one section to implement it.

How to read the results and pick a winner

Once the journey is live and traffic is flowing, the real question is: when do you trust the data enough to declare a winner?

Spreeflo gives you two layers of insight here.

1. Compare email performance directly

Each marketing email template has its own analytics: deliveries, opens, click-through rate, and so on. Since our A and B variants are two different templates, you can:

Open the analytics for Trial Welcome – Subject A.
Note the open rate and click rate over a fixed window (say, the last 30 days).
Do the same for Trial Welcome – Subject B.

If you’ve kept the traffic split 50/50 and run the test long enough, differences in open rate point to the better subject. As rough practice:

Aim for at least a few hundred sends per variant before deciding.
Give the test at least 7–14 days so weekly patterns even out.
Look for a clear gap (for example, 32% vs 26%), not a rounding error.

For early-stage apps with lower install volume, it’s better to run one good test longer than to churn through lots of tiny, inconclusive ones.

2. Tie each variant to downstream conversion

Open rate is a proxy. What really matters is whether a subject leads to more trial conversions, first revenue generated, or activation events.

You can approximate this in Spreeflo using segments built on email activity and custom events. For example:

Segment A: contacts who opened Trial Welcome – Subject A and triggered trial_converted.
Segment B: contacts who opened Trial Welcome – Subject B and triggered trial_converted.

The segment builder lets you define this precisely:

Add an Email Activity rule filtered to the specific template, with “opened at least 1 time.”
AND a Custom Events rule for trial_converted with “triggered at least 1 time.”

The size of each segment, relative to how many total opens each email had, gives you a rough conversion rate per variant. It’s not a clinical experiment framework, but for a small team it’s more than enough signal to favour one subject over the other.

If you used Add Tag nodes (subject_test_a / subject_test_b), you can slice long-term metrics by tag in a similar way: feature usage, upsell purchases, and so on.

When (and how) to stop the test

Once one subject is consistently outperforming the other on opens and your chosen conversion metric, you have three options.

Roll out the winner fully.
Keep a trickle of traffic for ongoing testing.
Clone the pattern for other key emails.

Simplest path: edit the losing email template and change its subject to match the winner. Now both branches send the same subject, and the experiment is effectively “off” without rewiring the journey.
Alternatively, remove the Random Split node and one email branch entirely, then connect the trigger and delay directly to the winning Send Email node.

Adjust the Random Split weighting to 90/10 or 95/5.
Keep the current winner on the 90% path.
Use the 10% path as your sandbox: swap in new subject ideas or more aggressive copy.

This way, you keep benefiting from the best-known subject while still exploring for better ones.

Take this exact set of nodes (trigger + delay + split + emails + merge) and adapt it for:
Upgrade prompts (e.g., when feature_limit_reached fires).
Saved-cart reminders if your app manages carts.
Win-back messages for at-risk accounts.

Once you’re comfortable wiring up one of these, cloning and adjusting for new use cases takes minutes, not hours.

Avoid these common A/B testing traps

A few pitfalls will waste your effort if you’re not careful.

Testing too many things at once. Keep body content, sender, and send time the same between variants. Only change the subject (and maybe preview text).
Stopping the test too early. A week of data from 50 installs isn’t enough. Wait for a meaningful sample size, even if it takes a month.
Changing the journey mid-test. If you adjust the trigger, add new filters, or change send time halfway through, you’re mixing two experiments in one data set.
Forgetting about re-enrollment. If you flip the trigger’s re-enrollment toggle on and your contacts can hit the same journey multiple times, they may see both variants over time. For clean subject tests, stick to one run per contact.
Overfitting to opens only. A subject that spikes opens but attracts the wrong kind of behaviour (people who don’t convert, or mark you as spam) is worse than a lower-open, higher-convert alternative.

Treat each experiment as a small, focused question. Then let the journey system do the repetitive work.

Turning one subject test into a habit of experimentation

The big win here isn’t “A vs B for one email.” It’s shifting how your team treats communication.

Instead of rewriting copy whenever you feel like it, you:

Wire an experiment harness once, using journeys and campaigns and journeys concepts.
Clone that harness to every important lifecycle touchpoint.
Use analytics and segments to pick winners without manual wrangling.
Let those decisions quietly compound over months.

You’re not hiring a growth team. You’re installing one small, always-on system that makes your emails a bit better every week.

And because Spreeflo’s templates support AI-powered personalization, you can even combine subject tests with dynamic body copy later on, using AI variables to reference each store’s use case while you keep the experiment structure the same.

If you want to go further, the full feature list and ready-made templates show other patterns you can plug into the same mindset: cart recovery, re-engagement, upgrade nudges.

For now, focus on this: wire up the simple subject-line A/B journey you see at the top of this page for your trial-start email. Give it a few weeks. Watch which line actually moves the needle.

That’s what it looks like when a lean ecommerce app team wins on leverage, not headcount.