Five ways to make more of your A/B email tests

25 Oct 2011

So you test two subject lines on a decent number of recipients and one performs better than the other. Yay! We have a winner... send it out to everyone else.

What’s not to like about that? Hopefully nothing, but the simplicity of “version A versus version B” testing can lull us into superficial thinking. Test results can deceive. But they can also tell us far more than we realise.

To help you squeeze a little more out of such tests, here are a few tips I picked up from my own and others’ experiences.

1. Test for unexpected results, too
Sometimes innocuous changes can make a big difference. It may be a purple button getting a third more clicks than an orange one. Or the addition of a small image boosting clicks by over a half.

Even seemingly ill-judged alternatives can sometimes come out on top. Like capitalising the subject line for 18% more opens.

All-capital subject lines rarely appear in the big book of recommended practices. They’re generally seen as a bad idea. But given the right circumstances, test results can surprise you. So don’t be afraid to experiment.

The beauty of testing is...it’s a test! You're not exposing your hunch or theory, however illogical or unrealistic, to the whole list. The test is there to confirm or refute your idea before you send it out to the majority of subscribers.

2. Be sure to test what you actually want to test
The best A/B test not only comes up with a winner from version A and B, but also provides insight into why A beat B: insight that lets you produce better emails in the future, too.

Many subject line tests, though, look like this:

Version A: Free shipping + 20% off everything!

Version B: BIG SALE: 2 days only!

It’s great to know which one to send out to the list, based on test results. But whether A or B wins, you can’t tell why. The two versions differ in length, capitalisation, urgency and message content – you have little insight into what played the key role.

If there is only one difference between A and B, then the cause of any response impacts is clearer. (Consider, also, multivariate testing.)

3. Test again
Repetitive design templates and copywriting elements encourage familiarity and recognition. But change can invite attention, simply because something is different... particularly attention from those who’ve learned to quickly gloss over your message.

The sustainability of that attention or any subsequent response boost depends on how much is simply due to this oft-forgotten novelty factor. So it’s important to revisit tests down the road.

Another reason for retesting is that the passage of time brings many changes to the nature of your list and to the numerous factors that contribute to email response. This might mean today’s winning version is the loser a few weeks or months later.

AWeber, for example, found button links initially outperformed their standard text links. The effect not only wore off, but eventually reversed: buttons ended up performing worse than text.

Equally, there’s perhaps a role here for simply changing things up now and then as awake-up slap tactic.

4. Borrow with care
Results from other people’s tests aren’t necessarily transferable to your situation.

Certain insights are likely universal. Other test results are too dependent on the audience and campaign environment. So published test results serve as excellent inspiration for your own tests, but not necessarily as insights for immediate implementation.

I recently found that adding my first name to my e-newsletter’s “from” line lifted clicks. Would that work for you? Maybe. Maybe not.

The nature of my e-newsletter lends itself to a more human “from” line. For example, I write and sign an editorial in each issue. There is a higher degree of potential personal name recognition in that B2B e-newsletter than there might be with Big Brand’s “Saturday Sale” B2C promotion. The only way to know for sure is (drum roll)... to test for yourself.

5. Think through the results
The success of a test also rests on how you interpret the numbers that pop out at the end. There are two particular issues here.

First, not every potential success metric goes up (down) in the winning (losing) version. You see it often with subject line tests, where the version producing the most opens isn’t always the one that produces the most clicks or conversions.

An important part of test evaluation is making sure you measure the effects on the metrics that really matter: what measure of success are you actually trying to improve?

Second, the outcome of the test is the sum of the responses across all recipients. Not everybody is likely to be reacting in the same way. You’re seeing the overall result, which masks how different individuals and segments might be responding.

So, for example, a particular call to action (CTA) might get a huge response from one segment and leave another cold, compared to the alternative. The test sample as a whole hasn’t voted for the new CTA, just a large-enough segment to make that CTA the winner.

That’s an opportunity: can you perhaps also segment by response to particular copywriting approaches? So the ones who react well to zany subject lines get...well... zany subject lines. The ones who want straightforward facts get those.

For more testing tips, try some of these posts from the DMA Email Marketing Councilblog:

Split testing sample size lookup table

A/B email split testing: good things come to those who wait

Testing – Ten Mistakes to Avoid: Part 1 and Part 2

Mark Brownlow, freelance business writer and publisher of the Email Marketing Reports website and blog.

You can find more articles like this from the Email Marketing Council via .