Split testing, sometimes called A/B testing is an important skill to understand and be able to perform in affiliate marketing. Far from being simply “throwing enough s*** at the wall to see what sticks, there is actually a methodology and strategy for achieving useful results.
Why do we test?
The simple answer is money and getting the most of it possible. Small changes in the way offers are presented can have a significant impact on its success. This could be the image used or your ad copy text. However, that’s not the only thing you should be testing. In fact, people too often just think of split testing as finding the right button color on their Landing Page or the right image and forget about these other important factors when testing such as the offer themselves and the networks they run on. Most importantly, it can be the crucial difference between achieving XXX numbers a day on a campaign or winding up getting stuck with a negative ROI.
How do we test?
In a nutshell, a basic split test (or A/B test) simply means testing a control (A) against some variable (B). You’ll test these points against the control over a period of time to see what the best performer is.
Single variable (A/B) This is the ideal model for accuracy because it’s a binary result and thus is more precise.
Example: Let’s say I use ad copy on one LP that says “Get the New Secret to Younger Looking Skin” as my control (A). On my other LP I use a headline that reads. “Dermatologist Reveals Secret to Younger Looking Skin” as my variable (B). I set up the test so that 50% of people got to LP (A) and 50% got to LP (B)
From the results of this data, it will become very clear as to which one is the better performer over a given set of time with a significant sample.
Statistical relevance – make sure that the results you’re seeing are caused by whatever you’re testing, rather than just being a random coincidence. To do so you’ll need to figure out the traffic sample size you will need for each variation in order to reach a statistically significant result. That means you’ll need to figure out the number of visits to each of your pages (control and variation) that will be required to reach a significant result. To do so, you’ll need to use a tool called a statistical relevance calculator. These are free easy to find many places just by Googling it.
How to use a statistical relevance calculator. Once you have your data you’ll need to compare them by the percentage of how many attempts were made Vs. the goal that was reached. If we look at conversions for an LP we’re testing as our goal then we can take the amount of traffic for each are our attempts.
|Attempts||Goal (Conversions)||Rate Percentage|
The commonly accepted level of significance is 95%. What this means is that there’s only a 5% chance that your results are due to chance.
Setting your range. In many cases for testing, you will need to set a range. In some cases, this will be a set period of time, but it could also be related to attempts. For a date range, this should be a minimum of one week to get actionable results.
Setting your budget. You’ll need to determine what your ad cost will be for a set number of desired results. Some sources claim that the minimum should be 100 desired results (conversions for example) to reach significance, but there are some flaws to sticking to a so-called “magic number”. For example, If you have a site that does 100,000 transactions per day, then 100 conversions can’t possibly be a representative of overall traffic. Keep in mind what your ad spend is per day and remember that the traffic will be cut in half, so to get equivalent traffic numbers, you’ll need to double it or run the test longer. Knowing this will help you set your budget.
What do we test and (mostly) in what order?
Here are some good basic test sets to consider. Not all of these are in exact order (when testing audience types for example) and it depends also on what your goals are. However, make sure you’re starting off with a solid offer.
Offers – Since starting with a good offer is fundamental to your overall success, make sure you are applying testing to your offer as well to see which offers give the best results.
There are a lot of factors to consider that can lead to one offer pulling in more money than another. People sometimes mistakenly believe that a higher payout for similar offers is going to be the winner, but if you have one offer pay $5 for example and the other is paying only $4, but consistently outperforms the other, the answer for the one to pick should be clear. Of course, we’d like the $5 one to be the winner, but you should do your best to make sure its a solid offer and not get distracted by the big payout amount.
Networks – Once you know you have a strong offer, you might want to test that offer on different networks to ensure that there are no other factors involved which may affect your bottom line.
Not all networks are created equal; even for the same offer. For example, networks could be using different servers or different tracking platforms. There is also the matter of scrubbing. Scrubbing is when a network purposely doesn’t credit 100% of the successful conversions. There are a few reasons why this might happen and sometimes it may be an effort at quality control when trying to validate leads. However, there could be some more nefarious motivations such as offering higher than average payouts for the same offer than other networks. Some people just chase the higher payout (remember what we said about testing offers). The network makes up for their “generosity” by shaving off a few dollars on the back end. Another possibility is that they use it as a tool to stay in the black to tweak their profit goals. If they are struggling, the temptation is always there.
LPs – OK, this one should seem more obvious and it’s also probably the most common test people get started with.
It’s long been known that companies have been pursuing how people respond and relate to how things are phrased and packaged to sell something. Google for example famously tested 41 different shades of blue to see which one people responded better to. You, however, are not on a scale as massive as Google, so please don’t use that as an equivalent example. The point is they did it for a reason because the most subtle things can really shift response results. You don’t need to obsess over button color too much, however, but ad copy, followed by what image attracts more results is worth running tests on. The right little tweak in your ad copy could send the success of your campaign skyrocketing.
The Ad – Ads are another common example and many traffic source platforms make it easy to set up ad testing.
As we’ve discussed people’s reactions to marketing can be driven by many factors. Again Ad copy and images are the most common features to test on an ad, but it could also be your angles, so make sure to run some tests to see what you might be missing out on.
GEOs – If trying to expand into a different country or if you’re interested in how countries compare then geo testing is important.
Even if you’ve been amazingly successful in one country, it doesn’t mean that you can just copy paste and be as successful in another. This is especially true if the culture and language difference is significant. This could also be a way for you to discover scaling opportunities; particularly if the traffic costs are lower.
Access method or device – This is going to be how the campaign is seen and in some cases what types of audiences see it.
Knowing how your ad performs on a mobile device vs. desktop is important to know since it can impact what kinds of traffic sources you use. This is also relevant for the device type. Keep in mind there is usually an age split for desktop users vs. mobile. It’s also worth pointing out, that most iPhone users require a credit card to set up their online app store accounts; not so with Android. If your offer is something that requires a credit card, this is one crude way of pre-filtering your audience. This in no way means that Android users don’t have credit cards, but it’s not a bad point to consider.
Age range and gender – This is going to tell you what gender and age groups seem to respond better to certain types of materials or angels.
Some offers are heavily tied to specific audience types and sometimes it might even be a condition of the offer. However, even if the offer is open, your angles and marketing materials will most likely resonate differently depending on factors such as age and gender. That’s why more deeply defining your audience type in this way can be so important.
Best Methods and tools for testing process
Testing is essential, but as we’ve seen you shouldn’t just go in blind. Once you have a clear strategy, you’ll want to do whatever you can to easily streamline the process. Jumping from system to system and trying to fold your data together can increase the chances for mistakes and it just wastes time unnecessarily. Also, consider that you’ll want to be able to construct variables (such as modifications to your landers) to test as quickly and easily as possible.
Setting up the test. For one of the better solutions for this, we’ll take a look at some of the advantages of Zeustrak. Right off the bat, there are some very keen advantages for using this system to run your campaigns.
Some examples include having GEO customizable cloud-based landers. This makes setting up in any GEO easy and provides lightning fast pages loads all over the world so your testing results tend to be much more accurate that non-cloud based systems.
One extremely convenient advantage is that your LPs are stored and editable directly in the interface, so you don’t have to waste a lot of time, editing, uploading and configuring your landers when developing variables or running tests.
Split testing in Zeustrak – Next, if you are ready to run the test once the LP or offer is in the system, just follow these steps:
*Below we can see two landers being tested with the same offer.
As you can see, once your set up it’s all right there in one master control interface and the real-time editing and page storage comes in very handy. Split testing can be challenging enough when trying to develop what models to use, so when you know what you want to do, running those tests should be as quick and as easy as possible.
Common testing issues to avoid
Here are some common pitfalls to be aware of when split testing. Don’t worry these are all quite easily manageable once you’re aware of them.
- Don’t get distracted by other people’s data. All audiences are different, so a change that worked well for someone else, won’t necessarily work for you. Just because a certain button color worked well one person, it really only indicates that it worked well for there very specific set details. If a green button color converted far better in a test against a yellow button, it in no way means the green buttons are always superior.
- Don’t run short tests If you test for too short a period of time, you won’t be able to gather enough data to even make the test worthwhile. Remember, there is a threshold here that you need to cross to get statistically relevant data. Anything less is quite literally irrelevant and a waste of time and money.
- Consider external factors. The world is a more complicated place beyond or nice clean testing scenarios, so make sure you examine potential external factors that could be affecting your data. If testing GEO comparative offers, for example, make sure you’re set up for the relative peak times in those GEOs. Also, is the data you get from a location may not be typical if it’s over a holiday season.
One of the worst things you can do (unless you absolutely love wasting money and carving a path to failure) is to not test at all. The second biggest mistake can be doing it wrong and basing it on some sense of random instinct, versus a data-driven methodology. Even then, no system is ever perfect and as always performance marketing is an elusive animal to pin down. As such, try to apply as much logical structure as you can and try to find methods and tools to perform your testing as accurately and efficiently as possible. By even doing that, you’ll gain a distinctive advantage.