A/B testing is fundamental to any successful campaign. In this article, we want to clarify some of the basic issues and questions around the topic for the benefit of anyone just starting out with this important marketing tool. Here you will find everything You wanted to know about A/B test to let you understand this topic, also if you are non-mathematicians.
What are A/B tests?
A/B tests are also called split tests because it splits traffic into two different versions. Then, based on the results, we can compare the results of those two versions to understand which one performs better.
Performance usually means the conversion rate but it could be an open rate, revenue or any other metric.
The A/B testing allows us to answer very important questions for every business.
It is an example of a statistical testing process where we also start from a hypothesis about two data sets and we check if it is true or not. To do this we need to check the hypothesis in two groups, compare their results and determine if the difference is statistically significant.
How to formulate your hypothesis correctly?
As we mentioned before, everything starts from the hypothesis. There are many approaches to this issue, but let’s use this example:
How we can develop this idea?
Let’s assume that we want to check if a yellow banner will perform better in our Facebook campaign. An example can look like this:
“Because we saw research about the brighter colors in a social media advertisement, we would like to use it in our next campaign.
We expect that the CTR for our new customers and new traffic will cause a revenue uplift.
We expect to see 2x higher CTR over a month.”
The third point in this example is the most important because the main goal of A/B testing is to implement the appropriate changes. This is the reason why we have to clearly describe what kind of results we expect and after what time. Based on that assumption, we will be able to measure it later.
Remember external factors!
Every experiment can be affected by outside influences:
- Black Friday Cyber Monday sales
- A positive or negative press mention
- Another campaign launched simultaneously
- The day of the week
- The changing seasons
- And many more
A/B test check list
Below you will find a list of the most important requirement for the preparation of A/B testing:
- Pick up one variable test; tracking the effect on different variables means you cannot measure the effect on one particular variable
- Make sure you’re only running one test at a time on any campaign; again, to measure one metric you cannot have multiple tests running at the same time
- Test both variations simultaneously; it means that the different testing versions should be presented to users at the same time
- Ask for feedback from users; this allows you to be able to see real people and their point of view behind the numbers
- Split your sample groups equally; as close to 50/50 as possible in order to have comparable sample sizes
Assumptions you have to remember:
You should answer three key questions when planning A/B tests:
- How many test subjects do you have to select? -> Determine your sample size
- At what point are the results useful? -> Decide how significant your results need to be to make an informed decision about which version works best
- How much time do you need? -> Give the A/B test enough time to produce useful data for you
One word about sampling
If possible, you should split your test subject as close to 50/50 as possible in order to have an equitable distribution.
Why is this so important?
First of all, the larger the sample size is, the less variability there will be. It means that results are more likely to be accurate.
Let’s look at an example – Imagine that you want to test the treatment or drugs and you choose totally different samples, e.g.
- in the first group, 100% of the people are ill
- in the second group, 50% of the people are ill and 50% are not
As you can see, the results will be totally different, and it will not be possible to effectively compare the results from those samples.
To avoid this, you need to define what part of the Customer Journey or funnel will be tested to avoid high variability.
You should also use one segment to test people with A/B tests. This means that you send the campaign to people who are similar in some way and meet the same conditions at the same time so the variability in this case will be lower.
In the population variability in the two samples can be totally different. This is called variance (not variability). So we try to avoid this as much as possible. We should remember that we are not living in a perfect world, where those samples would be fully representative of the population.
Below we can see a graph presenting so-called regressions towards reversion to the mean, or regression towards the average.
If the black line is the average (mean), the trend oscillates up and down around it. The trend moves above the average, then regresses back towards it, moving up and down.
If the variable, we are tracking is extremally on its first measurement It would tend to be closer to the average on the second measurement. If we test a big enough sample for a sufficient amount of time, we will get more accurate results.
That’s why you need to define those two variables; how big your sample should be and how much time you need to track this trend to make it statistically significant.
Size of the sample and significance – Useful tools
Size of the sample
Remember, that you need to have enough data to test to get the specific uplift you want to achieve. We can use a calculator which helps us with this question. Here you can find a link to a calculator which will help you decide how many subjects are needed for you A/B test.
This tool requires you to add two indicators:
- Baseline conversion rate; the current conversion level you have.
- The minimum detectable effect; the effect that you want to achieve.
Based on this indicator the calculator will calculate the recommended size of your sample for your A/B tests for the both variants.
Statistical significance of the results
This tool will help you determine if the results are statistically significant or not
This tool requires you to add the number of people in samples and number of successes e.g. conversions in each of group. The tool shows if the results are significant (have a significant difference) or not.
Number of conversions needed to make the test significant
Another tool will help you to speed up things and get the minimal number conversions you need to achieve to finish the testing.
This calculator shows us a so-called stop factor – the amount of people you need to gain in your testing to finish it.
This last tool will be useful in estimating how many days you have to wait to make your test significant.
Prioritizing the hypothesis
If you want to test multiple aspects of your campaigns using A/B tests, it’s useful to prioritize which metrics will help you get results fastest. According to the ICE method we can take into consideration three conditions:
- Impact – how big impact on our business this hypothesis has
- Confidence – how confident you feel about this theory
- Effort – how much effort you have to have to examine this hypothesis
You have to create a table and write down all the hypotheses you have. Then just add a specific amount of points for these three indicators to every hypothesis and the one which will have the biggest amount of points will be your first one to test. Happy testing!