Why you (always) need a control group in marketing campaigns

Is your marketing strategy working? Do you need more tools and systems to sell more? These are hard questions that every person in marketing needs to ask. In the past it was a matter of uncertainty. Luckily, today we can accurately measure the additional value that comes from marketing campaigns.
Let us take a look at the charts below:


Is the first one good and the second one bad? Well, we don't know. The proper way to answer the question "Is it much?" in economics is to ask the question "Compared to what?".
We know for sure that on the first one profits go up, however since it is nominated in absolute values (let's say dollars), this growth could be caused by hyperinflation. On the second one we see clearly that from May profits decrease rapidly, but we can't tell if that's a bad result if we don't see the big picture. If the company was producing plastic straws and the whole sector lost 25% as a result of eco-revolution, the loss of 19% looks really good (even if producing plastic straws is still a bad idea).
Well, these are extreme examples, but the important message here is that economic phenomena have more than one cause and usually it is not very obvious, since we can rarely isolate all factors and measure their impact on the result.
Control group
A control group is simply a tool that allows you to answer the question "Compared to what?" and put the final result into the right perspective. By using it you can isolate the impact of one factor from all other factors and measure it.
In traditional marketing it was very difficult to do this. John Wanamaker famously said: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” Not so long ago you couldn't really tell how many of your customers came to your offline store persuaded by leaflets or posters, and how many saw the ad on TV. Today, with the ability to track massive amounts of data generated during purchasing processes, we are able to measure this. But we need to remember to always use a control group.
There are two important things about control groups. They have to be large enough to be statistically significant and be randomly selected. Random selection creates a representative sample of the whole population. For example, if in the whole population the distribution of a certain value is normal (or close to normal), the distribution in the randomly selected control group would also be normal.

This means that processes that we can observe in a control group are very similar to the ones that we can observe in the whole population. In other words, if we isolate people from a control group from some factor, we can simulate what would happen in the whole population if this factor didn't exist.
Incremental value
Such a simulation can give us an answer about the impact of a marketing campaign on our revenue, for it is not always direct. Let us consider an example of calculating the revenue from a recommendations campaign.
I go to an ecommerce website, I browse for pasta and eventually I add a pack of spaghetti to my shopping cart. The Synerise recommendations engine suggests that I may also be interested in some tomato sauce. I add this item to my cart and finalize the transaction. In a simple metric that compares products that were shown to me in recommendations and the ones that I bought, I've got information that the tomato sauce is a purchase that comes from recommendations (the id of an item is equal to the one from the page I clicked).

Read more about connecting events in crm.
But how do we know that I would not buy this product if it was not recommended to me? Since spaghetti with tomato sauce is a very popular dish, it is possible that I would find this item and buy it anyway. To answer this question I should have a simulation of a scenario in which the product was not shown. This is where we use a control group: I can check how many people from the control group purchased tomato sauce with spaghetti and compare it with the numbers from the test group. If 20% of people from the control group did this, and at the same time in the test group it was 30%, I know that 10% of people buying tomato sauce come from Synerise recommendations engine.
However, when we consider how complex the processes that lead to a purchase are, we can't limit our calculation to the direct impact (message sent/not sent) of a campaign. By using a simulation we also need to calculate things that we can't measure directly and estimate the total incremental value.
To make this happen, we need four values:
- Revenue from the test group (R);
- Number of customers assigned to the test group (P);
- Conversion rate in the control group (C);
- Average spending per client in the control group (A).
P, C and A are necessary to simulate the revenue without the campaign; R – (P*C*A) gives us the difference.
Historical data fallacy
It's worth mentioning that even very small differences in the conversion rates and average order values can produce unexpectedly large numbers when calculated this way. Let's take a look at the chart showing the revenue from a marketing campaign launched on the 1st of July:

As we can see, the company generates a stable 2% month-over-month increase in revenue from January to June. From July we observe a 5% increase in revenue. If we try to calculate this just by comparing the first half of the year with the second, we would see just 3% as the incremental value. However, if the campaign was launched with a randomly selected control group, the simulation shows a quite different number:
Incremental revenue in July
Number of | Conversion rate | Buying | Revenue | Average client | |
All users | 1123456 | 4,90% | 55051 | $1 559 938 | $28,34 |
Target group (80%) | 898765 | 5% | 44939 | $1 279 149 | $28,46 |
Control group (20%) | 224692 | 4,50% | 10112 | $280 788 | $27,77 |
Let's substitute these values to our formula:
$1 279 149 – ((898765 * 4,5%) * $27,77) = $1 279 149 – $1 123 141 = $156 093
This means that instead of earning $1 559 938 in July, the company would not earn $1 513 140,17 (with just $46 798,15 of additional value), but $1 403 845,32.

There are a lot of things that could change in the meantime, like the buying habits of existing customers, global demand for particular products, a new product from a competitor etc. We don't know what would happen without running a campaign, but we know for sure that the information about the revenue from the first half of the year is not really valuable for us.
Possible issues
It is possible however, that the unexpectedly high results are indeed wrong. It usually has to do with two factors mentioned before:
Number of customers | Conversion rate | Buying customers | Revenue | Average spending of the client | |
All users | 13450 | 2,15% | 289 | $15 603,00 | $53,99 |
Target group (80%) | 10760 | 2% | 248 | $12 794,46 | $51,59 |
Control group (20%) | 2690 | 1,50% | 41 | $2 808,5 | $68,50 |
- The control group or the whole population is too small
If we take these numbers and put them into our formula:
$15 603,00 – ((10760 * 1,5%) * $68,50
we will get $1 738,40 of supposed incremental revenue, which is absurd, given the fact that the only factor we changed by the campaign was the conversion rate. Users in the control group clearly spend on average more than users in the test group, so by changing only one factor (the conversion rate) by 0,5 percentage point we could not change the revenue by 11%. With groups so small we can't do this calculation – only one or two users (e.g. heavy buyers) can change the whole result.
- The control group selection is not fully random
Only a randomized selection guarantees that the control group is a representative sample of the whole population, which means that we can extrapolate processes we observe in the sample. It can happen that due to some glitch or bug certain users are not randomly assigned between groups and that would make the whole simulation unreliable. For instance: users of a certain type of operating system are assigned only to the control group – this can create a pattern in purchases in the group which can seriously impact the results (e.g. average spending of the user).