- Estimate how long an experiment will take
- Decide how "sensitive" an experiment should be
- Decide how many variations to run
Once you decide on a hypothesis, you’ll design an experiment. How many variations should you create? What kind of experiment should you run: A/B, multivariate, or multi-page?
Experiment design is important, because it's a key part of the cost calculation of experimentation. The design and scope of your experiment determine how long it will take to reach statistical significance.
Use this information to consider:
Are the results of this experiment likely valuable enough to justify the amount of traffic or time? Are other, potentially more impactful ideas that you could be experimenting on?
Should you reduce the number of variations to speed up my experiment? If so, how would you re-design this experiment?
Should you increase the drama - or degree of difference - between the variation and the original to reach statistical significance sooner and speed up the experiment?
How can you design variations that focus on maximizing lift for your primary goal?
A statistical calculation called the minimum detectable effect (MDE) can help you connect cost to your experiment design. Use it to make informed decisions about your experiment parameters.
Minimum detectable effect (MDE) is a calculation that estimates the smallest improvement you’re willing to be able to detect. It determines how "sensitive" an experiment is.
Use MDE to estimate how long an experiment will take given the following:
Baseline conversion rate
You can use Optimizely’s Sample Size Calculator to make this calculation.
For example, imagine these parameters:
Your baseline conversion rate is 15%
You'd like to measure statistical significance to 95%
You'd like to detect a 10% lift at minimum (this is your MDE)
According to the Sample Size Calculator, you’d need ~8,000 visitors per variation to reach statistical significance.
In reality, you don't know the actual lift in advance. If you did, you wouldn't be running the experiment, right? By estimating the minimum lift you'd like to detect, with a given level of certainty, you establish boundaries for how much traffic or time you'll invest in this experiment. You can plan and scope your experiment more accurately.
Let's follow the example above one step further.
You design the experiment above with four variations. Your site averages 10,000 unique visitors per week. If you show this experiment to 100% of visitors, it will probably take 3.2 weeks to reach significance.
8,000 visitors per variation x 4 variations = 32,000 visitors
32,000 visitors / 10,000 visitors per week = 3.2 weeks
At this stage, consider whether the traffic and time is worth it, and how you might design a faster experiment.
Here are a few best practices for designing an experiment with MDE in mind.
Use potential business impact to decide on the sensitivity of your experiment.
Use MDE as a guide rather than an exact prediction.
Design impactful variations.