relevant products:
  • Optimizely X Web Experimentation
  • Optimizely X Web Personalization
  • Optimizely X Full Stack

THIS ARTICLE WILL HELP YOU:
  • Understand Optimizely's Stats Accelerator, its algorithms, and how it affects your results
  • Distinguish between the two Stats Accelerator algorithms
  • Determine whether to use Stats Accelerator for your experiments, as well as which algorithm to use
  • Enable Stats Accelerator (beta) for your account

Stats Accelerator helps you algorithmically capture more value from your experiments, either by reducing the time to statistical significance or by increasing the number of conversions collected. To do this, Stats Accelerator monitors ongoing experiments and automatically adjusts traffic distribution among variations.

You may hear Stats Accelerator concepts described as the “multi-armed bandit” or “multi-armed bandit algorithms.” The Key terminology section included below clarifies Stats Accelerator terminology and concepts.

Stats Accelerator algorithms

When enabled, Stats Accelerator applies one of two algorithms (or optimization strategies) for the primary metric: Accelerate Learnings or Accelerate Impact. Keep reading to learn more about their differences and use cases.

Accelerate Learnings

The Accelerate Learnings algorithm seeks to reduce experiment duration by showing more visitors the variations that have a better chance of reaching statistical significance. Accelerate Learnings attempts to discover as many significant variations as possible.

Accelerate Learnings helps you maximize the number of learnings from experiments in a given time frame, so you spend less time waiting for results.

For Accelerate Learnings, we require at least 3 variations, including the original or holdback (baseline) variation.

Accelerate Impact

The Accelerate Impact algorithm seeks to maximize the payoff of the experiment by showing more visitors the leading variation(s).

Accelerate Impact helps you exploit as much value from the leading variation as possible during the experiment lifecycle, so you avoid the opportunity cost of showing sub-optimal experiences.

For Accelerate Impact, we require at least 2 variations, including the original or holdback (baseline) variation.

Use cases

If you experiment in high volumes, you face two challenges. First, data collection is costly. Time spent experimenting means you have less time to exploit the value of the eventual winner. Both algorithms solve this problem by reducing time to significance or maximally exploiting overperforming variations during an experiment.

Second, you may worry that creating more than 1 or 2 variations will delay statistical significance too long. Accelerate Learnings allows you to be bold and create more variations, while shrinking the time to significance by quickly identifying the variations that have a chance of statistical significance.

Most experiments with a clear primary metric tracking unique conversions can benefit from Accelerate Learnings. Read our Stats Accelerator technical FAQ to learn more.

Here are a couple cases that may be a better fit for Accelerate Impact:

  • Promotions and Offers: users who sell consumer goods on their site often focus on driving higher conversion rates. To do so, many retailers offer special promotions that only last a certain amount of time. Instead of running a normal A/B/n test, Stats Accelerator can Accelerate Impact by sending more traffic to the overperforming variations and less traffic to the underperforming variations.

  • Long-running campaigns: some Optimizely Personalization users have long-running campaigns to which they continually add variations for each Experience. For example, an airline may deliver destination-specific experiences on the homepage based on past searches. Over time, the airline might add different images and messaging. For long-running Personalization campaigns, the overall objective is often to drive as many conversions as possible, making it a perfect fit for Accelerate Impact.

Stats Accelerator works with Optimizely X Web Experimentation, Personalization, and Full Stack. For Personalization, the Stats Accelerator makes adjustments to the traffic distribution among variations within an experience.

Enable Stats Accelerator

If you want to use Stats Accelerator, please contact your Customer Success Manager.

After Stats Accelerator is enabled for your account, you can find the two algorithms with the traffic allocation settings for your variations.

stats_accelerator.png

Key terminology

Algorithm
 
The mathematical formula used to execute against the chosen strategy. For a given optimization strategy, there are several possible algorithms we could use and an infinite number of ways to tune the algorithm’s parameters. See the FDR Control with Adaptive Sequential Experimental Design white paper for technical details.
Multi-armed bandit
A class of problems in probability theory where an agent must decide how to best distribute resources among several options with initially unknown payout structures.
The term "multi-armed bandit" derives from a person, the “bandit,” playing slots. Given a sack of coins and several slot machines with unknown payout ratios, the bandit must decide how to distribute the coins to get the highest payout. At first, the bandit does not know what each machine will pay out, but as she pulls each one she gets a better understanding.
See the definition for "regret minimization” below for a solution to the multi-armed bandit problem.
Optimization strategy
The optimization strategy defines the goal or objective a user chooses when enabling Stats Accelerator. We offer two optimization strategies: Accelerate Learnings and Accelerate Impact. These are often referred to as “time minimization” and “regret minimization” frameworks, respectively.
Regret minimization
A framework or strategy that seeks to minimize the number of times a new visitor is shown a sub-optimal experience with respect to the primary metric. In doing so, the framework minimizes the level of quantified “regret” observed after the conclusion of the experiment and maximizes the payout.
Regret minimization is the traditional approach to solving a multi-armed bandit problem. We call it "Accelerate Impact" in our Stats Accelerator.
Time minimization
A framework or strategy that aims to identify and distribute visitors to variations with a relative likelihood of reaching statistical significance that is higher than other variations. To be technically precise, time minimization addresses a set of problems that are not defined as multi-armed bandit problems.
Optimizely developed this framework, and we call it "Accelerate Learnings" in our Stats Accelerator.
 

Technical FAQ

How does Stats Accelerator work with Stats Engine?
 
Stats Engine will continue to decide when a variation has a statistically significant difference from the control, just as it always has. We would never compromise statistical validity by introducing a new feature. But because some differences are easier to spot than others, the number of samples needed to be allocated to each variation to reach significance differs variation to variation.
For the Accelerate Learnings approach, Stats Accelerator decides how many samples each variation should be allocated in real-time to get the same statistically significant results as standard A/B/n testing, but in less time. Additionally, these algorithms are only compatible with always-valid p-values, such as those used in Stats Engine. These p-values hold with all sample sizes and support continuous peeking/monitoring. This means that you may use the Results page for Stats Accelerator-enabled experiments just like any other experiment.
What algorithms or frameworks does Stats Accelerator support?
Stats Accelerator is not a single algorithm, but a suite of algorithms that each adapts its allocation for a different specified goal.
The first algorithm is Accelerate Impact, and it seeks to maximize the number of conversions, revenue, or other measure that you’ve defined in your primary metric. If your primary metric is supposed to decrease a particular figure, such as abandonment rate, we will try to minimize that figure. For tasks that balance exploration-versus-exploitation (Accelerate Impact), perhaps to tune the allocation to maximize revenue or conversions, we use a procedure inspired by Thompson Sampling, which is known to be optimal in this regime (Russo, Van Roy 2013).
The second algorithm is Accelerate Learnings, and it aims to reduce the time it takes for your variations to reach statistical significance. Many of the principles underlying the algorithms come from the rich research area of multi-armed bandits. Specifically, for our pure-exploration tasks (Accelerate Learnings), such as discovering all variants that have statistically significant differences from the control, we use algorithms based on the popular upper confidence bound heuristic that is known to be optimal for pure-exploration tasks (Jamieson, Malloy, Nowak, Bubeck 2014).
Can I use my own algorithm?
Using the REST API, you can programmatically adjust Traffic Allocation weights as needed.  Optimizely’s out-of-the-box Stats Accelerator feature was finely tuned based on millions of historic data and state-of-the-art work in the field of bandits and adaptive sampling.
What is the estimated time savings when I use the Accelerate Learnings algorithm?
The time you save by using Stats Accelerator depends on the variations you are exploring. However, users typically achieve statistical significance two to three times faster than standard A/B/n testing when using Accelerate Learnings. This means with the same amount of traffic, you can reach significance using two to three times as many variants at a time as was possible with standard A/B/n testing.
How often does Stats Accelerator make a decision?
The model that dictates Stats Accelerator is updated hourly. Even for Optimizely users with the highest traffic, this frequency is more than sufficient to get the maximum benefits of a dynamic, adaptive allocation. If you require a greater or lower frequency of model updates, please let us know - it is valuable feedback!
What happens if I change the baseline on the Results page?
There is no adverse impact to selecting another baseline, but the numbers may be difficult to interpret. We suggest keeping the original baseline when you interpret Results data.
What happens if I change my primary metric?
The Stats Accelerator scheme reacts and adapts to the primary metric. If you change the primary metric mid-experiment, the Stats Accelerator scheme will start changing its policy to optimize that metric. For this reason, we suggest that you do not change the primary metric once you begin the experiment or campaign.
What happens when I pause or stop a variation?
If you pause or stop a variation, Stats Accelerator will ignore those variations’ results data when adjusting traffic distribution among the remaining live variations.
How does Stats Accelerator handle revenue and numeric metrics?
During the initial beta period, we will only support non-numeric and non-revenue metrics for Experiments using Accelerate Learnings. This means you should not use “Revenue” or “Total Conversions” in an experiment if you select the Accelerate Learnings algorithm. In Personalization, we will support numeric or revenue metrics.
Here's the eventual approach we will take to handle revenue and numeric metrics in Stats Accelerator. Statistics surrounding binary metrics, like click or no-click, are easy to reason about because there is only a single parameter that describes the entire distribution: the probability of click. For numeric metrics like revenue, the number of parameters to fully describe the distribution, in general, may be unbounded. In practice, we use robust estimators for the first few moments (for example, the mean, variance, and skew) to construct confidence bounds that are used, just like those of binary metrics.
Does Stats Accelerator work with Full Stack?
Yes, Stats Accelerator works properly with all Optimizely products and experiment types. Because traffic distribution will be updated frequently, Full Stack customers should implement sticky bucketing to avoid exposing the same visitor to multiple variations.
How does Stats Accelerator work with Personalization?
Stats Accelerator will automatically adjust traffic distribution between variations within campaign experiences. This will not affect the holdback. To maximize the benefit of Accelerate Learnings, we recommend increasing your holdback to a level that would normally represent uniform distribution. For example, if you have 3 variations and a holdback, consider a 25% holdback.
Do you recommend a minimum level of traffic or number of variations to use with Stats Accelerator?
We do not impose any requirements for traffic level or number of variations, but we suggest a few guidelines. Stats Accelerator works best with multiple variations, but not so many that each variation is starved of traffic from the outset. In practice, we suggest at least 2 variations for Accelerate Impact and at least 3 variations for Accelerate Learnings, both including the original or holdback (baseline) variation.
How will Stats Accelerator affect reporting for my campaign?
Experiments with Stats Accelerator get all the same Results page benefits as any other experiment. Just remember that because Stats Accelerator adjusts the percentage of visitors who see each variation, you will see visitor counts that reflect the distribution decisions of the Stats Accelerator feature.
For which use cases is Stats Accelerator most effective?
We offer a few use case suggestions in the main article above, and here's the mathematical foundation for determining which use cases are a good fit.
For Accelerate Learnings, when you have more than 2 alternative hypotheses you want to test, and some of them don’t make a meaningful impact and some of them do, we’ll help you discover the ones that do much faster than you would be able to with uniform distribution.
For Accelerate Impact, when you have multiple variations and their true conversion rates differ by a lot, we’ll help you drive more traffic to the better variations so you don’t have to waste too many opportunities with the poorly performing variations.
Is there any use case for which I should not use Stats Accelerator? 
Stats Accelerator is driven only by the primary metric. Metrics are often correlated, so optimizing one optimizes another (for example, revenue and conversion rate). However, if metrics are independent of each other, optimizing the allocation for the primary metric may come at the expense of the secondary metric.
What is the mathematical difference between Accelerate Learnings (pure exploration) and Accelerate Impact (explore-vs.-exploit)?
In simple terms, if your goal is to learn whether any variations are better or worse than the baseline and take actions that have longer-term impact to your business based on these learnings, use Accelerate Learnings. On the other hand, if you do not care too much about comparing variations to baseline and just want to maximize conversions among these variations, choose Accelerate Impact.
In traditional A/B/n testing, a control schema is defined in contrast to a number of variants that are to be determined better or worse than the control. Typically, such an experiment is done on a fraction of web traffic to determine the potential benefit or detriment of using a particular variant instead of the control. If the absolute difference between a variant and control is large, only a small number of impressions of this variant are necessary to confidently declare the variant as different (and by how much). On the other hand, when the difference is small, more impressions of the variant are necessary to spot this small difference. The goal of Accelerate Learnings is to spot the big differences quickly and divert more traffic to those variants that require more impressions to attain statistical significance. Although nothing can ever be said with 100% certainty in statistical testing, we guarantee that the false discovery rate (FDR) is controlled, which bounds the expected proportion of variants falsely claimed as having a statistically significant difference when there is no true difference (users commonly specify to control the FDR at 5%).
In a nutshell, use Accelerate Learnings when you have a control or default and you’re investigating optional variants before committing to one and replacing the control. In Accelerate Impact, the variants and control (if it exists) are on equal footing. Instead of merely trying to reach statistical significance on the hypotheses that each variant is either different or the same as the control, Accelerate Impact attempts to adapt the allocation to the variant that has the best performance.
How should I approach Stats Accelerator if my conversion rate changes over time or I am worried about Simpson's Paradox?
Time variation is defined as non-stationarity in the underlying distribution of the metric value.  More simply, time variation occurs when a metric’s conversion rate changes over time. Both Stats Engine and Stats Accelerator’s algorithms assume these distributions are stationary.
Time variation is caused by a change in the underlying conditions that affect visitor behavior. Examples include more purchasing visitors on weekends; an aggressive new discount that yields more customer purchases; and a marketing campaign in a new market that brings in a large number of visitors with different interaction behavior from existing visitors.
We assume stationarity because this assumption enables us to support continuous monitoring and faster learning (see the Stats Engine article for details). However, both Stats Engine and Stats Accelerator algorithms have a built-in mechanism to detect violations of this assumption.  When a violation is detected, we update our statistical significance calculations accordingly. We call this a “stats reset.”
The impact of time variation is much more pronounced when the metric is numeric (non-binary), such as revenue goals or total conversions. Time variation is observed less frequently and to a lesser degree in binary metrics such as unique conversions. The effect of time variation is difficult to quantify and varies from case to case.
Furthermore, time variation has less affect on the Accelerate Impact approach because it does not seek to reduce time to statistical significance declaration. Rather, Accelerate Impact evaluates the best balance of exploiting leading variations while still exploring variations with the greatest potential. Stats resets are less common with Accelerate Impact because we are not aggressively trying to identify statistically significant variations faster.
To mitigate the effects of time variation even further for Accelerate Impact, we are implementing an exponential decay function that will weigh more recent visitor behavior more strongly to adapt to the effect of time variation more quickly. For both Accelerate Learnings and Accelerate Impact, we reserve a portion of the traffic for pure exploration so that we can detect when time variation happens. 
An exponential decay function, which we are implementing for Accelerate Impact, is a good approach to addressing time variation. Exponential decay is a smooth mathematical function to give less weight to earlier observations and more weight to recent observations. It is broadly used to model the effect of early observations, gradually becoming less relevant in the face of changing trends over time.
If you suspect your visitors' behavior may change over time (and therefore, non-stationarity is possible), here's how to approach Stats Accelerator. If dramatic shifts in the underlying distributions are expected in the planned timeframe of the experiment, and your intention is not to learn about these shifts as they take place but to rather study the average effects over the timeframe, we suggest using Accelerate Impact instead of Accelerate Learnings.
 

Additional resources