- Evaluate the tradeoffs of testing with a low sample size
- Discover strategies that help you get better results with low traffic
When traffic is a precious commodity, it makes sense to implement your A/B tests with as much forethought as possible. This article discusses how to get the most from Optimizely, even if your website receives relatively low visitor traffic and/or conversions.
We recommend using this information in tandem with our support article How long to run a test, in which we discuss based on sample size and statistical measures how long you should let a test run for before checking the results.
Why can low traffic pose a challenge?
The goals that you set up for your experiment determine the metrics by which a variation is considered successful. Popular goals measure significant actions such as completion of a sign-up form or placement of an order. However, if your website only gets a few such conversions a week, it will take a long time to determine with certainty which version is the winner, particularly if the difference in conversion rate between the original and variation is small.
To illustrate this, let's imagine you're running two A/B tests, both of which have so far received 500 visitors to each variation and a handful of conversions each on the main goal. Test 1 changes the text of a sign-up button from "Buy now" to "Try it!" Test 2 adds a pop-up advertising free shipping. For Test A, the improvement in conversion rate for the variation is 5%. For Test 2, the improvement in conversion rate is 50%.
By taking a look at our sample size calculator, you can check how many visitors would be required in each branch of the test to prove that the results have reached statistical significance and that the uplift isn't just down to pure chance.
As illustrated in the screenshots below, the difference between proving an uplift of 5% and an uplift of 50% is enormous. What this means in practice is that testing macro conversions, such as order completions, runs the risk of requiring so much time to reach the required number of visitors or conversions as to make it unrealistic for a website with lower traffic.
Visitors required per variation to prove 5% uplift:
Visitors required per variation to prove 50% uplift:
This is not to say that low-traffic websites should not conduct A/B tests, but rather that strategy should play a greater role in deciding what tests to run than if you had unlimited traffic to play with. The following recommendations will help websites with lower traffic get the best value from Optimizely and A/B testing in general.
What can sites with low traffic do to maximize the value of A/B testing?
1. Test high-impact changes
Sometimes small changes have a big impact on conversion rates, but it's more likely that testing something big will have a more noticeable effect. Focusing your tests around areas of your site that visitors consider important are more likely to have a big impact on conversion rate than testing very small modifications on niche pages.
While it might seem intimidating to test out something radical, when you test high-impact changes, the likelihood of achieving a drastic difference in conversion rates increases and with it the chance of being able to achieve statistically significant results within a reasonable timeframe. Even a negative outcome can result in valuable insights about your customers' values and behavior, which can be used to inform future tests.
The only drawback to testing very radical changes is that, if you change many things at once, it may be difficult to attribute the change in conversion rate to one specific element on the page. This risk can be mitigated by testing different themes rather than randomly moving elements around on the page. If you can figure out what is important to your customers, you can apply these learnings elsewhere on the site.
2. Focus on micro-conversions
Your main goal may be to increase conversions or sign-ups, but does it really make sense to position them as your main metric if it would take an extremely long time to gather enough conversions to verify the results you've collected? Probably not.
Testing conversions on the micro level, at which conversions are more plentiful, can help you determine the immediate effect that an A/B test has on a page and will help you call your results more quickly. Examples of micro conversions would be engagement with the page, click on an add to cart button, viewing a certain number of pages or clicking through to a product detail page.
Other goals to consider might be setting up a conversion goal that fires when a visitor has scrolled a certain percentage of the way down your long-copy page or a custom event goal which fires only for users who stayed 30 seconds or longer on your site. If you are using our Google Analytics integration, another possibility would be to watch metrics such as bounce rate or average pageviews per visitor.
3. Test the page directly
Instead of measuring final conversions that take place several pages away, measure changes that take place directly on the page where the experiment is running. For example, if you want to A/B test your product pages but don't have enough conversions to make setting up an order confirmation goal worthwhile, instead of test the product pages with the goal of bringing more relevant traffic to the next step of the process. Testing an earlier step in the process will give you more freedom to test and help you learn more about your users each step of the way.
Imagine you are running a test on a product listing page where visitors can either click "Add to basket" or "View details" for each product. If you choose to make orders placed your main goal, not only will there be fewer of these conversions in total, but the distance between the page on which the experiment is running and the conversion page is large and thus the impact is minimized. If you instead choose that your main goals will be to increase engagement with the "Add to basket" and "View details" buttons, you'll get more conversions to work with and these metrics will be more likely to be directly impacted by the changes you've made as the distance between the changes made and the conversion measured is small.
4. Consider a lower statistical significance setting
Optimizely actually allows for you to change your statistical significance level for each Project -- or, in other words, the level at which Optimizely declares that your results are likely due to actual changes in visitor behavior, not noise or randomness. Our articles on statistical significance and the statistical significance setting cover most of the information you'll need in this scenario, but here's the summary:
A high statistical significance will ultimately declare fewer false positives, but Optimizely will generally take longer to declare results.
By lowering your statistical significance, you increase the likelihood that you'll see false positives, but you'll also be able to run experiments at a higher velocity because Optimizely will require a lower sample size.
5. Don't be tempted by multivariate testing
The more variations you test, the more overall traffic will be required and the longer it will take to get your results to statistical confidence. Stick to A/B tests until you have sufficient traffic to direct into multiple variations.
6. Avoid niche testing
Avoid testing areas of your site which get very few hits and instead make your targeting conditions are wide as possible so as to include as many visitors as is feasible. Site-wide banner tests, landing page tests and the like will take advantage of the traffic you do have and are more likely to reach statistical significance in a shorter period of time than if you were to test only one specific product page, for instance.
7. Test to improve SEO and user experience
If you are conducting activities aimed at improving your site's SEO, you can set up experiments to determine which actions have the biggest impact.
Perhaps you want to find out which AdWords or organic search terms are most effective at getting users to click through to the site and take an action. You could test this by setting up individual tests which target individual AdWords parameters in the URL of your site or which look for a certain search term in the document.referrer. Alternatively, set up various CTAs in an email campaign to lead to different URLs and track conversions for each of these URLs. Which search terms lead to the most conversions or engagement? Which AdWords listings are clicked most often and have the best conversion rates? Which variant of your email campaign had the biggest impact?
Once you've determined the search terms you should focus on, you can then target your SEO activities towards this, thus saving valuable time and ad spend.
Another interesting option is to test whether users who have clicked on a certain call to action in an email or AdWords listing are more likely to convert when greeted with the same message that they clicked on in the email or AdWords listing than visitors who are greeted with a standard message. A standard A/B test for which the original is the original message and the variant contains a personalized message would be a great way of discovering this and offering a more personalized experience to your website's users.
8. Use the difference interval when looking at results
If you have a low-traffic site, you may find yourself in a position where you have to make a call on an experiment that hasn't declared a winner or loser yet. If this happens, you can use the difference interval on the Results page to make a (rough) determination.
The difference interval is a confidence interval on the difference in conversion rates. In other words, it shows you the range of values that likely contains the actual (absolute) difference between the conversion rates that would show up if you were to implement that variation.
In the example above, there is a 90% chance that the absolute difference in conversion rates lies between -10.93% and 29.18% if you were to implement Variation #1. That means there is a better chance that this test has a positive effect than a negative effect. Of course, it’s still risky to implement this variation. But if your time is limited and you need to make a decision one way or the other, it might be a risk you’re willing to take.
A useful and easy risk analysis that you can do with a difference interval is to report the best case, worst case, and middle ground estimates of predicted lift by reading the upper endpoint, lower endpoint, and center of the difference interval, respectively.
As the test runs longer and gathers more data, the difference interval will shrink. If this variation is actually a winner, the difference interval will move entirely above 0, at the same time that statistical significance increases to 90%.