Skip to main content
menu_icon.png

Everything you need to switch from Optimizely Classic to X in one place: See the Optimizely X Web Transition Guide.

x
Optimizely Knowledge Base

Best practices: Datafile management in Full Stack

relevant products:
  • Optimizely X Full Stack

THIS ARTICLE WILL HELP YOU:
  • Balance latency and freshness in datafile management
  • Navigate the tradeoffs for your implementation and performance constraints
  • Decide how to implement datafile synchronization  

In Optimizely X Full Stack, you configure A/B tests and variations in a web interface hosted on app.optimizely.com. Then, you implement an SDK in your application to bucket users into the right variations based on that configuration. The link between these two systems is called the datafile.

image1.png

The datafile is a JSON representation of all the A/B tests, audiences, metrics, and other elements that you’ve configured in a project. Here's how to access the datafile for a Full Stack project.

Whenever you make a change in Optimizely's interface, like starting an A/B test or changing its traffic allocation, Optimizely automatically builds a new revision of the datafile and serves it on cdn.optimizely.com. To incorporate this change, your application needs to download the new datafile and cache it locally to make decisions quickly. We call this process datafile synchronization. 

There are several different approaches you can take to datafile synchronization, depending on your application’s needs. In general, these approaches involve a trade-off between latency and freshness. Finding the right balance ensures that your datafile stays up-to-date without slowing down your application. This guide walks through the best practices and alternatives for striking this balance.

There is no right answer for managing the datafile because every application has different implementation and performance constraints. We recommend evaluating these options and standardizing your approach to datafile management by building your own wrapper around our SDKs. This wrapper can also capture other context-specific options like event dispatching and logging.

Datafile management is implemented out-of-the-box in our iOS and Android SDKs. For more information, see initialization in our developer documentation. This guide focuses on managing datafiles in server-side contexts.

Understanding the tradeoffs

To understand the tradeoffs in datafile management, it helps to consider the most naive approach.

Imagine that every time an A/B test runs, you fetch the latest datafile from the CDN, then use it to initialize an Optimizely client and make a decision. This approach guarantees you the latest datafile, but it comes at a major performance cost. Every decision requires a round-trip network request. In asynchronous contexts like SMS or chatbots, this can work, but we don’t recommend it for synchronous use cases like a web server or API.

Instead, we recommend caching a local copy of the datafile within your application, then synchronizing it periodically. This lets you make A/B test decisions immediately, without waiting for a network request, while keeping the configuration up-to-date.

For example, you can set up a timer to re-download the datafile every 5 minutes and store it in memory, then read from there every time you make a request. Or, you could use a webhook to keep a centralized service in sync and make internal HTTP requests for the datafile. As these examples illustrate, there are several choices to consider when implementing datafile synchronization:

  • Where to store the datafile: Locally in memory, on the filesystem, or on a separate service.

  • When to update it: Via a “pull” model that polls for updates on a regular interval, or by listening for a “push” from a webhook.

  • How to fetch it: Directly from the CDN, or a private authenticated endpoint.

The sections below walk through the best practices and trade-offs for each approach.

Storage options

Consider these options when deciding where to store the datafile.

With any of the options listed below, you have the flexibility to choose what format to cache the datafile in. The most common approach is to store the JSON string of the datafile itself. This is sufficient in many cases, but JSON parsing can take as long as 100 ms depending on the language, load, and datafile size. Even when parsing is much faster, be careful with implementations that require repeated JSON parsing. For example, if you re-instantiate the SDK within a loop, this cost can quickly add up. If you do need to instantiate repeatedly, pass in the already-parsed object or already-instantiated Optimizely client rather than the raw datafile to avoid this cost.

Best performance

We recommend storing the datafile directly in memory. You'll be able to look up the datafile with near-zero latency, so your web service stays performant. In a simple application, this can be done by instantiating the Optimizely object directly and passing it around as needed. In more complex applications, you can use tools like Memcache.

Multiple processes need to share the same datafile configuration

We recommend keeping the datafile in some kind of local storage. For example, you can keep the JSON file directly on your local file system, which is generally slower than a memory lookup but faster than a network request. Alternatively, you can store the datafile in a distributed store like Redis. In general, we recommend systems that allow fast reads and relatively fast writes.

Colocated services

If you have many colocated services that all need to operate off the same datafile, consider hosting the Full Stack SDK as its own independent service. This service can expose SDK methods like activate and track as HTTP endpoints that other services can hit. Then, you can implement datafile synchronization within the service using any of the methods above. This approach adds a small latency hit from the internal network request, but it makes implementing in a microservice architecture substantially easier, especially if you have many different types of services operating in different languages. The centralized endpoint allows you to implement the logic just once, rather than separately in each service.

Update options

The other key consideration is when to update the datafile. In general, you have two choices: pull or push. We recommend using both approaches together, if possible. Use a webhook (push) as the primary means for keeping your datafile up to date, but keep polling (pull) at regular intervals in case the webhook fails.

Pull updates

The “pull” approach consists of polling the Optimizely CDN on a regular interval and updating the stored datafile whenever a new revision is available. Polling is generally easy to implement through a timer or CRON job.

Pulling works best if you don’t need instant updates. For example, if you’re comfortable with pressing the “pause” button on an A/B test that’s performing badly and waiting a while for the change to percolate to your users, polling on a 5- or 10-minute interval is fine.

Push updates

If you need faster updates, such as every time a feature flag is toggled, we recommend “pushing” the changes as soon as you make them. You can configure a webhook to ping your server as soon as a change is made so you can pull the update down immediately.

This is the preferred approach for server-side contexts with a reliable network connection, but it doesn’t usually apply for web and mobile clients. For an example of webhooks in action, see our Python demo app.

Download options

Regardless of when you choose to update the datafile, you have flexibility in terms of where you download it from. You can fetch the datafile for your Full Stack project from Optimizely’s CDN. For example, if the ID of your project is 12345 you can access the datafile at https://cdn.optimizely.com/json/12345.json.