A/B Testing and Delivery Pipelines

09-29-21 Adam Simpson

A/B tests in your web application can be useful in many scenarios. However, you need to make sure they are not a substitute for a healthy delivery pipeline.

A/B testing is predicated on one main idea: your product or service will be better if you can measure the effectiveness of small changes. A/B testing is essentially creating experimental changes to your product and then presenting those changes to a subset of users and measuring the effectiveness of those changes compared to users who use the “old” way.

Note: If you’re unfamiliar with A/B testing I highly recommend checking out Netflix’s series of posts about how they tackle A/B testing.

Happy Path

At Sparkbox we’ve utilized two platforms to manage A/B tests for clients: Optimizely and Google Optimize.

Both Optimizely and Google Optimize have WYSIWYG tools that attempt to let you design A/B tests without touching your application code. This doesn’t work out well in practice for applications that use JS on the frontend because you end up in race conditions between the A/B testing tools and your application. The happy path is to eschew these WYSIWYG tools and instead limit their interaction with your product to two vectors: run-time variables and routing.

If you’re testing a feature that will be on an existing page you can have the A/B testing service inject a variable onto the page and then check for that variable in your application code. Essentially it’s a feature flag with traffic rules applied to it.

If you’re testing the addition of a new page to a flow, you could still use the variable approach and have your router check for the existence of the variable in the routing processing stack. Or you could have the service do the routing directly instead. Regardless of how the A/B testing tool is integrated into your application, it’s important to treat the tool like any other dependency and create a wrapper or service boundary around it to isolate any failures from the service impacting your application.

A recent scenario with a client highlighted why A/B testing can be so useful. They were testing out a new feature on an existing page and the data coming back was showing that users clearly preferred the older version. They were able to revert the feature completely and plan out a new UX. This decision wasn’t a stressful moment, but instead merely flipping a switch and then working on how to improve the feature.

Before you jump in and add A/B testing to your application, it’s important to evaluate your delivery pipeline and ensure A/B testing is actually the right tool for the job.

Build a Healthy Delivery Pipeline

The A/B testing happy path assumes a healthy delivery pipeline. We’ve written about healthy deployments in the past, but at a high level deployments should be:

  • Automatic
  • Collaborative

These characteristics allow releases to happen rapidly—ideally multiple times in a day. If your delivery cadence is weeks or months, A/B testing is going to cause more pain.

Start with Feature Flags

We’re big fans of feature flags at Sparkbox because feature flags break the tie between releases and deployments. Feature flags let you deploy at a regular cadence and control releases of new features via feature flags.

If you squint, A/B tests are just feature flags with measurement and user segmentation thrown in the mix. If you can get 80% of the benefit of A/B testing via feature flags, then stop there. Most organizations would benefit from having feature flags—a smaller percentage of that group would benefit from A/B testing.

Match the Freedom of a Hotfix

Even the slowest moving teams have a process in place to handle showstopping bugs in production: a hotfix deployment. Unlike regular feature deployments, a hotfix deployment usually gets to production as quickly as possible. A hotfix deployment is an implicit acknowledgment that a normal feature deployment takes as long as it does because of artificial constraints.

I would suggest any organization considering A/B testing to first spend engineering time and budget on improving the cadence of their delivery pipeline. Focus on having feature deployments match the velocity of a hotfix before considering A/B testing.

Wrapping Up

When you’re looking to introduce A/B tests to your application it’s also an excellent time to consider the health of your overall application delivery pipeline. If A/B tests are attractive because of how quickly you can enable or disable features then you should invest in feature flags and shore up your delivery pipeline; focus on shortening the time it takes a feature to get to production. A/B testing is not a cure-all for delivery woes but it is an excellent way to experiment with and iteratively improve an experience over time.