Everything That Can Go Wrong Will – Use Feature Flags to Manage the Risks

Edward A. Murphy Jr. was an American aerospace engineer who worked on safety-critical systems for the United States Air Force. You might wonder how this is relevant. You see, even though you might have never sat in a fighter jet or were probably learning to tie your shoelaces when he died in 1990, you have either heard of or used his most famous invention: Murphy’s Law.

Murphy’s Law and product development

Product companies that rely on keeping their customers engaged understand that the speed of deploying new system capabilities is important. Their development teams are furiously pounding away on their keyboards, crafting code that they hope wows the customer and ensuring their work is properly merged in source control and seamlessly delivered for the world to use, enjoy, or hate (depends on which side of the bed the user woke up that day).

Feature after feature is added to the main branch, and an automated deployment agent is standing by to lift the code libraries and carefully place them in a production server. If configured properly, the entire end-to-end process is like a symphony, never missing a beat and flowing from idea to value seamlessly.

Enter Eddy and his ominous warning. Turns out a new feature that was added at the last minute and not thoroughly tested has an edge-case defect, very unlikely to occur. However, if it does get invoked, it can lead to a major security flaw that can compromise a user’s sensitive data and can ruin the company’s reputation.

Don’t call your lawyers yet; the development team uses feature flags

Feature flags, also known as feature toggles, allow safer delivery of features by decoupling deployment and feature releases. Think of them as “kill switches” that can be used to dynamically reconfigure a production system if necessary. Feature flags support continuous integration/continuous delivery and support a culture of experimentation to keep adding more value to a customer’s engagement.

Use feature flags in a repeatable way to ensure predictable outcomes

This analyst always found irony in the DevOps mantra of making processes predictable to figure out how to be innovative. In other words, use standardization to discover nonstandard ideas. This was, of course, before I realized how feature flags could support DevOps inspired delivery pipelines.

In the spirit of standardizing the use of feature flags in product delivery, the following are some best practices to follow when using them:

1. Turn flags on or off on the server to avoid cache invalidation challenges.

Modern web applications, such as SPAs, are feature rich. The desire to give users that incredible experience that will blow their minds has created a whole host of techniques (like caching) where the browser is literally the computing agent (and with technologies like WebAssembly, this trend is only going to get more pervasive). However, just because client-side processing has become faster does not mean it’s necessarily a good idea to manage flag toggling there.

For one, client-side caching (which is a widely used technique to improve performance of an application) interferes with the synchronicity of a flag’s status on the server and on the client. If the client manages the flag’s setting and the kill switch for the feature is turned on at the server, what guarantees do you have the client is connected to the server to pick up on the change? Is there an automatic push mechanism from the server to the client?

2. Feature flags managed on the server exclusively reduce implementation complexity.

Yes, rich clients are more powerful now with all the processing strength available for our local machine. Nevertheless, there will always be a situation where both server and client will be managing feature-flag state for different flags, and the fragmentation of processing across two domains can get really complex in a hurry.

3. Make feature-flagging decisions close to the point of first contact with the user, especially in distributed microservice architectures built on a domain-design model.

Microservices have gained tremendous popularity when it comes to creating modular services that manage one and only one business domain. In an intelligently designed retail web service, one would expect a microservice for customers, one for checkouts, one for ordering, and so on (with the caveat that sometimes its not as easy as alluded to here).

If the company wants to experiment with a feature flag for free priority shipping for a certain type of customer, it makes most sense to toggle that at the customer domain. Realistically speaking, the flag can be manipulated at the checkout or ordering service, but that breaks the rule of domain independence. It is always better to make the feature-flag decision closer to the customer domain because its outcome is most valuable for them.

Making feature-flag decisions close to the business logic controlling the flag’s toggle makes it unnecessary to share the user’s context with other code modules, ensuring the principles of modularity in code writing are maintained. This practice is, however, not easily implementable and requires strong architecture discipline.

4. Don’t forget making the code changes for supporting feature flags testable.

Code testing for feature flags can span the macro and/or the micro. The macro-tests are black box tests and should only be concerned with ensuring the expected experience for a user when the feature flag is turned on.

For those who are interested in auditing the intermediate steps that follow the activation of the feature flag, process-steps-level testing should be done, at a minimum, for both states of the flags (i.e. on vs. off). Many times has this analyst come across situations where feature flags were only tested for the “on” state based on the assumption that the “off” state was what got tested as part of the core regression or functional test cycles.

5. Database changes must be done systematically and with extreme caution.

Code changes to production systems usually require a change to the data model. The schema needs to support any newly deployed code, and sometimes that means applying a migration to our database schema.

The migration can take either of two options: Expand-Contract or Parallel Updates.

With Expand-Contract, the safe option is to update the data model (Expand) without putting any referential constraints on newly added tables. Once Expand is done, use code changes to write to the new and old data model simultaneously until there is a switch over to the new tables completely. At this point, the old data model is reneged, and the data model Contracts to a smaller size.

Parallel Update requires both code and data model changes to hit the production at the same time. Unless thoroughly tested, this is a risky mechanism and can lead to severe outcomes.

6. Remember to clean up flags that are no longer relevant.

Your A/B test was a success. The customers loved the priority shipping option for any order volume over Can$50. The product team is convinced this is a feature worth rolling out, and the enhancement is made public on the weekend. The flag served its purpose, and now it needs a graceful retirement.

Make flag retirement a project task that needs to be completed for project closure.

The principles of Lean and Kanban demand that every bit of work that needs to be done should have visibility. If you don’t see it, you won’t know it has to be done. The same applies to retiring feature flags. Make it a part of the project closure tasks to make sure it done.

7. A feature flag by any name is a confusing flag.

Like all variables we code and give unique names to, feature flags are the same. Instead of calling a flag “priority_shop,” better to label it “priority_shop_front_UI” or “priority_shop_front_DB,” etc.

8. Use server jobs to retire flags.

There might be a reason for using local cron jobs (or similar structures) to periodically check the flag’s best-by date and toggle it off when the conditions are satisfied.

9. Integrate the outcomes from feature flags into a feedback loop for measuring impact on users and systems.

Using feature flags, businesses make managed changes to the system in the hope that the impact is observed and the business may be improved. When assessing the impact of the new feature, it’s important to not only look at the increases in revenue generating metrics but also at other less-obvious indicators like system reliability, scalability, and operational loads.

Our Take

Feature flags help product delivery teams by reducing the risk to code deployments because of feature releases. Feature flags provide a mechanism for feedback and iteration by linking features to changes in engineering KPIs and product metrics.

We all claim to be learning organizations who love to experiment. Using feature flags to assist with continual evolution of the business is a tested approach for many of the tech giants in our world. We might not be tech giants, but who’s to say we can’t behave like one?


Want to know more?

Implement DevOps Practices That Work