Feature Flag Guarding
Every new Chromium feature is gated behind a feature flag from the moment its code lands. The flag defaults off, flips only after the launch gate authorizes it, and is removed once the feature has reached Stable without rollback.
A patch that compiles and passes the tests is not a patch that’s ready to reach users. In Chromium, the gap between those two states is bridged by a base::Feature declaration, a BASE_FEATURE_VALUE_PARAM default, and a call-site check that reads the value at runtime. Code that lands without this gate runs in Canary the same day it merges. Code that lands with it runs only when the experiment infrastructure, the Origin Trial portal, or the Intent to Ship gate has authorized exposure for that channel and that population.
Context
A Chromium feature is built by a small team, lives in chromium/src alongside thousands of other in-progress features, and reaches a user population that runs into the billions once it ships to Stable. The same source tree feeds the four channels in parallel: Canary builds from tip-of-tree every working day, Stable builds from a branched milestone every four weeks. A feature’s code lands once; its exposure is what the channels and the experiment infrastructure modulate. The pattern operates in that gap between landing and exposure: at the call site, in the runtime check, and in the cleanup record after the feature stabilizes.
Problem
A feature owner has tests passing, OWNERS approval on the implementation, and a green commit queue. The natural move is to land the code, run it on Canary the same day, and let the next channel promotion carry it into Beta and Stable. That move exposes the feature to every Canary user before the Intent to Ship gate has been cleared, before any Origin Trial has produced compatibility data, and before any Finch experiment has measured stability under traffic. It also makes the feature impossible to disable without a revert: a kill-switch needs a flag to operate on. The recurring problem is how a project that lands hundreds of patches a day can land novel call-site behavior without simultaneously activating it.
Forces
- Code freshness vs. exposure control. Reviewers prefer features to land in small patches close to when they were written; product owners need exposure to be staged, optional, and revocable.
- One source tree vs. four channels. A single landing has to produce four different runtime behaviors (defaulted-on in Canary, off in Beta, off in Stable) without forking the tree.
- Experiment infrastructure vs. call-site discipline. Finch and Origin Trials can flip a feature’s exposure, but only if the call site reads a value they can flip. Code that hard-codes its behavior bypasses both.
- Long-lived flags vs. dead code. A flag that outlives its feature swells binary size, complicates the call-site, and invites the Zombie Origin Trial and Experiment That Became Permanent failures.
Solution
The Chromium project requires that every new feature land behind a base::Feature flag declared in a _features.h header and read at every call site through base::FeatureList::IsEnabled(). The flag has a canonical declaration shape, a documented default value, and a cleanup obligation at end of life.
A typical declaration in content/browser/some_feature/some_feature.h:
BASE_DECLARE_FEATURE(kSomeFeature);
Its definition in the matching .cc file:
BASE_FEATURE(kSomeFeature,
"SomeFeature",
base::FEATURE_DISABLED_BY_DEFAULT);
The flag’s string name ("SomeFeature") is what Finch configs, Origin Trial registrations, and chrome://flags listings refer to. The default value (base::FEATURE_DISABLED_BY_DEFAULT or FEATURE_ENABLED_BY_DEFAULT) is what runs when no experiment, no Finch override, and no command-line switch is in effect.
Every call site that depends on the feature’s behavior reads the flag through base::FeatureList::IsEnabled(kSomeFeature), never through a hard-coded check or a build-time #ifdef. The check sits in front of the new code path; the existing path remains in place until the flag is removed at cleanup time. This shape opens three runtime levers. Finch can flip the value for any population it targets. The Origin Trial server can enable the feature for sites that hold a valid token. A release engineer can disable the feature for the entire user base by pushing a Finch kill-switch config, without shipping a binary.
The cleanup obligation closes the loop. Once the feature has cleared Intent to Ship, has reached 100% of Stable, and has held there long enough to confirm no rollback is forthcoming (typically two stable cycles), the flag and its default-disabled code path are removed in a follow-up patch. The flag’s owner, named in the flag_metadata.json entry, is on the hook for the cleanup. Long-lived flags that miss cleanup show up in the periodic flag-audit sweep and generate tracking bugs.
What makes the pattern work is the absence of escape hatches. The gate is a runtime check, not a build switch, so a feature can’t ship to Canary while staying off in Stable through a compilation flag. The flag’s name is registered in flag_metadata.json and surfaced in chrome://flags, so a release engineer or QA contractor running into the feature on a Canary build can name it without reading the source. The cleanup obligation is tracked in flag_metadata.json’s expiration field; flags past their cleanup target produce build-time warnings.
How It Plays Out
A team at Igalia lands a new Web API behind kMyApi, defaulted off. The first Canary build carries the new code path but doesn’t execute it; the existing call site routes through the legacy path unchanged. A blink-dev Intent to Experiment thread requests Origin Trial registration. The team configures the Origin Trial server to accept tokens scoped to kMyApi. Three weeks later the trial is live: Canary, Dev, Beta, and Stable users hitting sites that include a valid token execute the new path; everyone else continues to run the legacy code.
Compatibility data accumulates; the team revises the API; the trial ends; the Intent to Ship thread clears with three API-owner LGTMs; Finch begins a 1% Stable rollout, then 10%, then 100%. Two stable cycles after 100%, a cleanup CL removes the flag and the legacy path. The feature is now baseline. The whole arc, from first landing to flag removal, has spanned roughly six months. The same arc without a feature flag would have begun with a Canary regression on day one.
A second team lands code without a flag. The patch is technically correct, OWNERS-approved, and passes presubmit. It changes the behavior of a navigation throttle in a way that turns out to interact badly with an enterprise policy still in active use. Canary users at managed-Chromium deployments hit the regression within twenty-four hours; an incident report lands on the team’s calendar that afternoon; the Tree Sheriff reverts the patch the next morning without waiting for the author to triage. The team rewrites the change behind kNavigationThrottleNewBehavior, lands it defaulted-off, runs a two-week Finch experiment, finds the same interaction, fixes it, and ships through the normal pipeline. The team has spent an extra two engineering weeks and surfaced one incident report; the cost of the missing flag was paid in operational disruption and a revert on the public record.
A third case: a Finch kill-switch incident. A Stable feature defaulted on at 100% begins to show elevated crash rates two days after a milestone promotion. The release engineering team pushes a Finch config setting kThatFeature to DISABLED_BY_DEFAULT for the entire Stable population; the feature is off across roughly a billion installs within hours, no binary update required. The team triages the crash, fixes the underlying issue in a follow-up patch behind the same flag, and re-enables Finch traffic gradually. The kill-switch only worked because the call site read the flag through base::FeatureList::IsEnabled(); a hard-coded path would have required a binary respin and a stable-channel emergency release.
Consequences
Benefits. Every feature reaches Canary as inert code. Every subsequent stage of channel exposure or trial enrollment is a deliberate configuration change rather than a code change. Every Stable launch has a kill-switch the release-engineering team can pull without a binary update. The pattern is also a structural defense against the Experiment That Became Permanent antipattern: the cleanup obligation gives the project a defensible mechanism for removing trial surface once a feature has stabilized.
Liabilities. Flag overhead is real. Every call-site check runs at runtime, every flag occupies a slot in the FeatureList registry, and a long-lived flag’s two code paths both have to be maintained until cleanup. Some features carry their flags for years past the point where they should have been cleaned up; the audit and warning machinery is what keeps that tail bounded. The pattern also places an ongoing obligation on every feature owner. Cleanup is rarely as exciting as launch, and an owner who has moved on can leave a dangling flag that no one else feels responsible for.
The pattern doesn’t guarantee that a feature is correct. It guarantees something narrower: the feature’s exposure is decoupled from its implementation, a problem found in Canary or Beta can be silenced without a code change, and the project always has a path to disable a feature server-side at Stable. That alone is what allows Chromium to land code at the rate it does without breaking the channels it ships through.
Notes for Agent Context
When implementing a new Chromium feature, declare a base::Feature in a _features.h header before writing the call-site code, and read it through base::FeatureList::IsEnabled() at every branch that depends on the new behavior. Never hard-code the new path on, never gate it behind a #ifdef, and never rely on a build flag for runtime behavior. Set the default to base::FEATURE_DISABLED_BY_DEFAULT unless the feature has already cleared Intent to Ship and is in cleanup. Register the flag in flag_metadata.json with an owner email and a target expiration date. The implementation is not complete until the call site uses IsEnabled() and the flag has a flag_metadata.json entry. An OWNERS-approved patch that lands without a flag will be reverted by the Tree Sheriff once the Canary regression report arrives.
Related Patterns
Sources
The canonical reference is the Chromium base::Feature system, designed by the //base team to give the project a uniform runtime gate that Finch, Origin Trials, and chrome://flags could all target. The flag-ownership policy was formalized in docs/flag_ownership.md to assign cleanup responsibility to a named individual, after a series of long-lived flags accreted as effectively-permanent surface and produced binary-size regressions that traced back to no clear owner. The flag-cleanup expectation aligns with the project’s broader compatibility commitment described in Web Platform Backward Compatibility: features land behind a flag so they can be removed without breaking sites if the rollout reveals a problem the design review did not anticipate.
Technical Drill-Down
- Chromium Feature List API —
base/feature_list.h— the canonical declaration site forBASE_FEATURE,BASE_DECLARE_FEATURE, andbase::FeatureList::IsEnabled(). - Chromium flag-ownership documentation — the per-flag ownership and expiration policy, including the
flag_metadata.jsonformat and the cleanup-warning mechanism. chrome://flagsexposure — the surface that exposes named feature flags to developers and QA contractors at runtime.- Finch experiment documentation — how Finch configs target named feature flags to flip default values for population subsets.
- Origin Trials developer documentation — the site-operator-facing surface that issues tokens scoped to named feature flags.