[DO NOT MERGE] featureflag: prototype license-aware feature gates#171314
[DO NOT MERGE] featureflag: prototype license-aware feature gates#171314vishalv1994 wants to merge 7 commits into
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
|
Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link) |
dt
left a comment
There was a problem hiding this comment.
This looks more or less like what I as was picturing. I'd say it's OK to change the existing featureflag call sites even more if we wanted, e.g. to remove the explicit setting registration they're doing and have WithSetting subsume the setting registration too, but that can easily happen later too.
| // gate does not register a new setting; the caller supplies one of the existing | ||
| // feature.*.enabled bools. When the setting is false, Enabled denies the | ||
| // feature even if the license permits it. | ||
| func WithSetting(s *settings.BoolSetting) Option { |
There was a problem hiding this comment.
i think this could take the setting name and register it for them (since Register is an init-time call too) if we wanted to really slim down the boilerplate of defining a feature flag that has the option of operator control.
There was a problem hiding this comment.
Yes, updated with the latest push. WithSetting now takes just the name and handles registration internally.
| // features lists the entitlements this license grants its holder. Field | ||
| // number 12 is chosen to stay wire-compatible with the managed-service | ||
| // proto, which already uses field numbers 8 through 11. | ||
| repeated Feature features = 12; |
There was a problem hiding this comment.
Impl nit: I think we should switch this to a compact []byte packed a bitflag (using emum IDs for shifts) since 4 bytes or 8 bytes per bool is pretty bloated for something that gets moved around by humans as a string on their terminal or clipboard. wire-compat with manage-service doesn't seem worth a worse UX for our users (of having big, multi-kb license strings) so we could either just not worry about identical wire format or, better yet, managed-service could switch to the superior representation as well if/when we decide we want them to match.
There was a problem hiding this comment.
Makes sense! The bitflag approach is definitely more efficient. My main question is whether we should make that change now or defer it.
With only 8 features today, the difference in license string length is roughly 10 characters, so it’s not really a UX concern yet.
If we ship the repeated Feature encoding first and decide to move to a bitmask later, we’ll either need to support decoding both formats or reissue licenses. Since there’s currently no migration cost, I’m leaning toward doing the bitflag encoding now.
Does that align with your thinking?
There was a problem hiding this comment.
only 8 features today
Alternative framing : we haven't even launched this yet or incorporated it into every DB team's day to day workflows and we're already at 8 -- and 15% increase in the user-visible/user-handled license strings. It seems likely that we'll want teams to put every feature/option/aspect of a feature that we might want to at some point unbundle/make an add-on/repackage behind its own discrete flag to ensure those decisions can be made externally at lic-gen time, so we should probably expect this list to only grow from here.
currently no migration cost, I’m leaning toward doing the bitflag encoding now.
Agreed: it's much easier to make the switch now before we externally release anything than do it later when it'll become a Migration™. There's ~no reason not to just do it now, it's already a win for the lic size and will only be moreso later.
| // installed. It is populated by an init() in pkg/server/license to avoid an | ||
| // import cycle (server/license imports featureflag, not the reverse). When | ||
| // nil, the license gate is permissive. | ||
| var GetLicenseHook func(st *cluster.Settings) (*licensepb.License, error) |
There was a problem hiding this comment.
not a blocking concern right now but eventually I think this would be cleaner if we could have featureflag depend on the license setting; perhaps that means the setting is in the wrong place if pkg/server/license is depending on featureflag and we'd want to move the setting itself to a util pkg separate from the the higher level code in pkg/server that is probably what's pulling in featureflag.
but again, not blocking for now -- just something i'd maybe have in a TODO.
There was a problem hiding this comment.
Agreed, the hook is there purely to break the cycle.
Moving the license setting (or at least the GetLicense accessor) to a lower-level package that featureflag can import directly would be cleaner. I'll add a TODO for this. It'll make the code more straightforward once we sort out the package layering.
I think those are't feature flags; I see two answers there: a) can we redefine them as feature flags? e.g. would product be able to define their segmentation as "replication factors > 3 are now defined as 'enhanced replication controls', which is a feature that has to be granted" rather than retaining control over specific numbers? that seems like it could work if there are one or two step-change/categorical values (e.g. >3, >5). b) If not, I'd say calling code should just call GetLicense and read the specific constant, since this is so specific to that exact case that there isn't really a generalizable, reusable pattern for us to encode in common infra here. |
I agree with this approach. Looking at the edition and add-on features we’ve identified so far, representing them as boolean flags seems like a clean and straightforward solution. Keeping any capabilities that don’t naturally fit into the |
Add a top-level Feature enum and a repeated Feature features field (number 12, wire-compatible with the managed-service proto's 8-11) so a license can carry the set of entitlements it grants. This is the first building block for license-aware feature gates. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a FeatureGate type that combines a license-entitlement check (read directly from the license proto via a hook) with an optional operator cluster setting, plus experimental/cloud-only stubs. This is the core of the license-aware feature-gate prototype; existing call sites are not yet migrated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Populate featureflag.GetLicenseHook from an init() so feature gates can read license entitlements. The dependency direction is server/license -> featureflag, avoiding an import cycle. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Demonstrate the feature-gate API shape with a license-only gate (multi-region) and a license+setting gate (changefeed), exercising Register, WithSetting, and Enabled without touching real call sites. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Generated by ./dev generate bazel as a consequence of adding the featureflag example test target. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…enial Reword the WithSetting comment to describe current behavior rather than narrate an alternative, and add a test that overrides GetLicenseHook to exercise the license-entitlement deny path (a license granting BACKUP but not CHANGEFEED), demonstrating the core gating mechanism. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Address review feedback: WithSetting now takes the setting name (string) and calls settings.RegisterBoolSetting on the caller's behalf, removing the need for each call site to separately register a feature.*.enabled bool. A new Setting() accessor exposes the registered *BoolSetting for direct reads or test overrides. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
fedc50a to
6ddd2ba
Compare
Summary
Interface-skeleton prototype of a license-aware feature-gate model: extend the
existing
featureflagpackage so a single gate can combine a license-entitlementcheck (read directly from the license proto — not denormalized settings) with the
existing optional operator cluster setting, plus experimental/cloud-only stubs.
licensepb: add a top-levelFeatureenum andrepeated Feature features = 12(field 12 chosen to stay wire-compatible with the managed-service proto's 8-11).
featureflag: addFeatureGatewithRegister+ functional options(
WithSetting/WithName/WithExperimental/WithCloudOnly) andEnabled,which evaluates the license gate then the operator setting.
server/license: populatefeatureflag.GetLicenseHookfrominit()— theacyclic dependency direction (
server/license→featureflag).license+setting) plus the license-denial path.
This deliberately does NOT migrate the ~9 existing real call sites (changefeed,
backup, etc.); that migration is a follow-up. Early commits add the proto + gate
machinery; the later commits wire the hook and add tests.
Open questions (for review)
featuresis empty(old-style license), should gates be permissive or strict? Prototype is
permissive and flags the decision inline.
repeated Featureenum — they need typed fields and a non-boolean gatemethod. Deferred.
WithExperimental/WithCloudOnlysemantics: what do they evaluate (buildtag? version gate? license environment? injected setting?). Stubbed.
features = 12andresolve edition+add-ons into it. Out of scope for this skeleton; required
counterpart.
Epic: none
Release note: None