Skip to content

license key implementation #31766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 3, 2025
Merged

license key implementation #31766

merged 5 commits into from
Apr 3, 2025

Conversation

doy-materialize
Copy link
Contributor

Motivation

https://github.com/MaterializeInc/cloud/issues/10759

this is the first draft of an implementation of license keys (more details can be found at https://www.notion.so/materialize/License-keys-1a713f48d37b80298db2d241b90fcfd5). the basic outline is that environmentd will read the max credit consumption rate from the signed license key as well as a flag to determine whether both the credit consumption rate limit and the credits each cluster replica uses can be overridden by the user. if no license key is provided, the user will be limited to 24G of clusters. in cloud, we'll be able to provide a special license key which disables these checks.

Tips for reviewer

opening this as a draft to see how it works in ci and if there is anything that needs to be fixed in the test suite. also, the public key used for license validation is currently coming from a random kms key in the scratch account, and i'll need to do the real implementation of license key issuing on the cloud side first before we can merge this, in order to get the real pubkey to use.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@doy-materialize doy-materialize force-pushed the push-uzwmlvurxytq branch 5 times, most recently from 7823431 to 71a91df Compare March 5, 2025 23:08
@@ -43,6 +43,8 @@ pub struct SystemParameterSyncConfig {
/// to use when populating the [SynchronizedParameters]
/// instance in [SystemParameterFrontend::pull].
ld_key_map: BTreeMap<String, String>,
/// max_credit_consumption_rate override from the license key.
pub max_credit_consumption_rate: Option<f64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if we are overloading max_credit_consumption_rate here rather than moving to something like max_memory_consumption_rate . The latter would open the door for max_disk_consumption_rate as well, but it means adding additional validations during ddl planning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably will need to do this eventually, but that would be a more complicated change that i'd like to leave for the future, to limit the scope here (there's already a lot going on in this pr)

if !license_key_config.allow_credit_consumption_override {
for (name, replica) in cluster_replica_sizes.0.iter_mut() {
let memory_limit = replica.memory_limit.get_or_insert_with(|| {
warn!("No memory limit found in cluster definition for {name}, defaulting to 4GiB");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems odd. Whenever these values are read in we want them to be parsed, for instance, if we added a systemvar to allow defining cluster sizes we'd also need to apply this logic there. It might make sense to this to be a function on ClusterReplicaSizeMap, or have a builder for ClusterReplicaSizeMap that allows somethingl ike a with_default_memory_limit . Alternatively we added memory requests it may not be obvious that we should update this to be the request value rather than 4Gi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i wasn't quite sure what to do here - if a user is using a license key with limited credits/memory usage, we can't allow using clusters that don't have memory limits set, but a lot of our tests currently want to do this. it might be better to just fail here instead of setting a default, and update the tests instead, i'll look into it

@doy-materialize doy-materialize force-pushed the push-uzwmlvurxytq branch 2 times, most recently from 90677a1 to d1cee6a Compare March 13, 2025 19:48
Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me! Thanks for backing out the SystemVar changes!

Comment on lines 705 to 708
} else if matches!(args.orchestrator, OrchestratorKind::Kubernetes) {
panic!("--license-key is required when running in Kubernetes");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we could further match here on the cloud provider to skip this check entirely for the Cloud offering? (Note: I'm not sure what cloud provider self hosted folks get reported as)

Also any reason for a panic instead of returning an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we want to be able to bypass needing a license key like that? my understanding was we wanted all community edition usage to be restricted. self-hosted users will also be using the aws cloud provider (the same thing our cloud product uses) if they are running in aws

@doy-materialize
Copy link
Contributor Author

fixed up the panics and hardcoded constants

@doy-materialize doy-materialize force-pushed the push-uzwmlvurxytq branch 6 times, most recently from 79d765d to 6e671df Compare March 27, 2025 16:55
@doy-materialize doy-materialize force-pushed the push-uzwmlvurxytq branch 8 times, most recently from d8cff0a to 3db5e25 Compare March 28, 2025 23:22
@doy-materialize doy-materialize force-pushed the push-uzwmlvurxytq branch 5 times, most recently from 3286322 to be5770d Compare April 1, 2025 15:29
@doy-materialize
Copy link
Contributor Author

okay, this appears to be working now - i still need to do some testing on the cloud side to make sure that things don't break when a release with these changes goes out, but it should be ready for one final review

@doy-materialize doy-materialize force-pushed the push-uzwmlvurxytq branch 3 times, most recently from 9bea33c to 1b753bc Compare April 2, 2025 18:20
if no license key is provided, default to a limit of 24 (cloud will use
a special license key which disables the check entirely)

if we have loaded a max_credit_consumption_rate, don't allow it to be
modified by anyone. this should ensure that we can continue managing
max_credit_consumption_rate as before in cloud (which will use a
specially issued license key which disables these checks) but anyone
using the emulator/self-hosted will either have a custom limit loaded
from their license key, or a default limit if they don't provide a
license key, neither of which can be overridden.
cloudtests are using the deployed docker image but want to run the tests
in kubernetes, but the image that we provide won't run in kubernetes
without a license key

additionally, some of the non-kubernetes tests want to run more than 24G
of clusters, so we need to pass through the license key for those too
@doy-materialize doy-materialize marked this pull request as ready for review April 3, 2025 16:35
@doy-materialize doy-materialize requested review from a team and ggevay as code owners April 3, 2025 16:35
@doy-materialize doy-materialize requested a review from aljoscha April 3, 2025 16:35
@doy-materialize doy-materialize changed the title [wip] license key implementation license key implementation Apr 3, 2025
@doy-materialize
Copy link
Contributor Author

fairly sure the remaining sqlsmith errors are #32081, so i'm going to ignore those for now

@doy-materialize doy-materialize merged commit c4b507c into main Apr 3, 2025
253 of 255 checks passed
@doy-materialize doy-materialize deleted the push-uzwmlvurxytq branch April 3, 2025 18:44
@@ -289,6 +290,16 @@ def __init__(
config["platform"] = platform

volumes = []

if image_version is None or image_version >= "v0.140.0-dev":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does string comparison work for a Version object?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants