Skip to content

Proposal: prefix output #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Gowiem opened this issue Apr 6, 2021 · 5 comments
Open

Proposal: prefix output #126

Gowiem opened this issue Apr 6, 2021 · 5 comments

Comments

@Gowiem
Copy link
Member

Gowiem commented Apr 6, 2021

Describe the Feature

I find it common that folks will utilize what I think of as the label's prefix (e.g. ${namespace}-${stage}-${environment}) as a string interpolation prefix to some identifier that they want to create without building a full label.

Most of the time this is used as a crutch to avoid having to create a label module for each namable thing that you're working on. It could definitely be argued that this is a bad practice and that it shouldn't be done... but I've seen it enough now across multiple codebases that I know I'm not the only one who does it so figured this proposal was worth the discussion.

The proposal is to add a new output called prefix that is simply all label components up until the name component. e.g. ${namespace}-${stage}-${environment}, ${namespace}-${stage}, ${namespace}-${environment}, etc.

I would need to look into the code more to determine how we would handle that in regards to the label ordering... but figured I could bring this up for discussion before coming up with a proper solution.

Use Case

This would enable not having to create this label prefix for ad-hoc usages and therefore dry up some terraform string interpolation code.

Alternatives Considered

Alternatives:

  1. Don't do this... I'd be fine to hear the feedback here say something along the lines of: Hey that should be considered bad practice and we don't want to support it.
  2. Create a prefix label that is only made up of the components that you want instead of outputting the prefix from more specific labels.
@Jamie-BitFlight
Copy link
Contributor

Jamie-BitFlight commented Apr 6, 2021 via email

@Gowiem
Copy link
Member Author

Gowiem commented Apr 6, 2021

@Jamie-BitFlight yeah I get that POV and after getting the thoughts from the wider Cloud Posse team on this I'm going to close out.

@Gowiem Gowiem closed this as completed Apr 6, 2021
@Gowiem
Copy link
Member Author

Gowiem commented Feb 22, 2024

I'm going to reopen this for discussion... I still see it being done and I honestly this it has merit. I'm not sure if this is just me being bull headed and misremember my conversation with the community, but I want this to be discussed here in this issue before calling it closed as I'm coming back this issue years later and still wanting this functionality

@Gowiem Gowiem reopened this Feb 22, 2024
@john-heinnickel
Copy link

I'm just arriving at this module today after going through a series of ocean boiling attempts to look for a DRY way to manage tags and labels. The pain-point as I see it is less about the proliferation of many module declarations so much as it is the number of variables that have to get propagated along side the functional metadata that defines qualities things like instance type, capacity pre-provisioning, queue type, message count, or whatever size and behavior modifiers are going to be needed to configure the behavior and characteristics of whatever named resources a module is created for.

Among the things we've tried are the consolidation of all resource name and tag set generation from a module that receives an index three tiered map, where the outer map is for service type, the second tier is for resource type, and the third tier is for a functional role name that leads to a leaf node object that collects all the values that are distinguish one resource from another:

  • It's "functional" name (Optional--the key at the innermost level is used instead if omitted)
  • The "workload" it belongs to (Bigger than a bread box, smaller than your apartment building token for grouping--borrowed from AWS solution use in their naming conventions)
  • A structure with with the functional value-matching token that map to tags used in our attempt at ABAC policy
  • A "role" | "function" indicator whether the above resource is a Principal or a Resource when tagged for policy
  • A multiplier attribute for stateful set expansion

We had come to find that in some rare cases there are needs for resource-type-specific naming convention changes:

  • Load Balancer Target groups require aggressive character sparing, to the point of cutting out vowels or using a shorter form of the Workload token
  • IAM Policies, IAM Roles, SSM Parameters, and S3 objects make semantic use of path hierarchy that merits use of a "/" separator that would lead to an illegal name in other resource types
  • Some service consoles allow tags to provide console columns and can afford to let non-identifying attributes remain tags, but console that don't (e.g. SQS) make a stronger case for packing more qualifiers into their names, but not strong enough to not also want the attributes in tags for programmatic matching.

When we started considering the "Workload" dimension in particular, and if our application made more use of multiple regions I suspect, we started to appreciate that there is often some scoping that changes the number of qualifiers we needed in certain scenarios:

  • Development environments sometimes used more than one VPC to host applications, but leveraged a central service VPC in each region, with a hub and spoke arrangement allowing Private Link endpoints created in a shared VPC to handle traffic from each of the application-hosting VPCs without paying to replicate PrivateLink endpoint for each application. Some of these VPCs were partitioned to repeat the same isolated groups of public/shared/backend subnets.
    • Deploying an application into these cost-sparing configurations required more disambiguation to their naming that created naming conventions that looked silly if translated 1:1 in production, and naming conventions that seem obvious at first may require justify some refactoring.
      • Consider account with three VPCs that are each provisioned to support eight parallel subnet sets. The IP ranges are preallocaed, so its not practical to resize these, which is how we ended up with three instead of one larger VPC with 24 subnet groups. That, and its hard to find a naming convention with 24 recognizable names.....
        • The VPC's are named with a weather convention: "wind", "rain", and "snow".
        • The subnet groups use eight color names each (chosen for minimal character count): Red, Blue, Gray, Cyan, Lime, Teal, Navy, Pink
      • In this convention, it makes sense to see the VPCS themselves named "wind", "rain" and "snow", while the gateways and subnets have names that add functional and availability zone modifiers (public/private/isolated):
        • wind-red-public-a
        • snow-lime-private-c
    • It did not, however, turn out so useful for ec2 instances and volumes deployed here
      • Two logically distinct instances of the same app would be deployed on one of these networks at any point in time, but it might get reployed later on a different subnet and still be the same application. The instances would be recreated when they moved, but the data volumes were retained as-is
        - If we put wind-red in a Volume's name while its compute nodes are on that subnet, do we rename it when the node moves to snow-gray
        - If so, how does the name help us recognize the same volumes over time? Names are a lot harder to use when they keep changing, particularly if some other logical instance is later started in wind-red and takes the name our deployment used to have.
      • The weather and color values were useful as auxiliary tags to provide that location metadata, but their namespace prefix only needed a single term to distinguish it from every other occurrence of the application that could stay the same no matter how many times the deployment moved around.

We still wanted color and weather as properties we collected in the deployment properties for an instance of this app, but we only wanted to use them in name generation when the purpose was to generate a name for a data resource that would look up a subnet, or a security group, or something else provisioned with VPC-oriented namespacing. For the resourced deployed into the contexts those tokens implied, we only wanted to use them to tag our resources, to configure their CloudMap discovery DNS domain, but we used a different sequence of names to track the logical instances that were being used at any point in time for some development effort.

Centralizing all the label construction in a master place initially allowed us to avoid repeating the constant parts of a prefix, but accommodating two or three "Scopes" for selecting the right elements to use started exposing every resource to the semantics of placing themselves in the right scope, which wasn't always easy or intuitive because this central registry was far removed from where the names actually got used.

While looking at this repository today tho, I saw something that made me remember something very useful from my functional programing class--currying!! I noticed that for the sake of reusing some parts of a label's definition there seems to be a context output that can be provided as a single input to another label module allocation. I think there may be a lot of value in using that kind of a construct--the ability to configure a module for labelling verbosely close to the application's root to supply the parameter rich patterns that are reused, and possibly to fix the values that are not going to change from one use to another, reducing the state that is needed to reuse the parts of a naming scheme that are locked in by outer modules, while still allowing partial clarification and override at the deeper layers without needing to pass a library of individual attributes.

When you think about it, anyone who is using one Module label to create a prefix name and then passing that around is already using a poor man's memento design pattern based on a simple concatenation heuristic. It works for the simplistic case of simple concatenation, and if an application has only three or four naming scopes, it would probably do this three or four times and pass just those prefixes around. But it does so at the loss of tag set propagation, case conversion, and other features. Passing a black box memento from labeling scopes that hide the distinct values needed to reuse the decisions that call for passing the same inputs in otherwise may be a fine compromise--the module call to provide a black box memento plus some minimal local customizations for a related use seems promising as being more usable than needing to receive all the arguments in the full signature and remember to pass them back as well. Potentially it could also provide some future proofing depending on the impact of any new attributes that a caller might start using and then need downstream modules to carry and pass into for their own use.

Without having given the implementation details a lot of thought, I can say that there was a certain sense of logical epiphany when I saw the use of "context" and remembered my first exposure to the "Memento" design pattern that is all about externalizing the means to reconstruct an object with a complex creational pattern as an opaque value that can be readily passed around inside of a program and then later supplied back to its original provider class in order to reconstitute the complex object without knowing anything about the complexity involved. We don't have custom functions in Terraform, that would allow us to pass around curried functions, but it seems to me that we do have the means to use a pair of modules to provide implementations of the memento pattern, and it may very well be that labelling fits that design pattern extremely well...

Instead of attempting to perform all name generation in a single module that would receive the both the rules metadata that I don't want to repeat alongside all of the inputs I want to generate labels for using those rules, I could use a policy memento module to capture the fixed values and set the required call signature at my root, and then pass as many of these mementos as my submodules need for what scope their naming needs are.

For the purpose of using a memento, there are problably at least two kinds of stakeholders:

  • Those creating a resource may need the full fledged policy application module that takes in the additional values left unbound when the memento was created, and returns both the ID and a tag set containing tags inheritted through the Memento as well as those based on the additional partial input.
  • Others using a memento to locate (e.g. subnet lookup) or create forward references (e.g. policy construction) to resources that were either created in other project layers or would create a circular reference if passed as a resource themselves might just need ID-generating use and count pass the their mementos to an apply module that only generated a requested ID and avoid the effort to create tags taht were not actually needed.

Anyhow, enough rambling from me for now. I want to download this module and see just how much reuse I can lready get from its context chaining....

@john-heinnickel
Copy link

Distilling the feedback I just offered above a bit, I'll reduce it in volume to a question asking whether there would be any benefit in terms of things like the implementation cost of having one module that has to both apply a naming policy and also construct a context object. I would anticipate that there could be things like this going on in the implementation:

  • Extra code to reconcile the precedence path of inputs that may either come from a context object or through the inputs that start a chain
  • In the case of an application that doesn't leverage conext chaining, but still calls this module extensively, there is likely an accumulation of unused output context object and wasted computation to have created them all
  • There could be some ambiguities about whether a context object's state is meant to be augmented to replaced when a caller both provides a context object and some inputs, where the conservative thing to do is to override the context inputs and some additional inputs taken when the context was created would be helpful, but would also become nonsense parameters when the module was used without an intent to use its context to chain into the next calls.

If so, then the law of Demeter may be worth applying here as an observation that constructing the memento to reuse a policy, acting on a memento to perform labelling, and acting on raw input alone to perform labelling are perhaps two or three distinct use cases and merit distinct module entry points.

  • Instead of publishing just one module that will produce both a Label and a Context object each time it is used, take advantage of the fact that most callers are only going to use either the returned Label or the returned context object, but rarely (I suspect) both.
    • Split the responsibilities into purpose-specific modules
      • One module becomes three:
        • A Memento factory captures the policy inputs and fixed values that will be the same in all subseqeunt calls, but does not actually produce a label
        • A Memento-dependent module takes a Memento and only the more sepcific values not captured in the memory to produce a Label
        • A Memento-agnostic module doesn't provide a Memento or consume a memento, and offers no special reuse mechanism.
      • Move all context chain creation logic to a module that only returns a chain context (Memento object)
        • Specialize here by removing the concurrent requirement to create a label
        • Gain license to implement any additional disambiguating inputs that would allow a memento factory to provide support for additional contracts that only make sense when there is no expectation to produce a label along with Context
      • Provide both a context-free and context-required variants of the existing module behavior.
        • If the form that accepts a Context memento can reduce its signature by treating some inputs as context-only, then there are fewer cases to handle and a simplet call signature can be provided.
        • The context-free form then supports those who don't need effort spent creating a context object they have agreed through their module usage that they will not need to have provided later.
      • Keep the context-free and the memento-constructing interfaces as similar as possible to help with migration if a developer later discovers a desire to use chaining after starting with full-argument passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants