Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding enum or sum types #33916

Open
lorenzofelletti opened this issue Sep 20, 2023 · 4 comments
Open

Consider adding enum or sum types #33916

lorenzofelletti opened this issue Sep 20, 2023 · 4 comments

Comments

@lorenzofelletti
Copy link

Terraform Version

1.5.7

Use Cases

There would be many use cases in which an enum type may come to help.
Right now, every time one wants an enum-like behavior, if one is lucky enough that the variable is a string, one has to come up with a regex validation rule, otherwise one has to come up with more complex (and arguably less readable than an enum definition) validation rules.

Moreover, with an enum type, it should become easier to implement the feature of having a variable module source (something similar to what is asked by #25587 or #33793). In fact, in principle, if you restrict the source field to string or enum, you can always know beforehand all the possible values for the field, and install all the dependencies beforehand, then at run-time only choose which one to use.

Attempted Solutions

N/A

Proposal

# first proposal
variable "my_string_enum" {
  type        = enum(string, "string1", "string2", "string3")
  description = "enum of strings"
}

variable "my_object_enum" {
  type = enum(object({
    id   = number
    name = string
  }), object({
    id   = 1
    name = "first"
  }), object({
    id   = 2
    name = "second"
  }))
  description = "enum of objects"
}

# second proposal
# exploiting the default field
variable "my_object_enum" {
  type = enum(object({
    id   = number
    name = string
  }))
  description = "enum of objects"
  default     = [
    {
      id   = 1
      name = "one"
    },
    {
      id   = 2
      name = "two"
    },
  ]
}

References

@lorenzofelletti lorenzofelletti added enhancement new new issue not yet triaged labels Sep 20, 2023
@jbardin jbardin added upstream config and removed new new issue not yet triaged labels Sep 20, 2023
@apparentlymart
Copy link
Contributor

apparentlymart commented Sep 20, 2023

Thanks for sharing this proposal, @lorenzofelletti.

I remember that somewhere in our big heap of issues there are suggestions for two related but not equivalent features:

  • Something like tagged unions (sum types) which somehow allows declaring that a variable can match of any one of a set of type constraints.
  • A similar fixed set of valid values but without requiring that they are all of the same type, possibly mixed with the previous point so that it would be possible to say something like "can be any object of this object type or the fixed string "something".

In order to keep the language coherent I think we'd want to choose which subset of these goals we actually want to meet and design a single feature that solves everything in that subset; there's enough overlap of use-cases between these different approaches that it would likely be confusing to have a separate language feature for each of them.

I wasn't able to quickly find the other issues I'm thinking of above. Hopefully I or someone else will find them in future triage walks and will be able to link them all together retroactively.


In the past I've prototyped adopting some of the ideas from CUE, whose design offers the insight that it's possible to unify the concept of types with the concept of values by defining a concrete value like "hello" as a subtype of string.

Under that model, we could imagine saying that a variable can have the type string to permit any string, or could have a type like "foo" || "bar" to permit only exactly those two strings, or even more complicated interactions like my example above of "something" || object({foo = string}).

That way of thinking seems to neatly solve all of the above variations under a single model, but of course does so with a higher level of complexity than solving just one problem in isolation, such as the enum proposal in this issue.

I was prototyping those in the context of my personal project cty that Terraform is built around, rather than within Terraform itself, and so that was my own personal work rather than work I did while wearing my "Terraform Team at HashiCorp" hat. I suspect that to be successful it would be better to design this at the Terraform and HCL layer and then see what supporting help those might need from cty, if anything.


I assume the intent of the "upstream" label here was to represent that the type constraint expression syntax belongs to HCL rather than directly to Terraform. However, the design of that part of HCL tends to be primarily influenced by Terraform's needs, and so I expect it would end up being a Terraform contributor that ultimately figures out what we want to do here and drives it to implementation; there are no separate HCL maintainers to drive this independently.

@lorenzofelletti
Copy link
Author

Thank you for your insightful reply, @apparentlymart.

I really like the idea from CUE of using "foo" || "bar" , the syntax too seems much more convenient than the one I proposed.
I also like the possibility of "mixing up with types" like in your example ("something" || object({foo = string})), although I think a convenient syntax should be provided to handle "choosing" the variant in this case.

If the possible values are all of the same type, then the problem does not arise, but if var.my_var can be either a string or an object, then we may need a way to discriminate between them, and a special syntax for it may be useful in order to keep the code as readable as possible.

@apparentlymart
Copy link
Contributor

The problem of "choosing" is indeed a part I glossed over here. Languages which have that sort of type system tend to also have something like "pattern-matching" syntax for concisely evaluating different expressions based on what kind of value was provided, and I agree that designing something in that area is an important part of designing this should we decide to make this more like sum-types and less like "fixed set of values of a single type".

I don't have any sufficient concrete proposal for that yet, but an analogy to help frame what we're talking about might be Rust's match expressions, which allow concisely taking different evaluation paths based on a value and destructuring complex values to more conveniently use their constituent parts.


Looking back at my previous (long) reply I realize that I forgot to say that the language today does have a way to express that sort of constraint as validating rule, rather than as a type constraint:

variable "my_string_enum" {
  type        = string
  description = "enum of strings"

  validation {
    condition = contains(["string1", "string2", "string3"], var.my_string_enum)
    error_message = "Must be either \"string1\", \"string2\", or \"string3\"."
  }
}

This approach does functionally work today, but of course there are some ergonomic annoyances with it:

  • It's harder to automatically extract documentation saying which values are acceptable, since the condition is a dynamic expression that can be written in various different ways, rather than a static constraint.
  • It's considerably harder to use this pattern when the "enum" is for a nested value inside a collection or structural type; it tends to require using alltrue with for expressions, or other similar patterns that are functionally equivalent but hard to write and maintain.
  • With validation as currently designed, there's no way to share the set of allowed values between condition and error_message, so we typically end up having to write the set of valid values twice so that the error message will be actionable. (There is a more focused proposal somewhere for something like allowing locals blocks inside validation blocks for tightly-scoped locals that can be shared across both condition and error_message, but that's a far blunter solution than treating it as a type-system-shaped problem.)

I would recommend that anyone with this need should try to follow the pattern I showed above with today's Terraform. Ergonomic concerns aside, it does functionally meet the use-case as stated. However, I don't find this answer satisfying and would like to continue researching alternatives like those we've been discussing in this issue.

@lorenzofelletti
Copy link
Author

Taking inspiration from Rust's pattern-matching to implement this feature would be a great thing, in my opinion.

I have been thinking about potential implementations lately, and here are my thoughts on how it could be achieved.

First, I think that a way to distinguish and reference each possible variant in an easy and readable way is needed. Something like this to be clear:

// variable difinition proposal
variable "sum_type" {
  // each branch is given a unique key
  type  = a_string: "STRING1" || a_number: number || an_obj: object({ field = string })
  description = "desc"
}

// variable difinition proposal 2
variable "sum_type" {
  type = union({
    // each branch is given a unique key
    a_string = "STRING1"
    an_obj   = object({
      field = string
    })
    a_number = number
  })
  description = "desc"
}

Moreover, I think that a crucial point should be on which level we want the match to occur.

For example, there are at least two different levels in which a hypothetical match operator could fit:

  • As a right-hand side construct, not much dissimilar to a function that maps each “branch” to a value
  • As a resource/module-level block, which could allow for potentially more complex scenarios.

In the example below, a proposal for a match construct it is introduced as a right-hand side expression:

// proposal for match as right-hand side expression
resource "a_resource" "my_res" {
  a_field = match(var.sum_type, {
    // i use the key for branching
    a_string => var.sum_type
    an_obj   => var.sum_type.field
  })

  another_field = match(var.sum_type, {
    a_string => var.sum_type
    // and a convenient else to unify the logic for all remaining branches
    else   => ""
  })
}

But, I can already imagine cases in which such a match construct will just clutter the code (e.g. when you need to repeat the same match logic for many fields of a resource).

Thus, here is a proposal for adding it as a module/resource-level block:

// proposal for match as a module/resource-level block
resource "a_resource" "my_res" {
  name = var.res_name

  match(var.sum_type) {
    // for each branch you define which variables to pass
    a_string {
      // a self/this keyword would make it even more readable
      a_field = var.sum_type
    }
    an_obj {
      a_field       = var.sum_type.field1
      another_field = var.sum_type.field1
    }
  }
}

This second approach would, in my opinion, cover a bigger area for which such a construct may be needed, without hindering readability.

One may also think of a convenient way to check the actual “variant” (useful in constructs like ifs, and to selectively create resources e.g. with count). Something like:

count = variant(var.sum_type, a_string) ? 0 : 1

@lorenzofelletti lorenzofelletti changed the title Consider adding the enum type Consider adding enum or sum types Oct 20, 2023
kwohlfahrt added a commit to kwohlfahrt/tf-k8s that referenced this issue Dec 1, 2024
This effectively side-steps the entire Terraform type-system. The issue
is that in Terraform, a list containing `DynamicValue`, still has to
have all of the dynamic values reconcilable to the same type. This
doesn't apply to `x-kubernetes-int-or-string`, though there it can be
worked around by converting all to string. However, with more
complicated types that are `x-kubernetes-preserve-unknown-fields`, this
will not be possible.

Therefore, make everything dynamic, and handle checking ourselves. The
one downside is that IDE integrations etc can't use the schema types for
type-hints.

This may be worth revisiting when github.com/hashicorp/terraform#33916
is resolved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants