Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "union" type for variables. #32587

Open
mkielar opened this issue Jan 27, 2023 · 8 comments
Open

Introduce "union" type for variables. #32587

mkielar opened this issue Jan 27, 2023 · 8 comments

Comments

@mkielar
Copy link

mkielar commented Jan 27, 2023

Terraform Version

1.3.7

Use Cases

Specific use case is as follows, but please read below for explanation of a more generic use-case.

We have a terraform module that provisions scheduled scaling of various compute resources. In that module, we've added scaler_configuration variable that looks like this:

variable "scaler" { 
  type = string
}
variable "scaler_configuration" {
  type = any
}

The module then provisions EventBridge Event with payload that is specific for each scaler. This payload then triggers a Lambda function that knows what to do.

Now, we have several types of resources we scale, and several methods of scaling of each resource. Each of these is represented as a separate "scaler", and required slightly different configuration. Unfortunately, due to current terraform constraints, we need to use any type for scaler_configuration. This makes documenting this variable a tedious process, and makes developer work harder, because of lack of autocomplete features.

Attempted Solutions

There are multiple ways currently to achieve that:
Probably the most obvious is to implement dedicated module for each scaler. That's a fair point, however sometimes that "strategy" could be just a small feature in a larger module. Using the same example as above: We could have an appmesh-app that provisions an ECS Service, it's CloudMap entries, AppMesh configuration, Alarms, Custom Metrics, etc, and additionally allows developers to specify scheduled scaling for that service. In that case, the scaling configuration of the service is just one of many features the module provides, and we'd like to have it customizable with different scalers. Strategy pattern feels ideal for such cases. You can also fall back to any and implement dedicated validation with precondition rules.

We're using both of these approaches quite successfully, but would wish for more explicit ways of defining variables.

Proposal

It would be nice, if - instead using any for scaler_configuration, we could use something like this:

variable "scaler_configuration" {
  type = union([
    # For scaling ECS Services that have Autoscaling Enabled:
    object({
      ecs_cluster_arn                   = string
      ecs_service_arn                   = string
      min_count                         = int
      max_count                         = int
      fit_desired_count_to_new_boundary = bool
    }),

    # For scaling ECS Services that don't have Autoscaling Enabled:
    object({
      ecs_cluster_arn = string
      ecs_service_arn = string
      desired_count   = int
    }),

    # For scaling RDS/Aurora Clusters:
    object({
      rds_cluster_id     = string
      reader_nodes_count = int
    }),

    # Etc..
  ])
}

A more general explanation:
Using union types would allow implementing "strategy" design patterns into modules.
Additionally, with #25609 in place, this would allow for single-place validation and extendibility of such configurations.

References

@mkielar mkielar added enhancement new new issue not yet triaged labels Jan 27, 2023
@jbardin
Copy link
Member

jbardin commented Jan 27, 2023

Hi @mkielar,

Can you explain further how a union fits into a static structural type system? All expressions must always evaluate to a single type, and without a single type for a value Terraform has no way to infer the resulting type of an expression. Also due to the lack of nominative typing, I'm not certain how one would differentiate between the given types in a union.

Thanks!

@jbardin jbardin added waiting-response An issue/pull request is waiting for a response from the community config and removed new new issue not yet triaged labels Jan 27, 2023
@mkielar
Copy link
Author

mkielar commented Jan 27, 2023

Well, if I understand things correctly, in #32590 you have explained that terraform identifies two different object types, based on their internal structure. Thus, I assume that terraform should already be able to tell the difference between:

object({
  rds_cluster_id     = string
  reader_nodes_count = int
})

and

object({
  ecs_cluster_arn = string
  ecs_service_arn = string
  desired_count   = int
}),

and treat the two as two different types.

What I propose is an approach based on duck-typing, and the differentiation that terraform seems to already be doing. The feature is also inspired on Python's ability to support multiple types for a single attribute - Python, although dynamically typed in nature, allows you to specify type hints and then use static linters that ensure you're passing properly typed values. So, in Python you can define a function as follows:

def fn(my_arg: MyAwesomeClass | List[string] | int):
  pass

and linters will be able to issue warnings if you try to pass something that's not an instance of MyAwesomeClass, not a list-of-strings and not an int.

The obvious difference between what Python and Terraform do is of course in the fact, that in Python you can say MyAwesomeClass is a type (ie define custom types and name them), and in Terraform you cannot. But that should not matter as long as Terraform can tell that a value assigned to given union-variable matches at least one of the types in the union. And Terraform can do that by duck-typing - if the value satisfies any of the object definitions within the union, then assume that it's of that type. Thus, it also satisfies the overall union-type.

@jbardin
Copy link
Member

jbardin commented Jan 27, 2023

There may be something here in the area of using a type constraint for validation, however that constraint would need to be fully incorporated into the language or they would devolve into any outside of the variable itself (or maybe the constraint is sufficiently useful within the variable validation alone?)

Another point to consider is how to deal with ambiguous type assignment. For example, a value with the following type would be perfectly valid as an assignment to any of those individual types:

  type = object({
    ecs_cluster_arn                   = string
    ecs_service_arn                   = string
    min_count                         = int
    max_count                         = int
    desired_count                     = int
    rds_cluster_id                    = string
    reader_nodes_count                = int
    fit_desired_count_to_new_boundary = bool
  })

@mkielar
Copy link
Author

mkielar commented Jan 29, 2023

So it seems that Terraform uses types in two ways, and treats them differently:

  1. There are types of the values, and these are verified when they're passed around (e.g. in an inline "if" expression)
  2. There are types of the variables, and they are used for validation upon assignment.

Except - as you showed above - pt.2. doesn't quite work. That's another inconsistency - Terraform complains when you try to pass a variable that doesn't exist when invoking a module, but it's fine when you try to pass an attribute that doesn't exist in variable object-type definition. I'm not sure why that is, and I was surprised first time I realised that. Should that not be the case, and Terraform would be more strict when validating inputs, the example you provided should not ever be accepted.

I track Terraform progress since 2016 (it was v0.6, back then IIRC), and my impression is that over the years the works are going in the direction of making input validation rules more strict, and implementing more and more type-safety features. With that in mind, the union type proposal could work on one of the two premises:

  1. The example you provided would not be legal at all - all attributes in the value must match the variable type, that's it.
  2. The example you provided would be generally allowed, but if it would cause ambiguity in case of the union type, terraform would simply throw an error and force the developer to either adjust the value type or the variable type.

@jbardin
Copy link
Member

jbardin commented Jan 31, 2023

I think there's still some imprecision here which is causing confusion. All values have a type, only in the case of a null or unknown value can that type be considered a type constraint when it contains a DynamicPseudoType (i.e. any). Variable types are no different, they use the same type system, however when assigning a value the type is used as a constraint for type conversion. You are not directly assigning the attributes of a variable object (which isn't possible in the language), you are assigning the entire variable object and allowing for type conversion.

This process is designed so that a module can specify only the data it needs internally, and the calling module does not need to be aware of changes within the provider or callee to take different parts of the given value. The canonical example is a module which wants a specific subset of attributes from a resource. If the object type needed to match the resource schema exactly, then either the module would either need to closely track the provider version and update the variable definition accordingly, or the caller would need to filter the attributes to always match the callee. Since the addition of attributes is a non-breaking change for providers, this would frequently cause unexpected errors in modules consuming that provider's resources as they evolve along with the systems they manage.

I think in order for something like a union to work, it would have to be paired with a new concept like "variable assignment without type conversion".

@apparentlymart
Copy link
Contributor

apparentlymart commented Jan 31, 2023

The kind of "union" described in the initial writeup here seems like it wants to be a tagged union, which is to say that any value of that type includes both an indication of which of the possible variants is dynamically selected and the value of that variant.

I think for that to work in Terraform's type system we would need two additional operations that we don't currently have:

  • To make a dynamic decision based on which variant is selected for that value.
  • To assert that the value has a particular variant and get the underlying value or an error if the value has a different variant.

It is already possible to construct something representing the same information as a tagged union using object types and some usage conventions, optionally enforced by variable validation:

variable "example" {
  type = object({
    ecs_autoscaling = optional(object({
      ecs_cluster_arn                   = string
      ecs_service_arn                   = string
      min_count                         = int
      max_count                         = int
      fit_desired_count_to_new_boundary = bool
    }))
    ecs = optional(object({
      ecs_cluster_arn = string
      ecs_service_arn = string
      desired_count   = int
    }))
    rds = optional(object({
      rds_cluster_id     = string
      reader_nodes_count = int
    }))
  })

  validation {
    condition     = length([for o in var.example : o if o != null]) == 1
    error_message = "Must set exactly one of the top-level attributes to choose a configuration variant."
  }
}

locals {
  # This allows a dynamic decision based on which was selected
  selected_variant_attr = one([select k, o in var.example : k if o != null])

  # This is an assertion about which one is selected in return for the value of that variant
  selected_variant = var.example[local.selected_variant_attr]
}

When calling this module the caller would need to specify both which variant they are intending to use and the value for that variant, which in this convention-based example means choosing exactly one of the top-level attributes to set, and leaving the rest unset:

module "example" {
  # ...

  example = {
    rds = {
      rds_cluster_id     = aws_rds_cluster.example.id
      reader_nodes_count = 2
    }
  }
}

The above adds some additional conventions on top of object types to create something which represents the same information as a tagged union, albeit with some clumsy syntax.

I mention this not because I think this is how tagged unions should appear syntactically in our language, but because I think it demonstrates one possible model for how they could work alongside Terraform's existing type system, assuming we had some extra syntax to make this be a real language feature rather than just a convention. (There's a bunch of examples on the Wikipedia page for Tagged Unions that I linked to above of what syntax was chosen for these features in some other languages.)

Of course the usual question applies of whether this arises frequently enough in practice to warrant additional complexity in the language vs. just a convention in a particular module. But it does seem like at least there is some way we could model it so that it co-operates with the rest of the type system and is unambiguous. In the meantime, a convention similar to the above could be used for modules that really need a solution to this in today's Terraform.


New language features that require new syntax rather than just new identifiers are typically the slowest to design because they have a wide scope of effect, and so although we are discussing some possibilities I would not expect to see rapid progress here.

If anyone else finds this issue and has a similar request then we'd love to see some more concrete, real-world examples of use-cases for this similar to the one in the original writeup, since having more examples helps put some bounds on the problem and thus helps find a minimal design to hopefully meet a need while minimizing negative impacts on other parts of the language.

@jbardin jbardin removed the waiting-response An issue/pull request is waiting for a response from the community label Apr 10, 2023
@sidekick-eimantas
Copy link

Finding myself in need of unions as well
Here's what I'm trying to do:

variable "webhook_policies" {
  type = list(
    object(
      {
        endpoint                = string
        ingest_enabled          = bool
        allowed_methods         = list(string)
        allowed_content_types   = list(string)
        allowed_ip_addresses    = list(string)
        enforce_headers_present = list(string)
        enforce_header_values   = map(list(string))
        signature_verification = union(
          [
            object(
              {
                type         = optional(string, "HTTP")
                header_name  = string
                header_value = string
              }
            ),
            object(
              {
                type       = optional(string, "MODULR")
                webhook_id = string
                secret     = string
              }
            )
          ]
        )
        response_status_code = number
        response_headers     = map(string)
        response_content     = optional(string)
        deliver_to = list(
          object(
            {
              type              = string # HTTP | AWS_S3
              name              = string
              delivery_enabled  = bool
              retry_delay_ms    = number
              delivery_attempts = number

              config = union(
                [
                  object(
                    {
                      endpoint                      = string
                      timeout                       = number
                      headers_override              = map(string)
                      method_override               = string
                      follow_redirects              = bool
                      delivery_success_status_codes = list(int)
                      verify                        = bool
                    }
                  ),
                  object(
                    {
                      bucket_name = string
                      region_name = string
                      use_ssl     = bool
                      verify      = bool
                    }
                  )
                ]
              )
            }
          )
        )
      }
    )
  )
}

@bleuse
Copy link

bleuse commented Dec 3, 2024

I'm wondering how this issue relates to #33916.
Would sum types help in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants