Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: AADSTS700024 Encountered In AzureResourceManagerTemplateDeployment@3 near 60 Minute Mark #20784

Open
4 of 7 tasks
aolszowka opened this issue Jan 14, 2025 · 0 comments
Open
4 of 7 tasks

Comments

@aolszowka
Copy link

New issue checklist

Task name

AzureResourceManagerTemplateDeployment

Task version

3

Issue Description

I believe we are being affected by the issue documented at Azure/azure-cli#28708 as documented by @jiasli.

@kboom has done work to try and provide an experimental work around for the AzureCLIv2 task here #19989 however I believe this same issue is going to be present in several pipeline related tasks, just simply exposed in different ways.

Rather than continuing to post on that unrelated PR, I am opening this new issue to focus on scenarios that involve AzureResourceManagerTemplateDeploymentv3.

The underlying issue is documented well, and I see that several maintainers are active in the #19989 thread. I believe the problem is well understood by all parties, but if you wind up here via Google here is my interpretation of the issue:

  1. After converting our Azure Service Connections to workload identity federation as recommended by Microsoft we are now beholden to a 60 minute limit on the tokens generated.
  2. If you have a long running ARM Template that is deployed via AzureResourceManagerTemplateDeploymentv3 you will eventually encounter this error as you reach the 60 minute mark.

Our use case is to deploy infrastructure via a pipeline using this; scenarios which can sometimes cause us to reach over this 60 minute mark include:

  1. Provisioning new SQL MI Instances - This has a SLA of up to 6+ Hours for initialization of a new instance
  2. Scaling SQL MI Instances - Much like creating a new instance this can have an extended SLA depending on how far you are scaling this
  3. Performing Cross Subscription Database Restores - Depending on the size of the Database and the performance of the point in time restore this can easily exceed 60 minutes of runtime.

Really any complex ARM Deployment will eventually encounter this scenario. If it would help I can work towards providing a turnkey PoC, but I believe this issue is well understood without the need for such a reproducing test case.

You will notice in the linked thread that others are encountering similar issues albeit using different wrappers (for example in that thread the most recent post as of the time of this writing was trying to use a terraform wrapper) that eventually get them down to the same scenario.

It is unclear what our options are here, do we know if converting back to a non-workload identify federation connection would work around the issue? This is possible, but as several attempts are made to harden environments this will become far more prevalent as outlined in the upstream issue.

Environment type (Please select at least one enviroment where you face this issue)

  • Self-Hosted
  • Microsoft Hosted
  • VMSS Pool
  • Container

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

No response

Operation system

ubuntu-latest

Relevant log output

##[error]Could not fetch access token for Azure. Status code: invalid_client, status message: Error(s) 700024 - Timestamp: 2025-01-14 20:16:02Z - Description: AADSTS700024: Client assertion is not within its valid time range.

Full task logs with system.debug enabled

Repro steps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant