Skip to content

Conversation

@jtnord
Copy link
Member

@jtnord jtnord commented Jan 5, 2026

diagnosis for #995 (comment) / jenkins-infra/helpdesk#4939

Testing done

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests that demonstrate the feature works or the issue is fixed

@jtnord
Copy link
Member Author

jtnord commented Jan 5, 2026

17:41:58  Provider: SecureRandom.DRBG algorithm from: SUN
17:41:58  provider: Failed to use operating system seed generator: java.io.IOException: Required native CryptoAPI features not  available on this machine
17:41:58  provider: Using default threaded seed generator

which I do not see locally.
https://stackoverflow.com/questions/49322948/slow-securerandom-initialization/49322949#49322949 and implies this was resolved. but I guess there is something about our container infrastructure that is broken? not exactly sure what OS/Hosts we run for windows @timja

@jtnord
Copy link
Member Author

jtnord commented Jan 5, 2026

https://github.com/openjdk/jdk21u-dev/blob/bad21fbe258402e7697279fdbdf7d67e02d20c03/src/java.base/windows/native/libjava/WinCAPISeedGenerator.c#L49C13-L52 alas this code does not call GetLastError to find out why it fails... 😮‍💨

Could be some missing libraries in the container, could be permissions could be something entirely different...

@timja
Copy link
Member

timja commented Jan 5, 2026

Cc @dduportal / @lemeurherve otherwise I’ll check later on.

@jtnord
Copy link
Member Author

jtnord commented Jan 5, 2026

I would guess it is either NTE_BAD_KEY_STATE (given there is some potential funkyness around passwords in containers, or NTE_PROV_TYPE_NO_MATCH because we are using a slim container (?) without the required support (the JDK requests PROV_RSA_FULL as the type).

@timja
Copy link
Member

timja commented Jan 5, 2026

I don’t think it’s in a container but will check when back to my laptop

@timja
Copy link
Member

timja commented Jan 5, 2026

These are the labels on the VM:

image

Its a Windows 2019 VM on AWS

@lemeurherve
Copy link
Member

Its a Windows 2019 VM on AWS

And here is how it's provisioned: https://github.com/jenkins-infra/packer-images/blob/main/provisioning/windows-provision.ps1

@timja
Copy link
Member

timja commented Jan 5, 2026

@lemeurherve
Copy link
Member

lemeurherve commented Jan 5, 2026

@timja
Copy link
Member

timja commented Jan 5, 2026

I logged into a running windows VM and ran:

pwsh

cd C:\tools\jdk-17\bin
$url = 'https://gist.githubusercontent.com/timja/18b75ae57ecc2d517a8fce0811c98bdd/raw/83c73c7a59a8a3ef3b094166cdb9b35bf51ce7cb/gistfile1.txt'
Invoke-WebRequest $url -Outfile Main.java
.\javac Main.java
.\java '-Djava.security.debug="provider"' Main
provider: Failed to use operating system seed generator: java.io.IOException: Required native CryptoAPI features not  available on this machine
provider: Using default threaded seed generator
Provider: MessageDigest.SHA algorithm from: SUN
Provider: MessageDigest.SHA algori

I tried it on Java 25 as well with the same result.

@timja
Copy link
Member

timja commented Jan 5, 2026

I think best to just write a little C program and run it on the VM.

@timja
Copy link
Member

timja commented Jan 6, 2026

Going to try this:
debug-secure-random-2.zip

@timja
Copy link
Member

timja commented Jan 6, 2026

This code (from mslearn on how to use the API):

#include <Windows.h>
#include <wincrypt.h>
#include <stdio.h>

int main() {
	printf("Debug Secure Random 2\n");

    //-------------------------------------------------------------------
    // Declare and initialize variables.

    HCRYPTPROV hCryptProv = NULL;        // handle for a cryptographic
    // provider context
    LPCSTR UserName = "J2SETest";  // name of the key container
    // to be used
//-------------------------------------------------------------------
// Attempt to acquire a context and a key
// container. The context will use the default CSP
// for the RSA_FULL provider type. DwFlags is set to zero
// to attempt to open an existing key container.

    if (CryptAcquireContext(
        &hCryptProv,               // handle to the CSP
        UserName,                  // container name 
        NULL,                      // use the default provider
        PROV_RSA_FULL,             // provider type
        0))                        // flag values
    {
        printf("A cryptographic context with the %s key container \n",
            UserName);
        printf("has been acquired.\n\n");
    }
    else
    {
        //-------------------------------------------------------------------
        // An error occurred in acquiring the context. This could mean
        // that the key container requested does not exist. In this case,
        // the function can be called again to attempt to create a new key 
        // container. Error codes are defined in Winerror.h.
        DWORD errCode = GetLastError();
        char errMsg[512];
        FormatMessageA(
            FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
            NULL,
            errCode,
            0,
            errMsg,
            sizeof(errMsg),
            NULL);
        if (errCode == NTE_BAD_KEYSET)
        {
            if (CryptAcquireContext(
                &hCryptProv,
                UserName,
                NULL,
                PROV_RSA_FULL,
                CRYPT_NEWKEYSET))
            {
                printf("A new key container has been created.\n");
            }
            else
            {
                DWORD createErrCode = GetLastError();
                char createErrMsg[512];
                FormatMessageA(
                    FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
                    NULL,
                    createErrCode,
                    0,
                    createErrMsg,
                    sizeof(createErrMsg),
                    NULL);
                printf("Could not create a new key container. Error %lu: %s\n", createErrCode, createErrMsg);
                fflush(stdout);
                exit(1);
            }
        }
        else
        {
            printf("A cryptographic service handle could not be acquired. Error %lu: %s\n", errCode, errMsg);
            fflush(stdout);
            exit(1);
        }

    } // End of else.
    //-------------------------------------------------------------------
    // A cryptographic context and a key container are available. Perform
    // any functions that require a cryptographic provider handle.

    //-------------------------------------------------------------------
    // When the handle is no longer needed, it must be released.

    if (CryptReleaseContext(hCryptProv, 0))
    {
        printf("The handle has been released.\n");
    }
    else
    {
        printf("The handle could not be released.\n");
    }
}

Works on my windows machine and fails on the VM with:
"Could not create a new key container"

@timja
Copy link
Member

timja commented Jan 6, 2026

Another iteration:
debug-secure-random-2.zip

@jtnord
Copy link
Member Author

jtnord commented Jan 6, 2026

The error handling there is questionable (no better then the JDK!l
The failure loop should call GetLastError and output the result.
On phone so cannot provide the code yet.

@timja
Copy link
Member

timja commented Jan 6, 2026

debug-secure-random-2.zip

I've copiloted this:

                DWORD errCode = GetLastError();
                char errMsg[512];
                FormatMessageA(
                    FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
                    NULL,
                    errCode,
                    0,
                    errMsg,
                    sizeof(errMsg),
                    NULL);
                printf("Could not create a new key container. Error %lu: %s\n", errCode, errMsg);
                exit(1);

trying now

@timja
Copy link
Member

timja commented Jan 6, 2026

Debug Secure Random 2
Could not create a new key container. Error 5: Access is denied.

@timja
Copy link
Member

timja commented Jan 6, 2026

The user is running as an admin:

PS Z:\jenkins\debug> $user = [Security.Principal.WindowsIdentity]::GetCurrent();
PS Z:\jenkins\debug> (New-Object Security.Principal.WindowsPrincipal $user).IsInRole([Security.Principal.WindowsBuiltinRole]::Administrator)
True

@timja
Copy link
Member

timja commented Jan 6, 2026

PS Z:\jenkins\debug> icacls "$env:APPDATA\Microsoft\Crypto\RSA"
Z:\jenkins\AppData\Roaming\Microsoft\Crypto\RSA NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
                                                BUILTIN\Administrators:(I)(OI)(CI)(F)
                                                EC2AMAZ-44RG0FU\jenkins:(I)(OI)(CI)(F)

Successfully processed 1 files; Failed processing 0 files

@timja
Copy link
Member

timja commented Jan 6, 2026

With a different key container name:
debug-secure-random-2.zip

@timja
Copy link
Member

timja commented Jan 6, 2026

With more error handling:
debug-secure-random-2.zip

@timja
Copy link
Member

timja commented Jan 6, 2026

Hmm weird I thought I'd try on windows-2022 but my exe doesn't even run there =/

no error message or anything but nothing is outputted

@timja
Copy link
Member

timja commented Jan 6, 2026

@jtnord
Copy link
Member Author

jtnord commented Jan 6, 2026

Hmm weird I thought I'd try on windows-2022 but my exe doesn't even run there =/

no error message or anything but nothing is outputted

missing the specific vc runtime?

@jtnord
Copy link
Member Author

jtnord commented Jan 6, 2026

Are you logged in via RDP to the console (for servers historically you needed to use the /console on remote desktop to get the console session as opposed a new session. Not sure that is still needed), or via SSH or winRM? WinRM is by its nature restricted, and I think SSH has some restrictions.

@timja
Copy link
Member

timja commented Jan 6, 2026

Are you logged in via RDP to the console (for servers historically you needed to use the /console on remote desktop to get the console session as opposed a new session. Not sure that is still needed), or via SSH or winRM? WinRM is by its nature restricted, and I think SSH has some restrictions.

I'm logged in over SSH (which is the same as Jenkins is doing)


missing the specific vc runtime?

Probably

@jtnord
Copy link
Member Author

jtnord commented Jan 6, 2026

missing the specific vc runtime?

Probably

image

@timja
Copy link
Member

timja commented Jan 6, 2026

New version should be statically linked:
debug-secure-random-2.zip

@dduportal
Copy link

  • Trying with WORKGROUP as domain (both UPN and Netlogon syntaxes) does not work. SSH refuses the auth. on both password and public key
  • However, both syntax works when using the value from $env:COMPUTERNAME

I'm now playing around with using the VM hostname instead of its IPv4

@dduportal
Copy link

  • Trying with WORKGROUP as domain (both UPN and Netlogon syntaxes) does not work. SSH refuses the auth. on both password and public key

    • However, both syntax works when using the value from $env:COMPUTERNAME

I'm now playing around with using the VM hostname instead of its IPv4

Using the hostname does not work either: same behavior. I'm going back to the password init in noninteractive

@dduportal
Copy link

AH I might have found a technique with the Posh-SSH module. Let me retry with cloudinit

@dduportal
Copy link

YEEEPEEKAY

21:55:20  Running on [EC2 (aws-us-east-2) - Windows Infra Test (i-01e25c18c0cd94c7d)](https://ci.jenkins.io/computer/EC2%20%28aws%2Dus%2Deast%2D2%29%20%2D%20Windows%20Infra%20Test%20%28i%2D01e25c18c0cd94c7d%29/) in Z:/agent/workspace/ra_acceptance-tests_infra-checks
21:55:20  [Pipeline] {
21:55:20  [Pipeline] powershell
21:55:33  Debug Secure Random 2
21:55:36  A cryptographic context with the J2SETest key container 
21:55:36  has been acquired.
21:55:36  
21:55:36  The handle has been released.

https://ci.jenkins.io/job/Infra/job/acceptance-tests/job/infra-checks/736/console

# Requires using YAML for the Windows "Cloud Init" stuff. Multipart upload of a powershell script does not work.
version: 1.1
tasks:
- task: executeScript
  inputs:
  - frequency: always
    type: powershell
    runAs: localSystem
    content: |-
      ## Set up permissions context (as you are Administrator here)
      Set-ExecutionPolicy Unrestricted -Scope LocalMachine -Force -ErrorAction Ignore
      # Don't set this before Set-ExecutionPolicy as it throws an error
      $ErrorActionPreference = "stop"

      ## Setup NVMe(s) and map it to the Z: drive
      $nb = Get-Disk | Where-Object PartitionStyle -eq 'RAW' | tee -Variable Disks | measure
      Write-Output "$nb.Count disk found."
      Switch ($nb.Count)
      {
        0 {Write-Output "No RAW disk found."}
        1 {
            Write-Output "1 disk found."
            $Disks | Initialize-Disk -PartitionStyle MBR
            $Disks | New-Partition -UseMaximumSize -MbrType IFS
            $Partition = Get-Partition -DiskNumber $Disks.Number
            $Partition | Format-Volume -FileSystem NTFS -Confirm:$false
            $Partition | Add-PartitionAccessPath -AccessPath "Z:"
            Get-WmiObject Win32_Volume | Format-Table Name, Label, FreeSpace, Capacity
        }
        default {
            Write-Output "$nb.Count disks found."
            $Disks | ForEach-Object -Begin {Get-Date} -Process {
                    Initialize-Disk -PartitionStyle MBR -PassThru -DiskNumber $_.Number
                    New-Partition -UseMaximumSize -MbrType IFS
                    $Partition = Get-Partition -DiskNumber $_.Number
                    $Partition | Format-Volume -FileSystem NTFS -Confirm:$false
                    $Partition | Add-PartitionAccessPath -AccessPath "Z:"
                } -End {Get-Date}
            Get-WmiObject Win32_Volume | Format-Table Name, Label, FreeSpace, Capacity
        }
      }

      # Set up Windows default user profile location to Z: drive
      $userpath = 'Z:\Users'
      $regpath = "HKLM:\Software\Microsoft\Windows NT\CurrentVersion\ProfileList"
      $regname = "ProfilesDirectory"
      $username="jenkins"
      $userHome = "$userpath\$username"
      $userSSHDir = "$userHome\.ssh"
      $authorizedKeysFile = "$userSSHDir\authorized_keys"


      set-itemproperty -path $regpath -name $regname -value $userpath
      Write-Output "Set up default user profiles to $userpath"

      # Create user and init its profile
      $pw = ConvertTo-SecureString -String '<redacted>' -AsPlainText -Force
      New-LocalUser -Name $username -Password $pw
      Add-LocalGroupMember -Group "openssh users" -Member $username
      Write-Output "Created the user $username"
      Install-Module -Name Posh-SSH -Force
      Write-Output "Installed PoshSSH module"
      $cred = New-Object System.Management.Automation.PSCredential ($username,$pw)
      $SessionID = New-SSHSession -ComputerName "localhost" -AcceptKey -Credential $cred
      Write-Output "Got SSH Session"
      Invoke-SSHCommand -Index $SessionID.Sessionid -Command 'whoami'

      $cryptoDebugUrl = 'https://github.com/user-attachments/files/24455803/debug-secure-random-2.zip'
      Invoke-SSHCommand -Index $SessionID.Sessionid -Command "powershell Invoke-WebRequest $cryptoDebugUrl -OutFile debug-secure-random-2.zip"
      Write-Output "Downloaded $cryptoDebugUrl in debug-secure-random-2.zip"
      Invoke-SSHCommand -Index $SessionID.Sessionid -Command "powershell Expand-Archive debug-secure-random-2.zip"
      Write-Output "Expanded debug-secure-random-2.zip"
      Invoke-SSHCommand -Index $SessionID.Sessionid -Command "powershell .\debug-secure-random-2\debug-secure-random-2.exe"
      Write-Output "Ran .\debug-secure-random-2\debug-secure-random-2.exe"

      # Setup SSH key for user in its profile once the initial SSH password has been performed to init CryptoAPI
      
      $keyUrl = 'https://raw.githubusercontent.com/jenkins-infra/aws/main/ec2_agents_authorized_keys'
      

      Write-Output "Starting setting up SSH key"
      Invoke-SSHCommand -Index $SessionID.Sessionid -Command "powershell New-Item -ItemType Directory -Path $userSSHDir -Force"
      Invoke-SSHCommand -Index $SessionID.Sessionid -Command "powershell Invoke-WebRequest $keyUrl -OutFile $authorizedKeysFile"
      Write-Output "Finished setting up SSH key"

      ## Setup datadog
      (Get-Content C:\ProgramData\Datadog\datadog.yaml -Raw) -Replace 'api_key:', 'api_key: <redacted>' | Set-Content C:\ProgramData\Datadog\datadog.yaml
      & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" restart-service

      ## Disable WinRM
      Remove-Item -Path WSMan:\Localhost\listener\listener* -Recurse
      cmd.exe /c net stop winrm

      ## Mark cloud init as finished using a marker file
      New-Item -Path "Z:/Temp" -ItemType "Directory"
      New-Item -Path "Z:/Temp/.cloud-init.done" -ItemType "File" -Value "Cloud Init"

@timja
Copy link
Member

timja commented Jan 14, 2026

Nice!

Is Posh-SSH needed? Isn't the fix to do an SSH to localhost when the user is created using password auth first and then key based auth?

The server should have SSH pre-installed on it?

I guess it makes it easier to e.g. work with a password and use powershell objects

@dduportal
Copy link

The server should have SSH pre-installed on it?

Is Posh-SSH needed?

Could be. I focused on validating the "patch" (e.g. verifying that the CryptoAPI can be used in the pipeline by performing a first SSH login with password in non interactive) and I wanted to rule out the "how to pass password to ssh client".

I guess it makes it easier to e.g. work with a password and use powershell objects

Exactly, that's why I started with it. ssh client does not support reading password from stdin.
I can try with the SSH_ASKPASS environment variable (which I usually do on Unix); but only now that we are sure it is worth it ;)).
If it does not work, I'll propose a PR to have the module pre-installed in our AMI to gain some time during init (as it takes ~4 min end to end before agents is ready).

Isn't the fix to do an SSH to localhost when the user is created using password auth first and then key based auth?

Absolutely. But the sequence of task is important:

  • The privileged operations can only be run by the "cloud init" ("EC2Launchv2"), e.g. during the boot phase. The "init script" of the ec2 plugin is expected to be run by the (almost) non privileged user jenkins
  • However, since the ec2 loops on its SSH connections, it means the jenkins user SSH session is opened "asap": the cloud-init might not be (and is usually not) finished when the connection happens. To fix this, i've moved the configuration of the authorized_key as the last step (after we are sure that user is properly setup).
  • We have not baked the user jenkins in the AMIs: it used to provide faster agent startup (less than 1 min) but it prevented us to have anything in the Z: (NVMe) drive: Jenkins ec2 plugin had already connected the jenkins user (prebaked) before cloud-init even finished to format and mount Z:.

@jtnord
Copy link
Member Author

jtnord commented Jan 14, 2026

So I'm not clear if the fix is just "ssh with user:pass" or if it you need to "run something that sets up the crypt API that has a user/PW token"
(Or if it's a combination of both)

If it's the latter then start-process will probably work to just run out debug tool and not need poshssh

@dduportal
Copy link

So I'm not clear if the fix is just "ssh with user:pass" or if it you need to "run something that sets up the crypt API that has a user/PW token" (Or if it's a combination of both)

If it's the latter then start-process will probably work to just run out debug tool and not need poshssh

I initially thought we ran this scenario (setup CryptoAPI with the Start-Process), but not sure. Retrying to be sure

@dduportal
Copy link

dduportal commented Jan 14, 2026

So I'm not clear if the fix is just "ssh with user:pass" or if it you need to "run something that sets up the crypt API that has a user/PW token" (Or if it's a combination of both)

If it's the latter then start-process will probably work to just run out debug tool and not need poshssh

Start-Process is really an awful function. Can't find how to show its stdout / stderr
/me is tired: https://learn.microsoft.com/fr-fr/powershell/module/microsoft.powershell.management/start-process?view=powershell-7.5#-redirectstandardoutput

@jtnord
Copy link
Member Author

jtnord commented Jan 14, 2026

So I'm not clear if the fix is just "ssh with user:pass" or if it you need to "run something that sets up the crypt API that has a user/PW token" (Or if it's a combination of both)
If it's the latter then start-process will probably work to just run out debug tool and not need poshssh

Start-Process is really an awful function. Can't find how to show its stdout / stderr :'(

you don't reallay need to. just check the manually with ssh -i ... in the resulting VM?

@dduportal
Copy link

So I'm not clear if the fix is just "ssh with user:pass" or if it you need to "run something that sets up the crypt API that has a user/PW token" (Or if it's a combination of both)
If it's the latter then start-process will probably work to just run out debug tool and not need poshssh

Start-Process is really an awful function. Can't find how to show its stdout / stderr :'(

you don't reallay need to. just check the manually with ssh -i ... in the resulting VM?

Actually I do: it does not seem to do anything at all. I'm not sure what i'm missing in the call.

Start-Process -FilePath "cmd.exe" -ArgumentList "/c pwd" -Credential $cred -LoadUserProfile -Wait -RedirectStandardOutput 'Z:\output.log' -RedirectStandardError 'Z:\err.log'

generates the log files but they are empty (both of them). I can't even get an exit code. What is wrong with my instruction?

@jtnord
Copy link
Member Author

jtnord commented Jan 15, 2026

I'm not sure what i'm missing in the call.
... pwd ...

pwd is not a command you can execute!
(now cmd should still run and fail though..) (command works for me locally)

@dduportal
Copy link

I'm retrying to see what we can do.

@dduportal
Copy link

@timja
Copy link
Member

timja commented Jan 15, 2026

Don’t we have 7 installed?

@lemeurherve
Copy link
Member

Don’t we have 7 installed?

We're ensuring both are installed in packer images (at least): https://github.com/jenkins-infra/packer-images/blob/317e2015fd5d9ee2670a6da01ada4b2700d953d8/provisioning/windows-provision.ps1#L381-L394

@dduportal
Copy link

Oh right, I juste have to use pwsh instead of powershell \o/

@dduportal
Copy link

dduportal commented Jan 15, 2026

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/invoke-command?view=powershell-7.5#-sshtransport

This parameter was introduced in PowerShell 6.0.

:'(

Doesn't support password auth for SSH 😮‍💨 (edit: not as a non-interactive script - https://learn.microsoft.com/en-us/powershell/scripting/security/remoting/ssh-remoting-in-powershell)

@dduportal
Copy link

I give up: I'll go with the Posh-SSH module.

  • SSH_ASKPASS technique fails for non interactive SSH sessions.
  • Invoke-Command requires WinRM for password-based authentication
  • It does not seem that Start-Process command works when trying to run with the jenkins user. As soon as I add the -Credential attribute (with a valid PS Secured Credential), it seems to run "something" but it never works (can't create file or directory nowhere). I guess there is an error but the stdout/stderr only generates empty content.

I've exhausted both my skills and patience: if no one objects, I'll go forward with the Posh-SSH for an initial implementation.

@dduportal
Copy link

Gotcha, I finally succeeded with the native ssh.exe \o/

@dduportal
Copy link

dduportal commented Jan 18, 2026

Gotcha, I finally succeeded with the native ssh.exe \o/

Here is the userdata code applied to to the agent template infratest-windows with the following changes:

  • CryptoAPi works in the pipeline (at least the exe file from @timja)

  • The jenkins user is not admin anymore but can use Docker

  • No need to wait ~80-90s for Posh-SSH module installation: native ssh.exe is used with the SSH_ASKPASS variables (see https://man7.org/linux/man-pages/man1/ssh.1.html for details) to provide the password for the first connection only).

    • The trick was to set the environment variable DISPLAY to 0 (e.g. scren 0) to support non-interactive case. It worked without in my initial tests which were interactive pwsh sessions but failed without in the cloud-init non interactive script.
  • Note: the debug-secure-random-2.exe must be called as the jenkins user in its SSH password-auth session. Only initializing the user home and profile with the ssh.exe and a simple command (tried with docker info to ensure it has permissions) is not sufficient

    • I'm not sure if there would be a powershell command to replace this exe execution
  • Verified with Windows Server 2019, 2022 and 2025

    • Note: Only 2025 has OpenSSH already installed with the SSH users group
  • There are optimizations which may be set to to decrease the amount of moving parts (which may fail agent initialization):

    • Add the debug-secure-random-2.exe in the packer-image AMIs (to avoid downloading it on each agent)
    • Ensure we have the additional groups "Docker Users" (all) and "OpenSSH users" (2022/2019 only)
    • Copy the SSH key from the admin's SSH setup (it is the same key) to avoid additional download during cloud init
    • Cleanup packer leftovers in Packer image itself

User data code:

Click to expand
# Requires using YAML for the Windows "Cloud Init" stuff. Multipart upload of a powershell script does not work.
version: 1.1
tasks:
- task: executeScript
  inputs:
  - frequency: always
    type: powershell
    runAs: localSystem
    content: |-
      ## Set up permissions context (as you are Administrator here)
      Set-ExecutionPolicy Unrestricted -Scope LocalMachine -Force -ErrorAction Ignore
      # Don't set this before Set-ExecutionPolicy as it throws an error
      $ErrorActionPreference = "stop"

      ## Setup datadog
      (Get-Content C:\ProgramData\Datadog\datadog.yaml -Raw) -Replace 'api_key:', 'api_key: <redacted>' | Set-Content C:\ProgramData\Datadog\datadog.yaml
      & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" restart-service
      Write-Output 'Datadog service setup'
      Get-Date

      ## Disable WinRM
      Remove-Item -Path WSMan:\Localhost\listener\listener* -Recurse
      cmd.exe /c net stop winrm
      Write-Output 'WinRM disabled'
      Get-Date

      ## Setup NVMe(s) and map it to the Z: drive
      $nb = Get-Disk | Where-Object PartitionStyle -eq 'RAW' | tee -Variable Disks | measure
      Write-Output "$nb.Count disk found."
      Get-Date
      Switch ($nb.Count)
      {
        0 {Write-Output "No RAW disk found."}
        1 {
            $Disks | Initialize-Disk -PartitionStyle MBR
            $Disks | New-Partition -UseMaximumSize -MbrType IFS
            $Partition = Get-Partition -DiskNumber $Disks.Number
            $Partition | Format-Volume -FileSystem NTFS -Confirm:$false
            $Partition | Add-PartitionAccessPath -AccessPath "Z:"
            Get-WmiObject Win32_Volume | Format-Table Name, Label, FreeSpace, Capacity
        }
        default {
            $Disks | ForEach-Object -Begin {Get-Date} -Process {
                    Initialize-Disk -PartitionStyle MBR -PassThru -DiskNumber $_.Number
                    New-Partition -UseMaximumSize -MbrType IFS
                    $Partition = Get-Partition -DiskNumber $_.Number
                    $Partition | Format-Volume -FileSystem NTFS -Confirm:$false
                    $Partition | Add-PartitionAccessPath -AccessPath "Z:"
                } -End {Get-Date}
            Get-WmiObject Win32_Volume | Format-Table Name, Label, FreeSpace, Capacity
        }
      }
      Write-Output 'Disk setup finished.'
      Get-Date

      ## Setup Docker Engine
      $dockerGroup = 'docker-users'
      try {Get-LocalGroup -Name $dockerGroup;} catch {New-LocalGroup -Name $dockerGroup;}

      # Note: file path MUST use Unix-style separator (/)
      @"
      {
        "hosts": ["npipe://"],
        "data-root": "Z:/docker",
        "group": "$dockerGroup"
      }
      "@ | Set-Content C:\ProgramData\Docker\config\daemon.json
      # Restart docker engine
      Restart-Service docker
      docker info
      Write-Output 'Docker Engine setup finished.'
      Get-Date

      # Create the 'jenkins' agent's user account
      $username = 'jenkins'
      ## TODO: generate random password
      $userPassword = '<redacted>'
      $pw = ConvertTo-SecureString -String $userPassword -AsPlainText -Force
      New-LocalUser -Name $username -Password $pw

      $sshGroup = "openssh users"
      try {Get-LocalGroup -Name $sshGroup;} catch {New-LocalGroup -Name $sshGroup;}

      Add-LocalGroupMember -Group $sshGroup -Member $username
      Add-LocalGroupMember -Group $dockerGroup -Member $username

      Write-Output "User $username created."
      Get-Date
      # Set up Windows default Users profiles location to the custom data disk drive
      $userpath = 'Z:\Users'
      $regpath = 'HKLM:\Software\Microsoft\Windows NT\CurrentVersion\ProfileList'
      $regname = 'ProfilesDirectory'
      Set-ItemProperty -path $regpath -name $regname -value $userpath
      Write-Output "Set up default user profiles to $userpath"
      Get-Date

      $cryptoDebugUrl = 'https://github.com/user-attachments/files/24455803/debug-secure-random-2.zip'
      $cryptoBaseDir = 'C:'
      $cryptoBasename = "$cryptoBaseDir\debug-secure-random-2"
      $cryptoExe = "$cryptoBasename.exe"
      $cryptoArchive = "$cryptoBasename.zip"
      Invoke-WebRequest $cryptoDebugUrl -OutFile $cryptoArchive
      Expand-Archive "$cryptoArchive" -DestinationPath "$cryptoBaseDir\"
      Write-Output "Debug Tool to setup CryptoApi deployed to $cryptoExe"
      Get-Date

      $env:DISPLAY = '0'
      $env:SSH_ASKPASS_REQUIRE = 'force'
      $env:SSH_ASKPASS = 'C:\askpass.bat'
      @"
      @echo off
      echo $userPassword
      "@ | Set-Content $env:SSH_ASKPASS

      ssh $username@localhost -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null "$cryptoExe"
      Remove-Item -Path "$env:SSH_ASKPASS" -Force
      Remove-Item -Path "$cryptoExe" -Force
      Remove-Item -Path "$cryptoArchive" -Force
      Remove-Item -Path 'C:\goss-windows-2019.yaml' -Force
      Remove-Item -Path 'C:\goss-windows-2022.yaml' -Force
      Remove-Item -Path 'C:\addSSHPubKey.ps1' -Force
      Write-Output 'Initialized User Profile and Crypto through SSH with password authentication'
      Get-Date

      # Setup SSH key for user in its profile
      $userSSHDir = "$userpath\$username\.ssh"
      New-Item -ItemType Directory -Path "$userSSHDir" -Force | Out-Null
      $authorizedKeysFile = "$userSSHDir\authorized_keys"
      $keyUrl = 'https://raw.githubusercontent.com/jenkins-infra/aws/main/ec2_agents_authorized_keys'
      Write-Host "Downloading SSH key from $keyUrl to $authorizedKeysFile"
      Invoke-WebRequest $keyUrl -OutFile $authorizedKeysFile

      ## Mark cloud init as finished using a marker file
      New-Item -Path "Z:/Temp" -ItemType "Directory"
      New-Item -Path "Z:/Temp/.cloud-init.done" -ItemType "File" -Value "Cloud Init"

Build replay testing this agent template: https://ci.jenkins.io/job/Plugins/job/credentials-plugin/job/PR-999/22/

with no Linux and a Windows JDK17 custom platform:

buildPlugin(useContainerAgent: false, configurations: [
  [platform: 'infratest-windows', jdk: 17],
])

@dduportal
Copy link

Update: adding back the jenkins user in the Administrator group as build 22 failed due to missing permissions (pipeline library git config --global xxx commands.

Retrying with https://ci.jenkins.io/job/Plugins/job/credentials-plugin/job/PR-999/23

@dduportal
Copy link

dduportal commented Jan 18, 2026

All this work is an absolute failure: we still have the same message about system seed generator (twice) in https://ci.jenkins.io/job/Plugins/job/credentials-plugin/job/PR-999/23:

provider: Failed to use operating system seed generator: java.io.IOException: Required native CryptoAPI features not  available on this machine

(edit)

@jtnord
Copy link
Member Author

jtnord commented Jan 19, 2026

  • I'm not sure if there would be a powershell command to replace this exe execution

you have java installed so you can probably run something like the following converted to powershell:

echo java.security.SecureRandom.getInstanceStrong().generateSeed(1) | jshell

I've now forgotten if we need getInstanceStrong() or getInstance() (and this will not output any diagnostics..) unless you pass the extra flags with -R?

@jtnord
Copy link
Member Author

jtnord commented Jan 19, 2026

adding back the jenkins user in the Administrator group

CryptoAPI in the administrators group behaves differently to non administrators.
When you create a context as an administrator it is shared by all administrators, when you create it as a user it is only available to that user. Could be something else when doing this that causes #999 (comment) ?

@jtnord
Copy link
Member Author

jtnord commented Jan 19, 2026

Update: adding back the jenkins user in the Administrator group as build 22 failed due to missing permissions (pipeline library git config --global xxx commands.

Retrying with https://ci.jenkins.io/job/Plugins/job/credentials-plugin/job/PR-999/23

git config set --system core.autocrlf true

Do not do this as part of the pipeline library, this is wrong. - either do this when you install Git4win, or configure the global (aka user) setting in the script.

@dduportal
Copy link

Update: adding back the jenkins user in the Administrator group as build 22 failed due to missing permissions (pipeline library git config --global xxx commands.
Retrying with https://ci.jenkins.io/job/Plugins/job/credentials-plugin/job/PR-999/23

git config set --system core.autocrlf true

Do not do this as part of the pipeline library, this is wrong. - either do this when you install Git4win, or configure the global (aka user) setting in the script.

I agree on principle: we should not do this. But changing this right now means we should first go back in history on why it was put here in first place (I guess a need for quick fix + missing knowledge + a spread of Windows configurations which is not the case nowadays).

If that's ok for you, we can continue in that part (e.g. dropping the admin permission) but expect a bit of delays then 😅 (we have an incoming 2.541.1 LTS this week .

Proposed plan (does it look good to you):

  • Work on removing the git config --global from pipeline library to allow running jenkins as "non-admin"
  • Retry the full JVM build here to verify moving to non admin solve your problem (or not)
  • Then, if it solve the issue, maybe optimize to replace the debug-xxx.exe by the JVM command you provided

dduportal added a commit to dduportal/jenkins-infra that referenced this pull request Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants