Issue Details
Describe the bug
The cloud instance is configured to auto submit aborted builds due to node termination, but it doesn't.
To Reproduce
- create pipeline
node("windows-small") {
while(true){
echo "Retriger test!"
}
}
- trigger it to run on AWS spot instance
- terminate AWS spot instance in AWS web portal while job is running
** Logs **
Jenkins job log:
09:56:58.232 Retriger test!
09:56:58.245 [Pipeline] echo
09:56:58.248 Retriger test!
09:56:58.263 EC2 instance for node AWS Windows i-0ec91fe6950f53b51 was terminated
09:56:58.270 [Pipeline] echo
09:56:58.275 Retriger test!
09:56:58.288 [Pipeline] echo
09:56:58.292 Retriger test!
09:56:58.301 [Pipeline] }
09:56:58.322 [Pipeline] // node
09:56:58.340 [Pipeline] End of Pipeline
09:56:58.513 org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: 028d9fd0-de7d-4aac-b021-55f267dd9fef
09:56:58.526 Finished: ABORTED
Jenkins system log (note that timestamps are differente due to different time zones in slave and master)
Jan 17, 2025 8:56:58 AM INFO com.amazon.jenkins.ec2fleet.EC2FleetAutoResubmitComputerLauncher afterDisconnect
DISCONNECTED: AWS Windows i-0ec91fe6950f53b51
Jan 17, 2025 8:56:58 AM INFO com.amazon.jenkins.ec2fleet.EC2FleetAutoResubmitComputerLauncher afterDisconnect
Start retriggering executors for AWS Windows i-0ec91fe6950f53b51
Jan 17, 2025 8:56:58 AM SEVERE hudson.slaves.SlaveComputer$1 onClosed
Launcher com.amazon.jenkins.ec2fleet.EC2FleetAutoResubmitComputerLauncher@2876def1s afterDisconnect method propagated an exception when {1}s connection was closed: Cannot invoke "org.jenkinsci.plugins.workflow.job.WorkflowRun.getActions(java.lang.Class)" because "failedBuild" is null
java.lang.NullPointerException: Cannot invoke "org.jenkinsci.plugins.workflow.job.WorkflowRun.getActions(java.lang.Class)" because "failedBuild" is null
at PluginClassLoader for ec2-fleet//com.amazon.jenkins.ec2fleet.EC2FleetAutoResubmitComputerLauncher.afterDisconnect(EC2FleetAutoResubmitComputerLauncher.java:106)
at hudson.slaves.SlaveComputer$1.onClosed(SlaveComputer.java:650)
at hudson.remoting.Channel.terminate(Channel.java:1143)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:90)
Role attached in AWS:
"ec2:*": This grants full access to all EC2 actions, including DescribeInstances, TerminateInstances, and DescribeSpotFleetRequests. This would allow the Jenkins EC2 Fleet plugin to perform any action on EC2 resources.
Environment Details
Plugin Version?
EC2 Fleet 3.2.0
Jenkins Version?
Jenkins 2.462.1
Spot Fleet or ASG?
Spot Fleet
Label based fleet?
No
Linux or Windows?
identical behaviour for linux/windows slaves. Jenkins master runs linux.
EC2Fleet Configuration as Code
Cloud AWS Windows Configuration
Name
AWS Windows
Select AWS Credentials or leave set to none to use AWS EC2 Instance Role
AWS Credentials
- none -
Region
eu-central-1 EU (Frankfurt)
Endpoint like https://ec2.us-east-2.amazonaws.com
Endpoint
- empty -
Fleet list will be available once region and credentials are specified. Only maintain supported, see help
EC2 Fleet
Auto Scaling Group - jenkins-spot-agents-windows-small
[ ] Show all fleets
Launcher
Launch agents via SSH
Credentials
jenkins/****** (jenkins windows ssh)
Host Key Verification Strategy
Non verifying Verification Strategy
Connect to instances via private IP instead of public IP
[x] Private IP
Always reconnect to offline nodes after instance reboot or connection loss
[x] Always Reconnect
Only build jobs with label expressions matching this node
[x] Restrict Usage
Labels to add to instances in this fleet
Label
ec2-fleet windows-small
Default is /tmp/jenkins-
Jenkins Filesystem Root
C:\Jenkins
Testing Number of executors per instance
Number of Executors
4
Scale Executors
No scaling
How long to keep an idle node. If set to 0, never scale down
Max Idle Minutes Before Scaledown
0
Minimum Cluster Size
1
Maximum Cluster Size
1
Minimum Spare Size
0
Maximum Total Uses
-1
Disable auto resubmitting a build if it failed due to an EC2 instance termination like a Spot interruption
[ ] Disable Build Resubmit
Maximum time to wait for EC2 instance startup
Maximum Init Connection Timeout in sec
180
Interval for updating EC2 cloud status
Cloud Status Interval in sec
10
Enable faster provision when queue is growing
[ ] No Delay Provision Strategy
Anything else unique about your setup?
<Yes…/No>
Issue Details
Describe the bug
The cloud instance is configured to auto submit aborted builds due to node termination, but it doesn't.
To Reproduce
** Logs **
Jenkins job log:
Jenkins system log (note that timestamps are differente due to different time zones in slave and master)
Role attached in AWS:
"ec2:*": This grants full access to all EC2 actions, including DescribeInstances, TerminateInstances, and DescribeSpotFleetRequests. This would allow the Jenkins EC2 Fleet plugin to perform any action on EC2 resources.
Environment Details
Plugin Version?
EC2 Fleet 3.2.0
Jenkins Version?
Jenkins 2.462.1
Spot Fleet or ASG?
Spot Fleet
Label based fleet?
No
Linux or Windows?
identical behaviour for linux/windows slaves. Jenkins master runs linux.
EC2Fleet Configuration as Code
Anything else unique about your setup?
<Yes…/No>