Skip to content

Conversation

@alaypatel07
Copy link
Contributor

@alaypatel07 alaypatel07 commented Sep 25, 2025

This test exercises the dra test for 5000 node test.

It is created as a separate test for checking the stability, eventually it will be merged into the release informing tests and this test will be removed to save infrastructure costs.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 25, 2025
@k8s-ci-robot k8s-ci-robot added area/config Issues or PRs related to code in /config area/jobs sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/testing Categorizes an issue or PR as relevant to SIG Testing. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. labels Sep 25, 2025
@alaypatel07
Copy link
Contributor Author

/assign @wojtek-t

@pohly pohly moved this from 🆕 New to 👀 In review in Dynamic Resource Allocation Sep 29, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 30, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 1, 2025
@alaypatel07 alaypatel07 force-pushed the dra-5000-gce-test branch 4 times, most recently from 7c96747 to a0bb5f4 Compare October 1, 2025 14:48
@BenTheElder
Copy link
Member

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 1, 2025
- --env=KUBE_DNS_MEMORY_LIMIT=300Mi
- --extract=ci/fast/latest-fast
- --gcp-nodes=5000
- --gcp-project-type=scalability-scale-project
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are only one of these projects currently, this job would have to be deconflicted with the existing 5k node jobs

- "perfDashBuildsCount: 270"
- "perfDashJobType: performance"
# TODO (alaypatel07): increase this interval once stable
interval: 12h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we almost certainly need to use a cron to avoid scheduling conflicts, and also not increase the interval, these are really expensive, and it requires some time to coordinate permission for quota this large in a project without a contract 🙃

@BenTheElder
Copy link
Member

what prevents us from including DRA in any of the existing 5k node jobs? we spend a lot of time and resources just doing things like dumping all of the logs to storage

@alaypatel07
Copy link
Contributor Author

what prevents us from including DRA in any of the existing 5k node jobs? we spend a lot of time and resources just doing things like dumping all of the logs to storage

There is a sense of not disrupting the stability of existing 5k node jobs with a new job, since those are release informing and monitored very closely.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 1, 2025
@BenTheElder
Copy link
Member

/assign wojtek-t marseel

@jackfrancis
Copy link
Contributor

/lgtm
/approve

from DRA perspective

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 6, 2025
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 13, 2025
@alaypatel07
Copy link
Contributor Author

@BenTheElder @jackfrancis the release blocking jobs run on odd days:

- cron: '1 17 1-31/2 * *' # Run on odd days at 9:01PST (17:01 UTC)

I have configured this job to run on even days, 9:01 AM EST. Please look if this work and approve/lgtm so we can get unblocked here

cc @klueska @pohly

@alaypatel07
Copy link
Contributor Author

/hold

Copy link
Contributor

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alaypatel07, jackfrancis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2025
@alaypatel07
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 14, 2025
Signed-off-by: Alay Patel <[email protected]>
Copy link
Contributor

@nojnhuh nojnhuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2025
@k8s-ci-robot k8s-ci-robot merged commit 418185d into kubernetes:master Oct 14, 2025
6 checks passed
@k8s-ci-robot
Copy link
Contributor

@alaypatel07: Updated the job-config configmap in namespace default at cluster test-infra-trusted using the following files:

  • key sig-scalability-periodic-dra.yaml using file config/jobs/kubernetes/sig-scalability/DRA/sig-scalability-periodic-dra.yaml

In response to this:

This test exercises the dra test for 5000 node test.

It is created as a separate test for checking the stability, eventually it will be merged into the release informing tests and this test will be removed to save infrastructure costs.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@pohly pohly moved this from 👀 In review to ✅ Done in Dynamic Resource Allocation Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config area/jobs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants