Skip to content

add Kubernetes AI Conformance working group #8515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

fedebongio
Copy link
Contributor

Which issue(s) this PR fixes:

This PR is a follow up on the email sent 2 weeks ago on WG creation, see https://groups.google.com/a/kubernetes.io/g/dev/c/u6I_mCRC4lE

cc @janetkuo

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Jul 9, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 9, 2025
@k8s-ci-robot k8s-ci-robot requested a review from dims July 9, 2025 17:53
@cblecker
Copy link
Member

cblecker commented Jul 9, 2025

/hold
for review

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 9, 2025
@fedebongio
Copy link
Contributor Author

cc @mfahlandt

efficiently run AI/ML workloads. \n"
charter_link: charter.md
stakeholder_sigs:
- Architecture
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should list the same SIGs that are listed as stakeholders in your charter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

@BenTheElder BenTheElder Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this should also have SIG testing given that a suite of tests is in scope.

cc @aojea @pohly

As a TL I am +1 to sponsor the WG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so does everybody agree to list only: SIG Architecture and SIG Testing for now? Please let me know so I can make the change and get this merged asap to start meeting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as SIG Testing TL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as SIG Arch Chair

@janetkuo
Copy link
Member

janetkuo commented Jul 9, 2025

cc SIG Arch chairs @derekwaynecarr @dims @johnbelamaric for approval

@dims
Copy link
Member

dims commented Jul 9, 2025

@janetkuo let me paste @derekwaynecarr 's question from doc here

Derek Carr
Derek Carr
5:25 PM Jul 7
I apologize for being out of office during the meeting, but it looks like this proposal has evolved from an earlier form I had seen for networking conformance/benchmarking.

I want to understand more, but my general concern is that the document linked is larger in scope and its not clear how a single vendor may or may not be able to assert conformance depending on what it prereqs in its current form (particularly if it requires multiple vendors to reach a solution).  For example, the availability of a production DRA driver does not exist at this time from any GPU vendor (let alone a Kubernetes distribution).

Isn't this inverting the process?  Should a WG be formed to explore what should or should not be done (including if conformance is needed) and then allow vendors (hardware and software) to participate to appropriately determine if the requirements as written could be fulfilled?

Copy link

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to advocate for an explicit nomination process for the leads of AI Conformance.


#### Code, Binaries and Services

- The primary artifact will be the Kubernetes AI Conformance specification and a suite of tests to demonstrate conformance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which repository will this be maintained? Who will be the maintainers of the conformance specification?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an open question from the SIG arch meeting.

IIRC there was loose agreement to follow up in the WG to discuss options for how best to implement and host this and present them to the relevant groups (SIG Arch, SIG Testing, CNCF, ....).

In think this should be clarified, it is undetermined if there will be binary artifacts in the kubernetes organization. They may be in the CNCF, or the workgroup might determine that there's an alternate approach.

Copy link

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @fedebongio!

fedebongio and others added 3 commits July 10, 2025 16:55
Co-authored-by: Janet Kuo <[email protected]>
Co-authored-by: Janet Kuo <[email protected]>
Fixing 80 char line correctly
@fedebongio
Copy link
Contributor Author

/retest

@fedebongio
Copy link
Contributor Author

fedebongio commented Jul 14, 2025

@janetkuo let me paste @derekwaynecarr 's question from doc here

Derek Carr
Derek Carr
5:25 PM Jul 7
I apologize for being out of office during the meeting, but it looks like this proposal has evolved from an earlier form I had seen for networking conformance/benchmarking.

I want to understand more, but my general concern is that the document linked is larger in scope and its not clear how a single vendor may or may not be able to assert conformance depending on what it prereqs in its current form (particularly if it requires multiple vendors to reach a solution).  For example, the availability of a production DRA driver does not exist at this time from any GPU vendor (let alone a Kubernetes distribution).

Isn't this inverting the process?  Should a WG be formed to explore what should or should not be done (including if conformance is needed) and then allow vendors (hardware and software) to participate to appropriately determine if the requirements as written could be fulfilled?

Pointing here for consistency, I've replied in the doc: https://docs.google.com/document/d/1BlmHq5uPyBUDlppYqAAzslVbAO8hilgjqZUTaNXUhKM/edit?disco=AAABjeq6AqY

@johnbelamaric
Copy link
Member

I am +1 for this with my SIG Arch hat.


#### Code, Binaries and Services

- The primary artifact will be the Kubernetes AI Conformance specification and a suite of tests to demonstrate conformance.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fedebongio I guess, we should change this, right ?

Suggested change
- The primary artifact will be the Kubernetes AI Conformance specification and a suite of tests to demonstrate conformance.
- The primary artifact will be the CNCF Kubernetes AI Conformance specification and a suite of tests to demonstrate conformance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only name things Kubernetes, for it to be CNCF, it would have to be blessed by CNCF bodies like TOC and TAGs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(that can happen later)

Copy link
Member

@mfahlandt mfahlandt Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, conformance naming is done by the GB eventually based on CNCF Charter, sections 5(b)(iv), 5(d)(viii)

but agree with @dims this can happen and be changed later

@k8s-ci-robot k8s-ci-robot added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Jul 15, 2025
@fedebongio fedebongio requested a review from mfahlandt July 15, 2025 17:08
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fedebongio, franciscojavierarceo, janetkuo, mfahlandt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 17, 2025
@mfahlandt
Copy link
Member

I think we are good to go to move it to review for steering?

@kubernetes/steering-committee
/assign @kubernetes/steering-committee

Comment on lines +14 to +19
- The primary artifact will be the (working title for now) "CNCF Kubernetes AI Conformance" specification and a suite of tests to demonstrate conformance.

#### Cross-cutting and Externally Facing Processes

- The Working Group will consider its primary problem-solving objective complete upon the successful definition and initial adoption of a stable (working title for now) "CNCF Kubernetes AI Conformance" specification.
- Once the foundational conformance is established and widely recognized, the ongoing maintenance and evolution of the conformance would be evaluated, and could ideally transition to an existing or newly formed Special Interest Group (SIG) with a long-term charter, at which point the Working Group would dissolve.
Copy link
Member

@aojea aojea Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have certain doubts with this paragraph

The WG definition in https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md#is-it-a-working-group-yes-if
indicates clearly that a WG

It does not own any code

A suite of tests is code, and there is also references to

and could ideally transition to an existing or newly formed Special Interest Group (SIG) with a long-term charter

So, if this code is in kubernetes it has to be owned by a SIG before, as "transition" implies it was owned by someone before, and a WG can not be the owner as indicated before, we need to clarify this.

The reference to "newly formed Special Interest Group (SIG)" is tendentious and irrelevant to the work and scope of this WG, I do not think it should be here.

#### Cross-cutting and Externally Facing Processes

- The Working Group will consider its primary problem-solving objective complete upon the successful definition and initial adoption of a stable (working title for now) "CNCF Kubernetes AI Conformance" specification.
- Once the foundational conformance is established and widely recognized, the ongoing maintenance and evolution of the conformance would be evaluated, and could ideally transition to an existing or newly formed Special Interest Group (SIG) with a long-term charter, at which point the Working Group would dissolve.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Once the foundational conformance is established and widely recognized, the ongoing maintenance and evolution of the conformance would be evaluated, and could ideally transition to an existing or newly formed Special Interest Group (SIG) with a long-term charter, at which point the Working Group would dissolve.
- Once the foundational conformance is established and widely recognized, the ongoing maintenance and evolution of the conformance would be evaluated, and could ideally transition to a Special Interest Group (SIG) with a long-term charter, at which point the Working Group would dissolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.