docs(principles): add operational principles document by andreahlert · Pull Request #147 · apache/airflow-steward

andreahlert · 2026-05-13T17:48:07Z

Summary

Adds PRINCIPLES.md at the repo root. Proposed, not landed-as-final. The whole point of the PR is to bike-shed the principles themselves before they bind anyone.

Motivation

RFC-AI-0004 sets the six baseline principles every adopter signs. It is deliberately minimal because it has to travel: anything more specific would make the contract harder for non-Steward projects to take on.

The framework itself needs a tighter ruler. Skills are the unit of authorship here, and skills are subjective by construction. The same skill catalogue will end up touching PR triage, security-report handling, release artefacts, mailing-list drafts, contributor mentoring. Those surfaces have very different blast radius, and very different trust requirements.

When one markdown file can reach that many places, "the RFC permits it" is too coarse a check at PR-review time. Reviewers need something finer they can point at: which commitments block a release, what evidence promotes a mode, when telemetry is allowed at all, what an auditable agent action looks like.

This document is that ruler. It restates the six baseline principles in their operational shape, and adds the project-internal commitments the RFC leaves out on purpose: eval as a release blocker, contributor-sentiment gating on mode promotion, no default telemetry, reproducible releases from signed source, maintainer education shipped alongside the code.

What's inside

19 ordered principles. Earlier ones outrank later ones when they collide. A PR or skill that violates a principle is wrong even if every test passes, and any committer can block on principle grounds until the change complies, or until an amendment carries through governance.

The doc header positions this explicitly as built on top of RFC-AI-0004, not as a competing RFC.

Happy to drop, fold, or rewrite anything based on the thread.

PRINCIPLES.md restates RFC-AI-0004's six baseline principles in their operational shape and adds the project-internal commitments the RFC deliberately defers: eval as release blocker, contributor-sentiment gating, no default telemetry, reproducibility from signed source, maintainer education shipped with the platform. 19 ordered principles. Earlier outranks later when they collide. Amendment process matches the release-vote process (>=3 binding +1, no binding -1, 72h window, no lazy consensus). Positioned as project-internal operating contract, not a competing RFC.

potiuk · 2026-05-14T10:01:24Z

+
+Automation rolls out in order of reversibility and blast radius:
+
+- Read-only suggestions and conversational help before agent-drafted artefacts.


I woudl also add (we will need it to optimize things) - deterministic checks with executable code as first pass - or something like that. While the whole orchestration is done in an agentic way, as much as can be done deterministicly by python/groovy scripts - we should do it first to save on token usage

Picked this up but it landed in principle 5 instead of here, the block you commented on got split out into principle 4.

The deterministic-first bit is the last line of 5: "Where a deterministic check (script, linter, schema validation) can replace an LLM pass, it runs first; LLM passes are not spent on what executable code already decides"

Covers the token saving. one thing I narrowed though, it only says deterministic check where it can replace an LLM pass. your comment read broader, more like do everything you can in python/groovy before the agentic orchestration even starts. If you meant the wider scope I can reword 5 or add it back to 4.

potiuk · 2026-05-14T10:16:16Z

+
+## 18. Maintainer education ships with the platform
+
+Most maintainers have never built an agentic application. The mental model is different: behaviour is probabilistic, prompts are code, evaluation is harder than testing a function. Every release ships the docs, patterns, eval examples, and workshop material maintainers actually need. A platform without the education stream alongside it is not adoptable, regardless of code quality.


Possibly we could have some chapters here - while keeping the sequencing - splitting the principles into "general", "maintainer", "user", "authorship" sections (proposal - not sure if this is a good split)- I find some of those principles more important than some of those above, but in different context. For example licence is one of the most important for "authorship" section, while releases are important for "maintainer" section... I think this would also make nice "frame" for the skills we will have - which will also generally fall into one or more of those categories.

It might complicate the document those so probably we have to think and discuss a bit more how to approach it.

you're right that the we need to clarify more, and I believe that docs folder needs better grouping, I feel that gap too. I'm actually working on it now, PR should land by end of the week. It'll be closer to a detailed CONTRIBUTING, with the rules and the stuff that matters, and there I can absolutely split things by audience and context.

but I'd keep PRINCIPLES.md out of that split. the way I see the layering: rules can flex along the PMC > committer > contributor chain, that's where context and seniority play in. Principles are the layer underneath, the solid indisputable base everyone shares. that's why even a committer can block a PMC's PR if they flag a principle was broken. if I start chaptering principles by role, that hierarchy stops working, because suddenly a principle "belongs" to one group and the others feel out of place calling it out.

so my plan is: keep PRINCIPLES.md short, flat, ordered, readable by agents and humans in one pass. put the role-aware structure in the contributing rework that's coming. happy to ping you on that PR when it's up so we can shape the chapters there together.

btw, full disclosure, this is my opinion as the parent of the child here 😄 .

I'm building a small framework around using principles as a moral ruler for repos, so I'm naturally biased toward keeping it tight. totally open to working out whatever fits the project best and hearing what other PMCs think on this.

No.. I like the simplcity. It's maybe a far fetched comparision but it's almost like 10 commandments -> there is no "ifs" and "buts" ... but also 10 commandments are not prioritized (1st is more important than 2nd).

I think maybe then prioritisation according to sequence is not needed? I am not sure if those principles might contradict each other - sometimes they can possibly - however, I think prioritisation according to sequential number is not a good idea?

I think - eventually - all those principles are important and if we have conflict - we should not decide on the sequential number - but possibly we should surface the conflict to the human and let the human decide?

If we do that - then we can leave the simplicity and conditionality, and we will optimistically assume decisions are non-conflicting, but when they are - we should let human to make decision.

funny thing, I always thought the commandments were ordered, at least 1 lol. "no other gods before me" always read to me like it outranked the rest. but ok, point taken on the no-ifs-and-buts spirit.

Here's why I'd still keep the order. the difference between this doc and the commandments is who reads it. commandments are read by humans who can sit with ambiguity and judge.

PRINCIPLES.md is read by agents too, and an agent that hits a conflict with no tiebreaker doesn't pause and reflect, it stalls or picks one at random. the numeric order gives it a deterministic line of reasoning: two principles collide, lower number wins, move on. that's what keeps the agent coherent instead of guessing.

And we already prioritize, just not globally. P1 says security ships before features. P4's whole ladder is ordered (read-only > drafts > state-change > merge). so the doc isn't order-free today, it has local orderings and no global one, which is the actually inconsistent state imo.

I'd keep "surface to human", but as the last resort not the default. if the conflict is two readings of the same principle, or something the order genuinely can't separate, sure, kick it to a human (for the case when a commiter is judging). But making human escalation the default conflict path turns every ambiguous call into a synchronous human gate, which fights the automation ladder we just wrote two principles down.

So my take: numeric order stays as the agent's deterministic tiebreaker, human escalation covers what the order can't resolve. that also kills justinmclean's "families" gap, the order is the family.

WDYT? Would be great to have more opinions about.

Yep. Would love to hear from others :).

- Replace SPDX with full ASF v2 license header (jbonofre) - Clarify binding audience: contributors, committers, PMC, unmodified adopters (jbonofre) - Extend apache#5 with deterministic-first execution to save tokens (potiuk) - Extend apache#6 with explicit human sign-off for outbound human communication (RussellSpitzer) - Rework apache#9 around capability floor instead of "same code on all backends", add justified-and-minimized clause, add end-to-end single-machine config requirement (RussellSpitzer) - Standardize on US English (analyze, artifact, behavior, catalog, license, specialized)

potiuk · 2026-05-15T14:14:12Z

Turning to draft to avoid accidental merge

…-language independent' (RussellSpitzer)

justinmclean · 2026-05-16T09:08:36Z

+## 6. The human is always in the loop, until they choose otherwise
+
+Every agent-authored output (comment, label, draft, issue, PR) is a proposal a human signs off on. The agent never merges its own work. Auto-merge, where it exists, is narrow, opt-in per project AND per change class, and never touches security-class changes. **The opt-out never extends to communication aimed at a human: any outbound message a person will read as if a maintainer wrote it (reporter mail, PR or issue comment, review reply, mailing-list post, mentoring message) requires explicit human sign-off, regardless of mode.** Agent-sent prose is impersonation, and impersonation never graduates to an auto-mode.
+


P6 has an internal contradiction, and a one-word fix resolves it.
"The agent never merges its own work" is immediately followed by "Auto-merge, where it exists..." — but if the agent never merges its own work, auto-merge can't exist by definition. The two sentences contradict each other within the same principle.

Suggested fix: "The agent never directly initiates a merge of its own work."

Good catch but I don't think it's a contradiction, it's the verb "merges" being ambiguous. "the agent merges its own work" can mean the agent is the one doing the merge, or just that the agent's work gets merged. P6 means the first. auto-merge is the second, and there the merger is the platform + project policy, not the agent. so auto-merge can exist fine without the agent ever being the one merging.

directly initiates works but it leaves a gap, the agent could still enable auto-merge on its own PR and that's just indirect initiation. P6 already says auto-merge is opt-in per project and per change class, so that should be a policy decision, not something the agent flips on itself.

Went with unilaterally: "the agent never merges its own work unilaterally". auto-merge isn't unilateral by definition, it needs the project opt-in, so that closes the gap too. updated P6 in the PR.

WDYT?

Good reasoning, but I think the real issue is whether an agent reading this sees a contradiction, not whether a human does.

A human can resolve the ambiguity of "merges" from context and intent. An agent processing this as a behavioural rule may not. If it reads "The agent never merges its own work" in the passive sense (agent work never gets merged) then the auto-merge carve-out directly contradicts it. If it reads it in the active sense (the agent never presses merge) there's no contradiction. The text doesn't signal which reading is intended, so the agent may resolve it inconsistently or incorrectly.

"Unilaterally" nudges toward the active reading, which helps. But since this document governs agent behaviour, making the subject explicit would remove the ambiguity entirely:

"The agent never performs a merge of its own work."

That leaves no room for misreading, and the auto-merge sentence follows cleanly.

What do you think?

justinmclean · 2026-05-16T09:10:32Z

+## 3. Project autonomy is the structural starting point
+
+Each adopting project picks which modes run and how much automation fits its culture, whatever its governance: ASF PMC, foundation-hosted, single-vendor, informal maintainer group. The framework offers a range, never mandates a level. Non-ASF adopters are first-class citizens. Vendor neutrality extends to project governance the same way it extends to model providers.
+


P3 claims that non-ASF adopters are first-class citizens, but the amendment process structurally excludes them.
The amendment section requires threads on [email protected] and [email protected], as well as binding votes from PMC members. A non-ASF project has none of these. There's no stated path for a non-ASF adopter to propose or ratify a principle change.

So we'd need to either extend the amendment section with an equivalent process for non-ASF adopters or scope the claim more honestly.

I think this is a wording problem more than a structural one. P3 is about adoption autonomy, "first-class" there means a non-ASF project gets every mode and skill without being a second-tier user. it was never claiming a vote on the framework's own governance. apache-steward is an ASF project, so amending its own PRINCIPLES.md goes through its PMC, same as any upstream's governance doc. a downstream consumer doesn't get a binding vote upstream, that's normal.

Small correction on the proposal path: an amendment is a PR against the file, anyone can open that. what a non-ASF adopter can't do is cast a binding vote to ratify. so it's the ratification that's ASF-only, not the proposal.

The "equivalent process for non-ASF adopters" option doesn't really work, you can't bolt a non-ASF ratification track onto an ASF project's governance doc, binding votes belong to the PMC by ASF rules. best you could do is a consultation channel.

So I went with scoping the claim. changed it to "first-class adopters, not a compatibility afterthought" and added a line to the amendment section: anyone can propose via PR, ratification is the apache-steward PMC's, and adopters who need a principle to read differently locally use overrides (P13) instead of amending the file. the doc already binds adopters only "to the extent they consume the framework unmodified", so it's a voluntary and overridable binding, not a trap.

WDYT?

@potiuk could you help us with extra eyes and opinion?

The proposal/ratification split resolves it cleanly. Anyone can open a PR, binding votes stay with the PMC, that's accurate and easy to follow.

"First-class adopters, not a compatibility afterthought" is better wording too.

justinmclean · 2026-05-16T09:13:05Z

+These principles regulate what this framework is and how it evolves. Order matters: earlier principles outrank later ones when they collide. Within the same family, the stricter reading wins until governance documents otherwise.
+
+A change (PR, skill, tool adapter, release) that violates a principle is wrong even if every test passes. Any committer may block it on principle grounds. The block lifts when the change complies, or when a principle-amendment proposal carries through governance with the same weight as a release vote.
+


Committer blocks have no resolution path.

The preamble gives any committer the power to block a change on principle, but the amendment process grants binding votes only to PMC members. In ASF governance, these are distinct roles.

So a committer can raise a block, but has no vote in resolving it — either the block is toothless (PMC overrides it anyway) or it's a deadlock with no exit.

Suggestion: "any committer may raise a principle objection" (non-binding, for PMC consideration)"

There's a real gap here but I think it's a different one than "committer has no vote". the block already has a vote-free exit: the change complies. author fixes it to not violate the principle and the block is gone, nobody votes. the missing case is when the author and the committer disagree on whether it actually complies. that's the hole, no adjudicator for a contested block.

The "PMC overrides it anyway" branch assumes a PMC override that the doc never grants. there's no override clause anywhere. so it's not toothless-or-deadlock, it's just the one deadlock when compliance is disputed.

I'd rather not demote the block to a non-binding objection. a committer being able to block on principle, even against a PMC member's PR, is intentional, principles are the shared base and that's the one place seniority doesn't buy a pass. making it "for PMC consideration" guts that.

so I kept the block binding and added a third resolution path instead: when the author and the blocking committer disagree on whether the change complies, a PMC vote settles whether the principle is violated. the committer doesn't need a vote to raise the block, they need the dispute to have an exit. raising the objection and adjudicating it are different roles, the committer does the first, the PMC does the second. updated the preamble in the PR.

@justinmclean, @potiuk and other PMC folks, would like your read on this one specifically. the committer-can-block-a-PMC-PR bit is a deliberate design choice and I want to make sure the resolution path feels right before it sets in.

The adjudicator/objector split is the right framing and the third resolution path closes the gap.

The part that feels off is jumping straight to a PMC vote to settle a disputed block. ASF values consensus over voting, a vote is a last resort when consensus fails, not a first-line dispute mechanism. Framing it as "PMC vote settles it" sets the wrong default.

Better to have the disputed block trigger a consensus-seeking step first, discussion on the PR, maybe a lazy consensus window, with a formal vote only if consensus genuinely can't be reached.

justinmclean · 2026-05-16T09:15:52Z

+## 3. Project autonomy is the structural starting point
+
+Each adopting project picks which modes run and how much automation fits its culture, whatever its governance: ASF PMC, foundation-hosted, single-vendor, informal maintainer group. The framework offers a range, never mandates a level. Non-ASF adopters are first-class citizens. Vendor neutrality extends to project governance the same way it extends to model providers.
+


"Agent-sent prose is impersonation" overclaims.

The clause requiring human sign-off on outbound messages is solid. But the next sentence — "Agent-sent prose is impersonation" — is too absolute. A clearly attributed, human-approved message ("Claude drafted this, maintainer X approved it") isn't impersonation by any ordinary definition.

Agreed, the sentence overclaims. the problem is it drops the qualifier the sentence before it sets up. P6 defines the protected class as a message someone reads "as if a maintainer wrote it". then "agent-sent prose is impersonation" comes out unqualified, so it reads as all agent prose, which isn't right. a clearly labeled bot message that nobody reads as maintainer-authored isn't impersonation, and P6 doesn't even require sign-off for that case since it's outside the protected class.

one nit on your example though, "Claude drafted this, maintainer X approved it" is the human sign-off case, which P6 already allows explicitly. so that one's fine regardless. the actual overclaim is the labeled-bot-message case, no sign-off, not passing as a maintainer.

fix I'd go with is to tie impersonation back to the qualifier instead of dropping it: "Sending such prose without that sign-off is impersonation, and impersonation never graduates to an auto-mode." keeps the point, stops it from sweeping in labeled bot output.

WDYT?

Agreed, tying impersonation back to the qualifier cleanly fixes the overclaim. "Sending such prose without that sign-off is impersonation" preserves the intent without sweeping in labeled bot output that nobody would read as maintainer-authored.

Worth noting that the ASF generative tooling guidance already has a labeling norm, recommending contributors include a "Generated-by:" token in commit messages for AI-assisted content. P6 sits comfortably alongside that: labeled, disclosed output is the normal case, and the sign-off requirement kicks in specifically when the output is intended to be read as if a person wrote it.

The proposed wording handles that distinction well.

justinmclean · 2026-05-16T09:16:42Z

+
+# Apache Steward Design Principles
+
+These principles regulate what this framework is and how it evolves. Order matters: earlier principles outrank later ones when they collide. Within the same family, the stricter reading wins until governance documents otherwise.


The preamble says "Within the same family, the stricter reading wins" — but the principles aren't organized into families anywhere in the document. That clause has no referent and can't be applied as written.

Good catch.

I'm not going to add families, the flat ordered list is deliberate and I want to keep it readable in one pass. instead I reworded the clause to say what it was meant to say: "Where a single principle admits more than one reading, the stricter reading wins until governance documents otherwise." updated in the PR.

Just add: what "stricter reading wins" is actually for is interpreting a single principle when it can be read loose or strict, take the strict one. the word "family" was a leftover from an earlier draft and never meant grouping.

The rewording is clear and says exactly what it means. No objection.

…auto-merge tension (justinmclean)

…t proposal path (justinmclean)

…ocks (justinmclean)

…intainer-authored (justinmclean)

…rinciple interpretation rule (justinmclean)

A skill is always a directory with SKILL.md as entrypoint, even for one-file workflows. SKILL.md stays under 500 lines; longer reference material moves into sibling markdown linked one level deep. Matches the runtime contract documented at https://code.claude.com/docs/en/skills and https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices, and reflects how skills in this repo (contributor-nomination, pr-management-code-review, pr-management-mentor) are already authored.