Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions .github/skills/airflow-translations/locales/th.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ The following technical terms should **remain in English** in Thai translations:

### Core Technical Terms (คำศัพท์ทางเทคนิค)

- **DAG** (Directed Acyclic Graph) - Keep as "DAG"
- **DAG Run** - Keep as "DAG Run"
- **Dag** - Keep as "Dag" (Airflow convention; never write "DAG")
- **Dag Run** - Keep as "Dag Run"
- **Task Instance** - Keep as "Task Instance"
- **XCom** - Keep as "XCom"
- **Asset** - Keep as "Asset"
Expand All @@ -24,7 +24,7 @@ The following technical terms should **remain in English** in Thai translations:
- **Sensor** - Keep as "Sensor"
- **Hook** - Keep as "Hook"
- **Operator** - Keep as "Operator" (โอเปอเรเตอร์) or in English
- **DAGBag** - Keep as "DAGBag"
- **DagBag** - Keep as "DagBag"

### UI Components (ส่วนประกอบของอินเทอร์เฟซ)

Expand Down Expand Up @@ -123,14 +123,14 @@ In Airflow UI and messages, numerals are typically formatted as:
Example:

```text
งาน DAG รันสำเร็จ (DAG run successful)
งาน Dag รันสำเร็จ (Dag run successful)
```

## Translation Style Guidelines

### 1. Technical Terminology

Keep technical terms like DAG, XCom, Operator in English when:
Keep technical terms like Dag, XCom, Operator in English when:

- They appear in code or configuration examples
- No clear Thai equivalent exists
Expand Down Expand Up @@ -181,8 +181,8 @@ Prefer translation for:

### 1. "Run" Context

- "Run DAG" → "รัน DAG" or "ดำเนินการ DAG"
- "DAG run" (noun) → "การรัน DAG" or "DAG Run"
- "Run Dag" → "รัน Dag" or "ดำเนินการ Dag"
- "Dag run" (noun) → "การรัน Dag" or "Dag Run"
- "Run ID" → "รันไอดี" or "Run ID"

### 2. "Task" Context
Expand All @@ -191,11 +191,11 @@ Prefer translation for:
- "Task instance" → "Task Instance" or "อินสแตนซ์งาน"
- "Task ID" → "Task ID" or "ไอดีงาน"

### 3. "DAG" Context
### 3. "Dag" Context

- "DAG run" → "การรัน DAG" or "DAG Run"
- "DAG ID" → "DAG ID" or "ไอดี DAG"
- "Sub DAG" → "DAG ย่อย" or "Sub DAG"
- "Dag run" → "การรัน Dag" or "Dag Run"
- "Dag ID" → "Dag ID" or "ไอดี Dag"
- "Sub Dag" → "Dag ย่อย" or "Sub Dag"

### 4. Configuration

Expand Down Expand Up @@ -252,14 +252,14 @@ However, these are typically omitted in technical documentation to maintain conc
"Tree View" → "มุมมองต้นไม้" or "Tree View"
"Graph View" → "มุมมองกราฟ" or "Graph View"
"Task Instances" → "Task Instances" or "อินสแตนซ์งาน"
"DAG Runs" → "DAG Runs" or "การรัน DAG"
"Dag Runs" → "Dag Runs" or "การรัน Dag"
```

### Message Examples

```text
"Task failed" → "งานล้มเหลว"
"DAG run successful" → "การรัน DAG สำเร็จ"
"Dag run successful" → "การรัน Dag สำเร็จ"
"XCom pushed" → "ดัน XCom แล้ว" or "XCom pushed"
"Connection test failed" → "การทดสอบการเชื่อมต่อล้มเหลว"
```
Expand Down
15 changes: 15 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,21 @@

# AGENTS instructions

## Naming

Write **Dag** (title case) in all prose. Keep the all-caps or lowercase
spelling only when reproducing a literal code token — never rewrite these,
even inside fenced code blocks:

- Python: the SDK class `DAG` (`from airflow.sdk import DAG`,
`dag = DAG("my_dag", ...)`); identifiers like `dag_id`, `dag`, `my_dag`.
- CLI: `airflow dags list`, `airflow dags test`, etc.
- Paths and config keys: `dag_processing/`, `dagprocessor`, `get_dag`, etc.
- Anti-pattern quotes that show the wrong form to teach the rule itself
(e.g., `Use "DAG" — always write "Dag"`).

Don't spell out **Directed Acyclic Graph** except for historical context.

## Environment Setup

- Install prek: `uv tool install prek`
Expand Down
88 changes: 88 additions & 0 deletions providers/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,91 @@ Each provider is an independent package with its own `pyproject.toml`, tests, an
- Don't upper-bound dependencies by default; add limits only with justification.
- Tests live alongside the provider — mirror source paths in test directories.
- Full guide: [`contributing-docs/12_provider_distributions.rst`](../contributing-docs/12_provider_distributions.rst)

## Changelog — never use newsfragments

**Never create newsfragments for providers.** Providers are released from `main` in waves, so
per-PR newsfragments are not consumed by the release process — the release manager regenerates the
changelog from `git log`. The towncrier-managed `newsfragments/` workflow is used only by
`airflow-core/`, `chart/`, and `dev/mypy/`. (`airflow-ctl/` follows the same "no newsfragments,
direct edit" pattern as providers — see `dev/README_RELEASE_AIRFLOWCTL.md`.)

When a provider change needs a user-visible note (typically a breaking change or important behavior
change that warrants explanation), update the provider's `docs/changelog.rst` directly in the same
PR, just below the `Changelog` header — exactly as the in-file `NOTE TO CONTRIBUTORS` block
describes. Routine entries (features, bug fixes, misc) are collected automatically by the release
manager from commit messages, so most PRs do not need to touch the changelog at all.

## Security: Connection extras must not be forwarded blindly to hooks/operators

**Never pass the whole `Connection.extra` dict (or `**conn.extra_dejson`) as
keyword arguments to a hook, operator, client constructor, or any underlying
library call.** Forward only the specific extra keys you have explicitly
reviewed and know are safe to expose.

### Why this matters — different security boundaries

Airflow has two distinct user roles with different trust levels:

- **Connection editors** — UI/API users with permission to create or edit
Connections. They control the host, login, password, and the `extra` JSON
blob.
- **Dag authors** — users who write the Python code that constructs hooks
and operators. They control which arguments are passed at call sites.

These are *not* the same population, and the security model treats them
differently. A Connection editor is trusted to supply credentials for a target
system; they are **not** trusted to alter how the worker process behaves, load
arbitrary Python code, change file paths the worker reads, or pass options
into client libraries that the Dag author did not opt into.

### What goes wrong when extras are forwarded blindly

Many client libraries accept constructor or method kwargs that are dangerous
when attacker-controlled. Concrete examples seen in the wild:

- A kwarg that names a Python callable, plugin path, or import string —
attacker sets it to a module they control → **remote code execution** on
the worker.
- A kwarg that points at a local file (cert path, key file, log path,
config file) — attacker redirects it to read or overwrite worker-side
files.
- A kwarg that sets a proxy, endpoint URL, or hostname — attacker
redirects traffic to an MITM endpoint and harvests credentials or
tokens for *other* systems.
- A kwarg that toggles TLS verification, signing, or auth — attacker
silently downgrades security.
- A kwarg that controls subprocess execution, shell invocation, or
template rendering — attacker reaches command/template injection.

In all of these, the Connection editor effectively gains capabilities
(RCE, file read/write, traffic redirection, auth bypass) that the security
model does not grant them.

### The rule

- **Allowlist, never passthrough.** In the hook, read each extra key by
name (`conn.extra_dejson.get("region_name")`,
`conn.extra_dejson.get("verify")`, …) and pass only those named values
forward. Reject or ignore unknown keys.
- **Do not** write `SomeClient(**conn.extra_dejson)`,
`hook = MyHook(**conn.extra_dejson)`, or
`kwargs.update(conn.extra_dejson)` followed by a downstream call.
- When adding support for a new extra key, treat it like any other public
argument: review what the underlying library does with it, and document
it in the provider's connection docs.
- If a Dag author genuinely needs to pass a non-allowlisted option, that
option should be a **Dag-author-supplied argument** on the operator or
hook (with its own review), not something a Connection editor can set.

### When reviewing provider PRs

Flag any of these patterns:

- `**conn.extra_dejson` or `**self.extra_dejson` spread into a constructor
or call.
- Looping over `conn.extra_dejson.items()` and forwarding every key.
- New code paths where extras are merged into `kwargs` and then passed on
unfiltered.
- "Convenience" features that let users put arbitrary client kwargs into
`extra` — they widen the Connection-editor blast radius.
Loading