Skip to content
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .gemini/commands/qualify_candidates.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# In: <project>/.gemini/commands/qualify_candidates.toml
# Invoked via: /qualify_candidates <file_path>

description = "Qualifies a batch of LOTP candidates from a JSON file."
prompt = """
# Task: Qualify LOTP Candidates

You are a software supply chain security expert. A user has invoked a command to qualify a batch of LOTP candidates from a JSON file.

**The user's raw command is appended below your instructions.**

Your task is to:
1. Read the JSON file provided by the user.
2. Get the list of already qualified tools from the `_tool/` directory.
3. Filter the candidates from the JSON file, removing the tools that are already qualified.
4. Take the top 10 most used candidates from the filtered list.
5. For each of the top 10 candidates, invoke the `/qualify_lotp` command to qualify the tool.

## Qualification Process
1. **Read Candidates File:** Read the JSON file specified in the user's command. The file is an array of objects, each with a "tool" and "usage_count" key.
2. **List Existing Tools:** Get a list of all the markdown files in the `_tool/` directory. The names of these files (without the `.md` extension) are the already qualified tools.
3. **Filter and Sort:**
* Remove any candidate from the list if a corresponding markdown file already exists in the `_tool/` directory.
* Sort the remaining candidates in descending order based on their `usage_count`.
4. **Select Top 10:** Take the top 10 candidates from the sorted list.
5. **Qualify Each Candidate:** For each of the 10 candidates, execute the `/qualify_lotp` command with the tool name as the argument.
"""
47 changes: 47 additions & 0 deletions .gemini/commands/qualify_lotp.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# In: <project>/.gemini/commands/qualify_lotp.toml
# Invoked via: /qualify_lotp <tool_name> [optional_description]

description = "Qualifies a tool as a potential LOTP (Living Off The Pipeline) gadget based on the project's knowledge base."
prompt = """
# Task: Qualify a LOTP Gadget

You are a software supply chain security expert. A user has invoked a command to qualify a tool as a potential "Living Off The Pipeline" (LOTP) gadget.

**The user's raw command is appended below your instructions.**
{{ if .optional_description }}
**User-provided description:** {{ .optional_description }}
{{ end }}

Your task is to analyze the specified tool and determine if it qualifies as a First-Order or Second-Order LOTP. You must use the framework and definitions outlined in the project's `GEMINI.md` file as your primary reference.

## Qualification Process
1. **Identify the Tool:** State the name of the tool you are analyzing from the user's input. If you are unfamiliar with the tool, perform a web search to understand its purpose and find its official documentation.

2. **Consult the Knowledge Base:**
* First, check if the tool has already been documented. Search for a corresponding markdown file in the `_tool/` directory (e.g., for `yarn`, check `_tool/yarn.md`).
* If a document exists, summarize the existing analysis and conclude.
* If no document exists, check for similar, already-documented tools in the knowledge base (e.g., for `pnpm`, you might reference `npm` or `yarn`). This can help inform your analysis.

3. **Analyze for LOTP Characteristics:** Base your entire analysis on the concepts defined in `GEMINI.md`.
* **Important**: Your analysis must assume you are analyzing the latest, fully-patched version of the tool. Do not base your conclusion on past CVEs or vulnerabilities that have been fixed. The goal is to identify currently documented features that can be abused as LOTP gadgets because the tool's threat model did not account for a malicious actor controlling its configuration or input files within a CI/CD pipeline.
* **Investigate Sandboxing:** If the tool includes security features like sandboxing to restrict file or network access, you must investigate whether this sandbox can be reconfigured or disabled. Crucially, determine if the sandbox's configuration is stored in a file within the repository (e.g., `tool.config.yaml`, `.toolrc`). If an attacker can modify this configuration in a pull request, the sandbox should be considered bypassable, and the tool must be evaluated based on the primitives it would expose in its least restrictive configuration.
* **Analyze for First-Order LOTP (Direct Primitive):**
* Does the tool process a **configuration file** (e.g., `package.json`) or a **data input file** (e.g., an XSLT stylesheet) that could be controlled by an attacker in a pull request? When analyzing a configuration file, you must verify that the tool loads it from a location the attacker can control (e.g., the current working directory). A tool that only loads configuration from a secure, trusted location (e.g., the user's home directory) is not a LOTP vector.
* If yes, what is the resulting **malicious primitive**? Based on the primitive, classify it as a **Tool** or a **Gadget**.
* **LOTP Tool (RCE):** Does it provide direct Remote Code Execution?
* **LOTP Gadget (Non-RCE Primitives)::** Does it provide a lesser primitive?
* **Arbitrary File Write:** Can it write or overwrite files anywhere on the filesystem?
* **Arbitrary File Read:** Can it read arbitrary files from the filesystem?
* **Arbitrary Network Access:** Can it make outbound network calls (e.g., HTTP, DNS)?
* **Environment Variable Manipulation:** Can it set environment variables for subsequent pipeline steps?
* Crucially, you must also verify that the primitive is not a **"dud"**. A primitive is only useful if the attacker can see or use the result. For example, if a gadget can read a file, can it also print the contents to the logs or send them over the network? If it can write a file, can the attacker control the contents? For an **Arbitrary File Write** to be a useful "Setup Gadget", your analysis must verify it can write *outside the checkout directory* to a location that can influence a subsequent process. Your analysis must confirm a complete chain from primitive to useful output.
* **Analyze for Second-Order LOTP (Chained Attack):**
* If it is a **Gadget**, could it act as a **"Setup Gadget"**? Explain how its primitive could be used to enable another tool.
* Could this tool act as an **"Execution Gadget"**? (i.e., it is triggered by a change made by a setup gadget).

4. **Provide a Conclusion:**
* State clearly whether the item is a **First-Order LOTP Tool** or a **First-Order LOTP Gadget**, and why.
* Specify the exact malicious primitive it provides.
* If it is a Gadget, discuss its potential role as a "Setup" or "Execution" gadget in a Second-Order attack chain.
* Reference specific examples or official documentation for the tool to support your conclusion.
Comment on lines +45 to +46
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this one. Do we want to describe potential use in Second-Order for a First-Order tool? It seems like noise, since the second-order are defined in the tools which provide second-order vector already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's kind of confusing.

My idea was it's First-Order if you get a foot-hold (some major or minor) advantage as "Initial Access". Like a First-Order Tool gives you RCE with a sniper shot, while First-Order Gadget gives you something which is constrained, but can be useful to reach a Second-Order Tool in a way. Typically that's gonna be those needing an env var.

"""
43 changes: 43 additions & 0 deletions GEMINI.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
A tool is considered a "Living Off The Pipeline" (LOTP) gadget when it can be abused by an attacker to achieve a malicious outcome within a CI/CD pipeline. This typically occurs in a Poisoned Pipeline Execution (PPE) scenario, where an attacker submits a pull request containing a malicious file that a pipeline tool consumes.

There are two main concepts in this framework:

### First-Order LOTP (Direct Primitive)

This is a tool or utility that, in a single step, produces a malicious primitive by processing an attacker-controlled file. These are the fundamental building blocks of pipeline attacks. To better classify the risk, we distinguish between "Tools" and "Gadgets":

* A **First-Order LOTP Tool** provides direct **Remote Code Execution (RCE)**. These are the most critical vulnerabilities, as they give an attacker immediate control over the pipeline runner.
* *Examples:* `make` executing a command from a `Makefile`, `npm` running a malicious `postinstall` script in `package.json`.
* A **First-Order LOTP Gadget** provides a lesser, non-RCE primitive. While not immediately granting RCE, these primitives are powerful building blocks for data exfiltration or for setting up a Second-Order attack.
* *Examples:* a tool with a templating feature allowing arbitrary network access for data exfiltration, an XSLT processor writing a file to a known location.

An attacker-controlled file can be one of two types:

1. **Configuration File:** A file the tool implicitly loads from a location an attacker can control (e.g., the current working directory, the repository root). An attacker modifies this file to alter the tool's behavior. This is distinct from configuration files loaded from secure, trusted locations (e.g., a user's home directory), which do not constitute a LOTP vector.
2. **Data Input File:** A file that is explicitly passed to the tool as its primary input.

The outcome of a First-Order LOTP is a **malicious primitive**. Common primitives include:
* **Remote Code Execution (RCE):** The most powerful primitive, giving the attacker full control.
* *Examples:* `make` executing a command from a `Makefile`, `npm` running a malicious `postinstall` script in `package.json`.
* **Arbitrary File Write:** The ability to write or overwrite files on the runner's filesystem. This is a powerful "Setup" primitive.
* *Examples:* An XSLT processor using a function with a `file:///` URI, a static site generator with a configurable output directory that allows path traversal.
* **Arbitrary File Read:** The ability to read files from the runner's filesystem. This can be used for data exfiltration.
* *Examples:* An XML processor that resolves external entities (`XXE`), a template engine that can include local files.
* **Arbitrary Network Access:** The ability to make network requests to arbitrary endpoints. This is often used for data exfiltration.
* *Examples:* A tool with a templating feature that can perform DNS lookups or HTTP requests, an XSLT processor using the `document()` function to fetch a remote DTD or stylesheet.
* **Environment Variable Manipulation:** The ability to set or modify environment variables for subsequent steps in the pipeline. This is a classic "Setup" primitive.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add environment variable read primitive?

Copy link
Contributor Author

@fproulx-boostsecurity fproulx-boostsecurity Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, like it was for Trivy. Reading an env var is not enough, unless you have a way to exfil (via logs or network)


It is critical to not only identify a primitive but also to verify its usefulness. A primitive is only valuable to an attacker if the data it exposes can be exfiltrated or acted upon. For example, an **Arbitrary File Read** is a **"dud primitive"** if the tool provides no mechanism to either print the file's contents to the CI/CD logs or send them over the network. An attacker can read the file, but they cannot see the contents. Similarly, an **Arbitrary File Write** is a dud if the attacker cannot control the file's contents or if the write is confined to a subdirectory within the attacker's own checkout. For a "Setup Gadget" to be effective, its file write primitive must allow writing to a location *outside* the checkout directory, thereby influencing a separate, trusted process (e.g., by overwriting a file like `$GITHUB_ENV` or dropping a malicious configuration file in a predictable system path). Therefore, a complete LOTP gadget must include both the primitive itself and a channel for output or exfiltration.

A critical aspect of this analysis is to consider the tool's security controls, such as sandboxing. A tool may appear safe because its dangerous features (e.g., filesystem or network access) are restricted by a default security policy. However, if this policy is defined in a configuration file that resides within the repository, it is under the attacker's control. In a PPE scenario, an attacker can submit a pull request that modifies this configuration to weaken or disable the sandbox, thereby unlocking the tool's full potential as a LOTP gadget. Therefore, the analysis must not only identify sandboxing features but also verify whether their configuration is secure from attacker influence.

### Second-Order LOTP (Chained Gadget Attack)

A Second-Order LOTP is not a type of gadget, but rather an **attack chain** that involves at least two gadgets:

1. **The "Setup" Gadget:** A First-Order LOTP that provides a non-RCE primitive, such as writing a file, reading a secret, or setting an environment variable.
2. **The "Execution" Gadget:** A subsequent tool in the pipeline that is triggered by the change made by the setup gadget, leading to full RCE.

A prime example involves `bash` in GitHub Actions. A "setup" gadget (any First-Order LOTP with a file write primitive) could write `BASH_ENV=./attacker-script.sh` to the file at `$GITHUB_ENV`. The "execution" gadget would be `bash` itself in a later `run` step, which would then execute the script because the `BASH_ENV` variable causes `bash` to source the specified script upon startup. For more details on this mechanism, see: [`_tool/bash.md`](_tool/bash.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it is the best example, since $GITHUB_ENV is a uuid named file, we need to have access to environment variable read to modify this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but if I recall at any one time there is a single GITHUB_ENV file or even if many , if you can use * globbing we could affect it without knowing the exact file name, but I agree this has some extra contraints , vs another more static file


It is crucial to understand that while many potential "setup" gadgets exist, they only become exploitable vulnerabilities if a corresponding "execution" gadget exists later in the *same pipeline workflow* to complete the chain.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is going to discard LOTP based on this which is workflow dependent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in terms of the LLM reasoning it should not get stuck on "ame pipeline workflow" too much, more like a rhetorical "imagine a workflow"