Skip to content

Arm backend: Allow Arm executor_runner CMake to run standalone#19018

Open
usamahz wants to merge 14 commits into
pytorch:mainfrom
usamahz:feature/standalone-runner
Open

Arm backend: Allow Arm executor_runner CMake to run standalone#19018
usamahz wants to merge 14 commits into
pytorch:mainfrom
usamahz:feature/standalone-runner

Conversation

@usamahz
Copy link
Copy Markdown
Collaborator

@usamahz usamahz commented Apr 21, 2026

Summary

This PR modernizes the ExecuTorch Arm bare-metal runner workflow so users can move from a PyTorch model to a runnable
Arm executor runner with fewer manual build-system steps, stronger validation, and faster repeated local iteration.

The main change is a new standalone Arm executor runner CMake entry point. run.sh now acts as the orchestration
layer for common Ethos-U bare-metal flows: it can derive build directories, configure the standalone runner with Arm
bare-metal defaults, stage generated PTE/BPTE files, validate reused CMake caches, build the needed runner target,
locate the runner binary, and invoke FVP.

Problem

Before this change, the Arm runner workflow depended on manually stitching together ExecuTorch build/install
artifacts, runner CMake configuration, PTE input wiring, toolchain and target settings, optional debug features, and
repeated install/export steps.

That made the workflow harder to explain, fragile in CI, slower to iterate on locally, and easy to break when reusing
a build directory configured for a different target or feature set.

And a shorter version if the PR description is already long:

CMake Architecture Change

flowchart LR
    subgraph Before
        A1["Build ExecuTorch<br/>arm-baremetal preset"] --> A2["Install/export artifacts"]
        A2 --> A3["Configure runner CMake<br/>examples/arm/executor_runner"]
        A4["PTE / BPTE"] --> A3
        A3 --> A5["arm_executor_runner ELF"]
    end

    subgraph After
        B1["run.sh"] --> B2["Validate / choose build dir"]
        B2 --> B3["Standalone runner CMake<br/>examples/arm/executor_runner/standalone"]
        B4["PTE / BPTE"] --> B1
        B3 --> B5["ExecuTorch top-level CMake<br/>as subdirectory"]
        B3 --> B6["Arm CMake helpers + presets"]
        B5 --> B7["arm_executor_runner ELF"]
        B6 --> B7
    end
Loading

What Changed

  • Added examples/arm/executor_runner/standalone as the supported standalone CMake entry point for
    arm_executor_runner.
  • Added shared Arm CMake helpers for Ethos-U SDK setup, required target validation, and predictable runner output
    paths.
  • Updated build_executor_runner.sh and run.sh to use the standalone runner workflow.
  • Added deterministic default build directories under --et_build_root.
  • Added cache validation for reused build directories, including target, toolchain, selected ops, PTE placement,
    BundleIO, ETDump, and devtools settings.
  • Added PTE/BPTE staging so repeated runs can reuse the same configured CMake build directory.
  • Integrated selective-op handling into the standalone runner path.
  • Cleaned up bare-metal install/export behavior so standalone builds can consume reusable build-tree artifacts.
  • Updated Arm README and notebooks for the new workflow.

Iteration Speed

Repeated local PTE-to-runner iteration is now 8x faster because run.sh can reuse the configured standalone CMake build directory, stage updated PTE/BPTE payloads into the existing cache wiring, and rebuild only the needed runner target instead of repeating the full manual configure/install/export flow.

This is a developer workflow speedup, not a model runtime speedup.

Result

For common Ethos-U bare-metal usage, the user-facing path is now script-owned and repeatable:

  1. Run Arm setup.
  2. Run examples/arm/run.sh with a model and target.
  3. Reuse or inspect the generated build directory under --et_build_root.
  4. Iterate by regenerating the PTE/BPTE and rebuilding through the same validated CMake cache.

VGF host flows remain explicit: run.sh requires an existing --build-dir for VGF-style host builds rather than
auto-configuring them as bare-metal runner builds.

Testing

Validated through the Arm backend runner, bare-metal, VGF, and CI workflows covered by this stack.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 21, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19018

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 10 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit 949526a with merge base 8020fe0 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 21, 2026
@github-actions github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Apr 21, 2026
@usamahz usamahz marked this pull request as draft April 21, 2026 12:31
@usamahz
Copy link
Copy Markdown
Collaborator Author

usamahz commented Apr 21, 2026

I will merge after all 8 commits land in this PR - so until then please do not merge! :)

usamahz added 6 commits May 8, 2026 15:33
- Detect standalone invocation, derive EXECUTORCH_ROOT, and expose
  ARM_EXECUTOR_RUNNER_STANDALONE for diagnostics.
- Load ExecuTorch presets and add_subdirectory(EXECUTORCH_ROOT ...) when
  building out of tree.
- Refresh pte_to_header.py with a shebang and updated Arm copyright.

Change-Id: I4582326c72a0b571c495aca64b2c58e45bfbb5be
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Auto-detect Python and corstone helpers so standalone builds mirror
  setup.sh and run.sh.
- Reuse the in-tree Ethos-U core driver unless a custom path is supplied
  and optionally fetch the SDK into arm-scratch.
- Validate delegate prerequisites and enforce ET_PTE_FILE_PATH behavior.
- Halt when no PTE or semihosting mode is provided.

Change-Id: Iadd5dcd5e1a12dca7a00117c7778e9580364294a
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Call gen_oplist.py through the configured Python interpreter only when
  a model PTE exists.
- Reference CMAKE_CURRENT_SOURCE_DIR for generated headers and linker
  scripts so out-of-tree builds resolve paths correctly.
- Normalize runner outputs and sanitizer helpers so the standalone build
  mirrors the superbuild.
- When BundleIO reuses a separate ExecuTorch build tree where
  bundled_program is not part of this CMake graph, restrict the fallback
  lookup to caller-provided build directories so the runner does not pick
  up an unrelated host library.

Change-Id: I9932d8d7434e8a834b21ac9bbf290361d7ec117b
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Honor EXECUTORCH_BAREMETAL_SKIP_INSTALL so embedders can disable
  install() rules.
- Propagate Ethos-U delegate includes, install the core driver when
  available, and copy CMSIS-NN headers for downstream toolchains.
- Route the arm_baremetal preset install output back into the build tree
  to keep standalone builds self-contained.

Change-Id: I84bb6a1ad64a404e10e8ce8897167e595b8b82fa
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Force EXECUTORCH_BAREMETAL_SKIP_INSTALL=OFF so build_executorch.sh always exports the Arm runner dependencies.
- Stop building the install target on non-musl hosts; the default build target already covers what run.sh needs and avoids redundant installs.

Change-Id: Iecd91e4a3eb275ca67ce6593ebfb06d3d7ec42ef
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Clarify help text for select_ops_list, toolchain choices, and add
  --build-dir reuse.
- Track whether select_ops_list was overridden, allow arbitrary cmake -D
  flags, and tidy scratch or toolchain warnings.
- Plumb the new option state through the control flow to prepare for
  automation.

Change-Id: I69b027e726eee0b23206e7e3c836db375a8bf5b6
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
@usamahz usamahz force-pushed the feature/standalone-runner branch from 10dc5c9 to 9127071 Compare May 8, 2026 14:34
@usamahz usamahz marked this pull request as ready for review May 8, 2026 14:40
@usamahz usamahz marked this pull request as draft May 8, 2026 14:41
usamahz added 2 commits May 11, 2026 09:14
- Auto-derive arm_executor_runner build directories when --build-dir is
  omitted and configure them with the arm_baremetal preset.
- Add validation helpers that ensure standalone builds were configured
  with the right targets, toolchains, and BundledIO/devtools toggles.
- Teach the script to stage PTEs, reuse multi-config build trees, and
  drive FVP/BundleIO workflows from a single entry point.

Change-Id: If52327a1bc512c87fd2ce5d9ce89c352919fd447
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Explain the auto-configured runner build flow and scratch directory
  expectations in examples/arm/README.md.
- Update the Ethos-U notebook to export EXECUTORCH_ROOT before calling
  standalone cmake.

Change-Id: If9f4f456c03b7a36a27ffdd1dfd1873ec286d07b
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
@usamahz usamahz marked this pull request as ready for review May 11, 2026 08:17
@usamahz usamahz added the release notes: arm Changes to the ARM backend delegate label May 11, 2026
@usamahz usamahz force-pushed the feature/standalone-runner branch from 0a84180 to 423ca2e Compare May 12, 2026 10:49
Allow VGF host runner builds to reuse existing top-level CMake build
directories without requiring the bare-metal standalone marker.

Pin the standalone Arm runner registry size to the default capacity
unless the user overrides MAX_KERNEL_NUM. This prevents selected-op
cache sizing from undersizing binaries that also link quantized and
Cortex-M registration libraries.

Change-Id: I6716c454ec5d9d3adbff756afc14fe8739268520
Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
@usamahz usamahz force-pushed the feature/standalone-runner branch from 423ca2e to b4e729e Compare May 12, 2026 10:53
@usamahz usamahz requested a review from zingo May 12, 2026 11:02
Update generated Ethos-U docs and docgen templates to point users at
the standalone Arm executor runner CMake entry point.

This replaces the old two-step install and direct runner configure flow.

Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
Change-Id: I582b87033c7d50a4219fc01a01f1b5ddd980e8e4
@usamahz usamahz requested a review from mergennachin as a code owner May 12, 2026 11:03
@usamahz
Copy link
Copy Markdown
Collaborator Author

usamahz commented May 12, 2026

This PR is ready to review

@zingo
Copy link
Copy Markdown
Collaborator

zingo commented May 12, 2026

Hi @digantdesai this is large and a lot of cmake changes and replaces a PR you reviewed yesterday in a better way)is OK to merge this?
We don't thing it should need any buck2 changes.

Adding @GregoryComer also, as you where added the the old PR .

Skip bundled FVP installation on non-Linux hosts and detect
repo-local FVPs-on-Mac wrappers or host-installed FVP binaries.

This keeps setup.sh from failing on macOS when FVP binaries are
provided outside the bundled Linux install path.

Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
@zingo zingo added the help wanted Extra attention is needed label May 12, 2026
Copy link
Copy Markdown
Member

@GregoryComer GregoryComer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. I'll defer to @digantdesai, but I believe there should be no buck impact and you should be clear to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. help wanted Extra attention is needed module: arm Issues related to arm backend release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants