Arm backend: Allow Arm executor_runner CMake to run standalone#19018
Arm backend: Allow Arm executor_runner CMake to run standalone#19018usamahz wants to merge 14 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19018
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 10 New Failures, 1 Cancelled Job, 3 Unrelated FailuresAs of commit 949526a with merge base 8020fe0 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
I will merge after all 8 commits land in this PR - so until then please do not merge! :) |
- Detect standalone invocation, derive EXECUTORCH_ROOT, and expose ARM_EXECUTOR_RUNNER_STANDALONE for diagnostics. - Load ExecuTorch presets and add_subdirectory(EXECUTORCH_ROOT ...) when building out of tree. - Refresh pte_to_header.py with a shebang and updated Arm copyright. Change-Id: I4582326c72a0b571c495aca64b2c58e45bfbb5be Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Auto-detect Python and corstone helpers so standalone builds mirror setup.sh and run.sh. - Reuse the in-tree Ethos-U core driver unless a custom path is supplied and optionally fetch the SDK into arm-scratch. - Validate delegate prerequisites and enforce ET_PTE_FILE_PATH behavior. - Halt when no PTE or semihosting mode is provided. Change-Id: Iadd5dcd5e1a12dca7a00117c7778e9580364294a Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Call gen_oplist.py through the configured Python interpreter only when a model PTE exists. - Reference CMAKE_CURRENT_SOURCE_DIR for generated headers and linker scripts so out-of-tree builds resolve paths correctly. - Normalize runner outputs and sanitizer helpers so the standalone build mirrors the superbuild. - When BundleIO reuses a separate ExecuTorch build tree where bundled_program is not part of this CMake graph, restrict the fallback lookup to caller-provided build directories so the runner does not pick up an unrelated host library. Change-Id: I9932d8d7434e8a834b21ac9bbf290361d7ec117b Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Honor EXECUTORCH_BAREMETAL_SKIP_INSTALL so embedders can disable install() rules. - Propagate Ethos-U delegate includes, install the core driver when available, and copy CMSIS-NN headers for downstream toolchains. - Route the arm_baremetal preset install output back into the build tree to keep standalone builds self-contained. Change-Id: I84bb6a1ad64a404e10e8ce8897167e595b8b82fa Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Force EXECUTORCH_BAREMETAL_SKIP_INSTALL=OFF so build_executorch.sh always exports the Arm runner dependencies. - Stop building the install target on non-musl hosts; the default build target already covers what run.sh needs and avoids redundant installs. Change-Id: Iecd91e4a3eb275ca67ce6593ebfb06d3d7ec42ef Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Clarify help text for select_ops_list, toolchain choices, and add --build-dir reuse. - Track whether select_ops_list was overridden, allow arbitrary cmake -D flags, and tidy scratch or toolchain warnings. - Plumb the new option state through the control flow to prepare for automation. Change-Id: I69b027e726eee0b23206e7e3c836db375a8bf5b6 Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
10dc5c9 to
9127071
Compare
- Auto-derive arm_executor_runner build directories when --build-dir is omitted and configure them with the arm_baremetal preset. - Add validation helpers that ensure standalone builds were configured with the right targets, toolchains, and BundledIO/devtools toggles. - Teach the script to stage PTEs, reuse multi-config build trees, and drive FVP/BundleIO workflows from a single entry point. Change-Id: If52327a1bc512c87fd2ce5d9ce89c352919fd447 Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
- Explain the auto-configured runner build flow and scratch directory expectations in examples/arm/README.md. - Update the Ethos-U notebook to export EXECUTORCH_ROOT before calling standalone cmake. Change-Id: If9f4f456c03b7a36a27ffdd1dfd1873ec286d07b Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
0a84180 to
423ca2e
Compare
Allow VGF host runner builds to reuse existing top-level CMake build directories without requiring the bare-metal standalone marker. Pin the standalone Arm runner registry size to the default capacity unless the user overrides MAX_KERNEL_NUM. This prevents selected-op cache sizing from undersizing binaries that also link quantized and Cortex-M registration libraries. Change-Id: I6716c454ec5d9d3adbff756afc14fe8739268520 Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
423ca2e to
b4e729e
Compare
Update generated Ethos-U docs and docgen templates to point users at the standalone Arm executor runner CMake entry point. This replaces the old two-step install and direct runner configure flow. Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com> Change-Id: I582b87033c7d50a4219fc01a01f1b5ddd980e8e4
|
This PR is ready to review |
|
Hi @digantdesai this is large and a lot of cmake changes and replaces a PR you reviewed yesterday in a better way)is OK to merge this? Adding @GregoryComer also, as you where added the the old PR . |
Skip bundled FVP installation on non-Linux hosts and detect repo-local FVPs-on-Mac wrappers or host-installed FVP binaries. This keeps setup.sh from failing on macOS when FVP binaries are provided outside the bundled Linux install path. Signed-off-by: Usamah Zaheer <usamah.zaheer@arm.com>
GregoryComer
left a comment
There was a problem hiding this comment.
Looks reasonable to me. I'll defer to @digantdesai, but I believe there should be no buck impact and you should be clear to merge.
Summary
This PR modernizes the ExecuTorch Arm bare-metal runner workflow so users can move from a PyTorch model to a runnable
Arm executor runner with fewer manual build-system steps, stronger validation, and faster repeated local iteration.
The main change is a new standalone Arm executor runner CMake entry point.
run.shnow acts as the orchestrationlayer for common Ethos-U bare-metal flows: it can derive build directories, configure the standalone runner with Arm
bare-metal defaults, stage generated PTE/BPTE files, validate reused CMake caches, build the needed runner target,
locate the runner binary, and invoke FVP.
Problem
Before this change, the Arm runner workflow depended on manually stitching together ExecuTorch build/install
artifacts, runner CMake configuration, PTE input wiring, toolchain and target settings, optional debug features, and
repeated install/export steps.
That made the workflow harder to explain, fragile in CI, slower to iterate on locally, and easy to break when reusing
a build directory configured for a different target or feature set.
And a shorter version if the PR description is already long:
CMake Architecture Change
flowchart LR subgraph Before A1["Build ExecuTorch<br/>arm-baremetal preset"] --> A2["Install/export artifacts"] A2 --> A3["Configure runner CMake<br/>examples/arm/executor_runner"] A4["PTE / BPTE"] --> A3 A3 --> A5["arm_executor_runner ELF"] end subgraph After B1["run.sh"] --> B2["Validate / choose build dir"] B2 --> B3["Standalone runner CMake<br/>examples/arm/executor_runner/standalone"] B4["PTE / BPTE"] --> B1 B3 --> B5["ExecuTorch top-level CMake<br/>as subdirectory"] B3 --> B6["Arm CMake helpers + presets"] B5 --> B7["arm_executor_runner ELF"] B6 --> B7 endWhat Changed
examples/arm/executor_runner/standaloneas the supported standalone CMake entry point forarm_executor_runner.paths.
build_executor_runner.shandrun.shto use the standalone runner workflow.--et_build_root.BundleIO, ETDump, and devtools settings.
Iteration Speed
Repeated local PTE-to-runner iteration is now 8x faster because
run.shcan reuse the configured standalone CMake build directory, stage updated PTE/BPTE payloads into the existing cache wiring, and rebuild only the needed runner target instead of repeating the full manual configure/install/export flow.This is a developer workflow speedup, not a model runtime speedup.
Result
For common Ethos-U bare-metal usage, the user-facing path is now script-owned and repeatable:
examples/arm/run.shwith a model and target.--et_build_root.VGF host flows remain explicit:
run.shrequires an existing--build-dirfor VGF-style host builds rather thanauto-configuring them as bare-metal runner builds.
Testing
Validated through the Arm backend runner, bare-metal, VGF, and CI workflows covered by this stack.
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani