clean up some runtime potential bugs (#19447) by billmguo · Pull Request #19447 · pytorch/executorch

billmguo · 2026-05-11T05:57:38Z

Summary:

Zero cache on allocation (line 65-66): std::fill on cache_data_ and update_data_ after allocator_.allocate() —eliminates uninitialized memory garbage that varies across devices.
Zero cache on reset (line 191): std::fill on cache_data_ in reset() — ensures stale KV cache from a previous prompt is fully cleared, not just the position counters.
Zero padding in last prefill chunk (line 618-621): When batch_len < input_len, fill the tail of the input buffer with zeros — prevents stale tokens from a previous chunk leaking through the embedding layer. sa_runner.cpp
Call runner.reset() before each prompt in the multi-prompt loop, stdin prompt loop, and stdin tokens loop —ensures the KV cache, masks, and input_pos_ are fully reset between prompts

Reviewed By: Andriyluck, derekdixu

Differential Revision: D104615993

pytorch-bot · 2026-05-11T05:57:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19447

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 1cd3d16 with merge base 7127953 ():

NEW FAILURE - The following job has failed:

Docathon Labels Sync / check-labels (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-11T05:57:45Z

@billmguo has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104615993.

github-actions · 2026-05-11T05:58:26Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

lucylq · 2026-05-11T17:04:32Z

@claude review

claude · 2026-05-11T17:05:32Z

Claude finished @lucylq's task in 1m 23s —— View job

Review of PR #19447: clean up some runtime potential bugs

Gather context and view diff
Review code changes
Post review feedback

Overall: The changes are defensive improvements to zero-initialize memory in the static KV cache and input buffer. They address real potential issues with uninitialized/stale data. A few observations:

1. Zero cache on allocation (lines 67-68) ✅

std::fill(cache_data_, cache_data_ + cache_data_size_, T(0));
std::fill(update_data_, update_data_ + update_data_size_, T(0));

Good. The AllocatorT::allocate() is a template parameter with no guarantee of zero-initialization (e.g., std::allocator::allocate does not zero memory). This eliminates non-deterministic behavior from uninitialized reads if any code path accesses cache regions before they're written.

2. Zero cache on reset (line 191) ✅

std::fill(cache_data_, cache_data_ + cache_data_size_, T(0));

This makes reset() truly clear the cache data rather than just resetting position counters. Important for multi-turn/multi-prompt scenarios where stale KV values could influence attention if the mask logic has any gaps.

Nit: Should update_data_ also be zeroed in reset()? Currently only cache_data_ is cleared. If the reasoning is that update_data_ is always overwritten before being read (since it's used as output buffer), that's fine — but it's worth confirming that invariant holds in all code paths.

3. Zero padding in last prefill chunk (lines 619-622) ✅

if (batch_len < input_len) {
  std::fill(
      input_buffer.begin() + batch_len, input_buffer.end(), TokenT(0));
}

Correct fix. When the last chunk of tokens is smaller than input_len, the tail of input_buffer would otherwise contain stale tokens from a previous iteration of the loop (the prior std::copy only writes batch_len elements). Zero-padding prevents these from leaking through the embedding layer.

4. PR description vs. actual diff mismatch

The PR description mentions "Call runner.reset() before each prompt in the multi-prompt loop, stdin prompt loop, and stdin tokens loop" but this change is not present in the diff. Only the header file is modified. Was this change intended for a separate file (e.g., sa_runner.cpp) and accidentally omitted, or is it being landed separately?

Summary: The three changes that are present look correct and low-risk. They add O(n) zeroing overhead on allocation, reset, and the last prefill chunk — negligible compared to model inference time. The main question is whether the runner.reset() calls mentioned in the description are missing from this PR.

Summary: 1. Zero cache on allocation (line 65-66): std::fill on cache_data_ and update_data_ after allocator_.allocate() —eliminates uninitialized memory garbage that varies across devices. 2. Zero cache on reset (line 191): std::fill on cache_data_ in reset() — ensures stale KV cache from a previous prompt is fully cleared, not just the position counters. 3. Zero padding in last prefill chunk (line 618-621): When batch_len < input_len, fill the tail of the input buffer with zeros — prevents stale tokens from a previous chunk leaking through the embedding layer. sa_runner.cpp 4. Call runner.reset() before each prompt in the multi-prompt loop, stdin prompt loop, and stdin tokens loop —ensures the KV cache, masks, and input_pos_ are fully reset between prompts Reviewed By: Andriyluck, derekdixu Differential Revision: D104615993

billmguo requested a review from lucylq as a code owner May 11, 2026 05:57

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026

meta-codesync Bot added fb-exported meta-exported labels May 11, 2026

meta-codesync Bot changed the title ~~clean up some runtime potential bugs~~ clean up some runtime potential bugs (#19447) May 11, 2026

billmguo force-pushed the export-D104615993 branch from 9e0d34d to a77301d Compare May 11, 2026 17:33

billmguo force-pushed the export-D104615993 branch from a77301d to 82a6b89 Compare May 11, 2026 17:54

lucylq approved these changes May 11, 2026

View reviewed changes

billmguo force-pushed the export-D104615993 branch from 82a6b89 to 1fbd4e4 Compare May 11, 2026 18:34

billmguo force-pushed the export-D104615993 branch from 1fbd4e4 to 1cd3d16 Compare May 11, 2026 18:38

meta-codesync Bot merged commit 23a91d5 into pytorch:main May 11, 2026
185 of 188 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clean up some runtime potential bugs (#19447)#19447

clean up some runtime potential bugs (#19447)#19447
meta-codesync[bot] merged 1 commit into
pytorch:mainfrom
billmguo:export-D104615993

billmguo commented May 11, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

lucylq commented May 11, 2026

Uh oh!

claude Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

billmguo commented May 11, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19447

❌ 1 New Failure

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

This PR needs a release notes: label

Uh oh!

lucylq commented May 11, 2026

Uh oh!

claude Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #19447: clean up some runtime potential bugs

1. Zero cache on allocation (lines 67-68) ✅

2. Zero cache on reset (line 191) ✅

3. Zero padding in last prefill chunk (lines 619-622) ✅

4. PR description vs. actual diff mismatch

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

billmguo commented May 11, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude Bot commented May 11, 2026 •

edited

Loading