Skip to content

Conversation

@huitseeker
Copy link
Contributor

@huitseeker huitseeker commented Jan 5, 2026

This PR merges main into next, bringing the decorator bypass optimization from #2529 into the next development branch.

What's Being Merged

The decorator bypass optimization that was merged to main via #2529, which includes:

  1. Gate all decorator retrieval behind in_debug_mode checks
  2. Prevent expensive decorator traversal when not in debug mode
  3. Maintain proper decorator storage structure even when stripped

* fix: bypass decorator retrieval in release mode

Gate all decorator retrieval calls behind `in_debug_mode` checks,
ensuring zero overhead when debugging is disabled.

Processor changes:
- before_enter/after_exit decorator loops
- decorators_for_op in basic block execution

FastProcessor changes:
- execute_before_enter_decorators early return
- execute_after_exit_decorators early return
- decorators_for_op in basic block execution

Includes spy tests to verify retrieval is bypassed.

* fix(core): strip decorators while maintaining valid CSR structure

Update strip_decorators() to create an empty but valid CSR structure
instead of calling clear(), which removed the structure entirely and
caused panics when accessing decorator information after stripping.

- Add DebugInfo::empty_for_nodes(num_nodes) to create valid empty CSR
- Update from_components to accept empty structures
- Add edge case tests for empty forest, idempotency, multiple node types

* fix(processor): gate error context decorator access with in_debug_mode

This fix resolves a 20x performance degradation in release mode when
decorators were present in the MastForest but not being executed.

Root Cause:
The err_ctx! macro was unconditionally calling node.get_assembly_op(),
which traversed the CSR decorator storage on every operation (522,059
times in blake3 benchmark), even when in_debug_mode was false. This
caused execution time to increase from 191ms to 3,884ms.

Solution:
Modified the error context creation pipeline to accept and respect the
in_debug_mode flag:
- Updated err_ctx! macro to require in_debug_mode parameter
- Updated ErrorContextImpl::new() and new_with_op_idx() to accept
  in_debug_mode
- Modified precalc_label_and_source_file() to return early when
  !in_debug_mode, avoiding expensive decorator traversal
- Updated all err_ctx!() call sites in Process and FastProcessor to
  pass in_debug_mode flag

Performance Impact:
- Before: Op execution time 3,884ms (7,439ns/op)
- After: Op execution time 191ms (366ns/op)
- Improvement: 20.3x faster

When in_debug_mode is false, decorators (including AsmOp decorators
used for error context) are no longer accessed, even if present in
the MastForest.

* chore: Changelog
@huitseeker huitseeker requested review from adr1anh and bobbinth January 5, 2026 11:09
@huitseeker huitseeker force-pushed the huitseeker/merge-main-into-next branch from 7bdf5b5 to ae25ef5 Compare January 5, 2026 11:24
@huitseeker huitseeker marked this pull request as ready for review January 5, 2026 11:27
@huitseeker huitseeker added the no changelog This PR does not require an entry in the `CHANGELOG.md` file label Jan 5, 2026
@huitseeker huitseeker marked this pull request as draft January 5, 2026 11:59
This merge brings the decorator bypass optimization from main (#2529) into next.

The changes adapt the decorator bypass optimization to next's architecture:
- No changes to legacy Process (removed in next)
- All changes apply to FastProcessor only
- Decorator retrieval gated behind in_debug_mode checks
- Error context creation respects in_debug_mode flag

Performance impact: ~10% overall speedup, 99.7% reduction in trace execution time.

Conflicts resolved using solutions from huitseeker/decorator-bypass-on-next rebase work.
@huitseeker huitseeker force-pushed the huitseeker/merge-main-into-next branch from ae25ef5 to 2112b5e Compare January 5, 2026 13:32
@huitseeker huitseeker marked this pull request as ready for review January 5, 2026 13:46
Copy link
Contributor

@plafer plafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you!

A couple of comments for the future (which I also mentioned in other places):

  • We may want to simplify execute_op_batch() method to pass node_id instead of basic_block as a parameter, and more generally, think about how to simplify the parameters passed into this method.
  • Ideally, with some future refactorings, stripping decorators would result in full removal of the DebugInfo struct (and we should probably rename strip_decorators() into clear_debug_info() or something similar).

@bobbinth bobbinth merged commit d72d043 into next Jan 5, 2026
16 checks passed
@bobbinth bobbinth deleted the huitseeker/merge-main-into-next branch January 5, 2026 17:29
@bobbinth
Copy link
Contributor

bobbinth commented Jan 5, 2026

Ah - I should have merged this w/o squashing. Sorry! Will make a commit into next that fixies this.

@plafer
Copy link
Contributor

plafer commented Jan 5, 2026

Ah - I should have merged this w/o squashing. Sorry! Will make a commit into next that fixies this.

Given that my PR is stacked on this, it's probably fine/easier to let that one as is? If we ever need to revert/bisect, it's still a relatively small commit so should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog This PR does not require an entry in the `CHANGELOG.md` file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants