fix: defunctionalize pass on the caller runtime to apply #7100

aakoshh · 2025-01-17T10:20:47Z

Description

Problem*

Extracted from #7072

Summary*

Updates defunctionalize to create the apply function with the runtime inherited from its caller.

If it's called from both ACIR and Brillig then two functions are created. This is because at this point the functions have already been monomorphised, and by having two different versions we can apply different rules. Or at least it keeps the logic simpler: if we use --force-brillig then we don't want acir functions to pop up; if we don't, then I suppose there wouldn't be any harm calling acir from brillig as it would be inlined.

Another potential issue of not separating the runtime would be if for example an acir-only instruction was added to apply, and then it would be inlined into a brillig function, causing a panic later; although I'm not sure how this is avoided now (can't brillig functions still call ACIR ones and inline their stuff?).

TODO:

Add test with expected runtime
Fix the SSA parser to handle function IDs as values
Cherry pick 64841c4
Add test to show that two applies are created if there are two different callers

See this and this comment about the increase in size of certain tests.

Additional Context

The problem with not inheriting the runtime was that it interfered with inlining: any Brillig function called from ACIR was considered an entry point and wasn't inlined into anything. OTOH the ACIR apply was inlined into its Brillig caller, which resulted in it being removed in the next inline pass, and then the original Brillig functions could be inlined in a third inline pass. Altogether confusing, and fixed in #7072

Documentation*

Check one:

No documentation needed.
Documentation included in this PR.
[For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

I have tested the changes locally.
I have formatted the changes with Prettier and/or cargo fmt on default settings.

github-actions · 2025-01-17T10:27:36Z

Execution Report

Program	Execution Time	%
sha256_regression	0.051s	0%
regression_4709	0.001s	0%
ram_blowup_regression	0.608s	1%
rollup-root	0.105s	0%
rollup-merge	0.006s	0%
rollup-block-root	36.500s	0%
rollup-block-merge	0.105s	0%
rollup-base-public	1.226s	0%
rollup-base-private	0.452s	0%
private-kernel-tail	0.019s	0%
private-kernel-reset	0.313s	0%
private-kernel-inner	0.075s	10%

github-actions · 2025-01-17T10:27:41Z

Compilation Report

Program	Compilation Time	%
sha256_regression	1.160s	16%
regression_4709	0.885s	11%
ram_blowup_regression	15.800s	-1%
rollup-root	3.564s	-6%
rollup-merge	2.028s	-7%
rollup-block-root-single-tx	136.000s	-3%
rollup-block-root-empty	1.998s	-8%
rollup-block-root	136.000s	-3%
rollup-block-merge	3.632s	-2%
rollup-base-public	28.280s	-1%
rollup-base-private	10.540s	6%
private-kernel-tail	1.018s	3%
private-kernel-reset	6.782s	15%
private-kernel-inner	1.962s	-7%

github-actions · 2025-01-17T10:35:26Z

Compilation Memory Report

Program	Peak Memory
keccak256	77.65M
workspace	123.98M
regression_4709	424.15M
ram_blowup_regression	1.46G
rollup-root	601.26M
rollup-merge	494.24M
rollup-block-root-single-tx	16.06G
rollup-block-root-empty	488.41M
rollup-block-root	16.07G
rollup-block-merge	601.26M
rollup-base-public	2.38G
rollup-base-private	1.14G
private-kernel-tail	207.34M
private-kernel-reset	584.47M
private-kernel-inner	294.71M

github-actions · 2025-01-17T10:35:27Z

Execution Memory Report

Program	Peak Memory
keccak256	74.71M
workspace	123.84M
regression_4709	316.02M
ram_blowup_regression	512.62M
rollup-root	498.31M
rollup-merge	473.05M
rollup-block-root	1.22G
rollup-block-merge	498.33M
rollup-base-public	734.17M
rollup-base-private	590.51M
private-kernel-tail	180.91M
private-kernel-reset	245.52M
private-kernel-inner	208.92M

github-actions · 2025-01-17T13:29:54Z

Changes to Brillig bytecode sizes

Generated at commit: a49c8624afc5bc815f925e875d80f4163f79f612, compared to commit: a6685befe3b103b236997843f69afc07b6ea0a8c

🧾 Summary (10% most significant diffs)

Program	Brillig opcodes (+/-)	%
higher_order_functions_inliner_max	+571 ❌	+815.71%
brillig_fns_as_values_inliner_zero	+148 ❌	+740.00%
higher_order_functions_inliner_zero	+59 ❌	+8.86%

Full diff report 👇

Program	Brillig opcodes (+/-)	%
higher_order_functions_inliner_max	641 (+571)	+815.71%
brillig_fns_as_values_inliner_zero	168 (+148)	+740.00%
higher_order_functions_inliner_zero	725 (+59)	+8.86%
sha256_regression_inliner_max	6,689 (+13)	+0.19%
sha256_inliner_max	2,288 (+4)	+0.18%
sha256_var_padding_regression_inliner_max	4,910 (+6)	+0.12%
hashmap_inliner_max	19,770 (-8)	-0.04%
uhashmap_inliner_max	12,703 (-8)	-0.06%
sha256_regression_inliner_min	5,340 (-8)	-0.15%
sha256_regression_inliner_zero	4,861 (-8)	-0.16%
sha256_var_padding_regression_inliner_min	3,285 (-6)	-0.18%
sha256_var_padding_regression_inliner_zero	2,957 (-6)	-0.20%
uhashmap_inliner_min	9,804 (-24)	-0.24%
hashmap_inliner_zero	7,947 (-22)	-0.28%
hashmap_inliner_min	12,595 (-40)	-0.32%
uhashmap_inliner_zero	7,518 (-37)	-0.49%
higher_order_functions_inliner_min	1,432 (-12)	-0.83%
brillig_fns_as_values_inliner_min	219 (-4)	-1.79%

github-actions · 2025-01-17T13:29:54Z

Changes to number of Brillig opcodes executed

Generated at commit: a49c8624afc5bc815f925e875d80f4163f79f612, compared to commit: a6685befe3b103b236997843f69afc07b6ea0a8c

🧾 Summary (10% most significant diffs)

Program	Brillig opcodes (+/-)	%
higher_order_functions_inliner_max	+1,054 ❌	+1064.65%
brillig_fns_as_values_inliner_zero	+160 ❌	+888.89%
higher_order_functions_inliner_zero	+131 ❌	+9.88%

Full diff report 👇

Program	Brillig opcodes (+/-)	%
higher_order_functions_inliner_max	1,153 (+1,054)	+1064.65%
brillig_fns_as_values_inliner_zero	178 (+160)	+888.89%
higher_order_functions_inliner_zero	1,457 (+131)	+9.88%
sha256_regression_inliner_max	111,646 (+141)	+0.13%
sha256_inliner_max	13,197 (+4)	+0.03%
sha256_var_padding_regression_inliner_max	196,670 (+6)	+0.00%
sha256_var_padding_regression_inliner_min	270,969 (-24)	-0.01%
sha256_var_padding_regression_inliner_zero	265,671 (-24)	-0.01%
sha256_regression_inliner_min	161,181 (-28)	-0.02%
sha256_regression_inliner_zero	155,415 (-28)	-0.02%
uhashmap_inliner_max	137,636 (-60)	-0.04%
uhashmap_inliner_min	229,698 (-108)	-0.05%
uhashmap_inliner_zero	175,966 (-98)	-0.06%
hashmap_inliner_zero	75,833 (-52)	-0.07%
hashmap_inliner_min	113,401 (-108)	-0.10%
hashmap_inliner_max	48,525 (-60)	-0.12%
higher_order_functions_inliner_min	2,790 (-26)	-0.92%
brillig_fns_as_values_inliner_min	268 (-6)	-2.19%

TomAFrench · 2025-01-17T14:37:28Z

Pushed a small change to signal that we want to inline these functions more aggressively. The inliner isn't picking this up currently.

aakoshh · 2025-01-17T14:43:47Z

That's a good idea, I was also wondering about it: the cost will always be above 0 by the looks of it, yet it seems like something that works better when inlined.

This is where InlineAlways should play a role in signalling. Did it not work?

TomAFrench · 2025-01-17T15:05:27Z

Nah, I looked at adding a line to the closure which determine whether to inline to force it to inline InlineAlways functions and it's still resulting in the same bytecode.

aakoshh · 2025-01-17T16:21:03Z

I think the InlineAlways worked, the problem is somewhere else. I'm testing with this:

fn main(w: Field) -> pub Field {
    let ret = twice(add1, 3);
    test_array_functions();
    w + ret
}
/// Test the array functions in std::array
fn test_array_functions() {
    let two = 2; // giving this a name, to ensure that the Option functions work with closures
    let myarray: [i32; 3] = [1, 2, 3];
    assert(myarray.any(|n| n > 2));
    assert(myarray.any(|n| n > two));

    let evens: [i32; 3] = myarray.map(|n| n * two); // [2, 4, 6]
    assert(evens.all(|n| n > 1));
    assert(evens.all(|n| n >= two));

    assert(evens.fold(0, |a, b| a + b) == 12);
    assert(evens.fold(0, |a, b| a + b + two) == 18);
    assert(evens.reduce(|a, b| a + b) == 12);
    assert(evens.reduce(|a, b| a + b + two) == 16);
    assert(evens.map(|n| n / 2) == myarray);
    assert(evens.map(|n| n / two) == myarray);
}

fn add1(x: Field) -> Field {
    x + 1
}

fn twice(f: fn(Field) -> Field, x: Field) -> Field {
    f(f(x))
}

Up and including the Loop Invariant Code Motion pass the difference between these calls is only in the inclusion of inc_rc on the brillig side:

cargo run -q -p nargo_cli -- --program-dir . info --force --silence-warnings --inliner-aggressiveness 9223372036854775807 --show-ssa
cargo run -q -p nargo_cli -- --program-dir . info --force --silence-warnings --inliner-aggressiveness 9223372036854775807 --force-brillig --show-ssa

Then, after unrolling, in ACIR we are left with 13 blocks, vs 33 with --force-brillig. Maybe they don't count as small loops.

aakoshh · 2025-01-17T16:26:09Z

Indeed if I don't skip unrolling here then the number of opcodes go down from 641 to 23

aakoshh · 2025-01-17T16:44:29Z

Here are the loop statistics, used to decide if they are considered small or not:

STATS of b25: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 10 }; is_small=false
STATS of b23: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 2, all_instructions: 11 }; is_small=false
STATS of b21: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 10 }; is_small=false
STATS of b19: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 2, all_instructions: 10 }; is_small=true
STATS of b17: BoilerplateStats { iterations: 2, loads: 1, stores: 1, increments: 1, all_instructions: 29 }; is_small=false
STATS of b15: BoilerplateStats { iterations: 2, loads: 1, stores: 1, increments: 1, all_instructions: 19 }; is_small=false
STATS of b13: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 29 }; is_small=false
STATS of b11: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 19 }; is_small=false
STATS of b9: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 11 }; is_small=false
STATS of b7: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 9 }; is_small=true
STATS of b5: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 2, all_instructions: 32 }; is_small=false
STATS of b3: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 10 }; is_small=false
STATS of b1: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 9 }; is_small=true

We have a check to limit byte code growth percentage due to unrolling, but it's not set to anything by default. This is_small is already preventing the loop from unrolling, regardless of what would happen. Only very small loops can unrolled, because the constant overhead is not that big; it's all_instructions + 1 vs iterations * (all_instructions - increments - min(loads, stores)*2 - 3).

I tried cheating by considering any loop with up to 15 instructions as small, it still resulted in 475 instructions. Only by unrolling all of them did it collapse down. Unfortunately just unrolling b5 (the biggest) didn't do the trick, it's not special in that sense.

compiler/noirc_evaluator/src/ssa/opt/defunctionalize.rs

aakoshh · 2025-01-17T20:00:10Z

On master it seems to be similar to #7072 (comment)

apply was acir, main is brillig
apply gets inlined into main, but because apply is acir and currently that means inline-unless-entry
because apply is acir, it marked every brillig function that called it as an entry point that should not be inlined, so after the 1st inlining there are a lot more functions kept on master, whereas here we end up with just main, fully inlined (this is an oversight which is fixed in feat(ssa): Pass to preprocess functions #7072)
then comes the kicker: because on master many of the brillig functions were actually not inlined, the loops into which they would have been inlined stayed smaller, and therefore they get unrolled:

BOILERPLATE STATS of b25: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 10 }; is_small=false
BOILERPLATE STATS of b23: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 2, all_instructions: 10 }; is_small=true
BOILERPLATE STATS of b21: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 10 }; is_small=false
BOILERPLATE STATS of b19: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 2, all_instructions: 10 }; is_small=true
BOILERPLATE STATS of b17: BoilerplateStats { iterations: 2, loads: 1, stores: 1, increments: 1, all_instructions: 8 }; is_small=true
BOILERPLATE STATS of b15: BoilerplateStats { iterations: 2, loads: 1, stores: 1, increments: 1, all_instructions: 8 }; is_small=true
BOILERPLATE STATS of b13: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 8 }; is_small=true
BOILERPLATE STATS of b11: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 8 }; is_small=true
BOILERPLATE STATS of b9: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 9 }; is_small=true
BOILERPLATE STATS of b7: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 9 }; is_small=true
BOILERPLATE STATS of b5: BoilerplateStats { iterations: 0, loads: 1, stores: 1, increments: 2, all_instructions: 10 }; is_small=true
BOILERPLATE STATS of b3: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 9 }; is_small=true
BOILERPLATE STATS of b1: BoilerplateStats { iterations: 3, loads: 1, stores: 1, increments: 1, all_instructions: 9 }; is_small=true

Create test to show defunctionalization not handling runtime

da1fe6a

aakoshh mentioned this pull request Jan 17, 2025

fix: Simplify defunctionalize return #7101

Merged

5 tasks

aakoshh added 2 commits January 17, 2025 13:25

Fix SSA parser to handle function as value

7f2cdef

Fix defunctionalization to inherit runtime of caller

bd4f90b

Add another test to check 2 runtimes are created

cfdbcad

aakoshh marked this pull request as ready for review January 17, 2025 13:35

Merge branch 'master' into fix-defunctionalize-runtime

0532794

aakoshh requested a review from a team January 17, 2025 13:38

aakoshh and others added 2 commits January 17, 2025 14:04

fix: Simplify defunctionalize return (#7101)

1b7dc2e

.

8102dd2

TomAFrench added 2 commits January 17, 2025 14:49

.

f206178

chore: fix tests

939cf84

aakoshh mentioned this pull request Jan 17, 2025

feat(ssa): Pass to preprocess functions #7072

Open

10 tasks

michaeljklein requested changes Jan 17, 2025

View reviewed changes

compiler/noirc_evaluator/src/ssa/opt/defunctionalize.rs Show resolved Hide resolved

michaeljklein approved these changes Jan 17, 2025

View reviewed changes

aakoshh added this pull request to the merge queue Jan 17, 2025

Merged via the queue into master with commit dd70845 Jan 17, 2025
99 checks passed

aakoshh deleted the fix-defunctionalize-runtime branch January 17, 2025 21:13

noirwhal mentioned this pull request Jan 17, 2025

chore: Release Noir(1.0.0-beta.2) #6914

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: defunctionalize pass on the caller runtime to apply #7100

fix: defunctionalize pass on the caller runtime to apply #7100

aakoshh commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

TomAFrench commented Jan 17, 2025 •

edited

Loading

aakoshh commented Jan 17, 2025

TomAFrench commented Jan 17, 2025

aakoshh commented Jan 17, 2025

aakoshh commented Jan 17, 2025

aakoshh commented Jan 17, 2025 •

edited

Loading

aakoshh commented Jan 17, 2025 •

edited

Loading

fix: defunctionalize pass on the caller runtime to apply #7100

fix: defunctionalize pass on the caller runtime to apply #7100

Conversation

aakoshh commented Jan 17, 2025 • edited Loading

Description

Problem*

Summary*

Additional Context

Documentation*

PR Checklist*

github-actions bot commented Jan 17, 2025 • edited Loading

Execution Report

github-actions bot commented Jan 17, 2025 • edited Loading

Compilation Report

github-actions bot commented Jan 17, 2025 • edited Loading

Compilation Memory Report

github-actions bot commented Jan 17, 2025 • edited Loading

Execution Memory Report

github-actions bot commented Jan 17, 2025 • edited Loading

Changes to Brillig bytecode sizes

🧾 Summary (10% most significant diffs)

github-actions bot commented Jan 17, 2025 • edited Loading

Changes to number of Brillig opcodes executed

🧾 Summary (10% most significant diffs)

TomAFrench commented Jan 17, 2025 • edited Loading

aakoshh commented Jan 17, 2025

TomAFrench commented Jan 17, 2025

aakoshh commented Jan 17, 2025

aakoshh commented Jan 17, 2025

aakoshh commented Jan 17, 2025 • edited Loading

aakoshh commented Jan 17, 2025 • edited Loading

aakoshh commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025 •

edited

Loading

TomAFrench commented Jan 17, 2025 •

edited

Loading

aakoshh commented Jan 17, 2025 •

edited

Loading

aakoshh commented Jan 17, 2025 •

edited

Loading