Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip javascriptcore's first parse step for ES Modules #15758

Merged
merged 73 commits into from
Jan 10, 2025

Conversation

pfgithub
Copy link
Contributor

@pfgithub pfgithub commented Dec 14, 2024

First half of #7384. Does not fix that problem, but gets closer to fixing it.

This pr:

  • Adds a test to prepare for when 7384 is fixed (marked with {todo: true} (Adds support to mark a test file as --todo with {todo: "true"} #15633)
  • Constructs ModuleInfo when printing with all the information needed to create a JavaScriptCore JSModuleRecord
  • Replaces javascriptcore's first parsing step to generate the module record with our own generated module record
  • Adds an env var COMPILE_ERRORS_ONLY=1 bun run build for building bun that skips codegen. This shows errors ~50% faster and can often reveal an error that was crashing zig previously

TODO:

  • Make sure memory is freed in the case where the module is discarded before it is loaded?

Check:

  • Make sure cached ModuleInfo in RuntimeTranspilerCache is used
  • Test all the types of module imports and exports and re-exports and star exports and make sure they still work
  • Consider whether to run the module_info building as a seperate pass at the end of printAst rather than integrating it into js_printer
  • Consider using Ast.import records, export records, scope, ... to find the needed items rather than finding them in printer. There are a few special cases like the jest thing (but maybe could be solved by moving that transformation to a step before printing?)
  • Does setFromUtf8 need to be used? What happens if it is called on a bad string containing non-utf8 bytes? Is this unsafe? Is it faster to use a setFromAscii fn when possible?
  • remove the eqlComptime("*default*") check somehow (either just remove it and it's fine or add special indexes for those)
  • Is ModuleInfo information needed in standalone module graph? see: module_loader.zig line 2633

Performance impact:

  • large file, with runtime transpiler cache: ~15% ± 1% faster (node is ~2x faster than bun at this)
  • large file, without runtime transpiler cache: ~5% ± 1% faster
  • large file with only one import: 12% faster with runtime transpiler cache, 5% faster without
  • small file, with imports: ~1% ± 2% slower
  • small file, without imports: ~1% ± 2% slower
  • (todo) real-world imports
Benchmark
// genlarge.js
let large_export_top = "";
let large_export_bottom = "export {\n";
let large_import_top = "import {\n";
let large_import_bottom = `function expect(a, b, c) {
    if(b !== a || c !== b) throw new Error("failure");
}\n`;

for(let i = 0; i < 30_000; i++) {
    large_export_top += `export const _${i} = ${i};\n`;
    large_export_bottom += "    _"+i+" as _"+i+"_d,\n";
    large_import_top += `    _${i}, _${i}_d,\n`;
    large_import_bottom += `expect(${i}, _${i}, _${i}_d);\n`;
}
large_export_bottom += "};\n";
large_import_top += "} from './large2.js';\n"

await Bun.write("large2.js", large_export_top + large_export_bottom);
await Bun.write("large.js", large_import_top + large_import_bottom);
// small.js
import { b } from "./small2.js";

console.log(b);
// small2.js
export const b = "abc";
// noimports.js
console.log("wow, no imports");
// runbench.js
const cmds = [
    "./bun-15758",
    "./bun-main",
];
const targets = [
    "./large.js woutrtc",
    "./large.js withrtc",
    "./largeni.js woutrtc",
    "./largeni.js withrtc",
    "./small.js",
    "./noimports.js",
];
for(const target of targets) {
    console.log("\n\n\nrunning benchmark:", target);
    await Bun.spawnSync({
        cmd: ["hyperfine", ...cmds.map(cmd => cmd + " " + target), "--warmup=5", "--shell=none"],
        env: {
            BUN_RUNTIME_TRANSPILER_CACHE_PATH: target.includes("withrtc") ? process.env.PWD + "/.rtc" : "0",
        },
        stdio: ["inherit", "inherit", "inherit"],
    });
}
// genlargeni.js
let large_export = "export const m = 25;\n";
let large_import = "import { m, n } from './largeni2.js';\n";
large_import += "console.log(m, n);\n";

for(let i = 0; i < 30_000; i++) {
    large_export += "console.log(1);\n";
    large_import += "console.log(1);\n";
}
large_export += "export const n = 25;\n";

await Bun.write("largeni2.js", large_export);
await Bun.write("largeni.js", large_import);

Run benchmark:

bun genlarge.js
bun genlargeni.js
bun runbench.js

@robobun
Copy link

robobun commented Dec 14, 2024

Updated 4:12 PM PT - Jan 9th, 2025

@pfgithub, your commit 3daed8d has passed in #9326! 🎉


🧪   try this PR locally:

bunx bun-pr 15758

@pfgithub pfgithub marked this pull request as ready for review January 6, 2025 21:48
@pfgithub pfgithub marked this pull request as draft January 8, 2025 01:43
@pfgithub pfgithub marked this pull request as ready for review January 10, 2025 02:55
@Jarred-Sumner Jarred-Sumner merged commit ccc7bde into main Jan 10, 2025
69 checks passed
@Jarred-Sumner Jarred-Sumner deleted the pfg/only-parse-twice branch January 10, 2025 03:31
Jarred-Sumner added a commit that referenced this pull request Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants