Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add synchronous hooks proposal #198

Merged
merged 10 commits into from
Nov 5, 2024
Merged

Conversation

joyeecheung
Copy link
Member

cc @nodejs/loaders

Spinned from nodejs/node#52219 - I ended up writing it down as a proposal here, because there are some high-level questions and re-design ideas after I looked into existing usages of CJS monkey-patching. While resolve and load hooks work well enough for some cases, it also seems common in the wild to just override specifier -> module mapping directly, so I sketched out some higher level hooks as alternatives here as well.

Copy link
Member

@GeoffreyBooth GeoffreyBooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! This seems close to being ready to implement.

doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/pirates.js Outdated Show resolved Hide resolved
@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 6, 2024

After some thoughts I think the higher level requires()/link() hooks that span across resolve/load make more sense than lower level exports()/link() hooks that run after load because they are more powerful. The lower level hooks can only run things after load. If they need to get information (e.g. specifier) from previous steps the hook author needs to also override the earlier hooks (resolve/load) to save them somewhere. But with the higher level hooks they can choose to either run things before resolve() or after load() or they can do both in one closure which is a lot more convenient. Or it's possible to re-implement the lower-level exports()/link() using the higher-level hook, but not the other way around. Take require for example, it's possible to re-implement the after-load exports() with requires() like this:

function exports(url, context) { ... }

function requires(specifier, context, nextRequires) {
  // requires hook can choose to run something here, exports can't.
  // If hooks want to simply hijack a request and return early, skipping
  // resolve and load, they can do it here. They also don't need to care
  // about how different specifiers can be mapped to the same URL.
  if (specifier === 'pnpapi') {
    return { url: 'some://dummy/url', exports: pnpApiObject };
  }
  const result = nextRequires(specifier, context);  // Invokes resolve and load steps
  exports(result.url, result);
}

Or to re-implement the after-load link hook using the high-level link hook for import requests:

function linkAfterLoad(url, context) { ... }

const pnpApiModule = new vm.SyntheticModule([...], () => {
   pnpApiModule.setExports('default', pnpApiObject);
});

function link(specifier, context, nextLink) {
  // link hook can choose to run something here, linkAfterLoad can't.
  // If hooks want to simply hijack a request and return early, skipping
  // resolve and load, they can do it here. They also don't need to care
  // about how different specifiers can be mapped to the same URL.
  if (specifier === 'pnpapi') {
    return { url: 'some://dummy/url', module: pnpApiModule };
  }
  const result = nextLink(specifier, context);  // Invokes resolve and load steps
  linkAfterLoad(result.url, result);
}

In addition, if these hooks need to invoke say require.resolve() in an algorithm to determine whether they need to hijack a request, since they don't need to override resolve hooks to retain any information themselves, they won't run into weird infinite recursions.

@jsumners-nr
Copy link

After some thoughts I think the higher level...

I like this assessment.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Jun 6, 2024

After some thoughts I think the higher level requires()/link() hooks that span across resolve/load make more sense than lower level exports()/link() hooks

Let’s say I want to write CoffeeScript and have it instrumented. The CoffeeScript hooks might implement just load, or resolve and load, per https://nodejs.org/api/module.html#transpilation. Then if I also want to register my instrumentation library, and it uses requires/link, do the CoffeeScript hooks never get called because they’ve been bypassed? Or are you suggesting that we don’t have resolve/load anymore, just requires and link?

As an author, personally what I find easiest to follow is hooks that correspond to steps in the process. So if there’s a step before resolution, and a step after loading, then those can be additional hooks that still run in the sequence: something before resolution, resolve, load, something after loading.

@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 7, 2024

Then if I also want to register my instrumentation library, and it uses requires/link, do the CoffeeScript hooks never get called because they’ve been bypassed? Or are you suggesting that we don’t have resolve/load anymore, just requires and link?

I am suggesting to have resolve and load and requires/link that wrap around the previous two steps. Hooks that only care about resolution implement resolve, hooks that only care about loading, (this is a bad name for source but meh it's already there) , like transpilers, implement load. These two apply to both CJS and ESM. Hooks that care about the CJS require results implement requires, hooks that care about ESM module import results implement link, hooks that care about both implement both (we can't unify this part in one hook due to design of ESM being essentially different from CJS - everything is immutable after linking, which happens before evaluation, so it's impossible to have a post-evaluation step that modifies the linking results surfaced to import).

@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 10, 2024

After attempting to prototype the high-level wrappers in the ESM loader I realized that there are some nuances in the current design - the nextLoad and nextLink (or similar) in the ESM loader currently isn't easily synchronous. The only case where they have to be asynchronous is when there are http/https imports in the graph. Others are more superficial. So I think we have some options:

  1. Don't go with the nextHook design but instead split the hooks into something like beforeLoad/afterLoad, so users don't have to invoke the next hook. If they wish to skip the rest of the hooks, they can just return shortCircuit: true.
  2. Keep the nextHook design but throw an error or ignore it for http/https imports.
  3. Do a combination of both - we provide both beforeLoad/afterLoad that work with http/https imports, as well as load(..., nextLoad) which throws when http/https imports are encountered. Hooks that need to support http/https can implement the former, hooks that don't implement the latter. This also means we don't need to split resolve hooks which are still synchronous in all cases.

@arcanis
Copy link
Contributor

arcanis commented Jun 11, 2024

  1. Don't go with the nextHook design but instead split the hooks into something like beforeLoad/afterLoad, so users don't have to invoke the next hook. If they wish to skip the rest of the hooks, they can just return shortCircuit: true.

How does that work for loaders that are prerequisite for subsequent loaders to work? For example let's say a TS loader needing to resolve & load files from within zip archives?

@jsumners-nr
Copy link

  1. Don't go with the nextHook design but instead split the hooks into something like beforeLoad/afterLoad, so users don't have to invoke the next hook. If they wish to skip the rest of the hooks, they can just return shortCircuit: true.

I like this idea. I find the nextHook pattern difficult to reason through.

@mcollina
Copy link
Member

The only case where they have to be asynchronous is when there are http/https imports in the graph.

The alternative is to provide the DeasyncWorker we talked about, and use it to implement a synchronous http client we use for this. In the end that's what we will be telling our users to do.

@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 12, 2024

How does that work for loaders that are prerequisite for subsequent loaders to work? For example let's say a TS loader needing to resolve & load files from within zip archives?

Can you describe how the module request would look like in this case? Would that just override the resolution/loading completely or still use part of the default logic? For example assuming that the request is require('foo') or import 'foo' and you want to make it load from a foo.ts in /path/to/package.zip, which gets transpiled to commonjs (so everything gets overridden completely):

function beforeResolve(specifier, context) {
  if (specifier === 'foo') {
    const url = 'file:///path/to/package.zip?file=foo.ts';  // Custom resolution
    return { format: 'zip+typescript', url, shortCircuit: true };
  }
  return { };  // do nothing, and the default resolution logic gets used.
}

function beforeLoad(url, context) {
  if (context.format === 'zip+typescript') {
    const source = unzipAndTranspile(url);  // Unzip path/to/package.zip,  and transpile foo.ts
    return { source, format: 'commonjs', shortCircuit: true };
  }
  return { };  // do nothing, and the default loading logic gets called.
}

(Not sure if the search params can be preserved in CJS loader yet, given the compat issue of Module._cache keys already described in the doc - if not, the best we can do is to pass some data via context on first load of the module that gets mapped to a specific file path - CJS loader only supports file URLs, after all, or hooks can customize the URL as something like file:///path/to/package.zip/foo.ts and they know how to extract a foo.ts out of /path/to/package.zip/ using this pattern).

Or if you are thinking about composing a zip loader + a typescript loader:

// Zip loader
function beforeResolve(specifier, context) {
  if (specifier === 'foo') {
    const url = 'file:///path/to/package.zip/foo.ts';  // Custom resolution
    return { format: 'zip', url, shortCircuit: true };
  }
  return { };  // do nothing, and the default resolution logic gets used.
}

function beforeLoad(url, context) {
  if (context.format === 'zip') {
    const source = unzip(url);  // Unzip path/to/package.zip, and locate foo.ts
    return { source, shortCircuit: true };
  }
  return { };  // do nothing, and the default loading logic gets called.
}

// TypeScript loader
function afterLoad(url, source, context) {
  if (url.endsWith('.ts')) {
    return { source: transpile(source), format: 'commonjs' };
  }
  return { };  // do nothing, and the default loading logic gets called.
}

(I think we might need to separate shortCircuit options into two: one that skips other hooks of the same kind and the default logic, one in before hooks that also skip after hooks).

@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 12, 2024

The alternative is to provide the DeasyncWorker we talked about, and use it to implement a synchronous http client we use for this. In the end that's what we will be telling our users to do.

Yeah one alternative is to provide load(url, context, nextLoad) where the default nextLoad only works with non-network loading; Some user-land hook can be developed to support synchronous network loading which can be something like:

function load(url, context, nextLoad) {
   if (url.startsWith('http')) {
      const source = fetchSync(url);
      return { source, shortCircuit: true };
   }
   return nextLoad(url, context);
}

..which needs to be ensured to be at the bottom of the hook chain (next to the default one).

One downside of this is the fetching would be sequential and blocking even for ESM which is designed to make parallel loading of module requests in the same module possible. Theoretically with a slightly different design I think it's still possible to make the fetching part almost parallel:

function load(url, context, nextLoad) {
   if (url.startsWith('http')) {
      const syncResponse = fetchSyncPooled(url);  // This adds a fetch request to a pool
      return {
          // This blocks until the result is fetched; by default if this is the only load hook, the ESM loader
          // runs the hooks to get the source provider functions first (then the requests are all started)
          // and then run over the source provider functions to get the source (and start blocking on the
          // first module request).
          getSource() { return syncResponse.readAsBuffer(); } ,
          shortCircuit: true
      };
   }
   return nextLoad(url, context);
}

This would require significant refactoring in the ESM loader to make the creation of ModuleWraps lazy but it allows the loader to start the fetch requests sequentially fairly quickly in a pool (without waiting for the previous module request to actually complete), and then the worker pool can handle the requests in parallel. Any getSource() invocation from the main thread loader would block until the response is fetched, but at the mean time other threads in the worker pool can fetch other URLs just fine. Although with this the throughput is still not fully on par with fully asynchronous loading (uncle/aunt modules need to block on the loading of nephew modules) but at least this can offer parallelism on a single-module level.

On the other hand I think this might be a more compelling reason to split the hooks - with the nextLoad design, if the other hook invokes nextLoad to get the source, it can still end up too eager and few requests get parallelized; But if the hooks are split, the hooks that need the source would override afterLoad instead while the http loader can override beforeLoad, and the http loader can still parallelize as much as possible. Or it can do something like this to increase throughput:

// This needs to be the last of the beforeLoad chain:
function beforeLoad(url, context) {
   if (url.startsWith('http')) {
      const syncResponse = fetchSyncPooled(url);  // This adds a fetch request to a pool
      responseMap.set(url, syncResponse);
      return {
          source: 'dummy',
          shortCircuit: true
      };
   }
   return {};
}

// Node.js can run the beforeLoad eagerly for the module requests, and
// only start running afterLoad when the all beforeLoads of parallel module
// requests are run

// This needs to be the first of the afterLoad chain:
function afterLoad(url, source, context, nextLoad) {
   if (url.startsWith('http')) {
      // source was dummy at this point.
      const syncResponse = responseMap.get(url);
      return {
          source: syncResponse.readAsBuffer(),
      };
   }
   return {};
}

@Qard
Copy link
Member

Qard commented Jun 12, 2024

Would it make sense to have a think like the "link" hook that works for both module formats? I think we would generally prefer a single interface to work with, even if it's a bit sub-optimal for certain cases. We want to be able to apply CJS-based patches to ESM rewrites when they happen without needing to rewrite the patches. That's not a huge deal though.

@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 12, 2024

Would it make sense to have a think like the "link" hook that works for both module formats?

I don't think so, because this is where one of the key difference of CJS and ESM lies: ESM linking (dependency resolution) is designed to be done before evaluation, for all module requests in the module, which relies on static analysis based on the import syntax, and it's also what allows parallel loading of ESM dependencies; CJS dependency resolution is done during evaluation (in the form of require() invocations), and therefore forces everything to be done sequentially. It's not impossible do some static analysis in CJS to detect dependencies, but it's bound to be hacky and can lead to corner cases (e.g. if the module renames require() or do a bunch of runtime magic with it, then the static analysis is likely to fail, not to mention that require() can take a string computed at run time instead of a static string, so any require(foo + bar) is not analyzable), at that point it would've been better to just do multiple hooks to ensure correctness.

@GeoffreyBooth GeoffreyBooth added the loaders-agenda Issues and PRs to discuss during the meetings of the Loaders team label Jun 13, 2024
@Qard
Copy link
Member

Qard commented Jun 14, 2024

I'm aware of the difference in order/timing. I was thinking more if some interface could exist which abstracts that away. Like if you allow the CJS side to see what the export names are without directly accessing them you could do a similar "replace when available" kind of semantic as we have to do with ESM. I feel like a similar shaped API should be possible between the two, it just might not be quite as friendly. 🤔

Anyway, two APIs are probably liveable, we'll just have to make sure our instrumentations can handle both scenarios with (hopefully) minimal breakage.

doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
doc/design/proposal-synchronous-hooks.md Outdated Show resolved Hide resolved
@joyeecheung
Copy link
Member Author

joyeecheung commented Jun 14, 2024

Like if you allow the CJS side to see what the export names are without directly accessing them you could do a similar "replace when available" kind of semantic as we have to do with ESM. I feel like a similar shaped API should be possible between the two, it just might not be quite as friendly. 🤔

That sounds cool but I think it would be more flexible if the built-in loader API provides the low-level hooks, and if a unified hook is desired, some user-land package can be developed on top of them (like a package that just unify import-in-the-middle and require-in-the-middle). Otherwise by only providing high-level unified hooks in core we will deprive the ability to do low-level customization completely from developers who need them, and only cater to the needs of developers who don't care about the differences.

@Qard
Copy link
Member

Qard commented Jun 14, 2024

Fair point. I'd be happy to help with constructing whatever that higher-level thing is later, but probably would want that to live in the Node.js org just to have an unbiased home for it that all the APM vendors can contribute to it. (Similar to what we're trying to do by donating IITM at the moment.)

@joyeecheung
Copy link
Member Author

The latest prototype is in https://github.com/joyeecheung/node/tree/sync-hooks-6 - I've got it working for ESM too, still need a bunch of testing to be opened as PR tough.

joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 4, 2024
This lays the foundation for adding synchronous hooks proposed in
nodejs/loaders#198. In addition this
corrects and adds several JSDoc comments for internal functions
of the ESM loader, as well as explaining how require() for
import CJS work in the special resolve/load paths. This doesn't
consolidate it with import in require(esm) yet due to caching
differences, which is left as a TODO.
joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 4, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM. In addition
this corrects and adds several JSDoc comments for internal functions
of the ESM loader, as well as explaining how require() for
import CJS work in the special resolve/load paths. This doesn't
consolidate it with import in require(esm) yet due to caching
differences, which is left as a TODO.
joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 4, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().
joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 4, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().
joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 4, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().
joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 5, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().
@joyeecheung joyeecheung removed the loaders-agenda Issues and PRs to discuss during the meetings of the Loaders team label Sep 9, 2024
joyeecheung added a commit to joyeecheung/node that referenced this pull request Sep 16, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().
nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Sep 17, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().

PR-URL: #54769
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
joyeecheung added a commit to joyeecheung/node that referenced this pull request Oct 1, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().

PR-URL: nodejs#54769
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
targos pushed a commit to nodejs/node that referenced this pull request Oct 4, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().

PR-URL: #54769
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@Mars7005

This comment was marked as off-topic.

nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Oct 31, 2024
This refactors the CommonJS loading a bit to create a center point
that handles source loading (`loadSource`) and make format detection
more consistent to pave the way for future synchronous hooks.

- Handle .mjs in the .js handler, similar to how .cjs has been handled.
- Generate the legacy ERR_REQUIRE_ESM in a getRequireESMError() for
  both .mts and require(esm) handling (when it's disabled).

PR-URL: #55590
Refs: nodejs/loaders#198
Reviewed-By: Antoine du Hamel <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
RafaelGSS pushed a commit to nodejs/node that referenced this pull request Nov 1, 2024
This refactors the CommonJS loading a bit to create a center point
that handles source loading (`loadSource`) and make format detection
more consistent to pave the way for future synchronous hooks.

- Handle .mjs in the .js handler, similar to how .cjs has been handled.
- Generate the legacy ERR_REQUIRE_ESM in a getRequireESMError() for
  both .mts and require(esm) handling (when it's disabled).

PR-URL: #55590
Refs: nodejs/loaders#198
Reviewed-By: Antoine du Hamel <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
louwers pushed a commit to louwers/node that referenced this pull request Nov 2, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().

PR-URL: nodejs#54769
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
louwers pushed a commit to louwers/node that referenced this pull request Nov 2, 2024
This refactors the CommonJS loading a bit to create a center point
that handles source loading (`loadSource`) and make format detection
more consistent to pave the way for future synchronous hooks.

- Handle .mjs in the .js handler, similar to how .cjs has been handled.
- Generate the legacy ERR_REQUIRE_ESM in a getRequireESMError() for
  both .mts and require(esm) handling (when it's disabled).

PR-URL: nodejs#55590
Refs: nodejs/loaders#198
Reviewed-By: Antoine du Hamel <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
Copy link
Member

@GeoffreyBooth GeoffreyBooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has been open long enough and the raised notes have either all been addressed or could be addressed via follow-up PRs to edit this document. I think it’s time to land this and continue iterating in follow-ups.

Copy link
Member

@JakobJingleheimer JakobJingleheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌 LGTM and nicely aligns with existing conventions/APIs/signatures.

@joyeecheung joyeecheung merged commit a6d5d30 into nodejs:main Nov 5, 2024
tpoisseau pushed a commit to tpoisseau/node that referenced this pull request Nov 21, 2024
This lays the foundation for supporting synchronous hooks proposed
in nodejs/loaders#198 for ESM.

- Corrects and adds several JSDoc comments for internal functions
  of the ESM loader, as well as explaining how require() for
  import CJS work in the special resolve/load paths. This doesn't
  consolidate it with import in require(esm) yet due to caching
  differences, which is left as a TODO.
- The moduleProvider passed into ModuleJob is replaced as
  moduleOrModulePromise, we call the translators directly in the
  ESM loader and verify it right after loading for clarity.
- Reuse a few refactored out helpers for require(esm) in
  getModuleJobForRequire().

PR-URL: nodejs#54769
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
tpoisseau pushed a commit to tpoisseau/node that referenced this pull request Nov 21, 2024
This refactors the CommonJS loading a bit to create a center point
that handles source loading (`loadSource`) and make format detection
more consistent to pave the way for future synchronous hooks.

- Handle .mjs in the .js handler, similar to how .cjs has been handled.
- Generate the legacy ERR_REQUIRE_ESM in a getRequireESMError() for
  both .mts and require(esm) handling (when it's disabled).

PR-URL: nodejs#55590
Refs: nodejs/loaders#198
Reviewed-By: Antoine du Hamel <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.