Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap out of memory when combining lastVersionOf with polling #79

Open
smessie opened this issue Jan 31, 2025 · 3 comments
Open

Heap out of memory when combining lastVersionOf with polling #79

smessie opened this issue Jan 31, 2025 · 3 comments

Comments

@smessie
Copy link
Member

smessie commented Jan 31, 2025

When you configure the ldes-client with lastVersionOf: true and polling: true (--last-version-only --follow), the client consumes more and more memory, until the Node process crashes with FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory.

It is to be expected that you should not configure the combination of these parameters, as you will never be able to emit members because you don't know if future members will be new versions of the ones you already found, and you configured to stay in sync.
However, it should not cause memory issues.

The question now is how we want to handle this issue?

  • I feel like we should add a check that prevents users to combine these parameters (exit execution with error when it happens).
    • do we want to silently fail, i.e. return 0 members and error log to console, or do we want to throw an exception? Think of where we are using it in rdfc pipelines or program code, as difference in cli doesn't really matter
  • preventing should already be enough, but do we also want to investigate and fix the underlying issue why this could even lead to memory issues?
@ajuvercr
Copy link
Member

My initial thought was, that these should be able to work together, but that the client emits the last versions in batches.
So the initial batch emits all last version as expected, and then for each poll cycle emits the last version of updated members.

This way the consumer can interact with this, by registering a listener to the poll event and start up some logic for something.

However, this does not mean that the client should go oom, and I can't really say why this happens. Have you looked into the problem @smessie?
I know that we currently keep an array of all seen memberIds, which is groing, but maybe a bit expected. And I think this is not the issue you are seeing, or are you actually seeing like milllions of different members?
Also, does the client emit members or not? When reading between the lines it looks like the client currently does not emit any members, is this correct?

@smessie
Copy link
Member Author

smessie commented Jan 31, 2025

My initial thought was, that these should be able to work together, but that the client emits the last versions in batches.
So the initial batch emits all last version as expected, and then for each poll cycle emits the last version of updated members.

That sounds like a viable option as well, although the poll event doesn't contain any data, but a client could work with that and reason that any members emitted after the poll event are newer versions of earlier emitted members.
Is the poll interval only initiated when it completed the previous cycle? Otherwise the poll event could be emitted before all members of the previous cycle are emitted.

Have you looked into the problem @smessie?

No, not yet. I experienced this issue but can now circumvent it by making sure I don't pass the combination of config parameters, so it's not a blocker for me and thus have some other priorities first. Also wanted to discuss it here first.

And I think this is not the issue you are seeing, or are you actually seeing like milllions of different members?

The LDES contained 16800 members, out of which 1042 unique ones. So no, no millions of different members.

Also, does the client emit members or not? When reading between the lines it looks like the client currently does not emit any members, is this correct?

It actually looks like it did.

@smessie
Copy link
Member Author

smessie commented Jan 31, 2025

For an LDES containing 438057 members (out of which 1797 unique ones), it also goes OOM with only the --latest-version-only parameter (so even without the --follow flag). It does not without this parameter (so without any parameters).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants