Skip to content

Conversation

@kalindafab
Copy link

See (https://issues.jenkins.io/browse/JENKINS-75081)

Testing done

I tested this change by running a Jenkins pipeline that generates a large amount of log output.

Tested with 7GB logs: The logs streamed correctly without an OutOfMemoryError.
Tested with 10GB logs: The issue still occurs, showing the same large log behavior.
Heap memory usage monitored: Observed heap dump and confirmed the Jetty thread still holds logs in memory.
Manual verification: Accessed /logText/progressiveText API after each build and checked the response.
No automated tests added because this issue requires large-scale log generation that is difficult to replicate in unit tests.

Proposed changelog entries

  • human-readable text

Improve /logText/progressiveText API to handle large logs more efficiently and reduce heap memory usage.
Stream log data in smaller chunks instead of loading large portions into memory at once.
Modify LargeText and AnnotatedLargeText to avoid excessive memory consumption for logs above 7GB.
Partial fix for heap exhaustion in large build logs; further improvements needed for logs exceeding 10GB.

The fix works for logs up to 7GB, but the issue still persists for logs over 10GB. I’d appreciate guidance on further improving memory handling in Jetty and any best practices for handling extremely large logs efficiently.

Looking forward to your feedback—thanks in advance! 🙌

Proposed upgrade guidelines

N/A

### Submitter checklist
- [ ] The Jira issue, if it exists, is well-described.
- [ ] The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see [examples](https://github.com/jenkins-infra/jenkins.io/blob/master/content/_data/changelogs/weekly.yml)). Fill in the **Proposed upgrade guidelines** section only if there are breaking changes or changes that may require extra steps from users during upgrade.
- [ ] There is automated testing or an explanation as to why this change has no tests.
- [ ] New public classes, fields, and methods are annotated with `@Restricted` or have `@since TODO` Javadocs, as appropriate.
- [ ] New deprecations are annotated with `@Deprecated(since = "TODO")` or `@Deprecated(forRemoval = true, since = "TODO")`, if applicable.
- [ ] New or substantially changed JavaScript is not defined inline and does not call `eval` to ease future introduction of Content Security Policy (CSP) directives (see [documentation](https://www.jenkins.io/doc/developer/security/csp/)).
- [ ] For dependency updates, there are links to external changelogs and, if possible, full differentials.
- [ ] For new APIs and extension points, there is a link to at least one consumer.

Desired reviewers

@mention

Before the changes are marked as ready-for-merge:

### Maintainer checklist
- [ ] There are at least two (2) approvals for the pull request and no outstanding requests for change.
- [ ] Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
- [ ] Changelog entries in the pull request title and/or **Proposed changelog entries** are accurate, human-readable, and in the imperative mood.
- [ ] Proper changelog labels are set so that the changelog can be generated automatically.
- [ ] If the change needs additional upgrade steps from users, the `upgrade-guide-needed` label is set and there is a **Proposed upgrade guidelines** section in the pull request title (see [example](https://github.com/jenkinsci/jenkins/pull/4387)).
- [ ] If it would make sense to backport the change to LTS, a Jira issue must exist, be a _Bug_ or _Improvement_, and be labeled as `lts-candidate` to be considered (see [query](https://issues.jenkins.io/issues/?filter=12146)).

@welcome
Copy link

welcome bot commented Feb 4, 2025

Yay, your first pull request towards Jenkins core was created successfully! Thank you so much!

A contributor will provide feedback soon. Meanwhile, you can join the chats and community forums to connect with other Jenkins users, developers, and maintainers.

@MarkEWaite MarkEWaite added the bug For changelog: Minor bug. Will be listed after features label Feb 11, 2025
Copy link

@A1exKH A1exKH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kalindafab LGTM.

</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Comment on lines -237 to +251
ObjectOutputStream oos = AnonymousClassWarnings.checkingObjectOutputStream(new GZIPOutputStream(new CipherOutputStream(baos, sym)));
oos.writeLong(System.currentTimeMillis()); // send timestamp to prevent a replay attack
ObjectOutputStream oos = AnonymousClassWarnings.checkingObjectOutputStream(
new GZIPOutputStream(new CipherOutputStream(baos, sym))
);

oos.writeLong(System.currentTimeMillis());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to just be formatting changes (and deleting a comment)? Please try to keep diffs minimal.

Comment on lines +137 to +139
if (pos >= chunkSize) {
in.skip(2);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this about?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted to chunk the logs while reading them by checking how much data has been read using a pos >= chunkSize check. When that threshold is reached, I used in.skip(2) to move past the delimiter or line break ( \r\n) between log entries.
The intention here was to read the logs in manageable chunks without holding the entire log in memory, and to prepare for streaming output more efficiently.

@jglick
Copy link
Member

jglick commented Apr 7, 2025

I am surprised to see a proposed fix in this repo (other than a dependency bump), when the problem appears to be in https://github.com/jenkinsci/stapler/blob/fcb700b75e9ed613d7d58e47aa117f504d2affe6/core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java#L310-L319 which is attempting to first write the entire log text to a memory buffer and then stream it, which is obviously not what we want. If we did not bother sending X-Text-Size then the CharSpool could simply be dropped, but

e.fetchedBytes = rsp.headers.get("X-Text-Size");
in fact uses this header, and you cannot set a header after streaming content. It may be possible to first writeLogTo a special counting Writer that discards content, then set the header, then write again, but this seems inefficient. Depending on the Source we could do better., I think.

Note that createWriter is protected for no obvious reason: it is never overridden.

https://github.com/jenkinsci/blueocean-plugin/blob/ab3a0465c4e68be8247158ae3c7733e88094d5a3/blueocean-rest-impl/src/main/java/io/jenkins/blueocean/service/embedded/rest/LogResource.java#L93 duplicates code but this is used only in Blue Ocean, which is not maintained anyway.

https://github.com/jenkinsci/pipeline-graph-view-plugin/blob/01dba0944a687b2ad5487ba8d0db49c1146458ff/src/main/java/io/jenkins/plugins/pipelinegraphview/consoleview/PipelineConsoleViewAction.java#L122-L138 also looks suspicious.

The whole design of LargeText is deeply flawed as it does not in fact handle large text well. https://github.com/jenkinsci/pipeline-cloudwatch-logs-plugin/blob/e075ad43a47010bc2ced615a5f222e32b9e6dd81/src/main/java/io/jenkins/plugins/pipeline_cloudwatch_logs/CloudWatchRetriever.java#L106-L112 shows how a plugin is forced to make a copy in memory of a build log.

@jglick
Copy link
Member

jglick commented Apr 8, 2025

jenkinsci/stapler#657 might work but I did not try to test it yet.

@kalindafab
Copy link
Author

@jglick
It might be worth considering a refactor of LargeText to support true streaming mode where the log is written directly to the response without buffering. If X-Text-Size is required (ex: progressive-text.js), we could support an optional pre-pass using a CountingWriter to get the size without allocating memory.

@kalindafab
Copy link
Author

maybe it would significantly reduce memory usage for large logs while keeping backward compatibility where needed. Also, making createWriter() public or allowing configurable streaming behavior would help plugin developers avoid duplicating flawed logic.

@jglick
Copy link
Member

jglick commented Apr 8, 2025

a refactor of LargeText to support true streaming mode where the log is written directly to the response without buffering

That is what my PR attempts to do.

an optional pre-pass using a CountingWriter to get the size without allocating memory

I took a different approach, since the current design only supports two content providers (file and byte buffer) both of which have a known length which can be looked up in advance.

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe #10515 solves the problem more directly; please help test if you can.

@kalindafab
Copy link
Author

I believe #10515 solves the problem more directly; please help test if you can.

right

@github-actions github-actions bot added the unresolved-merge-conflict There is a merge conflict with the target branch. label Apr 30, 2025
@github-actions
Copy link
Contributor

Please take a moment and address the merge conflicts of your pull request. Thanks!

@kalindafab
Copy link
Author

I believe #10515 solves the problem more directly; please help test if you can.

sorry! i might delay because i don't have enough time to work on this

@basil basil closed this in #10515 May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug For changelog: Minor bug. Will be listed after features unresolved-merge-conflict There is a merge conflict with the target branch.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants