Attempt to fix logText progressive memory issue in LargeText.java #10236

kalindafab · 2025-02-04T10:05:33Z

See (https://issues.jenkins.io/browse/JENKINS-75081)

Testing done

I tested this change by running a Jenkins pipeline that generates a large amount of log output.

Tested with 7GB logs: The logs streamed correctly without an OutOfMemoryError.
Tested with 10GB logs: The issue still occurs, showing the same large log behavior.
Heap memory usage monitored: Observed heap dump and confirmed the Jetty thread still holds logs in memory.
Manual verification: Accessed /logText/progressiveText API after each build and checked the response.
No automated tests added because this issue requires large-scale log generation that is difficult to replicate in unit tests.

Proposed changelog entries

human-readable text

Improve /logText/progressiveText API to handle large logs more efficiently and reduce heap memory usage.
Stream log data in smaller chunks instead of loading large portions into memory at once.
Modify LargeText and AnnotatedLargeText to avoid excessive memory consumption for logs above 7GB.
Partial fix for heap exhaustion in large build logs; further improvements needed for logs exceeding 10GB.

The fix works for logs up to 7GB, but the issue still persists for logs over 10GB. I’d appreciate guidance on further improving memory handling in Jetty and any best practices for handling extremely large logs efficiently.

Looking forward to your feedback—thanks in advance! 🙌

Proposed upgrade guidelines

N/A

### Submitter checklist
- [ ] The Jira issue, if it exists, is well-described.
- [ ] The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see [examples](https://github.com/jenkins-infra/jenkins.io/blob/master/content/_data/changelogs/weekly.yml)). Fill in the **Proposed upgrade guidelines** section only if there are breaking changes or changes that may require extra steps from users during upgrade.
- [ ] There is automated testing or an explanation as to why this change has no tests.
- [ ] New public classes, fields, and methods are annotated with `@Restricted` or have `@since TODO` Javadocs, as appropriate.
- [ ] New deprecations are annotated with `@Deprecated(since = "TODO")` or `@Deprecated(forRemoval = true, since = "TODO")`, if applicable.
- [ ] New or substantially changed JavaScript is not defined inline and does not call `eval` to ease future introduction of Content Security Policy (CSP) directives (see [documentation](https://www.jenkins.io/doc/developer/security/csp/)).
- [ ] For dependency updates, there are links to external changelogs and, if possible, full differentials.
- [ ] For new APIs and extension points, there is a link to at least one consumer.

Desired reviewers

@mention

Before the changes are marked as ready-for-merge:

### Maintainer checklist
- [ ] There are at least two (2) approvals for the pull request and no outstanding requests for change.
- [ ] Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
- [ ] Changelog entries in the pull request title and/or **Proposed changelog entries** are accurate, human-readable, and in the imperative mood.
- [ ] Proper changelog labels are set so that the changelog can be generated automatically.
- [ ] If the change needs additional upgrade steps from users, the `upgrade-guide-needed` label is set and there is a **Proposed upgrade guidelines** section in the pull request title (see [example](https://github.com/jenkinsci/jenkins/pull/4387)).
- [ ] If it would make sense to backport the change to LTS, a Jira issue must exist, be a _Bug_ or _Improvement_, and be labeled as `lts-candidate` to be considered (see [query](https://issues.jenkins.io/issues/?filter=12146)).

welcome · 2025-02-04T10:05:37Z

Yay, your first pull request towards Jenkins core was created successfully! Thank you so much!

A contributor will provide feedback soon. Meanwhile, you can join the chats and community forums to connect with other Jenkins users, developers, and maintainers.

A1exKH

@kalindafab LGTM.

jglick · 2025-04-07T23:07:13Z

core/pom.xml

    </dependency>
+    <dependency>
+      <groupId>org.eclipse.jetty</groupId>
+      <artifactId>jetty-server</artifactId>


jglick · 2025-04-07T23:08:13Z

core/src/main/java/hudson/console/AnnotatedLargeText.java

-        ObjectOutputStream oos = AnonymousClassWarnings.checkingObjectOutputStream(new GZIPOutputStream(new CipherOutputStream(baos, sym)));
-        oos.writeLong(System.currentTimeMillis()); // send timestamp to prevent a replay attack
+        ObjectOutputStream oos = AnonymousClassWarnings.checkingObjectOutputStream(
+                new GZIPOutputStream(new CipherOutputStream(baos, sym))
+        );
+
+        oos.writeLong(System.currentTimeMillis());


Seems to just be formatting changes (and deleting a comment)? Please try to keep diffs minimal.

jglick · 2025-04-07T23:09:23Z

core/src/main/java/hudson/util/ChunkedInputStream.java

+        if (pos >= chunkSize) {
+            in.skip(2);
+        }


What is this about?

I attempted to chunk the logs while reading them by checking how much data has been read using a pos >= chunkSize check. When that threshold is reached, I used in.skip(2) to move past the delimiter or line break ( \r\n) between log entries.
The intention here was to read the logs in manageable chunks without holding the entire log in memory, and to prepare for streaming output more efficiently.

jglick · 2025-04-07T23:42:05Z

I am surprised to see a proposed fix in this repo (other than a dependency bump), when the problem appears to be in https://github.com/jenkinsci/stapler/blob/fcb700b75e9ed613d7d58e47aa117f504d2affe6/core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java#L310-L319 which is attempting to first write the entire log text to a memory buffer and then stream it, which is obviously not what we want. If we did not bother sending X-Text-Size then the CharSpool could simply be dropped, but

jenkins/core/src/main/resources/lib/hudson/progressive-text.js

Line 75 in 18b7916

e.fetchedBytes = rsp.headers.get("X-Text-Size");

in fact uses this header, and you cannot set a header after streaming content. It may be possible to first writeLogTo a special counting Writer that discards content, then set the header, then write again, but this seems inefficient. Depending on the Source we could do better., I think.

Note that createWriter is protected for no obvious reason: it is never overridden.

https://github.com/jenkinsci/blueocean-plugin/blob/ab3a0465c4e68be8247158ae3c7733e88094d5a3/blueocean-rest-impl/src/main/java/io/jenkins/blueocean/service/embedded/rest/LogResource.java#L93 duplicates code but this is used only in Blue Ocean, which is not maintained anyway.

https://github.com/jenkinsci/pipeline-graph-view-plugin/blob/01dba0944a687b2ad5487ba8d0db49c1146458ff/src/main/java/io/jenkins/plugins/pipelinegraphview/consoleview/PipelineConsoleViewAction.java#L122-L138 also looks suspicious.

The whole design of LargeText is deeply flawed as it does not in fact handle large text well. https://github.com/jenkinsci/pipeline-cloudwatch-logs-plugin/blob/e075ad43a47010bc2ced615a5f222e32b9e6dd81/src/main/java/io/jenkins/plugins/pipeline_cloudwatch_logs/CloudWatchRetriever.java#L106-L112 shows how a plugin is forced to make a copy in memory of a build log.

jglick · 2025-04-08T00:29:06Z

jenkinsci/stapler#657 might work but I did not try to test it yet.

kalindafab · 2025-04-08T08:46:47Z

@jglick
It might be worth considering a refactor of LargeText to support true streaming mode where the log is written directly to the response without buffering. If X-Text-Size is required (ex: progressive-text.js), we could support an optional pre-pass using a CountingWriter to get the size without allocating memory.

kalindafab · 2025-04-08T08:47:38Z

maybe it would significantly reduce memory usage for large logs while keeping backward compatibility where needed. Also, making createWriter() public or allowing configurable streaming behavior would help plugin developers avoid duplicating flawed logic.

jglick · 2025-04-08T13:11:26Z

a refactor of LargeText to support true streaming mode where the log is written directly to the response without buffering

That is what my PR attempts to do.

an optional pre-pass using a CountingWriter to get the size without allocating memory

I took a different approach, since the current design only supports two content providers (file and byte buffer) both of which have a known length which can be looked up in advance.

jglick

I believe #10515 solves the problem more directly; please help test if you can.

kalindafab · 2025-04-28T13:14:38Z

I believe #10515 solves the problem more directly; please help test if you can.

right

github-actions · 2025-04-30T15:42:35Z

Please take a moment and address the merge conflicts of your pull request. Thanks!

kalindafab · 2025-04-30T19:36:01Z

I believe #10515 solves the problem more directly; please help test if you can.

sorry! i might delay because i don't have enough time to work on this

Attempt to fix logText progressive memory issue in LargeText.java

8226879

MarkEWaite added the bug For changelog: Minor bug. Will be listed after features label Feb 11, 2025

A1exKH approved these changes Feb 20, 2025

View reviewed changes

jglick reviewed Apr 7, 2025

View reviewed changes

core/pom.xml

</dependency>

<dependency>

<groupId>org.eclipse.jetty</groupId>

<artifactId>jetty-server</artifactId>

Copy link

Member

jglick Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

jglick reviewed Apr 7, 2025

View reviewed changes

jglick mentioned this pull request Apr 7, 2025

More efficient LargeText.doProgressText jenkinsci/stapler#657

Merged

jglick mentioned this pull request Apr 8, 2025

[JENKINS-75081] Avoid heap allocation when rendering large logs #10515

Merged

6 tasks

jglick requested changes Apr 28, 2025

View reviewed changes

github-actions bot added the unresolved-merge-conflict There is a merge conflict with the target branch. label Apr 30, 2025

basil closed this in #10515 May 2, 2025

Uh oh!

Attempt to fix logText progressive memory issue in LargeText.java #10236

Attempt to fix logText progressive memory issue in LargeText.java #10236

Uh oh!

Conversation

kalindafab commented Feb 4, 2025

Testing done

Proposed changelog entries

Proposed upgrade guidelines

Desired reviewers

Uh oh!

welcome bot commented Feb 4, 2025

Uh oh!

A1exKH left a comment

Choose a reason for hiding this comment

Uh oh!

jglick Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

jglick Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

jglick Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

kalindafab Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

jglick commented Apr 7, 2025

Uh oh!

jglick commented Apr 8, 2025

Uh oh!

kalindafab commented Apr 8, 2025

Uh oh!

kalindafab commented Apr 8, 2025

Uh oh!

jglick commented Apr 8, 2025

Uh oh!

jglick left a comment

Choose a reason for hiding this comment

Uh oh!

kalindafab commented Apr 28, 2025

Uh oh!

github-actions bot commented Apr 30, 2025

Uh oh!

kalindafab commented Apr 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants