Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Usage Increase in Log4j2 2.24.3 with Async Logging Due to RingBufferLogEvent #3569

Open
shaonan666 opened this issue Mar 24, 2025 · 11 comments

Comments

@shaonan666
Copy link

Description

I am using Log4j2 in my project and recently upgraded from version 2.23.1 to 2.24.3. The application employs an asynchronous logging strategy with Disruptor version 3.4.3. Below is the relevant configuration:

  • log4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
  • log4j2.asyncQueueFullPolicy=org.apache.logging.log4j.core.async.DiscardingAsyncQueueFullPolicy
  • log4j2.discardThreshold=ERROR
  • log4j2.asyncLoggerWaitStrategy=Yield
  • log4j2.asyncLoggerRingBufferSize=1048576

The logging frequency in my application is relatively low, averaging about 10 log entries per second. When using version 2.23.1, memory usage remained stable based on monitoring data . However, after upgrading to 2.24.3, I observed a gradual increase in memory usage over time, which eventually stabilizes at a fixed threshold.
Figure 1

This behavior differs significantly from what I experienced with 2.23.1. Through heap dump analysis at different time intervals, I confirmed that the memory growth is primarily associated with RingBufferLogEvent objects within Log4j2’s RingBuffer implementation, with the RingBuffer’s memory footprint consistently increasing over time.

Image

Image

Image

I noticed that the 2.24.3 release includes several changes related to memory management. Could any of these changes, particularly those affecting the RingBuffer or RingBufferLogEvent, explain the behavior I am observing?

If additional details are needed, please let me know, and I’d be happy to provide them.

Environment

  • Log4j2 Version: 2.24.3 (upgraded from 2.23.1)
  • Disruptor Version: 3.4.3
  • JVM: Zing24.08.101.0+1 (build 21.0.4.0.101+1-LTS)
  • Logging Frequency: ~10 logs/second
@ppkarwasz
Copy link
Contributor

Hi @shaonan666,

RingBufferLogEvent does not have a fixed retained size and can gain weight in time if all garbage-free logging options are enabled. This is due to the fact that reusable objects are reused instead of being garbage-collected. A retained size of 744 is nothing alarming.

You should check the following options:

Note

Log4j uses a fuzzy-matching algorithm to match both the official spelling of the configuration properties and some obsolete spellings. You should look if your configuration properties have names similar to those above, that start with log4j or org.apache.logging.

@ppkarwasz ppkarwasz added waiting-for-user More information is needed from the user and removed waiting-for-maintainer labels Mar 24, 2025
@shaonan666
Copy link
Author

shaonan666 commented Mar 25, 2025

Hi @ppkarwasz

My program isn’t a web app, so Constants#isWebApp returns false at runtime, setting log4j2.enableThreadlocals to true. I didn’t intend to enable GC-free mode, it happened accidentally because my program isn’t a web app, resulting in “half” of GC-free mode being enabled. Now I understand that to fully enable GC-free mode, I also need to set log4j2.garbagefreeThreadContextMap to true.

Additionally, I’ve noticed some version differences:

  • In 2.23.1, if log4j2.garbagefreeThreadContextMap isn’t enabled, it returns CopyOnWriteSortedArrayThreadContextMap by this logic.

  • In 2.24.3 , if log4j2.garbagefreeThreadContextMap isn’t enabled, the program is temporarily identified as a 'WEB_APP_CONTEXT_MAP' and ultimately returns DefaultThreadContextMap. (see Log4jProvider.java)

Based on these observations, I have the some questions:

  • Could the difference between DefaultThreadContextMap and CopyOnWriteSortedArrayThreadContextMap be the reason I’m seeing an increase in memory usage? Could you briefly explain the differences between these two classes? I’ve noticed they affect memory usage, and I’d like to understand why.

  • Under the same configuration (not a web app), when log4j2.garbagefreeThreadContextMap isn’t enabled, memory usage in 2.24.3 seems less stable compared to 2.23.1. Is this an expected change?

@github-actions github-actions bot added waiting-for-maintainer and removed waiting-for-user More information is needed from the user labels Mar 25, 2025
@ppkarwasz
Copy link
Contributor

Based on these observations, I have the some questions:

  • Could the difference between DefaultThreadContextMap and CopyOnWriteSortedArrayThreadContextMap be the reason I’m seeing an increase in memory usage? Could you briefly explain the differences between these two classes? I’ve noticed they affect memory usage, and I’d like to understand why.

Up to version 2.23.x:

  • DefaultThreadContextMap was an implementation of ThreadContextMap basically equivalent to ThreadLocal<HashMap>. Since it only bound classes from the bootstrap classloader to threads, it was safe to use in web applications without memory leaks. DefaultThreadContextMap was also a copy-on-write implementation, which allowed us to transfer the maps returned by it to other threads.
  • CopyOnWriteSortedArrayThreadContextMap was a more performant copy-on-write implementation, but it bound Log4j classes to threads, so it wasn't safe to use in web applications.

In version 2.24.0 a version of DefaultThreadContextMap was introduced that was both webapp-safe and fast (see #2330). The package-private CopyOnWriteSortedArrayThreadContextMap was removed.

Since both implementations use immutable objects RingBufferLogEvent will not keep references to them:

private void clearContextData() {
if (contextData != null) {
if (contextData.isFrozen()) { // came from CopyOnWrite thread context
contextData = null;
} else {
contextData.clear();
}
}
}

Your log events retain instances of SortedArrayStringMap, which seems to indicate that GarbageFreeSortedArrayThreadContextMap is used instead. In code you can check it with ProviderUtil.getProvider().getThreadContextMapInstance() (since 2.24.0), while in previous versions with ThreadContext.getThreadContextMap() (which returns null for the default map).

  • Under the same configuration (not a web app), when log4j2.garbagefreeThreadContextMap isn’t enabled, memory usage in 2.24.3 seems less stable compared to 2.23.1. Is this an expected change?

The use of DefaultThreadContextMap will generate a little bit more temporary objects than CopyOnWriteSortedArrayThreadContextMap, in particular instances of JdkMapAdapterStringMap. These should however have no influence on the amount of memory retained in the long term.

Right now, my best bet on why your system behaves the way he does is that GarbageFreeSortedArrayThreadContextMap is somehow used.

@shaonan666
Copy link
Author

shaonan666 commented Mar 26, 2025

Hi @ppkarwasz

I checked the dump file and code, and it seems GarbageFreeSortedArrayThreadContextMap isn’t used.

In code you can check it with ProviderUtil.getProvider().getThreadContextMapInstance()

In version 2.24.3, using ProviderUtil.getProvider().getThreadContextMapInstance(), I got DefaultThreadContextMap because GC_FREE_THREAD_CONTEXT_PROPERTY isn’t enabled

if (threadContextMapClass == null) {
threadContextMapClass = props.getBooleanProperty(GC_FREE_THREAD_CONTEXT_PROPERTY)
? GARBAGE_FREE_CONTEXT_MAP
: WEB_APP_CONTEXT_MAP;

case WEB_APP_CONTEXT_MAP:
case WEB_APP_CLASS_NAME:
return new DefaultThreadContextMap();

Image

in previous versions with ThreadContext.getThreadContextMap() (which returns null for the default map).

In version 2.23.1, using ThreadContext.getThreadContextMap() (which returns null for the default map), I got CopyOnWriteSortedArrayThreadContextMap based on this logic

  • Constants.ENABLE_THREADLOCALS is true because isWebApp is false
  • GcFreeThreadContextKey is false by default value

private static ThreadContextMap createDefaultThreadContextMap() {
if (Constants.ENABLE_THREADLOCALS) {
if (GcFreeThreadContextKey) {
return new GarbageFreeSortedArrayThreadContextMap();
}
return new CopyOnWriteSortedArrayThreadContextMap();
}
return new DefaultThreadContextMap(true);

Image

Additionally, I couldn’t find GarbageFreeSortedArrayThreadContextMap in the dump files for either version.

@ppkarwasz
Copy link
Contributor

@shaonan666,

I was looking at the heap dump you shared before:

Image

I am not able to reproduce this state, unless I add an Additional Context Property to my logger configuration:

<Root level="INFO">
  <AppenderRef ref="FILE"/>
  <Property name="key" value="value"/>
</Root>

Can you share your log4j2.xml configuration file?

@shaonan666
Copy link
Author

Hi @ppkarwasz
This is my log4j2.xml configuration file. Could you please review it for me?

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="INFO">

    <properties>
        <Property name="log4j2.path" value="${sys:logging.path:[email protected]@}"/>
        <Property name="log4j2.immediateFlush" value="${sys:logging.immediateFlush:[email protected]@}"/>
        <Property name="log4j2.level" value="${sys:logging.level:[email protected]@}"/>
        <Property name="log4j2.max.file.size" value="xxxMB"/>
        <Property name="log4j2.max.file.count" value="xxx"/>
        <Property name="log4j2.max.file.time" value="xxx"/>

        <Property name="PID">????</Property>
        <Property name="LOG_EXCEPTION_CONVERSION_WORD">%xwEx</Property>
        <Property name="LOG_LEVEL_PATTERN">%5p</Property>
        <Property name="LOG_DATEFORMAT_PATTERN">yyyy-MM-dd HH:mm:ss.SSS</Property>
        <Property name="CONSOLE_LOG_PATTERN">xxxx</Property>
        <Property name="FILE_LOG_PATTERN">xxx</Property>

    </properties>

    <Appenders>

        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="${CONSOLE_LOG_PATTERN}"/>
        </Console>

        <RollingRandomAccessFile name="FILE_INFO"
                                 fileName="${log4j2.path}/info.log"
                                 filePattern="${log4j2.path}/archive/history_info-%d{yyyy-MM-dd}.%i.zip"
                                 immediateFlush="${log4j2.immediateFlush}">
            <PatternLayout pattern="${FILE_LOG_PATTERN}"/>
            <Policies>
                <TimeBasedTriggeringPolicy interval="1" modulate="true"/>
                <SizeBasedTriggeringPolicy size="${log4j2.max.file.size}"/>
            </Policies>
            <DefaultRolloverStrategy max="${log4j2.max.file.count}">
                <Delete basePath="${log4j2.path}/archive/" maxDepth="2">
                    <IfFileName glob="*.zip"/>
                    <IfLastModified age="${log4j2.max.file.time}"/>
                </Delete>
            </DefaultRolloverStrategy>
        </RollingRandomAccessFile>

        <RollingRandomAccessFile name="FILE_ERROR"
                                 fileName="${log4j2.path}/error.log"
                                 filePattern="${log4j2.path}/archive/history_error-%d{yyyy-MM-dd}.%i.zip"
                                 immediateFlush="${log4j2.immediateFlush}">
            <Filters>
                <ThresholdFilter level="WARN" onMatch="ACCEPT" onMismatch="DENY"/>
            </Filters>
            <PatternLayout pattern="${FILE_LOG_PATTERN}"/>
            <Policies>
                <TimeBasedTriggeringPolicy interval="1" modulate="true"/>
                <SizeBasedTriggeringPolicy size="${log4j2.max.file.size}"/>
            </Policies>
            <DefaultRolloverStrategy max="${log4j2.max.file.count}">
                <Delete basePath="${log4j2.path}/archive/" maxDepth="2">
                    <IfFileName glob="*.zip"/>
                    <IfLastModified age="${log4j2.max.file.time}"/>
                </Delete>
            </DefaultRolloverStrategy>
        </RollingRandomAccessFile>

    </Appenders>


    <Loggers>

        <Root level="${log4j2.level}" includeLocation="false">
            <AppenderRef ref="Console"/>
            <AppenderRef ref="FILE_INFO"/>
            <AppenderRef ref="FILE_ERROR"/>
        </Root>

    </Loggers>


</Configuration>

@ppkarwasz
Copy link
Contributor

@shaonan666,

Looking more carefully at your memory graphs, I think that the expansion of StringBuilder instances in the RingBufferLogEvent due to GC-free logging is enough to explain the gradual increase in retained memory:

  • An never used RingBufferLogEvent retains 176 bytes of memory. I contains an empty instance of SortedArrayStringMap that will be removed after the first usage and no StringBuilder instance.
  • If the event is used to log a short message, its size increased to 352 bytes of memory, due to the instantiation of the StringBuilder.
  • For longer message, the size increases, but is capped at 744 bytes.

Given the huge RingBuffer you are using (our default in GC-free mode is merely 4096) you should expect an increase of 176 MiB in retained memory size once all the RingBufferLogEvents have been used at least once.
Since only 40% of your log events have been used, you experienced an increase of around 70 MiB.

@ppkarwasz ppkarwasz added waiting-for-user More information is needed from the user and removed waiting-for-maintainer labels Mar 26, 2025
@shaonan666
Copy link
Author

Hi @ppkarwasz

StringBuilder instances in the RingBufferLogEvent due to GC-free logging

Does this refer to the fact that I only configured log4j2.enableThreadlocals=true (due to isWebApp=false)?

Given the huge RingBuffer you are using (our default in GC-free mode is merely 4096) you should expect an increase of 176 MiB in retained memory size once all the RingBufferLogEvents have been used at least once.

However, with the same configuration and code in version 2.23.1, why didn’t I observe this memory increase? What specific code changes from 2.23.1 to 2.24.3 might have caused this memory behavior change?

Thank you for your patient explanations.

@github-actions github-actions bot added waiting-for-maintainer and removed waiting-for-user More information is needed from the user labels Mar 26, 2025
@ppkarwasz
Copy link
Contributor

However, with the same configuration and code in version 2.23.1, why didn’t I observe this memory increase? What specific code changes from 2.23.1 to 2.24.3 might have caused this memory behavior change?

Apparently the behavior in 2.23.1 was due to a bug (see #2379), which disabled part of the GC-free behavior.

To draw some conclusions from this thread, what are your expectations regarding GC-free mode?

  • Do you expect Log4j Core to pre-allocate StringBuilders of size log4j2.initialReusableMsgSize instead of doing it on-demand?
  • If the ThreadContextMap is in GC-free mode the maps that contain context data never shrink. Should there be a configuration option to shrink them, when they are cleared? cc/@vy

@ppkarwasz ppkarwasz added waiting-for-user More information is needed from the user and removed waiting-for-maintainer labels Mar 26, 2025
@vy
Copy link
Member

vy commented Mar 26, 2025

If the ThreadContextMap is in GC-free mode the maps that contain context data never shrink. Should there be a configuration option to shrink them, when they are cleared?

@ppkarwasz, keeping the TCM size under a certain threshold sounds like a good thing, similar to what we do with StringBuilders using StringBuilders::trimToMaxSize and the log4j.layoutStringBuilder.maxSize property. The brutal truth is, as I stated earlier a couple of times, TCM really needs a rewrite.

@shaonan666
Copy link
Author

shaonan666 commented Mar 26, 2025

Hi @ppkarwasz

I think I’ve got it. Let me describe what happened again to ensure I fully understand.

My program doesn’t specify a messageFactory.

In version 2.23.1, LoggerContext#getLogger(String name) ultimately initializes a ParameterizedMessageFactory

public Logger getLogger(final String name) {
return getLogger(name, null);
}

public AbstractLogger(final String name, final MessageFactory messageFactory) {
this.name = name;
this.messageFactory = messageFactory == null ? createDefaultMessageFactory() : narrow(messageFactory);
this.flowMessageFactory = createDefaultFlowMessageFactory();

In version 2.24.3, LoggerContext#getLogger(String name) uses a GC-free-compliant ReusableMessageFactory based on the configuration log4j2.enableThreadlocals=true

private static final MessageFactory DEFAULT_MESSAGE_FACTORY = createInstanceFromFactoryProperty(
MessageFactory.class,
MESSAGE_FACTORY_PROPERTY_NAME,
Constants.ENABLE_THREADLOCALS ? ReusableMessageFactory.INSTANCE : ParameterizedMessageFactory.INSTANCE);

public Logger getLogger(final String name) {
return getLogger(name, DEFAULT_MESSAGE_FACTORY);
}

These two different MessageFactory implementations are the main reason I’ve observed a gradual increase in memory usage: to support GC-free, ReusableMessageFactory allocates objects that aren’t released until all RingBufferLogEvent instances have been used at least once, at which point the memory growth stops.


Do you expect Log4j Core to pre-allocate StringBuilders of size log4j2.initialReusableMsgSize instead of doing it on-demand?

Pre-allocating a suitably sized StringBuilder based on prediction could be a good choice for services with high runtime latency requirements. But my main concern is the gradual memory increase in version 2.24.3, which I didn’t see in 2.23.1, and I initially suspected a memory leak. However, based on the current situation, it doesn’t seem to be a memory leak risk. I’m considering reducing the RingBuffer size to manage memory usage.
Thank you for your patient explanations!

@github-actions github-actions bot added waiting-for-maintainer and removed waiting-for-user More information is needed from the user labels Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To triage
Development

No branches or pull requests

3 participants