Skip to content

Conversation

schlosna
Copy link
Contributor

@schlosna schlosna commented Dec 16, 2024

When using Jackson to deserialize timestamps with formatter of DateTimeFormatter.ISO_OFFSET_DATE_TIME or DateTimeFormatter.ISO_ZONED_DATE_TIME, a lot of time is spent in InstantDeserializer::addInColonToOffsetIfMissing allocating and performing regex matching on possible timezone offset, even if the input timestamp is already in a valid ISO 8601 format with explicit zone of Z or with colon separated offset.

Similar to #266

# 2021 MacBookPro M1 Pro
# JMH version: 1.37
# VM version: JDK 21.0.5, OpenJDK 64-Bit Server VM, 21.0.5+11-LTS

Before (2.18.2)

Benchmark                                    Mode  Cnt     Score    Error  Units
InstantDeserializerBenchmark.offsetDateTime  avgt    5   942.358 ± 21.485  ns/op
InstantDeserializerBenchmark.zonedDateTime   avgt    5  1025.040 ± 37.269  ns/op

After (2.19.0-SNAPSHOT)

Benchmark                                    Mode  Cnt    Score     Error  Units
InstantDeserializerBenchmark.offsetDateTime  avgt    5  705.542 ±  20.482  ns/op
InstantDeserializerBenchmark.zonedDateTime   avgt    5  850.149 ± 219.331  ns/op
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.json.JsonMapper;
import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import java.time.OffsetDateTime;
import java.time.ZoneOffset;
import java.time.ZonedDateTime;
import java.time.format.DateTimeFormatter;
import java.util.List;
import java.util.Locale;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import java.util.stream.IntStream;
import java.util.stream.Stream;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.MethodSource;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OperationsPerInvocation;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 3, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 3, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@State(Scope.Benchmark)
@SuppressWarnings({"designforextension", "NullAway", "CheckStyle"})
public class DateTimeDeserializerBenchmark {

    private static final ObjectMapper mapper = JsonMapper.builder()
            .defaultLocale(Locale.ENGLISH)
            .addModule(new JavaTimeModule())
            .build();

    private static final List<String> timestamps = timestamps();
    private static final int EXPECTED_TIMESTAMPS = 515;

    @Benchmark
    @OperationsPerInvocation(EXPECTED_TIMESTAMPS)
    public void zonedDateTime(Blackhole blackhole) throws Exception {
        for (String string : timestamps) {
            blackhole.consume(mapper.readValue(string, ZonedDateTime.class));
        }
    }

    @Benchmark
    @OperationsPerInvocation(EXPECTED_TIMESTAMPS)
    public void offsetDateTime(Blackhole blackhole) throws Exception {
        for (String string : timestamps) {
            blackhole.consume(mapper.readValue(string, OffsetDateTime.class));
        }
    }

    public static List<String> timestamps() {
        return Stream.of(
                        DateTimeFormatter.ISO_DATE_TIME,
                        DateTimeFormatter.ISO_INSTANT,
                        DateTimeFormatter.ISO_OFFSET_DATE_TIME,
                        DateTimeFormatter.ISO_ZONED_DATE_TIME)
                .flatMap(f -> IntStream.rangeClosed(-18, 18)
                        .mapToObj(h -> Stream.of(
                                f.format(OffsetDateTime.now(ZoneOffset.ofHours(h))),
                                f.format(OffsetDateTime.now(
                                        ZoneOffset.ofHoursMinutes(h, Math.abs(h) == 18 ? 0 : h < 0 ? -30 : 30)))))
                        .flatMap(Function.identity()))
                .flatMap(ts -> {
                    int lastColon = ts.lastIndexOf(':');
                    if (lastColon == -1 || lastColon != ts.length() - 3) {
                        return Stream.of(ts);
                    }
                    return Stream.of(
                            ts, new StringBuilder(ts).deleteCharAt(lastColon).toString());
                })
                .map(ts -> '"' + ts + '"')
                .toList();
    }

    public static void main(String[] _args) throws Exception {
        new Runner(new OptionsBuilder()
                        .include(DateTimeDeserializerBenchmark.class.getSimpleName())
                        .build())
                .run();
    }
}

@schlosna schlosna changed the title Ds/colon offset Optimize InstantDeserializer addInColonToOffsetIfMissing Dec 16, 2024
@schlosna schlosna marked this pull request as ready for review December 16, 2024 13:57
Copy link
Member

@JooHyukKim JooHyukKim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schlosna Super interesting findings! 👍🏼👍🏼 Just out of curiosity, may I ask how much performance improvement this change makes in your usecase/production? Here the performance test here you shared (thank you) seems to show like 20% improvement, but just wondering how much or just how it helps in production.

Thank you in advance!

Comment on lines +815 to +816
@Test
public void OffsetDateTime_with_offset_can_be_deserialized() throws Exception {
Copy link
Member

@JooHyukKim JooHyukKim Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we can merge this and one for zonedDateTime below into a separate test class like Xxx336Test.java for their purposes and similar style, but idk might be overkill for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can consolidate these into a separate test class

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I think keeping them along with existing test makes sense: this is not new functionality but optimizing (and adding test coverage). So let's not create per-issue test classes.

@JooHyukKim
Copy link
Member

JooHyukKim commented Dec 16, 2024

Also, for a second I thought about maybe change addInColonToOffsetIfMissing to protected for internal use then... have new (non-existent currently)

  • ZonedDateTimeDeserializer
  • OffsetDateTimeDeserializer

classes to override for their own good. This one also potential overkill (or at least for at this point)

@schlosna
Copy link
Contributor Author

@schlosna Super interesting findings! 👍🏼👍🏼 Just out of curiosity, may I ask how much performance improvement this change makes in your usecase/production? Here the performance test here you shared (thank you) seems to show like 20% improvement, but just wondering how much or just how it helps in production.

Thank you in advance!

Thanks for the quick review.

I have seen profiles pointing at the regex matcher allocations and method profiles pointing at addInColonToOffsetIfMissing for a number of production systems that heavily use Jackson and OffsetDateTime for serialization/deserialization. I will try to spin up a more realistic JMH benchmark workload.

@cowtowncoder cowtowncoder added the cla-received Marker to denote that there is a CLA for pr label Dec 17, 2024
@cowtowncoder
Copy link
Member

While more performance results can be useful and interesting, I think I am satisfied with included benchmarks. True, end-to-end effect will be more limited, but this seems like safe change wrt test coverage.

So I will go ahead and merge -- 2.19(.0) makes sense since while looks safe enough, changes are not trivial so prefer inclusion in minor version (over patch).

@cowtowncoder cowtowncoder merged commit 29aa2b8 into FasterXML:2.19 Dec 17, 2024
4 checks passed
@cowtowncoder cowtowncoder changed the title Optimize InstantDeserializer addInColonToOffsetIfMissing Optimize InstantDeserializer addInColonToOffsetIfMissing() Dec 17, 2024
@cowtowncoder cowtowncoder modified the milestones: 2.19., 2.19.0 Dec 17, 2024
cowtowncoder added a commit that referenced this pull request Dec 17, 2024
@schlosna schlosna deleted the ds/colon-offset branch December 17, 2024 02:16
@schlosna
Copy link
Contributor Author

Thanks @cowtowncoder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-received Marker to denote that there is a CLA for pr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants