Skip to content

Log analysis uses Billed Duration rather than Duration for calculating Lambda run time #228

@ryancormack

Description

@ryancormack

With Lambda's billing model rounding up to the nearest millisecond, this can would ignore fractional milliseconds from being calculated on the average length of time the lambda runs for. It's unlikely this is an issue for the vast vast majority of use cases (if you are sensitive to fractional millisecond latency drifts Lambda is unlikely the best tool), however it does result in any cold start not being calculated properly.
When using Managed Runtimes you are not billed for the init duration of a function. In these circumstances, depending on the number of invocations and whether they are run in parallel of not, it could make a difference to what the end user experiences from your Lambda Function. When using a Custom Runtime, or using a Container Image package type, you are billed for the init duration, so the power tuner does show a more representative summary of end user experience on the Tuner.

However, the Power Tuner does also display cost/price compared to performance. For these metrics, using the Billed Duration rather than the Duration metric is the correct value to use, which highlights the cost benefit of using a Managed Runtime.

I couldn't find any historical issues related to this, so I suspect it's not been a problem. However, given there are two open PRs (here and here) and an long running issue for Power Tuning cold starts only, I think this will become problematic. I could take 2 identical functions, but have one run on a Node 20 custom runtime and one on a Node 20 Managed Runtime. The end user experience would be very similar (getting loads of slow/cold starts), but the power tuner would report the managed runtime returns in double digit milliseconds and the custom one returns in 600+ms.

I see there are already parameter options to discard outliers. Should it even be considered that this is an existing bug, or is this behaviour what is expected?

Should the extraction logic just be updated to know if it's a custom or managed runtime and pick the appropriate part of the Report log detail, likely having to use different values for billed and time metrics.

Examples:
Managed Runtime Cold Start:
Duration: 25.30 ms Billed Duration: 26 ms Memory Size: 1024 MB Max Memory Used: 79 MB Init Duration: 400.62 ms

Custom Runtime Cold Start:
Duration: 21.50 ms Billed Duration: 399 ms Memory Size: 1024 MB Max Memory Used: 86 MB Init Duration: 376.86 ms

In this example the Power Tuner would report something like this.
I think this is very misleading when considering only cold starts and as close to 100% accurate as necessary for only warm starts. The cost part is correct, but the time part I think is wrong - but that's subjective depending on what the real goal of the time metric here is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions