Skip to content

Comparability of ISC25 ior-rnd4K-easy-read results impacted by caching-sensitive behavior #109

@johnlewars

Description

@johnlewars

During the SC25 Birds-of-a-Feather session, it was suggested that some storage systems may achieve higher performance on ior-rnd4K-easy-read than on ior-easy-read. Based on analysis of published IO500 results and my local reproductions, this behavior appears to be concentrated in submissions using the earlier ISC25 versions of IO500 and is likely influenced by benchmark-version-specific issues.

TL;DR: ISC25 ior-rnd4K-easy-read results can show artificially high performance due to cache-sensitive issues and should not be directly compared with results from corrected benchmark versions.

In general, absent caching effects, I am not aware of any real storage backend that would deliver higher throughput for 4 KiB random reads than for larger sequential reads over the same dataset and client configuration.

Across the published IO500 results I reviewed, cases where ior-rnd4K-easy-read exceeds or closely approaches ior-easy-read performance are observed in the early ISC25 benchmark version results. These submissions show unusually high random-read-to-sequential-read ratios, consistent with increased cache reuse during the random-read phase. I have also reproduced this behavior independently.

This effect may be especially visible for systems whose write phase produces a smaller effective read working set, because that working set is more likely to fit in client-side cache.

Two IO500 changes in the ISC25 timeframe materially affect the random-read phase’s working set and access pattern and are likely contributors to this behavior.

  1. Incorrect segment count used during read phase
    Commit: e286dbe
    The commit message states that the read phase was using the segment count from the easy-write stonewall file, whereas the intended behavior was to use the specified segment count, which is usually higher. This issue can reduce the effective read working set and may result in the entire working set, or a high portion of it, fitting in client-side cache.

  2. IOR update including rand64 fix
    Commit: cf79a6e
    This updates the IOR version used by IO500 to include the rand64 fix. That change can affect the random-access pattern used by IOR and therefore the degree of locality observed during the benchmark.

When client-side caching is not explicitly bypassed, for example through Direct I/O or an equivalent cache-avoidance mechanism, these issues can substantially increase the cache hit rate by reducing the effective working set and increasing locality. The result is that the test may measure a mixture of storage performance and cache reuse, rather than providing a consistent measure of backend 4 KiB random-read capability.

This appears to explain why some ISC25 results report ior-rnd4K-easy-read performance that exceeds or approaches ior-easy-read, which is otherwise unexpected for real storage media under comparable conditions.

This creates a comparability concern: ISC25 ior-rnd4K-easy-read results may benefit from unintended cache effects, while results from corrected versions reflect a different effective workload. As a result, the published table mixes results that are not directly comparable for this subtest, potentially overstating random-read performance for affected ISC25 submissions.

I suggest one of the following actions.

Preferred: flag ISC25 ior-rnd4K-easy-read results as non-comparable with corrected benchmark versions, or exclude that subscore from cross-version comparisons.

Alternative: provide an official compatibility mode that allows newer submissions to reproduce the ISC25 behavior for this subtest, if the goal is to compare against ISC25-era results. I am including this option because it has been suggested as a possible way to enable apples-to-apples comparison, although benchmark runs dominated by high cache-hit ratios are less representative of backend storage performance.

Additional mitigation: consider strengthening cache-avoidance requirements for ior-rnd4K-easy-read, for example by requiring Direct I/O where supported.

Could the IO500 maintainers clarify how ISC25 ior-rnd4K-easy-read results should be interpreted in light of these fixes, and whether these results should be flagged as non-comparable for this subtest?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions