Skip to content

Rework the XltRandom usage pattern for more stability #582

@rschwietzke

Description

@rschwietzke

Goal

Rework the XltRandom handling and use JDK 17 random generators to reduce the synchronization overhead. Also ensure that the random generator is reseeded when the test case is executed. This will ensure that the random data stream during initialization of the class and test case and the one during execution of the test are distinct and won't interfere.

Non Goal

We cannot and will not address all possible random sequence conflicts because that is almost impossible. Some have to be addressed by the consumer. We will extend the API for that.

Introduction

Randomness is essential for realistic load tests to avoid patterns, rhythm and potentially exhibit behavior one might not have thought of. Of course, randomness makes debugging harder because what happened in execution A might not happen in B.

The random data handling of XLT is unique because it allows you to reproduce tests locally that broke during a load test at any time. It also permits fixing up randomness for local repeatable executions. This is done by exploiting the characteristics of PNRGs in a programming language: same seed, same outcome.

Challenges

Synchronization

Right now, we reseed the PRNG for an iteration with self-created new seed that takes the user number and previous seed into account. This is mostly fine and not a big issue.

The random generator we use is per thread, hence it does not require synchronization, but due to the use of a standard Random, we get some synchronization that one might not need. It is unlikely that Java can optimize that away because it cannot see that we are not using that strictly per thread. The overhead might be small, but we run things at extreme scale, so any saving is appreciated.

Stream Jitter

One can use XltRandom everywhere, including in constructors, static initializers, or singleton code. All this makes the stream of numbers highly sensitive to ordering. We want to have almost 100% reliable reproducibility if this is the first execution or the 1000th.

Proposed Changes

Generator

Suggest using the good but cheap Xoroshiro:Xoroshiro128PlusPlus.

Prestart XltRandom

Don't start XltRandom when it is first requested but rather before we start the test case.

Fixed Substreams

We set fixed points to split the stream to ensure that each execution state has its own random state and we don't give the original generator into the test case to avoid messing with it.

Ideas for split points, and we have to run them all the time even though we don't do the action at this point.

  • Before we load the test case class
  • When the test case is constructed
  • When the test case is setup aka beforeClass, before and so on
  • When the test case is executed, aka the run
  • When the shutdown and teardown are happening, similar to setup

Or in English: We determine the lifecycle of a test case and split off a new stream at each point, independent of whether this stream is needed or not.

Split Stream

We use the new split stream functionality SplittableGenerators where a new stream of random data can be branched off, affecting that first stream after the split in a deterministic way.

So, when we start a new RandomGenerator we use the known initial seed concept and start a new stream. For each test execution, we create a new Generator using our known concept of seeds so we know it and can save the seed for communication back to the developer.

Migration

No changes to the scripts should be necessary to continue working. The existing XltRandom API should not break, but we can add new APIs.

Documentation

Document the changes and write up some best practices to avoid issues with the reproducibility. Just random data is easy; making it debuggable is hard.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions