Skip to content

Test Genesis State Machine with quickcheck-dynamic #1413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

geo2a
Copy link
Contributor

@geo2a geo2a commented Mar 6, 2025

Fixes #1149

"no changelog", as the changes only affect the ouroboros-consensus-diffusion:consensus-test package.

In this PR, we refactor the Test.Consensus.GSM modules, which contain the tests for the Genesis State Machine component, to use a different state machine testing framework. Specifically, we are switching the tests from quickcheck-state-machine to quickcheck-dynamic. The main motivation for the switch is being able to run the Genesis State Machine in IOSim in order to benefit from IOSims ability to manipulate the passage of time and this speed-up the tests.

While the current GSM tests do run in IOSim, this is only possible due to vendoring a copy of quickcheck-state-machine that allows using IOSim, which we would like to avoid long-term.

Specifically, this PR:

  • adds QD-based tests
  • groups definition used in both QSM and QD tests

Below, I describe some aspects of the switch that I consider worth highlighting, hopefully making the review more manageable.

Differences from quickcheck-state-machine tests

While I've tried to replicate the existing quickcheck-state-machine-based tests as closely as possible, there are some unavoidable differences between the test suites stemming from how the quickcheck-dynamic library is structured and implemented.

Different distributions of command sequences

While the user can specify the logic to generate individual commands, both quickcheck-dynamic and quickcheck-state-machine have internal hard-coded logic to generate sequences of commands:

quickcheck-dynamic quickcheck-state-machine
generateActionsWithOptions generateCommandsState

Here are the partial tests reports, which show that QSM produces more uniform distribution of commands:

quickcheck-dynamic                     |             quickcheck-state-machine
                                       |
GSM (at most 1 peer):  OK (0.19s)      |             GSM (at most 1 peer):  OK (0.42s)
  +++ OK, passed 1000 tests.           |               +++ OK, passed 1000 tests.
                                       |
  Actions (24732 in total):            |               Commands (7336 in total):
  33.232% +TimePasses                  |               13.63% NewCandidate
  14.257% +ReadMarker                  |               13.63% StartIdling
  14.152% +ReadGsmState                |               13.63% TimePasses
  13.404% +ModifyCandidate             |               13.56% ReadMarker
  12.187% +StartIdling                 |               13.54% ReadGsmState
   6.740% +NewCandidate                |               13.41% ModifyCandidate
   3.324% +Disconnect                  |               10.17% Disconnect
   2.705% +ExtendSelection             |                8.42% ExtendSelection
                                       |
  Notables (2312 in total):            |               Notables (3347 in total):
  17.82% FellBehindN                   |               20.50% FellBehindN
  17.08% PreSyncingToSyncingN          |               19.30% PreSyncingToSyncingN
  16.09% <none>                        |               13.92% TooOldN
  11.98% TooOldN                       |               13.65% CaughtUpN
  10.86% CaughtUpN                     |                9.68% FlickerN
   9.34% SyncingToPreSyncingN          |                9.47% SyncingToPreSyncingN
   9.26% BigDurN                       |                9.20% BigDurN
   7.53% FlickerN                      |                4.27% <none>
   0.04% NotThrashingN                 |

Similarly, for 5 peers:

`quickcheck-dynamic`                   |             `quickcheck-state-machine`
                                       |
GSM (at most 5 peers): OK (0.25s)      |             GSM (at most 5 peers): OK (0.53s)
  +++ OK, passed 1000 tests.           |               +++ OK, passed 1000 tests.
                                       |
  Actions (26337 in total):            |               Commands (7314 in total):
  29.157% +TimePasses                  |               13.67% NewCandidate
  16.190% +StartIdling                 |               13.67% TimePasses
  15.643% +NewCandidate                |               13.65% StartIdling
  11.098% +ReadGsmState                |               13.29% ReadGsmState
  11.072% +ReadMarker                  |               13.28% ReadMarker
  10.024% +ModifyCandidate             |               13.04% ModifyCandidate
   4.291% +ExtendSelection             |               10.39% ExtendSelection
   2.525% +Disconnect                  |                9.01% Disconnect
                                       |
  Notables (2102 in total):            |               Notables (2788 in total):
  27.26% PreSyncingToSyncingN          |               31.17% PreSyncingToSyncingN
  21.93% SyncingToPreSyncingN          |               24.82% SyncingToPreSyncingN
  16.60% FellBehindN                   |               21.34% FellBehindN
  14.65% <none>                        |                6.85% BigDurN
   5.99% TooOldN                       |                6.71% TooOldN
   5.95% BigDurN                       |                6.56% CaughtUpN
   5.80% CaughtUpN                     |                1.94% FlickerN
   1.81% FlickerN                      |                0.61% <none>

QD also generates significantly more commands than QSM, probably due to longer defaults length for command sequences.

The glue code of quickcheck-dymamic is essentially pure and thus allows to use IOSim

The type of quickcheck-state-machine's runCommands function constraints the underlying monad to IO:

runCommands :: (MonadMask m, MonadIO m)
            => ...
            -> PropertyM m (History cmd resp, model Concrete, Reason)

This is needed, because quickcheck-state-machine uses TChans internally to orchestrate the interaction between the model and the system-under-test. This in particular makes it impossible to run the system-under-test in IOSim and motivated us to vendor a copy of QSM with the types adapted to permit IOSim.

In quickcheck-dymamic, commands are called actions, and run using the
runActions function, whose type gives the user much more freedom:

runActions
  :: forall state m e
   . ( StateModel state
     , RunModel state m
     , e ~ Error state m
     , forall a. IsPerformResult e a
     )
  => Actions state
  -> PropertyM m (Annotated state, Env)

class (forall a. Show (Action state a), Monad m) => RunModel state m where
  ...

I.e. the user needs to instantiate the RunModel class, where the library only requires m to be a Monad. The implementation of runActions itself is essentially pure and only uses the facilities of Test.Quickcheck.Monadic to invoke the underlying monadic code of the system-under-test.

Slightly different test setup routine

While quickcheck-state-machine represents state machines as records of functions, and thus gives a lot of flexibility to the user in terms pasteurisation, quickcheck-dynamic is not as flexible, as it uses type classes.

Specifically, I was having a hard time to emulate the pattern of parameterising the state machine with a Context that exists in the current QSM test (see here, for example). The Context data type contains parameters that are generated with quickcheck and are fixed over the execution of one property test, i.e. they are static from the state machine's point of view.

quickcheck-dynamic does not provide a way to supply static information to the state machine out of the box, thus not allowing to configure the initial state according to some external static parameter and presenting us with an instance of the classic configuration problem. I've tried several ways to solve this particular instance, but the turned out to be using reflection as suggested by @nfrisby. Specifically, I've taken inspiration from this blogpost by John Wiegley, where he uses reflection to provide configuration to quickcheck generators --- this is almost exactly what we want here!

The essence of the solution is to constrain the StateModel instance with an constraint provided by reflection. This way, we get to use have the reflected parameters in the initialState method:

instance Reflection.Reifies params StaticParams => QD.StateModel (Model params) where
  ...

  initialState = initModel params
    where params = Reflection.reflect (Proxy @params)

initModel :: StaticParams -> Model params

We would then create a value of StaticParams in the property definition:

prop_sequential_iosim :: Maybe UpstreamPeer -> LedgerStateJudgement -> QC.Fun (Set.Set UpstreamPeer) Bool -> QC.Property
prop_sequential_iosim pUpstreamPeerBound pInitialJudgement (QC.Fun _ pIsHaaSatisfied) =
    Reflection.reify (StaticParams{ pUpstreamPeerBound
                                  , pInitialJudgement
                                  , pIsHaaSatisfied
                                  }) $ \(Proxy :: Proxy params) ->
     -- NOTE: the actions have to be generated with an explicit call, as
     -- 'generator' relies on reflection for access to the parameters
     QC.forAll QC.arbitrary $ \actions ->
       runIOSimProp $ prop_sequential_iosim1 @_ @params actions

The only tricky thing here is to explicitly invoke the generator for actions, so that the reified value of StaticParams is in scope.

Next steps

Once the reviewers are happy with the changes, we should:

  • remove the tests based on quickcheck-state-machine
  • remove our copy of quickcheck-state-machien that allows IOSim
  • restructure the GSM tests modules accordingly, i.e. move the QD module one level up and rename it to GSMTests

Independently:

  • we need to make a Hackage release of quickcheck-dynamic, as this PR relies on the version from master which is not yet released
  • once the release is uploaded, we will be able to remove the s-r-p

@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch from ab71f71 to ae48bb8 Compare March 6, 2025 16:46
@geo2a geo2a self-assigned this Mar 6, 2025
@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch 4 times, most recently from ad743f9 to 7d704d0 Compare March 10, 2025 16:11
@@ -1,4 +1,4 @@
module Test.QuickCheck.Extras (
module Test.Ouroboros.Consensus.QuickCheck.Extras (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module needs to be moved, as quickcheck-dynamic also has Test.QuickCheck.Extras.

@geo2a geo2a marked this pull request as ready for review March 10, 2025 16:24
@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch 2 times, most recently from 521c87e to 3261feb Compare March 19, 2025 14:49
@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch 2 times, most recently from bf22c03 to ba4c4f4 Compare March 20, 2025 12:56
github-merge-queue bot pushed a commit that referenced this pull request Mar 25, 2025
- bump the index state in `cabal.project`
- update hackage.nix, use the legacy `for-stackage` branch, as we still
support GHC 8.10. See [this
note](https://github.com/input-output-hk/hackage.nix/blob/58d6ddbb7cb41518a2f61a6b1f123308f6eece76/default.nix#L2)
for more details.
- bump tool index state

This PR is needed to remove an s-r-p on `quickcheck-dynamic` in
#1413
@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch 2 times, most recently from 1ade547 to 2de5a06 Compare March 26, 2025 12:28
@geo2a geo2a moved this to 👀 In review in Consensus Team Backlog Mar 31, 2025
Copy link
Member

@amesgen amesgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!


Thanks a lot for the thorough attention to labelling! One further observation: we set the minimum command length to 20 for QSM; for QD, I get this distribution:

        # of actions (1000 in total):
        41.5% "0 - 9"
        19.4% "10 - 19"
        12.2% "20 - 29"
         7.6% "30 - 39"
         6.0% "40 - 49"
         3.1% "50 - 59"
         3.0% "100 - 199"
         2.9% "60 - 69"
         2.2% "70 - 79"
         1.1% "80 - 89"
         0.8% "90 - 99"
         0.2% "200 - 299"

This still seems completely fine, especially as the chosen value of 20 for QSM was ad-hoc either way.


The PR description can be updated in two aspects:


Apart from that, I think we can now proceed as described in the PR description and remove the vendored QSM code and the QSM GSM test.

@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch 3 times, most recently from 60ecd1f to 7d887a9 Compare April 7, 2025 11:04
@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch from db77deb to 65cf44b Compare April 7, 2025 14:28
@geo2a geo2a requested a review from amesgen April 8, 2025 08:39
geo2a added 2 commits April 10, 2025 09:04
- add tests based on quickcheck-dynamic
- remove old tests based on quickcheck-state-machine, substituted by the above
- Test.QuickCheck.Extras -> Test.Ouroboros.Consensus.QuickCheck.Extras
@geo2a geo2a force-pushed the geo2a/issue-1149-refactor-gsm-tests-reflection branch from 65cf44b to d3a2163 Compare April 10, 2025 08:05
@jasagredo
Copy link
Contributor

I think the failing reference must now point to https://developers.cardano.org/docs/get-started/cardano-node/installing-cardano-node

@geo2a
Copy link
Contributor Author

geo2a commented Apr 10, 2025

I've fixed the link, thanks @jasagredo!

@geo2a
Copy link
Contributor Author

geo2a commented Apr 14, 2025

Blocked on update of quickcheck-lockstep to use quickcheck-dynamic 4.0.0, see this Slack thread.

@dnadales dnadales moved this from 👀 In review to 🚫 Help needed in Consensus Team Backlog May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🚫 Help needed
Development

Successfully merging this pull request may close these issues.

switch to quickcheck-dynamic and/or upstream the io-sim-based QSM module
3 participants