Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulator: allow failure assertions #554

Merged
merged 9 commits into from
Dec 31, 2024
Merged

Conversation

alpaylan
Copy link
Contributor

  • add creating the same table two times to the list of checked properties as a failure property example

Comment on lines 429 to 432
(
1,
Box::new(|rng: &mut R| property_double_create_failure(rng, env)),
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a different property? I imagined under create table we would have a probability, though small one, to try to create the same table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this flat frequency thing will have not scale when have have a bunch of properties because we would have to scale the percentages and the manually change them. I like the approach of hierarchy where for example: write 50%, create table 2% of write -> overwrite 10% of create table. At the end this is a write and should be included with that workload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a different perspective, let me explain it, I think both of them have merits so I'm open to using either.

One thing I was planning for the future is extendable properties, you can imagine generating some stuff in the middle that abides to certain constraints. For example;

CREATE TABLE t1 (int id);
...(no drop(t1), no alter(t1))
CREATE TABLE t1 (char id);

Currently, the middle portion is just empty, we consecutively generate two creates, ideally, that middle part would have a constrained generation that we know will not invalidate the property.

I think this approach works better with a flat property structure, because properties would be rather standalone, a new contributor could add a new property, test it separately without breaking existing properties in any way.

We can push the rule 10% of the writes should be creates to the configuration level, and control the frequency as we did in other examples.

(the 1 here is just me being lazy, sorry about that, I'll fix into the 10% of writes probabilities)

I think this approach would potentially scale to 100s of different properties, which could be added incrementally, especially considering existing bugs in the code. Whenever someone discovers a bug, we could add a property that would catch it, and that property would be the part of the testing setup forever.

The hierarchical approach, in my opinion, would scale to maybe 20-30 properties, but it would be hard to maintain after that, at least that's my intuition.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding onto this @pereman2,

I even have a plan to turn the properties into data structures in near future, so the probabilities wouldn't be hard-coded as they are right now, but we would just count the interactions and dynamically compute the frequency. I can write up a concept implementation to make it more concrete if you would like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i really like the concept of imposing constraints on execution while we want certain things to hold like "during t1...t2 table T exists with the schema it was created". excited to see where this goes. tbh I would be in favor of just tossing this to main and seeing where it goes -- you seem to have a lot of ideas and exp in this area, so i wouldn't want to keep that train from going while it has momentum

Comment on lines 429 to 432
(
1,
Box::new(|rng: &mut R| property_double_create_failure(rng, env)),
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i really like the concept of imposing constraints on execution while we want certain things to hold like "during t1...t2 table T exists with the schema it was created". excited to see where this goes. tbh I would be in favor of just tossing this to main and seeing where it goes -- you seem to have a lot of ideas and exp in this area, so i wouldn't want to keep that train from going while it has momentum

@alpaylan
Copy link
Contributor Author

The latest changes fixes a bug in the simulator, adds an option to set the minimum size, and the maximum time for the simulator.

@penberg penberg closed this in 21a9f29 Dec 31, 2024
@penberg penberg merged commit 21a9f29 into tursodatabase:main Dec 31, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants