Skip to content

Remove SnapshotTableType#643

Closed
jorisdral wants to merge 2 commits intomainfrom
jdral/remove-table-type
Closed

Remove SnapshotTableType#643
jorisdral wants to merge 2 commits intomainfrom
jdral/remove-table-type

Conversation

@jorisdral
Copy link
Copy Markdown
Collaborator

No description provided.

@wenkokke
Copy link
Copy Markdown
Collaborator

Are we sure we want to remove it? I'm using it on my branch to distinguish between Simple and Full tables.

@jorisdral
Copy link
Copy Markdown
Collaborator Author

jorisdral commented Mar 24, 2025

Are we sure we want to remove it? I'm using it on my branch to distinguish between Simple and Full tables.

I don't think there is much of a benefit. Sure, you could now create a table from the Simple module and use it from the Full module, but nothing would break if you do (right?)

@wenkokke
Copy link
Copy Markdown
Collaborator

Are we sure we want to remove it? I'm using it on my branch to distinguish between Simple and Full tables.

I don't think there is much of a benefit. Sure, you could now create a table from the Simple module and use it from the Full module, but nothing would break if you do (right?)

If you create a table from Full, use it in Simple, then use it in Full, you change the monoid to const and back.

@wenkokke
Copy link
Copy Markdown
Collaborator

Are we sure we want to remove it? I'm using it on my branch to distinguish between Simple and Full tables.

I don't think there is much of a benefit. Sure, you could now create a table from the Simple module and use it from the Full module, but nothing would break if you do (right?)

If you create a table from Full, use it in Simple, then use it in Full, you change the monoid to const and back.

If we want to remove this, we should either rewrite the codebase so that the monoid is only ever used in mupserts, or at very least test that switching between the full and simple APIs doesn't alter the semantics using the state machine tests.

@jorisdral
Copy link
Copy Markdown
Collaborator Author

jorisdral commented Mar 24, 2025

If you create a table from Full, use it in Simple, then use it in Full, you change the monoid to const and back.

This you could do with just the Full module too. The user can change the monoid whenever they want and break the table. That's probably a separate thing to solve, and the table type doesn't really help very much with that

If we want to remove this, we should either rewrite the codebase so that the monoid is only ever used in mupserts, ...

I'm not sure I understand what you mean here. Could you elaborate?

@jorisdral
Copy link
Copy Markdown
Collaborator Author

jorisdral commented Mar 24, 2025

FYI, one idea would be reify the monoid as a datatype, and only allow a small set of monoids. We would then store this datatype in a table/snapshot so that the monoid can't be changed.

data ResolveMethod = Sum | Const | Subtract | ...

And we could even allow custom monoids still, but then we'll have to warn the user that the implementation doesn't check whether the monoid has changed unexpectedly, e.g.:

data ResolveMethod v = Sum | Const | Subtract | Custom v -- WARNING

@wenkokke
Copy link
Copy Markdown
Collaborator

If you create a table from Full, use it in Simple, then use it in Full, you change the monoid to const and back.

This you could do with just the Full module too. The user can change the monoid whenever they want and break the table. That's probably a separate thing to solve, and the table type doesn't really help very much with that

The monoid is derived from a type class, so switching the monoid should at least require switching orphan instances, at which point you must be aware you're doing something very wrong, since you could easily break, say, Data.Map that way as well.

@wenkokke
Copy link
Copy Markdown
Collaborator

If we want to remove this, we should either rewrite the codebase so that the monoid is only ever used in mupserts, ...

I'm not sure I understand what you mean here. Could you elaborate?

  • If you think switching table types is safe, we should at least test it before allowing it.

  • We could test this using state machine tests, where we compare the real table to the real table, except that on one side we sometimes switch from one API to another at snapshot boundaries.

  • Alternatively, if we manage to purge the monoid constraint from every internal operation save for mupsert, we could statically know it's safe.

@jorisdral
Copy link
Copy Markdown
Collaborator Author

jorisdral commented Mar 25, 2025

I'm fine with keeping the snapshot table type, but I think it just does not help very much with the fundamental problem: making sure that a snapshot isn't opened at the wrong Haskell types for keys/values/blobs (and therefore monoid and serialisation functions)

The monoid is derived from a type class, so switching the monoid should at least require switching orphan instances, at which point you must be aware you're doing something very wrong, since you could easily break, say, Data.Map that way as well.

I think it's easier to break than that: you can open the same snapshot using different types (with different monoids). Or you've upgraded a dependency and something changed unexpectedly in the ResolveValue instance. You don't have to switch orphan instances

We already have SnapshotLabel, which exists so that the user can "type" their snapshots. The user picks their key/value/blob type (and therefore they pick the monoid), and it's up to the user to make sure they label their snapshots so that they don't try to open the wrong one using the wrong types. The public API should be explicit about what to consider when labelling snapshots, but some of the choices (like choice of types, Serialisekey instances, ResolveValue instances) are out of our hands so we tell the user to come up with the labels

If we want to remove this, we should either rewrite the codebase so that the monoid is only ever used in mupserts, ...

I'm not sure I understand what you mean here. Could you elaborate?

* If you think switching table types is safe, we should at least test it before allowing it.

* We could test this using state machine tests, where we compare the real table to the real table, except that on one side we sometimes switch from one API to another at snapshot boundaries.

The only difference between public APIs is which ResolveValue function is passed into the internals. The internals function the same regardless of the public API. We can add a test to verify that, but it can probably be a simple property instead of using the state machine tests (too much boilerplate)

* Alternatively, if we manage to purge the monoid constraint from every internal operation save for mupsert, we could statically know it's safe.

I'm not sure what you mean here by "to purge". The constraint only exists at the public API, inside the internals it is only a ResolveValue function that is used when combining Entrys

@wenkokke
Copy link
Copy Markdown
Collaborator

wenkokke commented Mar 25, 2025

I think it's easier to break than that: you can open the same snapshot using different types (with different monoids). Or you've upgraded a dependency and something changed unexpectedly in the ResolveValue instance. You don't have to switch orphan instances

We already have SnapshotLabel, which exists so that the user can "type" their snapshots. The user picks their key/value/blob type (and therefore they pick the monoid), and it's up to the user to make sure they label their snapshots so that they don't try to open the wrong one using the wrong types. The public API should be explicit about what to consider when labelling snapshots, but some of the choices (like choice of types, Serialisekey instances, ResolveValue instances) are out of our hands so we tell the user to come up with the labels

Any sense in replacing SnapshotLabel with a serialized TypeRep?

From @dcoutts: They're unstable across GHC versions.

@wenkokke
Copy link
Copy Markdown
Collaborator

@jorisdral: With special cases for Type, List, and TupleN, I can get consistent serializations for TypeRep across every major GHC version since 8.10.7. So at very least it might be worth providing this as a generic function snapshotLabelFromTypeRep with some minor warnings?

@jorisdral jorisdral marked this pull request as draft April 9, 2025 12:59
@jorisdral
Copy link
Copy Markdown
Collaborator Author

Superseded by #707

@jorisdral jorisdral closed this May 7, 2025
@jorisdral jorisdral deleted the jdral/remove-table-type branch June 17, 2025 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants