-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New group consistency algorithm #6401
Comments
For this protocol to work, we need to make sure all group messages that have Currently Delta Chat adds self address to the "To" field if "To" field is empty. If Alice and Bob are in the group, Alice removes Bob and sends "member removed" to "Bob", but "Bob" is not included in the "To" field so Alice adds own address to the "To" field. Proper solution for the case when we don't have any members for the |
Consider this case:
One solution would be: Never accept changes if the received timestamp (i.e. An alternative that may be a bit better for eventual consistency would be: Never accept changes if the maximum received timestamp for all members in this group (i.e. |
This is expected. This at least achieves eventual consistency, even if by re-adding group members. If the group is somewhat active, Alice will likely receive at least one message after restoring the backup and sending a message to the group. We can maybe make Alice remove whatever members have timestamp older than 60 days and not present in the Any workarounds that we add should not result in the failure to achieve consistency, such as Alice forever sending messages to some member and other members never accepting this member into the memberlist. |
Can Charlie keep his own tombstone so that Delta Chat can at least show to Charlie that it may be an outdated addition? It shouldn't blame anyone and Charlie can decide to leave the group again more quickly. |
I think I will not implement this 60 days rule for initial implementation and postpone it for another PR. |
Also it seems that Also it seems that past members (and even non-members if they know the group id) can readd themselves, maybe we should ignore self-timestamps for |
This is already the case with the
Malicious users are explicitly out of scope for https://github.com/chatmail/specs/tree/main/group-membership, the problem is difficult enough already. Such user can already do worse things like creating a group with the same ID and different members then introducing such new group to previously known member, creating a clone group with known past members etc. If you really need to prevent such user from at least joining the group, you need to generate a new ID. Any attempts to e.g. ignore adding self to the group will result in "member is not part of the group" errors due to reordering in cases without any attackers. |
Current idea to fix the "user restored old backup" case after discussion with @Hocuri:
If the user who restored a backup is a chatmail user, they will likely have only some recent group messages and will synchronize to not-so-outdated state. If the user is a "gmail user" who has the full archive on IMAP, it will take some time to process messages but should not result in problems as long as user does not chat until fully downloading everything. With the recent fix to remove Secure-Join messages after processing (#6354) no automatic messages should be sent during fetching. Maybe we should not allow sending anything until the first full sync, but this is out of scope for the issue. |
The only thing is that this "memberlist timestamp" is also needs to be sent. Otherwise if two such stale states meet, it's not clear which one to prefer. Maybe quite a rare scenario (e.g. two users restore from old backups), but still. Sending the whole state is obviously better anyway. |
Yes, we talked about it and decided to ignore it. It is always possible to come up with a scenario when the chat is not active and then two users restore backups, one from 1 year old and another from 5 years old etc., things are going to break then. |
We can add additional timestamps to messages headers, they can basically work as |
This implements new group consistency algorithm described in <#6401> New `Chat-Group-Member-Timestamps` header is added to send timestamps of member additions and removals. Member is part of the chat if its addition timestamp is greater or equal to the removal timestamp.
PR is at #6404
Existing group consistency algorithm is built in ad-hoc way by fixing known cases where it practically failed and adding the tests each time. At this moment there is one unfixed case in the form of the PR #6021
Instead of fixing this case one more time with another patch, current plan is to rebuild group consistency from scratch. Algorithm is currently defined at https://github.com/chatmail/specs/tree/main/group-membership with a TLA+ model that was used to reject simpler solutions by finding counter-examples for "eventual consistency" property. There is a Python model of the algorithm at https://github.com/chatmail/specs/tree/main/gmc which shows that #6021 case is fixed and also tests the "immediate consistency" property. We will also keep all existing tests already defined in the core codebase, they should pass and maybe changed if there is a good reason, but not removed, especially when they are testing compatibility with older Delta Chat and MUAs that is not captured by the formal spec.
The idea of the new algorithm is to maintain group member set as a Last-Write-Wins set CRDT. To determine if some operation happened earlier or later we are going to use message timestamps with second precision. Using logical clocks (e.g. vector clocks) was considered but sending them around would increase the amount of data and the result of having temporarily unsynchronized clock between devices is not fatal. Nowadays most of the devices are mobile devices that have clocks synchronized over the network without users having to know about it, I personally don't remember having to adjust my clock on any of the devices in the last 5-10 years at least, it just works and we don't need high resolution.
Currently
chats_contacts
table consists of two columnschat_id
andcontact_id
. We will also addadd_timestamp
andremove_timestamp
columns that default to 0. Ifadd_timestamp >= remove_timestamp
, the member is actually in the chat, otherwise the row is a tombstone. Note that ifadd_timestamp == remove_timestamp
, member is considered to be part of the chat as it simplifies database migration.We also add new headers
Chat-Group-Past-Members
and andChat-Group-Member-Timestamps
.Chat-Group-Past-Members
contains a list of past group members in a format similar to theTo:
header.Chat-Group-Member-Timestamps
contains space-separated (just like theTo
field for easy wrapping)unix timestamps (in seconds, just integers) for the
To
field followed byChat-Group-Past-Members
field in exactly this order. I have not decided yet if it makes sense to include the timestamp of the sender, i.e. for theFrom
field. When member leaves the group, self address is anyway included in theChat-Group-Past-Members
. We also want to have sealed sender at some point. I will leave the sender timestamp out for now. If this becomes a problem, we can add the sender explicitly in theTo
field, this is probably a cleaner solution.There is no special compression for timestamps, messages are compressed by OpenPGP anyway. There are some ideas to e.g. only send differences for timestamps other than the first one or hex-encode, but I am not going to do this for the first implementation. We can play with this late before merging.
When a message is received, if there is a
Chat-Version
header andChat-Group-Member-Timestamps
headerexists, do the merging. First, adjust the received timestamps so they are not in the future by taking the minimum of the current timestamp and the received timestamp. Same when loading the timestamps from the database, if they are in the future assume them to be the current timestamp.
For every member that does not exist in the
chats_contacts
table yet, just accept the state from the received message and the new timestamp. If the member already exists in thechats_contacts
table, accept the new state if the received timestamp is higher than stored timestamp. In case of the same timestamp and conflict, e.g. local state says the member is removed (is_tombstone
is true) and received state says the member is added (it is in theTo:
field rather thanPast-Members:
field), prefer adding the member. Adding members is preferred to removing members because in this case added member eventually learns that it is still part of the group and can leave while if the user is removed they stop receiving messages and may not notice that they are no longer in the group.If the message is received with
Chat-Version
but withoutChat-Group-Member-Timestamps
, it is an old Delta Chat message. In this case if there is aChat-Member-Added
orChat-Member-Removed
header, do the action locally if the message is newer than the timestamp stored locally.If the message is received without
Chat-Version
header, it is a message sent from non-Delta Chat client. In this case treat all members in theTo
field as if they are added with timestamp 0. This allows adding members to the groupwith non-DC clients, but not re-adding or removing them. It is mostly to support ad-hoc groups which are actually email threads and normally don't live long enough to have several iterations of member removals and re-additions.
Expiration for tombstones is moved into #6427
The text was updated successfully, but these errors were encountered: