Skip to content

GDPR and right to be forgotten #450

@r-dunning

Description

@r-dunning

Problem

If a Right to be Forgotten request is submitted we cannot currently comply as the event streams are append only. Any attempts to update the events with anonymisation will result in re-emission of events by any Change Feed Processors subscribed to that event stream

Proposed Solution

Add log compaction and tombstone events so we can drop data, but keep a record that data did exist at some point. Log compaction is a standard approach for reducing Event stream sizes and can be used for data deletion/anonymisation.
Add a system for adding a tombstone event which when processed will delete all prior events in event stream.

Implementations

Implementation could be either an interface which Events could implement, or a Attribute which can be applied to Events

  • Interface

    An Interface could be implemented by an Event which mandates a property of the full snapshot, it can enforce a custom Apply Method which returns the property and replaces the value in the snapshot for ApplyEvents - Enforces whole snapshot replacement

  • Attribute

    An Attribute could be attached to an Event which requires the user to manually update the properties on the snapshot with valid values - Snapshot will need total replacement and so saving all the properties in the Event constructor. Could get very large, user might forget to set some properties
    Interface which a event can implement which contains a property of the snapshot type. When processed by the ApplyEvents method it will add delete commands to the batch.

Considerations

  • Deletion:

    Synchronous Deletion could be slow for very large event streams so an asynchronous deletion process would probably be best

  • CFPs:

    Currently CFPs are not subscribed to delete events so deleted events will not be reprocessed

  • Get by Id:

    The latest version will return the snapshot as updated/set by the tombstone event

  • Get by Revision number:

    If a process (e.g. a CFP emitting a revision stamped event) requires a stream/snapshot at a particular revision the proposed solution will just return a null as the latest revision number will be the revision of the tombstone event.

  • Audit:

    One of the major benefits of Event sourcing is the built in audit-ability of the streams, log compaction would remove these audit logs, unless a secondary system was built to save the audit data when deletion was requested.

Keen to hear thoughts on the above and what approach people would prefer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions