-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Thanks again for this astonishing release. I use it nearly every day heavily! My issue:
During anonymization, multi-word PII entities such as “Hans Müller” are replaced with multiple identical placeholders (e.g., two instances of 1_person). In the sidebar view, these placeholders span across occurrences of the same entity, being uniformly represented as 1_person (Hans Müller, “Will replace [1_person] → Hans Müller”). However, this placeholder duplication during anonymization introduces an inconsistency at the de-anonymization stage: each occurrence of the duplicated placeholder is independently restored to Hans Müller. As a result, the process yields duplicate mentions of the same entity, despite their original equivalence ("Hans Müller Hans Müller").
See pull request for a possible solution (add appendices to the placeholders marking the words)