|
| 1 | +# Data Classification Extension |
| 2 | + |
| 3 | +CloudEvents might contain payloads which are subjected to data protection |
| 4 | +regulations like GDPR or HIPAA. For intermediaries and consumers knowing how |
| 5 | +event payloads are classified, which data protection regulation applies and how |
| 6 | +payloads are categorized, enables compliant processing of events. |
| 7 | + |
| 8 | +This extension defines attributes to describe to |
| 9 | +[consumers](../spec.md#consumer) or [intermediaries](../spec.md#intermediary) |
| 10 | +how an event and its payload is classified, category of the payload and any |
| 11 | +applicable data protection regulations. |
| 12 | + |
| 13 | +These attributes are intended for classification at an event and payload level |
| 14 | +and not at a `data` field level. Classification at a field level is best defined |
| 15 | +in the schema specified via the `dataschema` attribute. |
| 16 | + |
| 17 | +## Notational Conventions |
| 18 | + |
| 19 | +As with the main [CloudEvents specification](../spec.md), the key words "MUST", |
| 20 | +"MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", |
| 21 | +"RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as |
| 22 | +described in [RFC 2119](https://tools.ietf.org/html/rfc2119). |
| 23 | + |
| 24 | +However, the scope of these key words is limited to when this extension is used. |
| 25 | +For example, an attribute being marked as "REQUIRED" does not mean it needs to |
| 26 | +be in all CloudEvents, rather it needs to be included only when this extension |
| 27 | +is being used. |
| 28 | + |
| 29 | +## Attributes |
| 30 | + |
| 31 | +### dataclassification |
| 32 | + |
| 33 | +- Type: `String` |
| 34 | +- Description: Data classification level for the event payload within the |
| 35 | + context of a `dataregulation`. In situations where `dataregulation` is |
| 36 | + undefined or the data protection regulation does not define any labels, then |
| 37 | + RECOMMENDED labels are: `public`, `internal`, `confidential`, or |
| 38 | + `restricted`. |
| 39 | +- Constraints: |
| 40 | + - REQUIRED |
| 41 | + |
| 42 | +### dataregulation |
| 43 | + |
| 44 | +- Type: `String` |
| 45 | +- Description: A comma-delimited list of applicable data protection regulations. |
| 46 | + For example: `GDPR`, `HIPAA`, `PCI-DSS`, `ISO-27001`, `NIST-800-53`, `CCPA`. |
| 47 | +- Constraints: |
| 48 | + - OPTIONAL |
| 49 | + - if present, MUST be a non-empty string without internal spaces. Leading and |
| 50 | + trailing spaces around each entry MUST be ignored. |
| 51 | + |
| 52 | +### datacategory |
| 53 | + |
| 54 | +- Type: `String` |
| 55 | +- Description: Data category of the event payload within the context of a |
| 56 | + `dataregulation` and `dataclassification`. For GDPR personal data typical |
| 57 | + labels are: `non-sensitive`, `standard`, `sensitive`, `special-category`. For |
| 58 | + US personal data this could be: `sensitive-pii`, `non-sensitive-pii`, |
| 59 | + `non-pii`. And for personal health information under HIPAA: `phi`. |
| 60 | +- Constraints: |
| 61 | + - OPTIONAL |
| 62 | + - if present, MUST be a non-empty string |
| 63 | + |
| 64 | +## Usage |
| 65 | + |
| 66 | +When this extension is used, producers MUST set the value of the |
| 67 | +`dataclassification` attribute. When applicable the `dataregulation` and |
| 68 | +`datacategory` attributes MAY be set to provide additional details on the |
| 69 | +classification context. |
| 70 | + |
| 71 | +When an implementation supports this extension, then intermediaries and |
| 72 | +consumers MUST take these attributes into account and act accordingly to data |
| 73 | +regulations and/or internal policies in processing the event and payload. If |
| 74 | +intermediaries or consumers cannot meet such requirements, they MUST reject and |
| 75 | +report an error through a protocol-level mechanism. |
| 76 | + |
| 77 | +If intermediaries or consumers are unsure on how to interpret these attributes, |
| 78 | +for example when they encounter an unknown classification level or data |
| 79 | +regulation, they MUST assume they cannot meet requirements and MUST reject the |
| 80 | +event and report an error through a protocol-level mechanism. |
| 81 | + |
| 82 | +Intermediaries SHOULD NOT modify the `dataclassification`, `dataregulation`, and |
| 83 | +`datacategory` attributes. |
| 84 | + |
| 85 | +## Use cases |
| 86 | + |
| 87 | +Examples where data classification of events can be useful are: |
| 88 | + |
| 89 | +- When an event contains PII or restricted information and therefore processing |
| 90 | + by intermediaries or consumers need to adhere to certain policies. For example |
| 91 | + having separate processing pipelines by sensitivity or having logging, |
| 92 | + auditing and access policies based upon classification. |
| 93 | +- When an event payload is subjected to regulation and therefore retention |
| 94 | + policies apply. For example, having event retention policies based upon data |
| 95 | + classification or to enable automated data purging of durable topics. |
0 commit comments