-
Notifications
You must be signed in to change notification settings - Fork 600
Description
Describe the Bug
Related Issue
Related to #5070
Summary
Our UUID definition is defined as "^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$")
which would correctly mark the rule ID of 7eb54028-ca72-4eb7-8185-b6864572347db
as invalid. In spite of this, this uuid was merged into main and passed unit tests (link). This is unexpected behavior and we should investigate why this occurred and address accordingly.
When verifying the code direcltly we use to enforce the UUID check, it correctly detects the issue, but unit tests are not.
❯ python
Python 3.12.11 (main, Jun 4 2025, 08:56:18) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>>
>>> UUID_PATTERN = re.compile(r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$")
>>>
>>> uuid_to_check = "7eb54028-ca72-4eb7-8185-b6864572347db"
>>>
>>> if UUID_PATTERN.match(uuid_to_check):
... print(f"{uuid_to_check} is a valid UUID.")
... else:
... print(f"{uuid_to_check} is not a valid UUID.")
...
7eb54028-ca72-4eb7-8185-b6864572347db is not a valid UUID.
Upon further investigation, it appears that the following Annotated definitions are not being enforced in rule data and metadata.
Examples
AlertSuppressionGroupBy = Annotated[
list[NonEmptyStr], fields.List(NON_EMPTY_STRING_FIELD, validate=validate.Length(min=1, max=3))
]
AlertSuppressionMissing = Annotated[str, fields.String(validate=validate.OneOf(["suppress", "doNotSuppress"]))]
AlertSuppressionValue = Annotated[int, fields.Integer(validate=validate.Range(min=1))]
BranchVer = Annotated[str, fields.String(validate=validate.Regexp(BRANCH_PATTERN))]
CardinalityFields = Annotated[
list[NonEmptyStr],
fields.List(NON_EMPTY_STRING_FIELD, validate=validate.Length(min=0, max=5)),
]
ConditionSemVer = Annotated[str, fields.String(validate=validate.Regexp(CONDITION_VERSION_PATTERN))]
Date = Annotated[str, fields.String(validate=validate.Regexp(DATE_PATTERN))]
Interval = Annotated[str, fields.String(validate=validate.Regexp(INTERVAL_PATTERN))]
MaxSignals = Annotated[int, fields.Integer(validate=validate.Range(min=1))]
NewTermsFields = Annotated[
list[NonEmptyStr], fields.List(NON_EMPTY_STRING_FIELD, validate=validate.Length(min=1, max=3))
]
PositiveInteger = Annotated[int, fields.Integer(validate=validate.Range(min=1))]
RiskScore = Annotated[int, fields.Integer(validate=validate.Range(min=1, max=100))]
RuleName = Annotated[str, fields.String(validate=elastic_rule_name_regexp(NAME_PATTERN))]
SemVer = Annotated[str, fields.String(validate=validate.Regexp(VERSION_PATTERN))]
SemVerMinorOnly = Annotated[str, fields.String(validate=validate.Regexp(MINOR_SEMVER))]
Sha256 = Annotated[str, fields.String(validate=validate.Regexp(SHA256_PATTERN))]
SubTechniqueURL = Annotated[str, fields.String(validate=validate.Regexp(SUBTECHNIQUE_URL))]
TacticURL = Annotated[str, fields.String(validate=validate.Regexp(TACTIC_URL))]
TechniqueURL = Annotated[str, fields.String(validate=validate.Regexp(TECHNIQUE_URL))]
ThresholdValue = Annotated[int, fields.Integer(validate=validate.Range(min=1))]
TimelineTemplateId = Annotated[str, fields.String(validate=elastic_timeline_template_id_validator())]
TimelineTemplateTitle = Annotated[str, fields.String(validate=elastic_timeline_template_title_validator())]
UUIDString = Annotated[str, fields.String(validate=validate.Regexp(UUID_PATTERN))]
This appears to be related to how marshmallow handled Annotated types.
Details
from marshmallow_dataclass import dataclass
from marshmallow import fields, validate
from typing import Optional
from typing import Annotated
from dataclasses import field
import re
UUID_PATTERN = re.compile(r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$")
UUIDString = Annotated[str, fields.String(validate=validate.Regexp(UUID_PATTERN))]
uuidstring = {"validate": validate.Regexp(UUID_PATTERN)}
@dataclass
class Example():
id: UUIDString
name: str
description: Optional[str] = None
@dataclass
class TestExample():
id: str = field(metadata=uuidstring)
name: str
description: Optional[str] = None
data = {"id": "123e4567-e89b-12d3-a456-426614174000213", "name": "Example"}
example = Example.Schema().load(data)
print(example)
data = {"id": "123e4567-e89b-12d3-a456-426614174000213", "name": "Example"}
example = TestExample.Schema().load(data)
print(example)
1. How Example
is Defined
- In
Example
, theid
field is typed asUUIDString
, which is anAnnotated
type. UUIDString
usesfields.String(validate=...)
as metadata, butmarshmallow
does not natively interpretAnnotated
types or their metadata during schema generation.- As a result, the
validate=validate.Regexp(UUID_PATTERN)
rule is ignored, and no validation is applied to theid
field.
2. How TestExample
is Defined
- In
TestExample
, theid
field explicitly usesfield(metadata={"validate": validate.Regexp(UUID_PATTERN)})
. marshmallow_dataclass
recognizes themetadata
dictionary and applies thevalidate.Regexp(UUID_PATTERN)
rule to theid
field during schema generation.- This ensures that the
id
field is validated against the UUID pattern when data is loaded.
However, NewType is deprecated in the most recent version of marshmallow-dataclass and Annotated is preferred method so the fix may be more involved.
DEPRECATED: Use typing.Annotated instead. NewType creates simple unique types to which you can attach custom marshmallow attributes. All the keyword arguments passed to this function will be transmitted to the marshmallow field constructor.
Upgrading to marshmallow-dataclass==8.7.1
resolves the bug as Annotated
functions as expected.
To Reproduce
- Checkout branch from [Tuning] System File Ownership Change #5051
- See that unit tests passed and UUID is invalid
Expected Behavior
Unit tests should fail as uuid is invalid.
Screenshots
No response
Desktop - OS
None
Desktop - Version
No response
Additional Context
No response