@@ -2036,32 +2036,129 @@ SHOULD use the terms defined by this document to do so.
20362036
20372037## Security Considerations {#security}
20382038
2039- Both schemas and instances are JSON values. As such, all security considerations
2040- defined in [ RFC 8259] [ rfc8259 ] apply.
2041-
2042- Instances and schemas are both frequently written by untrusted third parties, to
2043- be deployed on public Internet servers. Implementations should take care that
2044- the parsing and evaluating against schemas does not consume excessive system
2045- resources. Implementations MUST NOT fall into an infinite loop.
2046-
2047- A malicious party could cause an implementation to repeatedly collect a copy of
2048- a very large value as an annotation. Implementations SHOULD guard against
2049- excessive consumption of system resources in such a scenario.
2050-
2051- Servers MUST ensure that malicious parties cannot change the functionality of
2052- existing schemas by uploading a schema with a pre-existing or very similar
2053- ` $id ` .
2054-
2055- Individual JSON Schema extensions are liable to also have their own security
2056- considerations. Consult the respective specifications for more information.
2057-
2058- Schema authors should take care with ` $comment ` contents, as a malicious
2059- implementation can display them to end-users in violation of a spec, or fail to
2060- strip them if such behavior is expected.
2061-
2062- A malicious schema author could place executable code or other dangerous
2063- material within a ` $comment ` . Implementations MUST NOT parse or otherwise take
2064- action based on ` $comment ` contents.
2039+ While schemas and instances are not always represented as JSON text, they are
2040+ defined in terms of the JSON data model. As such, the security considerations
2041+ defined in [ RFC 8259] [ rfc8259 ] may still apply in environments where text-based
2042+ representations are used, particularly those considerations related to parsing,
2043+ number precision, and structural limitations.
2044+
2045+ Schemas and instances are frequently authored by untrusted parties.
2046+ Implementations that accept or evaluate such inputs may be exposed to several
2047+ classes of attack, particularly denial-of-service (DoS) by means of resource
2048+ exhaustion.
2049+
2050+ ### Nested ` anyOf ` /` oneOf `
2051+
2052+ One risk for resource exhaustion in JSON Schema arises from the nested use of
2053+ ` anyOf ` and ` oneOf ` . While a single combinator keyword with multiple subschemas
2054+ is typically manageable, nesting them causes the number of evaluation paths to
2055+ grow exponentially.
2056+
2057+ For example, a ` oneOf ` with 5 subschemas, each containing another ` oneOf ` with 5
2058+ options, results in 25 evaluation paths. Adding a third level increases this to
2059+ 125, and so on. Attackers can exploit this by crafting schemas that force
2060+ validators to explore a large number of branches.
2061+
2062+ This evaluation explosion is particularly dangerous when each path involves
2063+ expensive work such as collecting large annotations or evaluating complex
2064+ regular expressions. These effects multiply across paths and can result in
2065+ excessive CPU or memory consumption, leading to denial-of-service.
2066+
2067+ Implementations that evaluate untrusted schema are encouraged to take steps to
2068+ mitigate these threats with measures such as bounding combinator keyword depth
2069+ and breadth, limiting memory used for annotation collection, and guarding
2070+ against resource-intensive validations such as pathological regexes.
2071+
2072+ ### Dynamic References
2073+
2074+ The paper [ "The Complexity of JSON Schema: Undecidable, Expensive, Yet
2075+ Tractable" (Caroni et al., 2024)] ( https://doi.org/10.1145/3632891 ) has shown
2076+ that validation in the presence of dynamic references is PSPACE-complete. The
2077+ paper describes a method for replacing dynamic references with static ones, but
2078+ doing so can cause the size of the schema to grow exponentially. Implementations
2079+ should be aware of this risk and may wish to implement the method described in
2080+ the paper or impose limits on dynamic reference resolution.
2081+
2082+ ### Infinite Loops and Cycles
2083+
2084+ Infinite loops can occur when evaluating schemas that produce cycles during
2085+ reference resolution. These cycles may involve multiple schemas. Not all
2086+ recursive schemas create loops, but implementations are advised to detect these
2087+ cycles and terminate evaluation when they are encountered.
2088+
2089+ ### Schema Identity and Collisions
2090+
2091+ Schemas may declare an ` $id ` to identify themselves or have embedded schemas
2092+ that declare an ` $id ` . An attacker may attempt to register a schema with an
2093+ ` $id ` that collides with a previously registered schema, or that differs only by
2094+ case, encoding, or other URI normalization quirks. Such collisions could result
2095+ in overwriting or shadowing of trusted schemas.
2096+
2097+ Implementations should consider rejecting schemas that have identifiers
2098+ (including embedded schema identifiers) that conflict with registered schemas
2099+ and should apply any URI normalization and comparison logic consistently to
2100+ detect and prevent conflicts.
2101+
2102+ ### External Schema Resolution
2103+
2104+ JSON Schema implementations are expected to resolve external references using a
2105+ local registry. Although the specification allows for dynamic retrieval
2106+ (` https: ` to fetch schemas over HTTP, or ` file: ` to read schemas from disk),
2107+ this behavior is discouraged unless it's intrinsic to the use case, such as with
2108+ JSON Hyper-Schema.
2109+
2110+ Resolving schemas dynamically introduces several security concerns, each of
2111+ which can be mitigated by limiting or controlling resolution behavior. A tightly
2112+ scoped schema resolution policy significantly reduces the attack surface,
2113+ especially when validating untrusted data.
2114+
2115+ Implementations are advised to disable dynamic retrieval by default and limit
2116+ external schema resolution to the local registry unless dynamic retrieval is
2117+ explicitly enabled. If enabled, they should consider limiting the number of
2118+ dynamic retrievals a validation can perform and defining timeouts on dynamic
2119+ retrievals to reduce the risk of resource exhaustion.
2120+
2121+ #### HTTP(S) Specific Threats
2122+
2123+ Allowing schema references to resolve over HTTP or HTTPS introduces several
2124+ threats:
2125+
2126+ * ** Denial of Service (DoS)** : Validation may hang or become slow if a
2127+ referenced schema URL is slow to respond or never returns.
2128+ * ** Server-Side Request Forgery (SSRF)** : Malicious schemas can reference
2129+ internal-only services using hostnames like localhost or private IPs.
2130+ Implementations are advised to restrict HTTP schema retrieval to a
2131+ configurable allowlist of trusted domains.
2132+ * ** Lack of Integrity Guarantees** : Retrieved schemas may be altered in transit
2133+ or change between validations. If network retrieval is allowed,
2134+ implementations are advised to only allow retrieval over HTTPS unless
2135+ specifically configured to allow unsecured transport.
2136+
2137+ #### File System Specific Threats
2138+
2139+ Allowing resolution from the local filesystem (` file: ` URIs) raises different
2140+ issues:
2141+
2142+ * ** Information Disclosure** : Malicious schemas may access sensitive files on
2143+ the system. Implementations should consider restricting filesystem access to
2144+ a specific schema directory tree.
2145+ * ** Cross-Context Access** : A schema fetched from HTTP may try to reference a
2146+ schema on the filesystem. Implementations are advised to allow resolving
2147+ ` file: ` references only when the referencing schema was itself loaded from the
2148+ file system, similar to same-origin policies in web browsers.
2149+ * ** Exposing Internal Paths** : Schemas that use ` file: ` URIs may reveal
2150+ host-specific filesystem details in two ways: through the ` $id ` itself or
2151+ through schema locations in validation output. Implementations are advised to
2152+ reject ` $id ` values that use the ` file: ` scheme. If ` file: ` URIs are permitted
2153+ internally, implementations are advised to sanitize them (for example, by
2154+ converting them to relative URIs) to avoid exposing host filesystem structure
2155+ to users.
2156+
2157+ ### Extension-Specific Risks
2158+
2159+ Third-party JSON Schema extensions may introduce additional risks. Implementers
2160+ are advised to consult the specifications of any extensions they support and
2161+ take into account their security considerations as well.
20652162
20662163## IANA Considerations
20672164
0 commit comments