Skip to content

fix(core): correctly resolve list indices in JSON Pointer refs and improve docs #32041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 58 additions & 17 deletions libs/core/langchain_core/utils/json_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,25 +8,66 @@
if TYPE_CHECKING:
from collections.abc import Sequence


def _retrieve_ref(path: str, schema: dict) -> dict:
components = path.split("/")
if components[0] != "#":
msg = (
"ref paths are expected to be URI fragments, meaning they should start "
"with #."
"""Return the fragment referenced by an internal ``$ref``.

The subset of *JSON Pointer* used by JSON-Schema requires every reference
to be a URI-fragment (it **must** start with ``#``). Each “/”-separated
token then selects either:

* a **mapping key** when the current node is a ``dict``; or

Check failure on line 18 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.12

Ruff (W291)

langchain_core/utils/json_schema.py:18:64: W291 Trailing whitespace

Check failure on line 18 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.10

Ruff (W291)

langchain_core/utils/json_schema.py:18:64: W291 Trailing whitespace

Check failure on line 18 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.13

Ruff (W291)

langchain_core/utils/json_schema.py:18:64: W291 Trailing whitespace

Check failure on line 18 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.11

Ruff (W291)

langchain_core/utils/json_schema.py:18:64: W291 Trailing whitespace

Check failure on line 18 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.9

Ruff (W291)

langchain_core/utils/json_schema.py:18:64: W291 Trailing whitespace
* a **zero-based list index** when the current node is a ``list``.

Args:
path: The reference exactly as it appears in the schema
(for example ``"#/properties/name"``). Must start with ``#``.
schema: The document-root schema object inside which *path* is resolved.
This object is **never** mutated.

Returns:
dict: A **deep copy** of the schema fragment located at *path*.

Raises:
ValueError: If *path* does **not** start with ``#``.
KeyError: If any token cannot be resolved.
"""
tokens = path.split("/")

# All internal JSON-Schema references must be URI fragments.
if tokens[0] != "#":
raise ValueError(
"ref paths are expected to be URI fragments, meaning they should "
"start with '#'.",

Check failure on line 40 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.12

Ruff (EM101)

langchain_core/utils/json_schema.py:39:13: EM101 Exception must not use a string literal, assign to variable first

Check failure on line 40 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.10

Ruff (EM101)

langchain_core/utils/json_schema.py:39:13: EM101 Exception must not use a string literal, assign to variable first

Check failure on line 40 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.13

Ruff (EM101)

langchain_core/utils/json_schema.py:39:13: EM101 Exception must not use a string literal, assign to variable first

Check failure on line 40 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.11

Ruff (EM101)

langchain_core/utils/json_schema.py:39:13: EM101 Exception must not use a string literal, assign to variable first

Check failure on line 40 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.9

Ruff (EM101)

langchain_core/utils/json_schema.py:39:13: EM101 Exception must not use a string literal, assign to variable first
)

Check failure on line 41 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.12

Ruff (TRY003)

langchain_core/utils/json_schema.py:38:15: TRY003 Avoid specifying long messages outside the exception class

Check failure on line 41 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.10

Ruff (TRY003)

langchain_core/utils/json_schema.py:38:15: TRY003 Avoid specifying long messages outside the exception class

Check failure on line 41 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.13

Ruff (TRY003)

langchain_core/utils/json_schema.py:38:15: TRY003 Avoid specifying long messages outside the exception class

Check failure on line 41 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.11

Ruff (TRY003)

langchain_core/utils/json_schema.py:38:15: TRY003 Avoid specifying long messages outside the exception class

Check failure on line 41 in libs/core/langchain_core/utils/json_schema.py

View workflow job for this annotation

GitHub Actions / cd libs/core / make lint #3.9

Ruff (TRY003)

langchain_core/utils/json_schema.py:38:15: TRY003 Avoid specifying long messages outside the exception class
raise ValueError(msg)
out = schema
for component in components[1:]:
if component in out:
out = out[component]
elif component.isdigit() and int(component) in out:
out = out[int(component)]
else:
msg = f"Reference '{path}' not found."
raise KeyError(msg)
return deepcopy(out)

node: Any = schema # start at the document root

for token in tokens[1:]:
# ----- Mapping lookup -------------------------------------------------- #
if isinstance(node, dict):
if token in node:
node = node[token]
continue
# Numeric token may reference an int key stored in the mapping.
if token.isdigit() and (int_token := int(token)) in node:
node = node[int_token]
continue

# ----- Sequence index -------------------------------------------------- #
if token.isdigit() and isinstance(node, list):
idx = int(token)
if idx >= len(node):
msg = "Index " + str(idx) + " out of range while resolving " + str(path)
raise KeyError(msg)
node = node[idx]
continue

# ---------------------------------------------------------------------- #
msg = "Unable to resolve token " + str(token) + " in " + str(path)
raise KeyError(msg)

# Hand back a deep copy so callers can mutate safely.
return deepcopy(node)


def _dereference_refs_helper(
Expand Down
Loading