Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix optionally_at to handle Nil mid-path #809

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jhillyerd
Copy link
Contributor

This is a fix for the decoding error mentioned in gleam-lang/gleam#4202 (reply in thread)

It modifies optionally_at to treat a mid-path Nil as a missing value, instead of a decode error. This PR does not change the behavior of Nil being present at the target of the path.

Example problem, given a decoder containing:

use str_val <- decode.then(decode.optionally_at(
  ["a", "b", "c"],
  None,
  optional(string),
))

And the following 3 JSON inputs:

{"a": {}}
{"a": {"b": {}}}
{"a": {"b": null}}

The first two will decode to None as expected, but the third case will fail:

runtime error: let assert

Pattern match failed, no pattern matched the value.

unmatched value:
  Error(UnableToDecode([DecodeError("Dict", "Atom", ["a", "b"])]))

Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! Doesn't this patch change the behaviour of all indexing, not just optional indexing?

@jhillyerd
Copy link
Contributor Author

jhillyerd commented Feb 19, 2025

Oh, index wasn't where I intended to make the change. Will take another look later.

@jhillyerd jhillyerd closed this Feb 19, 2025
Signed-off-by: James Hillyerd <[email protected]>
Signed-off-by: James Hillyerd <[email protected]>
@jhillyerd
Copy link
Contributor Author

After looking at it, it seemed like adding a second "optional" index implementation was the best option to preserve existing functionality.

In 08cabcc I also flipped the is_null check to happen before calling bare_index, as is_null looks cheaper to call. I haven't benchmarked the change, and am happy to revert it.

Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you!

It looks like you've removed the need for the handle_miss parameter from index and it's not used in optional_index, so it could be removed in both.

Could you update the changelog and update the optionally_at documentation to detail the behaviour also please 🙏

Signed-off-by: James Hillyerd <[email protected]>
Signed-off-by: James Hillyerd <[email protected]>
@jhillyerd
Copy link
Contributor Author

  • 5246c92 removes handle_miss from both index and optional_index
  • 727f15b updates the optionally_at docs and changelog

Let me know if you think including the use of optional in the example I added is too much. It seems like the question comes up often enough in discord that an example of their use together would help people.

Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm struggling to come to a firm conclusion as to whether this API is right.

The change here is to go from being able to decode optional fields, to being able to decode optional and also nullable fields, however it doesn't handle the final value for you. This seems to mix two related things (which could be convenient, but removes some control from the programmer), and there's an inconsistency with how null values are handled depending on where they are in the path.

How would someone write a decoder where it is optional only?

Do we need a different name?

How do we resolve the difference between the nulls at different places on the path?

Could you share you use concrete case for this decoder please, that'll help me understand the requirements.

Sorry I'm being difficult here, but it's very important to get this right!

@jhillyerd
Copy link
Contributor Author

jhillyerd commented Feb 25, 2025

You are right that we would be removing some control from the programmer in terms of handling a nil-dict-value differently from a missing-dict-value. My initial assumption was this was not an important distinction for decoding JSON, but given that decoders aren't specific to JSON, that's probably a bad assumption.

Giving it a new name would be nice, and that is what I originally proposed in #4202. Personally I'd prefer it use the use style decoder function signature like subfield, but I think that was what sidetracked our original discussion over to using then+optionally_at instead.

@jhillyerd
Copy link
Contributor Author

jhillyerd commented Feb 25, 2025

My use case is parsing nested JSON messages from Home Assistant, which does not have a spec for me to follow.

What tripped me up in the message below was trying to extract event.variables.trigger.from_state.entity_id, for which the parent from_state is null in this example, but other times all of trigger is missing entirely (not null).

{
  "id": 1,
  "type": "event",
  "event": {
    "variables": {
      "trigger": {
        "id": "0",
        "idx": "0",
        "alias ": null,
        "platform": "state",
        "entity_id": "binary_sensor.presence_sensor_occupancy",
        "from_state": null,
        "to_state": {
          "entity_id": "binary_sensor.presence_sensor_occupancy",
          "state ": "unavailable",
          "attributes": {
            "device_class": "occupancy",
            "friendly_name": "Office Presence Sensor Occupancy"
          },
          "last_changed": "2025-01-13T03:01:42.730040+00:00",
          "last_reported": "2025-01-13T03:01:42. 730040+00:00",
          "last_updated": "2025-01-13T03:01:42.730040+00:00",
          "context": {
            "id": "01JHESRA6A8V3TMDZ HN29D3X5D",
            "parent_id": null,
            "user_id": null
          }
        },
        "for": null,
        "attribute": null,
        "description": "state of binary_sensor.presence_sensor_occupancy"
      }
    },
    "context": {
      "id": "01JHESRA6A8V3TMDZHN29D3X5D ",
      "parent_id": null,
      "user_id": null
    }
  }
}

Basically, I'd like to follow the robustness principle: "be conservative in what you do, be liberal in what you accept from others". If Home Assistant decides to make trigger null instead of missing, or from_state missing instead of null, it should not break my program, and I do not want to have to special-case every single level of the decode.

@jhillyerd
Copy link
Contributor Author

I spent some time this weekend seeing if I could implement this as a stand-alone library, but with Decoder being an opaque type, and new_primitive_decoder obviously not being designed to support nested decoders or paths in errors, it just didn't seem feasible.

I'm thinking I will cherrypick the optional_index fn we arrived at into a new PR, and implement one or both of nullable_at and nullable_subfield and see how that looks, that way we don't alter the behavior of optional_at.

Alternately, perhaps there could be a new_indexable_decoder that allows more advanced functionality.

@jhillyerd jhillyerd marked this pull request as draft March 3, 2025 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants