Fix key deserialization propagation in windows#848
Conversation
dd3bf22 to
62d6b4e
Compare
| When retrieving a prefix during partition-level windows expiration, we | ||
| don't know its original type due to this conditional serialization. | ||
| Therefore, we must first *try* to deserialize it using the configured | ||
| `loads` function. |
There was a problem hiding this comment.
Could we store an indicator about the prefix type, if it's a byte or something else ?
For migration, existing windows missing that information can try to deserialize
There was a problem hiding this comment.
Yes, but it's considerably more code, not sure if it's worth it.
Let's see what @daniil-quix thinks. But even if we decide to do this, it may be done as a separate optimisation; in the meantime, this will fix the bug.
There was a problem hiding this comment.
I'm mostly worried about the performance hit of trying to deserialise every keys
There was a problem hiding this comment.
I did some quick tests locally, and the performance hit actually depends on the value itself.
Two almost identical values b'123123' and b'{123123 take very different amounts of time:
### Successful deserialization
%%timeit
try:
orjson.loads(b'123123')
except Exception:
...
34.4 ns ± 0.182 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
### Deserialization failed
%%timeit
try:
orjson.loads(b'{123123')
except Exception:
...
765 ns ± 4.51 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
67f0bf5 to
fb1045b
Compare
| assert expired[0] == ((0, 10), 1, [], prefix) | ||
| assert expired[1] == ((10, 20), 1, [], prefix) | ||
| assert expired[2] == ((20, 30), 1, [], prefix) |
There was a problem hiding this comment.
This was kinda testing that the same prefix is returned :)
Let's return the key from expire_windows() to make the tests legit, even though we don't use it in the actual code.
Despite
key_deserializerbeing set tojson, after windowing, the deserialization is gone, and the key is served back as bytes. Affected windows:Tumbling and hopping windows with the closing strategy "key" are unaffected.
process_windowinstead of the serialized one coming from the store.state.expire_windows, so it doesn't need to return it._deserialize_prefixfor a detailed explanation.