Adding support for sharing memory between the module and the engine #2299

yairgott · 2025-07-02T23:58:13Z

Overview

Sharing memory between the module and engine reduces memory overhead by eliminating redundant copies of stored records in the module. This is particularly beneficial for search workloads that require indexing large volumes of documents.

Vectors

Vector similarity search requires storing large volumes of high-cardinality vectors. For example, a single vector with 512 dimensions consumes 2048 bytes, and typical workloads often involve millions of vectors. Due to the lack of a memory-sharing mechanism between the module and the engine, valkey-search currently doubles memory consumption when indexing vectors, significantly increasing operational costs. This limitation introduces adoption friction and reduces valkey-search's competitiveness.

Memory Allocation Strategy

At a fundamental level, there are two primary allocation strategies:

[Chosen] Module-allocated memory shared with the engine.
Engine-allocated memory shared with the module.

For valkey-search, it is crucial that vectors reside in cache-aligned memory to maximize SIMD optimizations. Allowing the module to allocate memory provides greater flexibility for different use cases, though it introduces slightly higher implementation complexity.

Old Implementation

The old implementation was based on ref-counting and introduced a new SDS type. After further discussion, we agreed to simplify the design by removing ref-counting and avoiding the introduction of a new SDS type.

New Implementation - Key Points

The engine exposes a new interface, VM_HashExternalize, which externalizes a buffer as a hash field. The externalized buffer is owned by the module. The function accepts the hash key, hash field, and a buffer along with its length.
ExternalizedValue is a new data type that captures the externalized buffer and its length.
FIELD_SDS_AUX_BIT_EXTERNALIZED is a new SDS AUX flag used to distinguish between regular/embedded SDS values and ExternalizedValue.

valkey-search Usage

Insertion

Upon receiving a key space notification for a new hash or JSON key with an indexed vector attribute, valkey-search allocates cache-aligned memory and deep-copies the vector value.
valkey-search then calls VM_HashExternalize to avoid performing two copies of the vector.

Deletion

When receiving a key space notification for a deleted hash key or hash field that was indexed as a vector, valkey-search deletes the corresponding entry from the index.

Update

Handled similarly to insertion.

Signed-off-by: yairgott <[email protected]>

Resolves valkey-io#2267 Timed out test gets logged at the end of the test run. ``` !!! WARNING The following tests failed: *** [TIMEOUT]: WAIT should not acknowledge 2 additional copies of the data in tests/unit/wait.tcl Cleanup: may take some time... OK ``` Signed-off-by: Sarthak Aggarwal <[email protected]>

…alkey-io#2283) When a replica doing the CLUSTER RESET SOFT, we should generate a new shard_id for it. Since if the node was a replica, it inherits the shard_id of its primary, and after the CLUSTER RESET SOFT, the replica become a new empty primary, it should has its own shard_id. Otherwise, when it joins back into the cluster or responds the PONG, we will have two primaries in the same shard. --------- Signed-off-by: Binbin <[email protected]>

In the unit test framework, there is a check that `zmalloc_used_memory() == 0` after each test suite. If an assertion fails in any test case, the test case is aborted and often it means some memory is not freed. The check for used memory == 0 then fails for all test suites that run after the failed one. This PR makes sure a memory leak is only reported once, for the test suite that didn't free its memory. Two test cases that explicitly checked that all memory had been freed are deleted. This is handled by the test framework. Signed-off-by: Viktor Söderqvist <[email protected]>

… and fix zadd --sequential to create expected data (valkey-io#2102) This extends the `__rand_int__` pattern to add `__rand_1st__` through `__rand_9th__`. For the new placeholders, multiple occurrences within a command will have the same value. With the `--sequential` option, each placeholder will use separate counters. For `__rand_int__`, the behavior is unchanged - multiple occurrences of the pattern within a command will have different values. Examples of using the `__rand_int__` with `--sequential`: ``` $ ./valkey-benchmark -r 300 -n 300 --sequential -q -- set key__rand_int__ val__rand_int__ set key__rand_int__ val__rand_int__: 60000.00 requests per second, p50=0.391 msec $ ./valkey-cli info keyspace # Keyspace db0:keys=150,expires=0,avg_ttl=0 $ ./valkey-cli get key000000000050 "val000000000051" ``` For the new patterns, multiple occurrences within the command will have the same value. ``` $ ./valkey-benchmark -r 300 -n 300 --sequential -q -- set key__rand_1st__ val__rand_1st__ set key__rand_int__ val__rand_int1_: 60000.00 requests per second, p50=0.383 msec $ ./valkey-cli info keyspace # Keyspace db0:keys=300,expires=0,avg_ttl=0 $ ./valkey-cli get key000000000050 "val000000000050" ``` I also fixed the zadd benchmark so it produces the expected number of keys when used with `--sequential`. (By using the same counter twice per command it was effectively counting by twos) I made this for myself but raised a PR in case y'all like it. 🙂 --------- Signed-off-by: Rain Valentine <[email protected]>

During memory eviction, we may end up with a primary kv hash table that's below the 13% `MIN_FILL_PERCENT_SOFT`, but inside `resize()` for the kv table we check with `ht->type->resizeAllowed` whether allocating a new table crosses the configured `maxmemory` limit without taking into account that shrinking, while temporarily increasing the memory usage, will ultimately reduce the memory usage in all situations. Blocking hash tables from shrinking in situations where we would only temporarily cross the maxmemory limit has at least three negative side effects: * The performance of valkey is negatively impacted if, for example, we're moving items from table A to table B since table A would eventually end up being a very sparse hash table that never shrinks. * In extreme situations, the performance of valkey slows down significantly when trying to find more keys to evict since we're trying to find keys in a sparse table that only becomes more sparse over time. * In more extreme situations, such as when reducing `maxmemory` by around 90% or writing large data to a single key, valkey can end up in a situation where, no matter how many keys are being evicted from the kv table, the table bucket struct overhead uses more memory than the data stores in the table, causing all keys to be evicted from the table and valkey becoming unusable before setting maxmemory to above the current memory usage: ``` 127.0.0.1:6379> config set maxmemory 2gb OK 127.0.0.1:6379> debug populate 20000000 OK (8.94s) 127.0.0.1:6379> debug htstats 0 [Dictionary HT] Hash table 0 stats (main hash table): table size: 29360128 number of entries: 20000000 [Expires HT] 127.0.0.1:6379> config set maxmemory 200mb OK 127.0.0.1:6379> dbsize (integer) 18725794 [...] 127.0.0.1:6379> dbsize (integer) 0 127.0.0.1:6379> keys * (empty array) 127.0.0.1:6379> set foo bar (error) OOM command not allowed when used memory > 'maxmemory'. 127.0.0.1:6379> config get maxmemory 1) "maxmemory" 2) "209715200" 127.0.0.1:6379> info memory # Memory used_memory:269890904 used_memory_human:257.39M [...] 127.0.0.1:6379> config set maxmemory 300mb OK 127.0.0.1:6379> info memory # Memory used_memory:1455512 used_memory_human:1.39M [...] 127.0.0.1:6379> set foo bar OK ``` This PR addresses these issues by allowing hash tables to allocate a new table for shrinking, even if allocating a new table causes the maxmemory limit to be crossed during resize. Additionally, the check for whether we should fast-forward the resize operation has been moved to before checking whether we are allowed to allocate a new resize table. Because we don't start a resize in this section of the function yet, a fast-forward usually drops the memory usage enough to allow another resize operation to start immediately after the fast-forward resize completes. As a side effect of this PR, we now also refuse to start a hashtable resize if the policy is set to `HASHTABLE_RESIZE_FORBID`. --------- Signed-off-by: Fusl <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]>

…el (valkey-io#2308) Sentinel sends switch-master message more quickly, ref [valkey-io#2282](valkey-io#2282). Signed-off-by: youngmore1024 <[email protected]>

- Subset of valkey-io#2183 --------- Signed-off-by: Josh Soref <[email protected]>

Signed-off-by: Madelyn Olson <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>

Signed-off-by: Ran Shidlansik <[email protected]>

…rectory as path (valkey-io#630) Helps testing specific functionality like `cluster` rather than running the entire test suite. ### Usage example #### Run all the tests under tests/unit/cluster directory ``` ./runtest --single tests/unit/cluster ``` #### Run all the tests under test/unit but skip the file `memefficiency` ``` ./runtest --single tests/unit --skipunit tests/unit/memefficiency.tcl ``` #### Run all the tests under tests/integration ``` ./runtest --single tests/integration ``` Note: this doesn't work with tests under `tests/cluster`. We should migrate all of the tests under `tests/cluster` to `tests/unit/cluster`. --------- Signed-off-by: Harkrishn Patro <[email protected]>

) Needs backporting to 8.1 to fix valkey-io#2320 --------- Signed-off-by: Harkrishn Patro <[email protected]>

Resolves test failures like https://github.com/valkey-io/valkey/actions/runs/16093360514/job/45412856532. Valgrind tests can run slowly, so the 100 ms expire time may have partially began to expire items during the ingestion phase. Signed-off-by: Madelyn Olson <[email protected]>

Signed-off-by: yairgott <[email protected]>

codecov · 2025-07-08T06:38:54Z

Codecov Report

Attention: Patch coverage is 63.25301% with 61 lines in your changes missing coverage. Please review.

Project coverage is 71.41%. Comparing base (4827720) to head (61052cd).
Report is 40 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/t_hash.c	57.93%	53 Missing ⚠️
src/module.c	12.50%	7 Missing ⚠️
src/ziplist.c	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2299      +/-   ##
============================================
- Coverage     71.43%   71.41%   -0.02%     
============================================
  Files           123      123              
  Lines         66908    67163     +255     
============================================
+ Hits          47795    47964     +169     
- Misses        19113    19199      +86

Files with missing lines	Coverage Δ
src/aof.c	`80.54% <100.00%> (+0.13%)`	⬆️
src/db.c	`90.06% <100.00%> (-0.04%)`	⬇️
src/rdb.c	`76.90% <100.00%> (+0.28%)`	⬆️
src/server.h	`100.00% <ø> (ø)`
src/ziplist.c	`15.18% <0.00%> (ø)`
src/module.c	`9.56% <12.50%> (+<0.01%)`	⬆️
src/t_hash.c	`89.80% <57.93%> (-6.40%)`	⬇️

... and 24 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: yairgott <[email protected]>

yairgott · 2025-07-08T07:23:57Z

cc @zuiderkwast

Signed-off-by: yairgott <[email protected]>

test_vest uses a mock_defrag function in order to simulate defrag flow. In the scenario of no system malloc_size we need to use zmalloc_usable_size in order to prevent stepping over the "to-be-replaced" buffer will partially fix: valkey-io#2435 Signed-off-by: Ran Shidlansik <[email protected]>

Fix missing check for executing client in `lookupKey` function ### Issue The `lookupKey` function in db.c accesses `server.executing_client->cmd->proc` without first verifying that `server.executing_client` is not NULL. This was introduced in valkey-io#1499 where the check for executing client was added without verifying it could be null. The server crashes with a null pointer dereference when the current_client's flag.no_touch is set. ``` 27719 valkey-server * /lib64/libpthread.so.0(+0x118e0)[0x7f34cb96a8e0] src/valkey-server 127.0.0.1:21113(lookupKey+0xf5)[0x4a14b7] src/valkey-server 127.0.0.1:21113(lookupKeyReadWithFlags+0x50)[0x4a15fc] src/valkey-server 127.0.0.1:21113[0x52b8f1] src/valkey-server 127.0.0.1:21113(handleClientsBlockedOnKeys+0xa5)[0x52b16f] src/valkey-server 127.0.0.1:21113(processCommand+0xf1e)[0x4712c9] src/valkey-server 127.0.0.1:21113(processCommandAndResetClient+0x35)[0x490fd5] src/valkey-server 127.0.0.1:21113(processInputBuffer+0xe1)[0x4912e5] src/valkey-server 127.0.0.1:21113(readQueryFromClient+0x8c)[0x49177b] src/valkey-server 127.0.0.1:21113[0x57daa6] src/valkey-server 127.0.0.1:21113[0x57e280] src/valkey-server 127.0.0.1:21113(aeProcessEvents+0x261)[0x45b259] src/valkey-server 127.0.0.1:21113(aeMain+0x2a)[0x45b450] src/valkey-server 127.0.0.1:21113(main+0xd43)[0x479bf6] /lib64/libc.so.6(__libc_start_main+0xea)[0x7f34cb5cd13a] src/valkey-server 127.0.0.1:21113(_start+0x2a)[0x454e3a] ``` ### Fix Added a null check for `server.executing_client` before attempting to dereference it: ### Tests Added a regression test in tests/unit/type/list.tcl. --------- Signed-off-by: Uri Yagelnik <[email protected]> Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Ran Shidlansik <[email protected]>

We used to did print the context but after valkey-io#2276, we lost the context. unstable: ``` *** Extract version and sha1 details from info command and print in tests/unit/info-command.tcl ``` now: ``` *** [err]: Extract version and sha1 details from info command and print in tests/unit/info-command.tcl Expected '0' to be equal to '1' (context: type source line 7 file /xxx/info-command.tcl cmd {assert_equal 0 1} proc ::test) ``` We can see the different, we have provided enough context when asserting fail. Otherwise we need to scroll back (which is usually a lot of server logs) to see the context. Signed-off-by: Binbin <[email protected]>

…2440) This might help shed some light on the flakey failure e.g: https://github.com/valkey-io/valkey/actions/runs/16754537665/job/47433388425?pr=2431 Signed-off-by: Ran Shidlansik <[email protected]>

Remove unused includes Signed-off-by: Xiaolong Chen <[email protected]>

…o primary (valkey-io#2446) fixes: valkey-io#2445 Signed-off-by: Ran Shidlansik <[email protected]>

1. Fix the way we toggle active expire. 2. reduce the number of spawned servers during the test Just merged some start_server blocks to reduce the time spent on starting and shutting down the server. --------- Signed-off-by: Ran Shidlansik <[email protected]>

Across multiple runs of the big list test in defrag, the latency check is tripping because the maximum observed defrag cycle latency occasionally spikes above our 5 ms limit. While most cycles complete in just a few milliseconds, rare slowdowns push some cycles into the double digit millisecond range, so a 5 ms hard cap is too aggressive for stable testing. ``` /runtest --verbose --tls --single unit/memefficiency --only '/big list' --accurate --loop --fastfail ``` ``` [err]: Active Defrag big list: standalone in tests/unit/memefficiency.tcl Expected 12 <= 5 (context: type proc line 18 cmd {assert {$max_latency <= $limit_ms}} proc ::validate_latency level 1) (Fast fail: test will exit now) [err]: Active Defrag big list: standalone in tests/unit/memefficiency.tcl Expected 21 <= 5 (context: type proc line 18 cmd {assert {$max_latency <= $limit_ms}} proc ::validate_latency level 1) (Fast fail: test will exit now) [err]: Active Defrag big list: standalone in tests/unit/memefficiency.tcl Expected 25 <= 5 (context: type proc line 18 cmd {assert {$max_latency <= $limit_ms}} proc ::validate_latency level 1) (Fast fail: test will exit now) [err]: Active Defrag big list: standalone in tests/unit/memefficiency.tcl Expected 17 <= 5 (context: type proc line 18 cmd {assert {$max_latency <= $limit_ms}} proc ::validate_latency level 1) (Fast fail: test will exit now) ``` --------- Signed-off-by: Seungmin Lee <[email protected]> Co-authored-by: Seungmin Lee <[email protected]>

Fixes valkey-io#626 - Skip CI tests when only documentation files are modified --------- Signed-off-by: Hanxi Zhang <[email protected]>

…o#2455) Fixes the GitHub Actions error where both `paths` and `paths-ignore` were defined for the same event, which is not allowed. Resolves the error: "you may only define one of `paths` and `paths-ignore` for a single event" Removed the conflicting `paths` section from the `pull_request` trigger, keeping only `paths-ignore` to skip documentation changes while allowing the workflow to run on all other changes. This is a follow-up fix to address the issue identified in the previous PR. Signed-off-by: Hanxi Zhang <[email protected]>

Test `Instance valkey-io#5 is still a slave after some time (no failover)` is supposed to verify that command `CLUSTER FAILOVER` will not promote a replica without quorum from the primary; later in the file (`Instance 5 is a master after some time`), we verify that `CLUSTER FAILOVER FORCE` does promote a replica under the same conditions. There's a couple issues with the tests: 1. `Instance valkey-io#5 is still a slave after some time (no failover)` should verify that instance 5 is a replica (i.e. that there's no failover), but we call `assert {[s -5 role] eq {master}}`. 2. The reason why the above assert works is that we previously send `DEBUG SLEEP 10` to the primary, which pauses the primary for longer than the configured 3 seconds for`cluster-node-timeout`. The primary is marked as failed from the perspective of the rest of the cluster, so quorum can be established and instance 5 is promoted as primary. This commit fixes the two by shortening the sleep to less than 3 seconds, and then asserting the role is still replica. Test `Instance valkey-io#5 is a master after some time` is updated to sleep for a shorter duration to ensure that `FAILOVER FORCE` succeeds under the exact same conditions. ### Testing `./runtest --single unit/cluster/manual-failover --loop --fastfail` Signed-off-by: Tyler Amano-Smerling <[email protected]>

…-io#2452) Adds a simple check for backup directory, remove the todo. Signed-off-by: Sarthak Aggarwal <[email protected]>

The bug caused many invalid HMSET entries to be added to the AOF during rewriting. This now properly skips emitting HMSET until we find an entry with no expiry. Signed-off-by: Jacob Murphy <[email protected]>

…lkey-io#2462) In the current implementation `HGETEX`, when applied on a non existing object, will simply return null value instead of an array (like in the `HMGET` case). --------- Signed-off-by: Ran Shidlansik <[email protected]>

closes valkey-io#2352 --------- Signed-off-by: Xiaolong Chen <[email protected]> Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Ran Shidlansik <[email protected]>

…y-io#2464) Following new API presented in valkey-io#2089, we might access out of bound memory in case of some illegal command input Signed-off-by: Ran Shidlansik <[email protected]>

Based on the review suggestion valkey-io#2432 (comment) --------- Signed-off-by: Xiaolong Chen <[email protected]>

### Description User data logged when crash caused by moduleRDBLoadError. ### Change Redact user data when hide-user-data-from-log enabled. Signed-off-by: VanessaTang <[email protected]>

…ey-io#2469) Right now, if a TLS connect fails, you get an unhelpful error message in the log since it prints out NULL. This change makes sure that report error always returns a string (never null) as well as tries to print out underlying errors. Signed-off-by: Madelyn Olson <[email protected]>

@enjoy-binbin

Introduces a new family of commands for migrating slots via replication. The procedure is driven by the source node which pushes an AOF formatted snapshot of the slots to the target, followed by a replication stream of changes on that slot (a la manual failover). This solution is an adaptation of the solution provided by @enjoy-binbin, combined with the solution I previously posted at valkey-io#1591, modified to meet the designs we had outlined in valkey-io#23. ## New commands * `CLUSTER MIGRATESLOTS SLOTSRANGE start end [start end]... NODE node-id`: Begin sending the slot via replication to the target. Multiple targets can be specified by repeating `SLOTSRANGE ... NODE ...` * `CLUSTER CANCELMIGRATION ALL`: Cancel all slot migrations * `CLUSTER GETSLOTMIGRATIONS`: See a recent log of migrations This PR only implements "one shot" semantics with an asynchronous model. Later, "two phase" (e.g. slot level replicate/failover commands) can be added with the same core. ## Slot migration jobs Introduces the concept of a slot migration job. While active, a job tracks a connection created by the source to the target over which the contents of the slots are sent. This connection is used for control messages as well as replicated slot data. Each job is given a 40 character random name to help uniquely identify it. All jobs, including those that finished recently, can be observed using the `CLUSTER GETSLOTMIGRATIONS` command. ## Replication * Since the snapshot uses AOF, the snapshot can be replayed verbatim to any replicas of the target node. * We use the same proxying mechanism used for chaining replication to copy the content sent by the source node directly to the replica nodes. ## `CLUSTER SYNCSLOTS` To coordinate the state machine transitions across the two nodes, a new command is added, `CLUSTER SYNCSLOTS`, that performs this control flow. Each end of the slot migration connection is expected to install a read handler in order to handle `CLUSTER SYNCSLOTS` commands: * `ESTABLISH`: Begins a slot migration. Provides slot migration information to the target and authorizes the connection to write to unowned slots. * `SNAPSHOT-EOF`: appended to the end of the snapshot to signal that the snapshot is done being written to the target. * `PAUSE`: informs the source node to pause whenever it gets the opportunity * `PAUSED`: added to the end of the client output buffer when the pause is performed. The pause is only performed after the buffer shrinks below a configurable size * `REQUEST-FAILOVER`: request the source to either grant or deny a failover for the slot migration. The grant is only granted if the target is still paused. Once a failover is granted, the paused is refreshed for a short duration * `FAILOVER-GRANTED`: sent to the target to inform that REQUEST-FAILOVER is granted * `ACK`: heartbeat command used to ensure liveness ## Interactions with other commands * FLUSHDB on the source node (which flushes the migrating slot) will result in the source dropping the connection, which will flush the slot on the target and reset the state machine back to the beginning. The subsequent retry should very quickly succeed (it is now empty) * FLUSHDB on the target will fail the slot migration. We can iterate with better handling, but for now it is expected that the operator would retry. * Genearlly, FLUSHDB is expected to be executed cluster wide, so preserving partially migrated slots doesn't make much sense * SCAN and KEYS are filtered to avoid exposing importing slot data ## Error handling * For any transient connection drops, the migration will be failed and require the user to retry. * If there is an OOM while reading from the import connection, we will fail the import, which will drop the importing slot data * If there is a client output buffer limit reached on the source node, it will drop the connection, which will cause the migration to fail * If at any point the export loses ownership or either node is failed over, a callback will be triggered on both ends of the migration to fail the import. The import will not reattempt with a new owner * The two ends of the migration are routinely pinging each other with SYNCSLOTS ACK messages. If at any point there is no interaction on the connection for longer than `repl-timeout`, the connection will be dropped, resulting in migration failure * If a failover happens, we will drop keys in all unowned slots. The migration does not persist through failovers and would need to be retried on the new source/target. ## State machine ``` Target/Importing Node State Machine ───────────────────────────────────────────────────────────── ┌────────────────────┐ │SLOT_IMPORT_WAIT_ACK┼──────┐ └──────────┬─────────┘ │ ACK│ │ ┌──────────────▼─────────────┐ │ │SLOT_IMPORT_RECEIVE_SNAPSHOT┼──┤ └──────────────┬─────────────┘ │ SNAPSHOT-EOF│ │ ┌───────────────▼──────────────┐ │ │SLOT_IMPORT_WAITING_FOR_PAUSED┼─┤ └───────────────┬──────────────┘ │ PAUSED│ │ ┌───────────────▼──────────────┐ │ Error Conditions: │SLOT_IMPORT_FAILOVER_REQUESTED┼─┤ 1. OOM └───────────────┬──────────────┘ │ 2. Slot Ownership Change FAILOVER-GRANTED│ │ 3. Demotion to replica ┌──────────────▼─────────────┐ │ 4. FLUSHDB │SLOT_IMPORT_FAILOVER_GRANTED┼──┤ 5. Connection Lost └──────────────┬─────────────┘ │ 6. No ACK from source (timeout) Takeover Performed│ │ ┌──────────────▼───────────┐ │ │SLOT_MIGRATION_JOB_SUCCESS┼────┤ └──────────────────────────┘ │ │ ┌─────────────────────────────────────▼─┐ │SLOT_IMPORT_FINISHED_WAITING_TO_CLEANUP│ └────────────────────┬──────────────────┘ Unowned Slots Cleaned Up│ ┌─────────────▼───────────┐ │SLOT_MIGRATION_JOB_FAILED│ └─────────────────────────┘ Source/Exporting Node State Machine ───────────────────────────────────────────────────────────── ┌──────────────────────┐ │SLOT_EXPORT_CONNECTING├─────────┐ └───────────┬──────────┘ │ Connected│ │ ┌─────────────▼────────────┐ │ │SLOT_EXPORT_AUTHENTICATING┼───────┤ └─────────────┬────────────┘ │ Authenticated│ │ ┌─────────────▼────────────┐ │ │SLOT_EXPORT_SEND_ESTABLISH┼───────┤ └─────────────┬────────────┘ │ ESTABLISH command written│ │ ┌─────────────────────▼─────────────┐ │ │SLOT_EXPORT_READ_ESTABLISH_RESPONSE┼──────┤ └─────────────────────┬─────────────┘ │ Full response read (+OK)│ │ ┌────────────────▼──────────────┐ │ Error Conditions: │SLOT_EXPORT_WAITING_TO_SNAPSHOT┼─────┤ 1. User sends CANCELMIGRATION └────────────────┬──────────────┘ │ 2. Slot ownership change No other child process│ │ 3. Demotion to replica ┌────────────▼───────────┐ │ 4. FLUSHDB │SLOT_EXPORT_SNAPSHOTTING┼────────┤ 5. Connection Lost └────────────┬───────────┘ │ 6. AUTH failed Snapshot done│ │ 7. ERR from ESTABLISH command ┌───────────▼─────────┐ │ 8. Unpaused before failover completed │SLOT_EXPORT_STREAMING┼──────────┤ 9. Snapshot failed (e.g. Child OOM) └───────────┬─────────┘ │ 10. No ack from target (timeout) PAUSE│ │ 11. Client output buffer overrun ┌──────────────▼─────────────┐ │ │SLOT_EXPORT_WAITING_TO_PAUSE┼──────┤ └──────────────┬─────────────┘ │ Buffer drained│ │ ┌──────────────▼────────────┐ │ │SLOT_EXPORT_FAILOVER_PAUSED┼───────┤ └──────────────┬────────────┘ │ Failover request granted│ │ ┌───────────────▼────────────┐ │ │SLOT_EXPORT_FAILOVER_GRANTED┼───────┤ └───────────────┬────────────┘ │ New topology received│ │ ┌──────────────▼───────────┐ │ │SLOT_MIGRATION_JOB_SUCCESS│ │ └──────────────────────────┘ │ │ ┌─────────────────────────┐ │ │SLOT_MIGRATION_JOB_FAILED│◄────────┤ └─────────────────────────┘ │ │ ┌────────────────────────────┐ │ │SLOT_MIGRATION_JOB_CANCELLED│◄──────┘ └────────────────────────────┘ ``` Co-authored-by: Binbin <[email protected]> --------- Signed-off-by: Binbin <[email protected]> Signed-off-by: Jacob Murphy <[email protected]> Signed-off-by: Madelyn Olson <[email protected]> Co-authored-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>

When we added the safeguard against sub-replicas logic, we didn't add much tests for it. As time has shown, it can happen in different scenarios. Here's a test case that used to happen in a failover scenario. Signed-off-by: Binbin <[email protected]>

…alkey-io#2431) The assert was added in valkey-io#2301 and we found that there are some situations would trigger assert and crash the server. The reason we added the assert is because, in the code: 1. sender_claimed_primary and sender are in the same shard 2. and sender is the old primary, sender_claimed_primary is the old replica 3. and now sender become a replica, sender_claimed_primary become a primary That means a failover happend in the shard, and sender should be the primary of sender_claimed_primary. But obviously this assumption may be wrong, we rely on shard_id to determine whether it is in a same shard, and assume that a shard can only have one primary. But this is wrong, from valkey-io#2279 we can know there will be a case that we can create two primaries in the same shard due to the untimely update of shard_id. So we can create a test that trigger the assert in this way: 1. pre condition: two primaries in the same shard, one has slots and one is empty. 2. replica doing a cluster failover 3. the empty primary doing a cluster replicate with the replica (new primary) We change the assert to an if condition to fix it. Closes valkey-io#2423. Note that the test written here also exposes the issue in valkey-io#2441, so these two may need to be addressed together. Signed-off-by: Binbin <[email protected]>

Signed-off-by: yairgott <[email protected]>

yairgott · 2025-08-12T05:10:12Z

This PR was based on Ran's hash TTL branch. Closing this PR in favor of #2472.

yairgott and others added 24 commits July 2, 2025 23:56

initial version

777e5f0

Signed-off-by: yairgott <[email protected]>

fixing some issues

7561f49

Signed-off-by: yairgott <[email protected]>

few fixes

c8acc0f

Signed-off-by: yairgott <[email protected]>

fixing some issues

0cd1a34

Signed-off-by: yairgott <[email protected]>

fixing some issues

50ef2df

Signed-off-by: yairgott <[email protected]>

fixing some issues

7af74bd

Signed-off-by: yairgott <[email protected]>

fixing some issues

6389017

Signed-off-by: yairgott <[email protected]>

[optimization] Improve the speed of sending switch-master from sentin…

d08aab1

…el (valkey-io#2308) Sentinel sends switch-master message more quickly, ref [valkey-io#2282](valkey-io#2282). Signed-off-by: youngmore1024 <[email protected]>

Spelling 10 (valkey-io#2246)

d75bc31

- Subset of valkey-io#2183 --------- Signed-off-by: Josh Soref <[email protected]>

Spelling 9 (valkey-io#2245)

57b1761

- Subset of valkey-io#2183 --------- Signed-off-by: Josh Soref <[email protected]>

Apply fixed for CVE-2025-32023 (valkey-io#2314)

20f5199

Signed-off-by: Madelyn Olson <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>

retry accept on transient errors (CVE-2025-48367) (valkey-io#2315)

cb10d9d

Signed-off-by: Ran Shidlansik <[email protected]>

Update test to unwatch all keys before auth within multi (valkey-io#2323

c036162

) Needs backporting to 8.1 to fix valkey-io#2320 --------- Signed-off-by: Harkrishn Patro <[email protected]>

few fixes

952818b

Signed-off-by: yairgott <[email protected]>

optimizing scan data to avoid deep copy of the value

fffe837

Signed-off-by: yairgott <[email protected]>

lint fixes

6375522

Signed-off-by: yairgott <[email protected]>

get hash value fix

884a93a

Signed-off-by: yairgott <[email protected]>

yairgott marked this pull request as ready for review July 8, 2025 06:31

yairgott changed the title ~~Hash externalizer~~ Adding support for sharing memory between the module and the engine Jul 8, 2025

memory violation fixes

4ada8cf

Signed-off-by: yairgott <[email protected]>

lint fix

798c7f3

Signed-off-by: yairgott <[email protected]>

ranshid and others added 5 commits August 6, 2025 12:33

verify expiration is set on hashexpire active expire test (valkey-io#…

772d12e

…2440) This might help shed some light on the flakey failure e.g: https://github.com/valkey-io/valkey/actions/runs/16754537665/job/47433388425?pr=2431 Signed-off-by: Ran Shidlansik <[email protected]>

Remove unused includes (valkey-io#2344)

2f1b250

Remove unused includes Signed-off-by: Xiaolong Chen <[email protected]>

madolson added this to Valkey 9.0 Aug 6, 2025

ranshid and others added 7 commits August 7, 2025 11:54

make sure replica and primary are in sync during hfe test Promotion t…

2b9fb75

…o primary (valkey-io#2446) fixes: valkey-io#2445 Signed-off-by: Ran Shidlansik <[email protected]>

Add path filters to skip CI on documentation changes (valkey-io#2393)

c6794fd

Fixes valkey-io#626 - Skip CI tests when only documentation files are modified --------- Signed-off-by: Hanxi Zhang <[email protected]>

Adding backup directory check for valkey-cli --cluster backup (valkey…

cc46538

…-io#2452) Adds a simple check for backup directory, remove the todo. Signed-off-by: Sarthak Aggarwal <[email protected]>

madolson moved this to In Progress in Valkey 9.0 Aug 8, 2025

murphyjacob4 and others added 12 commits August 10, 2025 10:05

Fix AOF rewrite behavior for hashes with expirations (valkey-io#2454)

e606fcb

The bug caused many invalid HMSET entries to be added to the AOF during rewriting. This now properly skips emitting HMSET until we find an entry with no expiry. Signed-off-by: Jacob Murphy <[email protected]>

Use bool for return types in hashtable functions (valkey-io#2432)

d32f9be

closes valkey-io#2352 --------- Signed-off-by: Xiaolong Chen <[email protected]> Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Ran Shidlansik <[email protected]>

Fix out-of-bound memory access when num-fields is not provided (valke…

ecca213

…y-io#2464) Following new API presented in valkey-io#2089, we might access out of bound memory in case of some illegal command input Signed-off-by: Ran Shidlansik <[email protected]>

Replace generic C assert with serverAssert (valkey-io#2467)

ee305ce

Based on the review suggestion valkey-io#2432 (comment) --------- Signed-off-by: Xiaolong Chen <[email protected]>

Redact user data when hide-user-data-from-log enabled (valkey-io#2274)

a89e529

### Description User data logged when crash caused by moduleRDBLoadError. ### Change Redact user data when hide-user-data-from-log enabled. Signed-off-by: VanessaTang <[email protected]>

initial version

8f26b3b

Signed-off-by: yairgott <[email protected]>

initial version

5cf7290

Signed-off-by: yairgott <[email protected]>

yairgott marked this pull request as draft August 12, 2025 04:54

yairgott closed this Aug 12, 2025

github-project-automation bot moved this from In Progress to Done in Valkey 9.0 Aug 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding support for sharing memory between the module and the engine #2299

Adding support for sharing memory between the module and the engine #2299

Uh oh!

yairgott commented Jul 2, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 8, 2025 •

edited

Loading

Uh oh!

yairgott commented Jul 8, 2025

Uh oh!

yairgott commented Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

30 participants

Adding support for sharing memory between the module and the engine #2299

Adding support for sharing memory between the module and the engine #2299

Uh oh!

Conversation

yairgott commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Vectors

Memory Allocation Strategy

Old Implementation

New Implementation - Key Points

valkey-search Usage

Insertion

Deletion

Update

Uh oh!

codecov bot commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yairgott commented Jul 8, 2025

Uh oh!

yairgott commented Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

30 participants

yairgott commented Jul 2, 2025 •

edited

Loading

codecov bot commented Jul 8, 2025 •

edited

Loading