Skip to content

Commit 6520d3a

Browse files
authored
Fix tests (#322)
* fix(tests): Use architecture-specific streams in coordinator tests Fix 7 failing tests in test_self_contained_coordinator_memtier.py by: - Adding messages to arch-specific streams instead of base stream - Fixing consumer group creation parameters (arch and id) - Updating assertions to check arch-specific streams This aligns tests with the arch-specific stream routing implemented in the coordinator, which reads from streams like: - oss:api:gh/redis/redis/builds:amd64 (for amd64) - oss:api:gh/redis/redis/builds:arm64 (for arm64) Fixes: - test_self_contained_coordinator_dockerhub_preload - test_self_contained_coordinator_dockerhub - test_self_contained_coordinator_dockerhub_iothreads - test_self_contained_coordinator_dockerhub_valkey - test_dockerhub_via_cli - test_dockerhub_via_cli_airgap - test_self_contained_coordinator_duplicated_ts * style: Format test file with black Apply black formatting to test_self_contained_coordinator_memtier.py to comply with CI code style checks. * style: Format runner.py with black Apply black formatting to runner.py for FLUSHALL changes from PR #320 to comply with CI code style checks. * fix(runner): Skip validation for untested operation types Improve metric validation to skip operation-specific metrics when those operations are not in the tested-commands list. For example, a SET-only test (--ratio 1:0) will now skip validation of Gets.Ops/sec metric, which would legitimately be 0. This fixes CI test failures where SET-only tests were failing validation because Gets.Ops/sec was 0 (below the 10 QPS threshold). The validation now: 1. Checks benchmark_config for 'tested-commands' list 2. Skips metrics for operation types (gets, sets, hgets, etc.) that are not in the tested-commands list 3. Still validates metrics for operations that are actually being tested * fix(tests): Add git_version to benchmark stream requests Add git_version parameter when calling generate_benchmark_stream_request in tests. This ensures version information is propagated through the stream to the coordinator, allowing by.version keys to be created in Redis. Without this, the tests expect by.version keys to exist but they were never created because artifact_version was None in the data export logic. Fixes test assertions that check for: - ci.benchmarks.redis/.../by.version/{version}/benchmark_end/.../memory_maxmemory Tests fixed: - test_self_contained_coordinator_dockerhub_preload - test_self_contained_coordinator_dockerhub - test_self_contained_coordinator_dockerhub_iothreads - test_self_contained_coordinator_dockerhub_valkey - test_self_contained_coordinator_duplicated_ts * fix(cli): Extract git_version from Docker image tag - Add regex-based version extraction from image names like 'redis:7.4.0' or 'valkey/valkey:7.2.6-bookworm' - Pass extracted git_version to generate_benchmark_stream_request() - Enables by.version Redis TimeSeries keys to be created for CLI-triggered tests - Fixes test_dockerhub_via_cli and test_dockerhub_via_cli_airgap assertion failures * style: Format cli.py with black Apply black formatting to comply with CI code style checks. * debug: Add logging to diagnose artifact_version issue * debug: Add print statements to track artifact_version value * style: Format runner.py with black * fix: Include running_platform in by.version key paths The export_redis_metrics function creates keys with the format: {prefix}/{test}/{by_variant}/benchmark_end/{running_platform}/{setup}/{metric} But tests were expecting keys without running_platform: {prefix}/{test}/{by_variant}/benchmark_end/{setup}/{metric} This fixes all 6 failing tests by adding running_platform to the expected key format. Also removes debug print statements that helped diagnose the issue. * fix: Add JSON module support and correct test metadata This commit fixes stats validation errors: 1. Add JSON module commands to commands.json: - JSON.GET: Return value at path in JSON serialized form - JSON.SET: Sets or updates JSON value at a path 2. Add 'json' group to groups.json with JSON commands 3. Fix test-suite metadata issues: - memtier_benchmark-playbook-rate-limiting-lua-100k-sessions.yml: Remove 'bitmap' from tested-groups (only scripting is tested via EVAL) - memtier_benchmark-playbook-realtime-analytics-membership.yml: Add 'sunion' to tested-commands (SUNION command was used but not listed) - memtier_benchmark-playbook-realtime-analytics-membership-pipeline-10.yml: Add 'sunion' to tested-commands These changes resolve all stats validation errors in CI. * fix missing quote from test case memtier_benchmark-1Mkeys-string-set-with-ex-100B-pipeline-10.yml
1 parent 8a23ac1 commit 6520d3a

9 files changed

+19863
-19751
lines changed

commands.json

Lines changed: 19689 additions & 19675 deletions
Large diffs are not rendered by default.

groups.json

Lines changed: 39 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,70 +1,74 @@
11
{
22
"bitmap": {
3-
"display": "Bitmap",
4-
"description": "Operations on the Bitmap data type"
3+
"description": "Operations on the Bitmap data type",
4+
"display": "Bitmap"
55
},
66
"cluster": {
7-
"display": "Cluster",
8-
"description": "Redis Cluster management"
7+
"description": "Redis Cluster management",
8+
"display": "Cluster"
99
},
1010
"connection": {
11-
"display": "Connection",
12-
"description": "Client connections management"
11+
"description": "Client connections management",
12+
"display": "Connection"
1313
},
1414
"generic": {
15-
"display": "Generic",
16-
"description": "Generic commands"
15+
"description": "Generic commands",
16+
"display": "Generic"
1717
},
1818
"geo": {
19-
"display": "Geospatial indices",
20-
"description": "Operations on the Geospatial Index data type"
19+
"description": "Operations on the Geospatial Index data type",
20+
"display": "Geospatial indices"
2121
},
2222
"hash": {
23-
"display": "Hash",
24-
"description": "Operations on the Hash data type"
23+
"description": "Operations on the Hash data type",
24+
"display": "Hash"
2525
},
2626
"hyperloglog": {
27-
"display": "HyperLogLog",
28-
"description": "Operations on the HyperLogLog data type"
27+
"description": "Operations on the HyperLogLog data type",
28+
"display": "HyperLogLog"
2929
},
30+
"json": [
31+
"JSON.GET",
32+
"JSON.SET"
33+
],
3034
"list": {
31-
"display": "List",
32-
"description": "Operations on the List data type"
35+
"description": "Operations on the List data type",
36+
"display": "List"
3337
},
3438
"pubsub": {
35-
"display": "Pub/Sub",
36-
"description": "Pub/Sub commands"
39+
"description": "Pub/Sub commands",
40+
"display": "Pub/Sub"
3741
},
3842
"scripting": {
39-
"display": "Scripting and Functions",
40-
"description": "Redis server-side scripting and functions"
43+
"description": "Redis server-side scripting and functions",
44+
"display": "Scripting and Functions"
4145
},
4246
"sentinel": {
43-
"display": "Sentinel",
44-
"description": "Redis Sentinel commands"
47+
"description": "Redis Sentinel commands",
48+
"display": "Sentinel"
4549
},
4650
"server": {
47-
"display": "Server",
48-
"description": "Server management commands"
51+
"description": "Server management commands",
52+
"display": "Server"
4953
},
5054
"set": {
51-
"display": "Set",
52-
"description": "Operations on the Set data type"
55+
"description": "Operations on the Set data type",
56+
"display": "Set"
5357
},
5458
"sorted-set": {
55-
"display": "Sorted Set",
56-
"description": "Operations on the Sorted Set data type"
59+
"description": "Operations on the Sorted Set data type",
60+
"display": "Sorted Set"
5761
},
5862
"stream": {
59-
"display": "Stream",
60-
"description": "Operations on the Stream data type"
63+
"description": "Operations on the Stream data type",
64+
"display": "Stream"
6165
},
6266
"string": {
63-
"display": "String",
64-
"description": "Operations on the String data type"
67+
"description": "Operations on the String data type",
68+
"display": "String"
6569
},
6670
"transactions": {
67-
"display": "Transactions",
68-
"description": "Redis Transaction management"
71+
"description": "Redis Transaction management",
72+
"display": "Transactions"
6973
}
70-
}
74+
}

redis_benchmarks_specification/__cli__/cli.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,22 @@ def trigger_tests_dockerhub_cli_command_logic(args, project_name, project_versio
8080
decode_responses=False,
8181
)
8282
conn.ping()
83+
84+
# Extract version from Docker image tag if possible
85+
# e.g., "redis:7.4.0" -> "7.4.0"
86+
# e.g., "valkey/valkey:7.2.6-bookworm" -> "7.2.6"
87+
git_version = None
88+
if ":" in args.run_image:
89+
tag = args.run_image.split(":")[-1]
90+
# Try to extract version number from tag
91+
# Common patterns: "7.4.0", "7.2.6-bookworm", "latest"
92+
import re
93+
94+
version_match = re.match(r"^(\d+\.\d+\.\d+)", tag)
95+
if version_match:
96+
git_version = version_match.group(1)
97+
logging.info(f"Extracted git_version '{git_version}' from image tag")
98+
8399
testDetails = {}
84100
build_stream_fields, result = generate_benchmark_stream_request(
85101
args.id,
@@ -96,7 +112,7 @@ def trigger_tests_dockerhub_cli_command_logic(args, project_name, project_versio
96112
None,
97113
None,
98114
None,
99-
None,
115+
git_version, # Pass extracted version
100116
None,
101117
None,
102118
None,

redis_benchmarks_specification/__runner__/runner.py

Lines changed: 48 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -175,13 +175,20 @@ def validate_benchmark_metrics(
175175
Args:
176176
results_dict: Dictionary containing benchmark results
177177
test_name: Name of the test being validated
178-
benchmark_config: Benchmark configuration (unused, for compatibility)
178+
benchmark_config: Benchmark configuration (optional, contains tested-commands)
179179
default_metrics: Default metrics configuration (unused, for compatibility)
180180
181181
Returns:
182182
tuple: (is_valid, error_message)
183183
"""
184184
try:
185+
# Get tested commands from config if available
186+
tested_commands = []
187+
if benchmark_config and "tested-commands" in benchmark_config:
188+
tested_commands = [
189+
cmd.lower() for cmd in benchmark_config["tested-commands"]
190+
]
191+
185192
# Define validation rules
186193
throughput_patterns = [
187194
"ops/sec",
@@ -219,6 +226,29 @@ def check_nested_dict(data, path=""):
219226
):
220227
return
221228

229+
# Skip operation-specific metrics for operations not being tested
230+
# For example, skip Gets.Ops/sec if only SET commands are tested
231+
if tested_commands:
232+
skip_metric = False
233+
operation_types = [
234+
"gets",
235+
"sets",
236+
"hgets",
237+
"hsets",
238+
"lpush",
239+
"rpush",
240+
"sadd",
241+
]
242+
for op_type in operation_types:
243+
if (
244+
op_type in metric_path_lower
245+
and op_type not in tested_commands
246+
):
247+
skip_metric = True
248+
break
249+
if skip_metric:
250+
return
251+
222252
# Check throughput metrics
223253
for pattern in throughput_patterns:
224254
if pattern in metric_path_lower:
@@ -2680,13 +2710,20 @@ def delete_temporary_files(
26802710
if not success:
26812711
logging.error(f"Memtier benchmark failed: {stderr}")
26822712
# Clean up database after failure (timeout or error)
2683-
if args.flushall_on_every_test_end or args.flushall_on_every_test_start:
2684-
logging.warning("Benchmark failed - cleaning up database with FLUSHALL")
2713+
if (
2714+
args.flushall_on_every_test_end
2715+
or args.flushall_on_every_test_start
2716+
):
2717+
logging.warning(
2718+
"Benchmark failed - cleaning up database with FLUSHALL"
2719+
)
26852720
try:
26862721
for r in redis_conns:
26872722
r.flushall()
26882723
except Exception as e:
2689-
logging.error(f"FLUSHALL failed after benchmark failure: {e}")
2724+
logging.error(
2725+
f"FLUSHALL failed after benchmark failure: {e}"
2726+
)
26902727
# Continue with the test but log the failure
26912728
client_container_stdout = f"ERROR: {stderr}"
26922729

@@ -3033,8 +3070,13 @@ def delete_temporary_files(
30333070
test_result = False
30343071

30353072
# Clean up database after exception to prevent contamination of next test
3036-
if args.flushall_on_every_test_end or args.flushall_on_every_test_start:
3037-
logging.warning("Exception caught - cleaning up database with FLUSHALL")
3073+
if (
3074+
args.flushall_on_every_test_end
3075+
or args.flushall_on_every_test_start
3076+
):
3077+
logging.warning(
3078+
"Exception caught - cleaning up database with FLUSHALL"
3079+
)
30383080
try:
30393081
for r in redis_conns:
30403082
r.flushall()

redis_benchmarks_specification/test-suites/memtier_benchmark-1Mkeys-string-set-with-ex-100B-pipeline-10.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ dbconfig:
1212
tool: memtier_benchmark
1313
arguments: '"--data-size" "100" "--ratio" "1:0" "--key-pattern" "P:P" "-c" "50"
1414
"-t" "2" "--hide-histogram" "--key-minimum" "1" "--key-maximum" "1000000" -n
15-
allkeys" --pipeline 50'
15+
"allkeys" --pipeline 50'
1616
resources:
1717
requests:
1818
memory: 1g

redis_benchmarks_specification/test-suites/memtier_benchmark-playbook-rate-limiting-lua-100k-sessions.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ tested-commands:
3434
- bitcount
3535
- eval
3636
tested-groups:
37-
- bitmap
3837
- scripting
3938

4039
redis-topologies:

redis_benchmarks_specification/test-suites/memtier_benchmark-playbook-realtime-analytics-membership-pipeline-10.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ dbconfig:
3535
tested-commands:
3636
- smembers
3737
- sdiff
38+
- sunion
3839
redis-topologies:
3940
- oss-standalone
4041
build-variants:

redis_benchmarks_specification/test-suites/memtier_benchmark-playbook-realtime-analytics-membership.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ dbconfig:
3535
tested-commands:
3636
- smembers
3737
- sdiff
38+
- sunion
3839
redis-topologies:
3940
- oss-standalone
4041
build-variants:

0 commit comments

Comments
 (0)