OSAC-706: E2E tests for catalog item lifecycle#44
Conversation
|
Skipping CI for Draft Pull Request. |
|
@tzvatot: This pull request references OSAC-706 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tzvatot The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
923e281 to
f0d016a
Compare
WalkthroughThis PR adds comprehensive test infrastructure and end-to-end test coverage for cluster catalog items. It includes a configurable test fixture, gRPC and CLI helper methods for catalog operations, and six test cases covering creation, visibility enforcement, cluster integration, and deletion constraints with proper resource cleanup. ChangesCatalog Item Test Infrastructure and Coverage
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
Poem
Security & Risk AssessmentLow Risk. This PR adds test infrastructure without modifying production code. The tests validate critical security and integrity controls:
No credentials, secrets, or sensitive data are introduced. The test infrastructure safely uses environment variables and CLI parsers to extract UUIDs and validate API responses. 🚥 Pre-merge checks | ✅ 9 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This PR is in draft because it depends on fulfillment-service#595 (filter unpublished catalog items from public API). The |
f0d016a to
bc80ea1
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/catalog/test_catalog_item_lifecycle.py`:
- Around line 73-77: Replace the manual try/except pattern with pytest's context
manager: wrap the call to
cli.create_cluster_with_catalog_item(catalog_item=catalog_item_id,
name=cluster_name) inside pytest.raises(subprocess.CalledProcessError) so the
test asserts the call fails idiomatically (this removes the manual raise
AssertionError and silences the Ruff TRY003 warning); keep the same exception
type (subprocess.CalledProcessError) and the exact call to
cli.create_cluster_with_catalog_item to locate the change.
- Around line 60-63: The teardown deletes the cluster with
cli.delete_cluster(uuid=cluster_id) then immediately calls the checked
grpc.delete_cluster_catalog_item(catalog_item_id=catalog_item_id), which can
flake if the cluster still references the catalog item; update both teardown
sites (around lines shown) to poll until the cluster is removed from gRPC before
deleting the catalog item—use grpc.list_cluster_ids() in a small loop or reuse
the existing wait_for_cluster_grpc_removal(cluster_id) helper to wait for
removal, then call
grpc.delete_cluster_catalog_item(catalog_item_id=catalog_item_id).
In `@tests/core/grpc_client.py`:
- Around line 88-93: call_unchecked duplicates the grpcurl argument construction
from call, risking drift; extract a shared helper (e.g., build_grpcurl_args or
_grpcurl_args) that accepts token, address, service, and optional data and
returns the args list, then replace the construction in both call and
call_unchecked to use that helper so the Authorization header and transport
flags are maintained in one place (ensure the helper handles -insecure, -H
Authorization: Bearer {token}, optional -d json.dumps(data), and appends address
and service).
In `@tests/core/osac_cli.py`:
- Around line 105-109: Duplicate logic parsing a UUID from CLI stdout appears in
create_compute_instance, create_cluster, and create_cluster_with_catalog_item;
extract that into a single helper (e.g., parse_uuid_from_stdout or
_extract_uuid) on the same test class or module, implement it to run the regex
re.search(r"'([^']+)'", stdout) with the assert and return group(1), then
replace the inlined match/assert/group(1) sequences in create_compute_instance,
create_cluster, and create_cluster_with_catalog_item to call the new helper so
all parsing is centralized and consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 6b1916ec-5039-4b63-ba9b-74a3b598b9a4
📒 Files selected for processing (5)
tests/catalog/__init__.pytests/catalog/conftest.pytests/catalog/test_catalog_item_lifecycle.pytests/core/grpc_client.pytests/core/osac_cli.py
9dc0722 to
747b38e
Compare
Add 5 E2E tests covering catalog item CRUD, visibility filtering, cluster creation via catalog item, unpublished rejection, and delete-when-referenced protection. Extend GRPCClient with catalog item operations (create, get, list, delete) and call_unchecked for negative test cases. Add CLI helper for creating clusters with --catalog-item flag. Generated with [Claude Code](https://claude.com/claude-code)
Fix grpcurl error code matching to use case-insensitive comparison (grpcurl returns 'NotFound' not 'NOT_FOUND'). Mark test_delete_catalog_item_blocked_when_referenced as xfail since the server does not yet enforce referential integrity on catalog item deletion. Generated with [Claude Code](https://claude.com/claude-code)
- Extract _build_args in GRPCClient to prevent checked/unchecked drift - Extract _parse_uuid in OsacCLI to deduplicate UUID parsing - Use pytest.raises for expected-failure assertion - Wait for cluster removal before deleting catalog item in teardown Generated with [Claude Code](https://claude.com/claude-code)
747b38e to
8c40aa0
Compare
|
The test:
Now it is expected to work as I can see from my tests in my local OSAC env. Could you remove the expected result? |
|
/lgtm |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/catalog/test_catalog_item_lifecycle.py`:
- Around line 52-54: The assertion on line 54 relies on fragile string matching
of the error message output from the grpc.call_unchecked call, checking for "not
published" or "not found" substrings. This test will break if error message
wording changes without any actual functional regression. Instead of matching
against specific error message strings, check for a specific gRPC status code or
structured error code field in the response if the grpc.call_unchecked response
supports it. If the response structure does not expose error codes, document the
string matching limitation; otherwise refactor the assertion to verify the error
code rather than the message text.
- Around line 17-22: The `_wait_cluster_removed` function silently returns when
the deadline is exceeded without raising an exception, allowing test teardown to
proceed even when the cluster is still registered. After the while loop exits,
add a check to determine if the cluster_id is still present in
grpc.list_cluster_ids(). If the cluster remains registered after the deadline
has passed, raise a TimeoutError with a descriptive message instead of returning
silently. This ensures that cluster removal failures are properly surfaced
rather than masked by downstream referential integrity errors during catalog
item deletion.
- Around line 37-40: The test in test_catalog_item_lifecycle.py has a critical
logic error where catalog_item_id is set to an empty string on line 38
immediately after calling grpc.delete_cluster_catalog_item(), and then line 40
asserts that this empty string is not in the catalog item list. This assertion
always passes regardless of whether the deletion succeeded, because an empty
string will never be in the ID list. To fix this, move the assertion on line 40
to execute before line 38 reassigns catalog_item_id to an empty string, so the
assertion actually verifies that the original catalog_item_id value has been
removed from the list returned by grpc.list_cluster_catalog_item_ids().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 399d15cf-017d-4dab-9f84-e56c7437a573
📒 Files selected for processing (5)
tests/catalog/__init__.pytests/catalog/conftest.pytests/catalog/test_catalog_item_lifecycle.pytests/core/grpc_client.pytests/core/osac_cli.py
| def _wait_cluster_removed(grpc: GRPCClient, cluster_id: str, timeout: int = 30) -> None: | ||
| deadline = time.monotonic() + timeout | ||
| while cluster_id in grpc.list_cluster_ids(): | ||
| if time.monotonic() > deadline: | ||
| break | ||
| time.sleep(2) |
There was a problem hiding this comment.
Missing timeout exception masks cluster removal failures.
_wait_cluster_removed breaks silently when the deadline is exceeded without raising an exception. This allows teardown to proceed to catalog item deletion even when the cluster is still registered, causing downstream gRPC referential integrity failures that mask the actual removal problem.
Risk severity: Major — Silent timeouts produce misleading test failures and can leave orphaned catalog items when the real issue is cluster deletion/deregistration lag.
🛡️ Proposed fix to raise TimeoutError on deadline breach
def _wait_cluster_removed(grpc: GRPCClient, cluster_id: str, timeout: int = 30) -> None:
deadline = time.monotonic() + timeout
while cluster_id in grpc.list_cluster_ids():
if time.monotonic() > deadline:
- break
+ raise TimeoutError(f"Cluster {cluster_id} still present after {timeout}s")
time.sleep(2)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/catalog/test_catalog_item_lifecycle.py` around lines 17 - 22, The
`_wait_cluster_removed` function silently returns when the deadline is exceeded
without raising an exception, allowing test teardown to proceed even when the
cluster is still registered. After the while loop exits, add a check to
determine if the cluster_id is still present in grpc.list_cluster_ids(). If the
cluster remains registered after the deadline has passed, raise a TimeoutError
with a descriptive message instead of returning silently. This ensures that
cluster removal failures are properly surfaced rather than masked by downstream
referential integrity errors during catalog item deletion.
There was a problem hiding this comment.
@alosadagrande, thanks for tracking it in #70. The silent break will stay a potential source of confusing failures until that's addressed — worth keeping an eye on it.
🧠 Learnings used
Learnt from: tzvatot
Repo: osac-project/osac-test-infra PR: 44
File: tests/catalog/test_catalog_item_lifecycle.py:52-54
Timestamp: 2026-06-16T11:30:30.149Z
Learning: When reviewing Python E2E tests that use `GRPCClient.call_unchecked()` (from `tests/core/grpc_client.py` in this repo), allow assertions that match on the returned gRPC `output` string (e.g., substring/regex checks) and do not flag them as “fragile” solely on that basis. `call_unchecked()` only returns `(output: str, exit_code: int)` and does not provide structured gRPC status/error codes from the response, so string matching is the only viable assertion strategy for these tests. (You may still flag other issues, but not the string-based assertion fragility claim without acknowledging this constraint.)
| grpc.delete_cluster_catalog_item(catalog_item_id=catalog_item_id) | ||
| catalog_item_id = "" | ||
|
|
||
| assert catalog_item_id not in grpc.list_cluster_catalog_item_ids() |
There was a problem hiding this comment.
Deletion verification checks wrong value (always passes).
Line 38 sets catalog_item_id = "" immediately after deletion, then line 40 asserts that this empty string is not in the catalog item list. This assertion always passes regardless of whether deletion succeeded, because an empty string will never appear in the ID list.
Risk severity: Critical — The test cannot detect catalog item deletion failures, allowing broken delete operations to pass silently and accumulate leaked resources.
🐛 Proposed fix to verify actual catalog item ID
grpc.delete_cluster_catalog_item(catalog_item_id=catalog_item_id)
+ deleted_id = catalog_item_id
catalog_item_id = ""
- assert catalog_item_id not in grpc.list_cluster_catalog_item_ids()
+ assert deleted_id not in grpc.list_cluster_catalog_item_ids()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/catalog/test_catalog_item_lifecycle.py` around lines 37 - 40, The test
in test_catalog_item_lifecycle.py has a critical logic error where
catalog_item_id is set to an empty string on line 38 immediately after calling
grpc.delete_cluster_catalog_item(), and then line 40 asserts that this empty
string is not in the catalog item list. This assertion always passes regardless
of whether the deletion succeeded, because an empty string will never be in the
ID list. To fix this, move the assertion on line 40 to execute before line 38
reassigns catalog_item_id to an empty string, so the assertion actually verifies
that the original catalog_item_id value has been removed from the list returned
by grpc.list_cluster_catalog_item_ids().
|
/retest |
|
@tzvatot: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Add E2E tests for the catalog items feature (OSAC-58 epic, Phase 6).
Tests:
test_catalog_item_crud— Create, list, get, delete via gRPC (private + public APIs)test_unpublished_catalog_item_not_visible_in_public_api— Unpublished items hidden from public List/Gettest_create_cluster_with_catalog_item— CLI creates cluster with--catalog-item, verifies spectest_create_cluster_with_unpublished_catalog_item_fails— Server rejects unpublished catalog itemtest_delete_catalog_item_blocked_when_referenced— Delete blocked when cluster references itemInfrastructure changes:
GRPCClient: Addcall_unchecked,get_cluster, and 4 catalog item methodsOsacCLI: Addcreate_cluster_with_catalog_itemTest plan
Jira: OSAC-706
Generated with Claude Code
Summary by CodeRabbit