Skip to content

OSAC-1412: add DB triggers for catalog item referential integrity#686

Open
ygalblum wants to merge 1 commit into
osac-project:mainfrom
ygalblum:fix/catalog-item-fix-toctou-race
Open

OSAC-1412: add DB triggers for catalog item referential integrity#686
ygalblum wants to merge 1 commit into
osac-project:mainfrom
ygalblum:fix/catalog-item-fix-toctou-race

Conversation

@ygalblum

@ygalblum ygalblum commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Replace the application-level delete-time reference check with bidirectional PostgreSQL triggers for catalog item referential integrity, while preserving the shared reference checker for unpublished-item visibility in public servers.

The previous query-then-delete pattern in the private catalog item servers had a TOCTOU race condition: a new referencing row could be inserted between the check and the soft-delete. Database triggers with FOR SHARE locking eliminate this race window entirely.

  • Added migration 54 with bidirectional triggers enforcing referential integrity between catalog items and their consumers (clusters, compute instances)
  • Parent-side triggers prevent soft-deleting a catalog item that is still referenced by active children (raises Z0003 / ErrInUse)
  • Child-side triggers prevent creating or updating a resource with a reference to a non-existent or soft-deleted catalog item (raises Z0002 / ErrReference)
  • Added btree indexes on the catalog_item JSONB field for both child tables
  • Removed the application-level reference checker from private catalog item servers — delete integrity is now enforced by DB triggers
  • Kept the shared catalogItemReferenceChecker interface in public servers for unpublished catalog item visibility checks
  • Deleted the mock reference checker (tests now use real DAO queries)

Migration tests covering trigger and index existence, parent-side enforcement (Z0003), and child-side enforcement (Z0002) for both catalog item types. Server tests updated to use real DAO queries for unpublished-item visibility and to assert trigger-based FailedPrecondition errors on delete.

Assisted-By: Claude Opus 4.6 noreply@anthropic.com

Summary by CodeRabbit

  • Bug Fixes

    • Deleting a catalog item is now blocked when active clusters or compute instances reference it, preventing orphaned references.
    • Creating or updating clusters and compute instances now rejects references to missing or soft-deleted catalog items with clear errors.
    • Referential validation is now enforced consistently at the database level for stronger integrity and more reliable behavior.
  • Tests

    • Migration and server tests updated to exercise the new DB-backed reference checks end-to-end.

@openshift-ci-robot

openshift-ci-robot commented Jun 12, 2026

Copy link
Copy Markdown

@ygalblum: This pull request references OSAC-1412 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Replace the application-level delete-time reference check with bidirectional PostgreSQL triggers for catalog item referential integrity, while preserving the shared reference checker for unpublished-item visibility in public servers.

The previous query-then-delete pattern in the private catalog item servers had a TOCTOU race condition: a new referencing row could be inserted between the check and the soft-delete. Database triggers with FOR SHARE locking eliminate this race window entirely.

  • Added migration 54 with bidirectional triggers enforcing referential integrity between catalog items and their consumers (clusters, compute instances)
  • Parent-side triggers prevent soft-deleting a catalog item that is still referenced by active children (raises Z0003 / ErrInUse)
  • Child-side triggers prevent creating or updating a resource with a reference to a non-existent or soft-deleted catalog item (raises Z0002 / ErrReference)
  • Added btree indexes on the catalog_item JSONB field for both child tables
  • Removed the application-level reference checker from private catalog item servers — delete integrity is now enforced by DB triggers
  • Kept the shared catalogItemReferenceChecker interface in public servers for unpublished catalog item visibility checks
  • Deleted the mock reference checker (tests now use real DAO queries)

Migration tests covering trigger and index existence, parent-side enforcement (Z0003), and child-side enforcement (Z0002) for both catalog item types. Server tests updated to use real DAO queries for unpublished-item visibility and to assert trigger-based FailedPrecondition errors on delete.

Assisted-By: Claude Opus 4.6 noreply@anthropic.com

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot requested review from eranco74 and rgolangh June 12, 2026 15:55
@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ygalblum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ab1b7af2-baba-4bd1-a4a1-1b60fb5acf3c

📥 Commits

Reviewing files that changed from the base of the PR and between b3f5626 and 4d4c5a3.

📒 Files selected for processing (11)
  • internal/database/migrations/54_add_catalog_item_ref_triggers.up.sql
  • internal/database/migrations/54_add_catalog_item_ref_triggers_test.go
  • internal/servers/catalog_item_reference_checker_mock.go
  • internal/servers/cluster_catalog_items_server.go
  • internal/servers/cluster_catalog_items_server_test.go
  • internal/servers/compute_instance_catalog_items_server.go
  • internal/servers/compute_instance_catalog_items_server_test.go
  • internal/servers/private_cluster_catalog_items_server.go
  • internal/servers/private_cluster_catalog_items_server_test.go
  • internal/servers/private_compute_instance_catalog_items_server.go
  • internal/servers/private_compute_instance_catalog_items_server_test.go
💤 Files with no reviewable changes (3)
  • internal/servers/cluster_catalog_items_server.go
  • internal/servers/compute_instance_catalog_items_server.go
  • internal/servers/catalog_item_reference_checker_mock.go

Walkthrough

Adds migration 54: DB triggers and indexes enforce catalog-item references and block soft-deletes while active resources reference an item. Removes DAO-backed reference-checker wiring from private servers; tests switch from gomock to persisting real Cluster/ComputeInstance records exercising DB triggers. Risk: High — incorrect trigger logic may block valid deletes or reject valid writes.

Changes

Referential Integrity Validation Refactoring

Layer / File(s) Summary
SQL migration and trigger tests
internal/database/migrations/54_add_catalog_item_ref_triggers.up.sql, internal/database/migrations/54_add_catalog_item_ref_triggers_test.go
Adds btree indexes on non-null spec->>catalog_item JSONB paths; implements check_cluster_catalog_item_not_in_use() and check_ci_catalog_item_not_in_use() to block active->soft-delete transitions (Z0003); implements check_cluster_catalog_item_ref() and check_ci_catalog_item_ref() to validate references on INSERT/UPDATE using FOR SHARE (Z0002); comprehensive tests cover behaviors and edge cases.
Public server builder cleanup
internal/servers/cluster_catalog_items_server.go, internal/servers/compute_instance_catalog_items_server.go
Stop wiring the DAO-backed referenceChecker into private delegates; public servers still hold referenceChecker for unpublished-item Get checks.
Server tests — integration-style
internal/servers/cluster_catalog_items_server_test.go, internal/servers/compute_instance_catalog_items_server_test.go
Replace gomock expectations with real datastore inserts (private Cluster/ComputeInstance) via dao.NewGenericDAO and privatev1 builders to validate unpublished-item access.
Private server simplification
internal/servers/private_cluster_catalog_items_server.go, internal/servers/private_compute_instance_catalog_items_server.go
Remove referenceChecker field and SetReferenceChecker() builder methods; Delete() now delegates to generic.Delete() without pre-delete reference queries.
Private server tests adjustments
internal/servers/private_cluster_catalog_items_server_test.go, internal/servers/private_compute_instance_catalog_items_server_test.go
Drop gomock import and related creation test; update expected delete-block message substring from "still referenced" to "in use".

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Resource as Cluster/ComputeInstance(table)
  participant CatalogItem as CatalogItem(table)
  Client->>Resource: INSERT/UPDATE spec.catalog_item
  Resource->>CatalogItem: call check_*_catalog_item_ref() (FOR SHARE)
  alt referenced exists & active
    CatalogItem-->>Resource: allow
  else missing or soft-deleted
    CatalogItem-->>Resource: raise Z0002 (abort)
  end
  Note over CatalogItem: On catalog-item soft-delete
  Client->>CatalogItem: UPDATE deletion_timestamp != 'epoch'
  CatalogItem->>Resource: call check_*_catalog_item_not_in_use()
  alt active referencing rows exist
    CatalogItem-->>Client: raise Z0003 (abort)
  else no active references
    CatalogItem-->>Client: allow soft-delete
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • jhernand
  • adriengentil

"Triggers stand guard where mocks once played,
DB locks hold truth while prechecks fade,
Z0002 and Z0003 mark the guarded gate,
Tests now persist to verify the state,
Review the SQL locks — subtle risks to braid."


✅ Pre-merge checks override applied

The pre-merge checks have been overridden successfully. You can now proceed with the merge.

Overridden by @ygalblum via checkbox on 2026-06-12T17:15:37.171Z.

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding database triggers to enforce catalog item referential integrity, which directly addresses the TOCTOU race condition elimination objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No-Hardcoded-Secrets ✅ Passed Scanned PR-related migration + server/test files for hardcoded secrets (api_key/secret/token/password literals, embedded-credential URLs, and private key blocks) and found none.
No-Weak-Crypto ✅ Passed Searched PR files for MD5/SHA1/DES/RC4/3DES/Blowfish/ECB and for ConstantTimeCompare/subtle/bytes.Equal patterns; none found in migration 54 or related server/tests.
No-Injection-Vectors ✅ Passed Reviewed changed SQL/Go files for forbidden injection patterns (SQL concat/EXECUTE concat, shell=True, eval/exec, pickle.loads, yaml.load, os.system, dangerouslySetInnerHTML); none detected. Securi...
Container-Privileges ✅ Passed PR commit changed only console-proxy/grpc-server deployments; they set runAsNonRoot: true and allowPrivilegeEscalation: false, with no hostPID/hostNetwork/hostIPC or SYS_ADMIN/privileged: true found.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive-data logging found in the PR’s modified SQL/Go files: only generic slog.ErrorContext(..., slog.Any("error", err)) appears, and the migration test has no log/print statements.
Ai-Attribution ✅ Passed AI-assisted commit OSAC-1412 includes an Assisted-By trailer and no Co-Authored-By trailer was found; risk low (attribution/compliance only).
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/database/migrations/54_add_catalog_item_ref_triggers_test.go`:
- Around line 190-208: The test currently only asserts the BEFORE INSERT trigger
'check_cluster_catalog_item_ref' on table 'clusters'—add assertions that the
same trigger exists for UPDATE (event_manipulation = 'UPDATE') and repeat
analogous checks for the compute instance trigger(s) referenced elsewhere; then
add behavior tests that exercise the UPDATE path: insert a valid cluster/compute
row, soft-delete or remove its referenced catalog_item, perform an UPDATE that
would keep the same catalog_item_id (or otherwise touch the row) and assert the
update fails or the constraint fires as expected (use the same test helpers and
DB connection used in the file to perform the UPDATE and validate
error/rollback). Ensure you modify the test cases that currently cover only
INSERT (the blocks around the 'check_cluster_catalog_item_ref' and the
compute-instance equivalent) to include both the registration assertion for
UPDATE and at least one failing UPDATE behavior case per resource.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 29954fae-77fd-4af7-a2a1-8d614381d886

📥 Commits

Reviewing files that changed from the base of the PR and between c385789 and af3d8ca.

📒 Files selected for processing (11)
  • internal/database/migrations/54_add_catalog_item_ref_triggers.up.sql
  • internal/database/migrations/54_add_catalog_item_ref_triggers_test.go
  • internal/servers/catalog_item_reference_checker_mock.go
  • internal/servers/cluster_catalog_items_server.go
  • internal/servers/cluster_catalog_items_server_test.go
  • internal/servers/compute_instance_catalog_items_server.go
  • internal/servers/compute_instance_catalog_items_server_test.go
  • internal/servers/private_cluster_catalog_items_server.go
  • internal/servers/private_cluster_catalog_items_server_test.go
  • internal/servers/private_compute_instance_catalog_items_server.go
  • internal/servers/private_compute_instance_catalog_items_server_test.go
💤 Files with no reviewable changes (3)
  • internal/servers/catalog_item_reference_checker_mock.go
  • internal/servers/compute_instance_catalog_items_server.go
  • internal/servers/cluster_catalog_items_server.go

Comment thread internal/database/migrations/54_add_catalog_item_ref_triggers_test.go Outdated
@ygalblum ygalblum force-pushed the fix/catalog-item-fix-toctou-race branch from af3d8ca to b3f5626 Compare June 12, 2026 17:15

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/database/migrations/54_add_catalog_item_ref_triggers.up.sql`:
- Around line 38-42: The trigger SQL is matching catalog items only by id/name,
allowing cross-tenant matches; update the WHERE clauses in the SELECTs (the ones
using data->'spec'->>'catalog_item' = old.id or old.name) to also constrain by
tenant scope (e.g., compare the cluster's tenant column or the tenant field
inside data->spec to old.tenant_id or old.tenant) so only rows from the same
tenant are considered; apply this tenant-scoping change to every occurrence (the
blocks around the current select into child_id and the similar selects at the
other mentioned locations).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 001a58e2-51d8-4dc3-b4be-dfcc791c26c0

📥 Commits

Reviewing files that changed from the base of the PR and between af3d8ca and b3f5626.

📒 Files selected for processing (11)
  • internal/database/migrations/54_add_catalog_item_ref_triggers.up.sql
  • internal/database/migrations/54_add_catalog_item_ref_triggers_test.go
  • internal/servers/catalog_item_reference_checker_mock.go
  • internal/servers/cluster_catalog_items_server.go
  • internal/servers/cluster_catalog_items_server_test.go
  • internal/servers/compute_instance_catalog_items_server.go
  • internal/servers/compute_instance_catalog_items_server_test.go
  • internal/servers/private_cluster_catalog_items_server.go
  • internal/servers/private_cluster_catalog_items_server_test.go
  • internal/servers/private_compute_instance_catalog_items_server.go
  • internal/servers/private_compute_instance_catalog_items_server_test.go
💤 Files with no reviewable changes (3)
  • internal/servers/cluster_catalog_items_server.go
  • internal/servers/catalog_item_reference_checker_mock.go
  • internal/servers/compute_instance_catalog_items_server.go

@ygalblum ygalblum force-pushed the fix/catalog-item-fix-toctou-race branch from b3f5626 to 5120d60 Compare June 12, 2026 17:44
Replace the application-level delete-time reference check with bidirectional
PostgreSQL triggers for catalog item referential integrity, while preserving
the shared reference checker for unpublished-item visibility in public servers.

The previous query-then-delete pattern in the private catalog item servers had
a TOCTOU race condition: a new referencing row could be inserted between the
check and the soft-delete. Database triggers with FOR SHARE locking eliminate
this race window entirely.

- Added migration 54 with bidirectional triggers enforcing referential integrity
  between catalog items and their consumers (clusters, compute instances)
- Parent-side triggers prevent soft-deleting a catalog item that is still
  referenced by active children (raises Z0003 / ErrInUse)
- Child-side triggers prevent creating or updating a resource with a reference
  to a non-existent or soft-deleted catalog item (raises Z0002 / ErrReference)
- Triggers resolve catalog item references by both id and name, matching the
  server-side lookup behavior that allows users to specify either
- Added btree indexes on the catalog_item JSONB field for both child tables
- Removed the application-level reference checker from private catalog item
  servers — delete integrity is now enforced by DB triggers
- Kept the shared catalogItemReferenceChecker interface in public servers for
  unpublished catalog item visibility checks
- Deleted the mock reference checker (tests now use real DAO queries)

Migration tests covering trigger and index existence, parent-side enforcement
(Z0003), and child-side enforcement (Z0002) for both catalog item types.
Server tests updated to use real DAO queries for unpublished-item visibility
and to assert trigger-based FailedPrecondition errors on delete.

Assisted-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
@ygalblum ygalblum force-pushed the fix/catalog-item-fix-toctou-race branch from 5120d60 to 4d4c5a3 Compare June 12, 2026 19:33
@ygalblum

Copy link
Copy Markdown
Contributor Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants