fix: improve error handling and logging in system mapping and unmapping#171
fix: improve error handling and logging in system mapping and unmapping#171OlegGitH wants to merge 11 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR updates the mapping service’s map/unmap paths for system-to-tenant relationships, primarily improving error/log visibility around validation, repository operations, and successful completion.
Changes:
- Renames local error variables in map/unmap transaction callbacks.
- Adds contextual error, warning, debug, and success logs.
- Updates helper comments for mapping validation functions.
Comments suppressed due to low confidence (2)
internal/service/mapping.go:99
- Renaming the local error here does not change the behavior described in the PR: the previous code returned immediately from the transaction when map validation failed, so createSystem/Patch could not overwrite that validation error. Please add a regression test that reproduces the claimed success-on-validation-failure path or update the PR description to match the logging-only behavior change.
system, found, validateErr := isSystemTenantMapAllowed(ctx, r, in)
if validateErr != nil {
slogctx.Error(ctx, "isSystemTenantMapAllowed failed", "error", validateErr)
return validateErr
internal/service/mapping.go:226
- Correct the grammar in the function comment: “Tenant exist” should be “Tenant exists”.
// It returns nil if the provided Tenant exist, the System is found and not linked, and HasL1KeyClaim is false.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
internal/service/mapping.go:226
- This comment is still misleading: the function also returns a nil error when the system is not found so the caller can create it, not only when the system is found and not linked.
// It returns nil if the provided Tenant exist, the System is found and not linked, and HasL1KeyClaim is false.
internal/service/mapping.go:117
- This success log is emitted before the transaction has committed, so a later commit/rollback failure would still leave a misleading "successfully mapped" debug entry. Reserve success wording for after Transaction returns nil, or make this message explicitly about the patch step only.
slogctx.Debug(ctx, "system successfully mapped in transaction")
…uses-Silent-Failure' into task/Error-Variable-Shadowing-Causes-Silent-Failure
…uses-Silent-Failure' into task/Error-Variable-Shadowing-Causes-Silent-Failure
| // Register system without tenant, then unmap, then set L1 key claim and try to map | ||
| systemID, systemType, region := registerRegionalSystem(t, ctx, sSubj, "", true, allowedSystemType, nil, nil) | ||
| defer cleanupSystem(t, ctx, sSubj, mSubj, systemID, "", systemType, region, true) | ||
|
|
| systemID, systemType, region := registerRegionalSystem(t, ctx, sSubj, inactiveTenant.ID, false, allowedSystemType, nil, nil) | ||
| defer cleanupSystem(t, ctx, sSubj, mSubj, systemID, inactiveTenant.ID, systemType, region, false) |
| assert.Equal(t, status.Code(err), status.Code(service.ErrTenantUnavailable)) | ||
| }) | ||
| t.Run("system has active L1 key claim during map", func(t *testing.T) { | ||
| // Register system without tenant, then unmap, then set L1 key claim and try to map |
| _, patchErr := r.Patch(ctx, system) | ||
| if patchErr != nil { | ||
| slogctx.Error(ctx, "failed to patch system during map", "error", patchErr) | ||
| return ErrSystemUpdate | ||
| } |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
internal/service/mapping.go:100
- The helper already logs at each failure point, but this outer log will fire for every validation failure as an ERROR (including expected cases like not-found / failed-precondition). That produces duplicate log entries and may inflate error-level noise; consider removing this outer error log or logging only unexpected errors here and letting the helper logs stand on their own.
system, found, validateErr := isSystemTenantMapAllowed(ctx, r, in)
if validateErr != nil {
slogctx.Error(ctx, "isSystemTenantMapAllowed failed", "error", validateErr)
return validateErr
}
| err := m.repo.Transaction(ctxTimeout, func(ctx context.Context, r repository.Repository) error { | ||
| system, err := validateAndGetSystemForUnmap(ctx, r, in) | ||
| if err != nil { | ||
| return err | ||
| system, validateErr := validateAndGetSystemForUnmap(ctx, r, in) | ||
| if validateErr != nil { | ||
| slogctx.Error(ctx, "validateAndGetSystemForUnmap failed", "error", validateErr) | ||
| return validateErr |
| return err | ||
| system, validateErr := validateAndGetSystemForUnmap(ctx, r, in) | ||
| if validateErr != nil { | ||
| slogctx.Error(ctx, "validateAndGetSystemForUnmap failed", "error", validateErr) |
| defer func() { | ||
| // Change tenant to active so cleanup can proceed | ||
| inactiveTenant.Status = model.TenantStatus(tenantgrpc.Status_STATUS_ACTIVE.String()) | ||
| db.WithContext(ctx).Save(inactiveTenant) |
Fixes a critical bug in UnmapSystemFromTenant and MapSystemToTenant where error variables were being shadowed/reused inside transaction callbacks. This caused validation errors from validateAndGetSystemForUnmap and isSystemTenantMapAllowed to be silently overwritten by subsequent r.Patch() or createSystem() calls, resulting in failed operations returning success responses to gRPC callers.
Additionally, both functions and their helper methods lacked any logging beyond the initial entry debug log. This made it impossible to diagnose failures in production — only "UnmapSystemFromTenant called" was visible, with no trace of what happened next.
Changes: