-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Adding version checks to remote entities using bytestream ser/de #20080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Adding version checks to remote entities using bytestream ser/de #20080
Conversation
|
❌ Gradle check result for fcd9fc2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
fcd9fc2 to
3ab843d
Compare
|
❌ Gradle check result for 3ab843d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/gateway/remote/model/RemoteDiscoveryNodes.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/remote/RemoteClusterStateService.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterStateCustoms.java
Outdated
Show resolved
Hide resolved
WalkthroughPropagates OpenSearch version into remote cluster-state serialization/deserialization: adds version parameters to async routing read methods, threads manifest.getOpensearchVersion() into remote model constructors and service calls, and makes blob-store formats version-aware by storing and applying a Version when reading streams. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (9)
server/src/main/java/org/opensearch/repositories/blobstore/ChecksumWritableBlobStoreFormat.java (1)
88-89: Consider removing the stale TODO comment.The TODO "The stream version should be configurable" appears to be addressed by this PR for the deserialization path. For serialization, using
Version.CURRENTis correct since newly written blobs should always be at the current version. Consider either removing this TODO or clarifying that it specifically applies to the serialize path if there's a future use case.server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterBlocks.java (1)
36-36: Consider reducing visibility of the format field.The
clusterBlocksFormatfield ispublic final. Since it's only accessed internally byserialize()anddeserialize(), consider making itprivate finalto encapsulate the implementation detail.- public final ChecksumWritableBlobStoreFormat<ClusterBlocks> clusterBlocksFormat; + private final ChecksumWritableBlobStoreFormat<ClusterBlocks> clusterBlocksFormat;server/src/test/java/org/opensearch/gateway/remote/model/RemoteClusterBlocksTests.java (1)
124-138: Consider adding a cross-version deserialization test.The current
testSerDevalidates round-trip serialization with the same version. To fully validate the PR's goal of backwards compatibility, consider adding a test that deserializes data with a differentVersionparameter to ensure version-aware deserialization paths are exercised.server/src/main/java/org/opensearch/gateway/remote/model/RemoteHashesOfConsistentSettings.java (1)
34-34: Consider reducing visibility of the format field.Similar to
RemoteClusterBlocks, thehashesOfConsistentSettingsFormatfield ispublic finalbut only used internally. Consider making itprivate final.- public final ChecksumWritableBlobStoreFormat<DiffableStringMap> hashesOfConsistentSettingsFormat; + private final ChecksumWritableBlobStoreFormat<DiffableStringMap> hashesOfConsistentSettingsFormat;server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteRoutingTableDiff.java (2)
49-51: Clarify naming:VERSIONvsversionparameter.There are two version-related concepts here:
- Line 49:
VERSION = 1- the codec/format version for the blob structure- Constructor parameter
version(line 83) - the OpenSearch version for compatibilityConsider adding a brief comment to distinguish these, or renaming
VERSIONtoCODEC_VERSIONfor clarity.- public static final int VERSION = 1; + /** Codec version for the blob format structure (distinct from OpenSearch version used for compatibility). */ + public static final int CODEC_VERSION = 1;
51-51: Consider reducing visibility of the format field.Consistent with other remote entity classes, the
remoteRoutingTableDiffFormatfield could beprivate finalsince it's only used internally.- public final ChecksumWritableBlobStoreFormat<RoutingTableIncrementalDiff> remoteRoutingTableDiffFormat; + private final ChecksumWritableBlobStoreFormat<RoutingTableIncrementalDiff> remoteRoutingTableDiffFormat;server/src/main/java/org/opensearch/cluster/routing/remote/RemoteRoutingTableService.java (1)
32-44: ClarifyVersionparameter naming to avoid confusion with existinglong versionThe new
Version versionparameter is conceptually different from the existinglong versionarguments in the write APIs (term/version identifiers). Using the same name (version) for both can be confusing at call sites and in implementations.Consider renaming this parameter to something more explicit like
opensearchVersionorwriterVersion(and mirror that in implementations) so it’s obvious this is the OpenSearch wire/version used for (de)serialization, not the routing-table version number.server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTable.java (1)
33-75: Version-aware format wiring looks good; guard against read-only constructor misuseThe move to an instance-level
ChecksumWritableBlobStoreFormatand the new read-side constructor that passesopensearchVersioninto the 3-argChecksumWritableBlobStoreFormatctor aligns well with the version-aware (de)serialization goal. Using the instance field consistently inserialize()/deserialize()is clean.Two points worth tightening up:
Read-only ctor & nullable fields
In theRemoteIndexRoutingTable(String blobName, String clusterUUID, Compressor compressor, Version opensearchVersion)ctor, bothindexRoutingTableandindexare leftnull, but:
getBlobPathParameters()dereferencesindexRoutingTable.getIndex().getUUID().getUploadedMetadata()assertsindex != nulland uses it.If a read-path caller (now or in the future) accidentally invokes either of these on a read-only instance, you’ll get an assertion failure or NPE. It would be safer to:
- Either document clearly that these methods must not be called on read-only instances, or
- Add defensive checks that throw a clearer
IllegalStateExceptionwhenindexRoutingTable/indexare null, indicating misuse, or- Derive the necessary information for blob path / metadata from manifest data passed into the read-side ctor (if available) instead of relying on
indexRoutingTable.Assumption about 2-arg format ctor behavior
For the write-side ctor, you still use the 2-argChecksumWritableBlobStoreFormatconstructor. This is likely fine if that ctor internally defaults toVersion.CURRENT. If instead the version must be explicit for the on-disk format to be tagged correctly, you might want to switch this ctor to the 3-arg form withVersion.CURRENT(or the cluster’s target version) for symmetry.server/src/main/java/org/opensearch/cluster/routing/remote/InternalRemoteRoutingTableService.java (1)
182-213: Read-path wiring withVersionlooks correct; consider stricter typing and null-safetyThe updated read methods correctly:
- Accept a
Versionparameter, and- Thread it through to
RemoteIndexRoutingTable/RemoteRoutingTableDiffvia the new constructors, ensuring version-aware deserialization.A couple of small improvements to make this less error-prone:
Disambiguate the
versionname
In this class you already havelong version(routing-table version) in the write APIs. UsingVersion versionhere can be confusing. Renaming the parameter to something likeopensearchVersionorremoteNodeVersionwould make call sites and logs clearer.Fail fast on null
Version
Since this value is critical for compatibility, consider enforcing non-null upfront, e.g.:public void getAsyncIndexRoutingReadAction(..., Version opensearchVersion) { Objects.requireNonNull(opensearchVersion, "opensearchVersion must not be null"); ... RemoteIndexRoutingTable remoteIndexRoutingTable = new RemoteIndexRoutingTable(uploadedFilename, clusterUUID, compressor, opensearchVersion); ... }Same comment applies to
getAsyncIndexRoutingTableDiffReadAction.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/cluster/routing/remote/InternalRemoteRoutingTableService.java(3 hunks)server/src/main/java/org/opensearch/cluster/routing/remote/NoopRemoteRoutingTableService.java(3 hunks)server/src/main/java/org/opensearch/cluster/routing/remote/RemoteRoutingTableService.java(2 hunks)server/src/main/java/org/opensearch/gateway/remote/RemoteClusterStateService.java(8 hunks)server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterBlocks.java(4 hunks)server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterStateCustoms.java(2 hunks)server/src/main/java/org/opensearch/gateway/remote/model/RemoteCustomMetadata.java(1 hunks)server/src/main/java/org/opensearch/gateway/remote/model/RemoteDiscoveryNodes.java(5 hunks)server/src/main/java/org/opensearch/gateway/remote/model/RemoteHashesOfConsistentSettings.java(4 hunks)server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTable.java(5 hunks)server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteRoutingTableDiff.java(5 hunks)server/src/main/java/org/opensearch/repositories/blobstore/ChecksumWritableBlobStoreFormat.java(2 hunks)server/src/test/java/org/opensearch/cluster/routing/remote/RemoteRoutingTableServiceTests.java(5 hunks)server/src/test/java/org/opensearch/gateway/remote/RemoteClusterStateAttributesManagerTests.java(5 hunks)server/src/test/java/org/opensearch/gateway/remote/RemoteClusterStateServiceTests.java(5 hunks)server/src/test/java/org/opensearch/gateway/remote/RemoteGlobalMetadataManagerTests.java(1 hunks)server/src/test/java/org/opensearch/gateway/remote/model/RemoteClusterBlocksTests.java(4 hunks)server/src/test/java/org/opensearch/gateway/remote/model/RemoteClusterStateCustomsTests.java(4 hunks)server/src/test/java/org/opensearch/gateway/remote/model/RemoteDiscoveryNodesTests.java(4 hunks)server/src/test/java/org/opensearch/gateway/remote/model/RemoteHashesOfConsistentSettingsTests.java(5 hunks)server/src/test/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTableDiffTests.java(5 hunks)server/src/test/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTableTests.java(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/gateway/remote/RemoteClusterStateServiceTests.java (3)
server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterBlocks.java (1)
RemoteClusterBlocks(33-93)server/src/main/java/org/opensearch/gateway/remote/model/RemoteDiscoveryNodes.java (1)
RemoteDiscoveryNodes(33-103)server/src/main/java/org/opensearch/gateway/remote/model/RemoteHashesOfConsistentSettings.java (1)
RemoteHashesOfConsistentSettings(32-108)
server/src/test/java/org/opensearch/cluster/routing/remote/RemoteRoutingTableServiceTests.java (2)
server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTable.java (1)
RemoteIndexRoutingTable(31-120)server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteRoutingTableDiff.java (1)
RemoteRoutingTableDiff(34-139)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: gradle-check
🔇 Additional comments (39)
CHANGELOG.md (1)
72-72: LGTM! The changelog entry is well-formatted, correctly placed in the "Fixed" section, and accurately describes the backward compatibility fix for remote state entities using bytestream serialization. The entry follows project conventions and clearly references the PR.server/src/main/java/org/opensearch/repositories/blobstore/ChecksumWritableBlobStoreFormat.java (2)
53-62: Version-aware deserialization pattern looks correct.The constructor overload with
Version opensearchVersionenables version-aware deserialization, while the default constructor usesVersion.CURRENT. This asymmetry is intentional: serialization always writes at the current version (line 89), while deserialization reads using the version from the source manifest.
110-112: LGTM!Setting the version on
StreamInputbefore applying the reader enables proper version-aware deserialization, allowing readers to handle differences between versions during parsing.server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterBlocks.java (2)
41-52: LGTM!The constructor pattern correctly distinguishes between upload (writing at current version) and download (reading with source version) paths. This enables proper backwards compatibility during version upgrades.
84-92: LGTM!The serialize and deserialize methods correctly use the instance-level format, enabling version-aware deserialization.
server/src/test/java/org/opensearch/gateway/remote/model/RemoteClusterBlocksTests.java (1)
66-67: LGTM!Test correctly updated to pass
Version.CURRENTto the new constructor signature.server/src/main/java/org/opensearch/gateway/remote/model/RemoteHashesOfConsistentSettings.java (2)
39-67: LGTM!The constructor pattern correctly separates upload (current version) and download (source version) paths, consistent with
RemoteClusterBlocks.
98-107: LGTM!Serialization and deserialization correctly use the instance-level format.
server/src/main/java/org/opensearch/gateway/remote/routingtable/RemoteRoutingTableDiff.java (2)
62-88: LGTM!The constructor pattern correctly handles upload (new data at current version) and download (reading with source version) paths, consistent with other remote entity classes in this PR.
129-138: LGTM!Serialization includes a defensive null check, and both methods correctly use the instance-level format for version-aware processing.
server/src/main/java/org/opensearch/gateway/remote/model/RemoteDiscoveryNodes.java (2)
36-36: LGTM! Version-aware format initialization pattern is correct.The dual-constructor approach properly handles both serialization and deserialization paths:
- Line 50: Upload constructor initializes
discoveryNodesFormatwithout version (will default toVersion.CURRENTfor serialization)- Line 56: Download constructor accepts
Versionparameter for backward-compatible deserializationThe public field exposure aligns with the past review discussion where getters/setters were deemed unnecessary.
Also applies to: 50-50, 56-56
91-96: LGTM! Correct usage of instance-based format.Serialization and deserialization correctly use the instance field
discoveryNodesFormatinstead of a static format, enabling version-aware operations.Also applies to: 100-102
server/src/test/java/org/opensearch/gateway/remote/model/RemoteDiscoveryNodesTests.java (1)
73-74: LGTM! Test updates correctly reflect new constructor signature.All test instantiations now pass
Version.CURRENTto the download-path constructor, aligning with the version-aware deserialization pattern.Also applies to: 82-83, 91-92, 97-98, 155-156
server/src/test/java/org/opensearch/gateway/remote/model/RemoteHashesOfConsistentSettingsTests.java (1)
11-11: LGTM! Test constructor calls updated correctly.Added
Version.CURRENTparameter to allRemoteHashesOfConsistentSettingsdownload-path constructor calls, consistent with the version-aware pattern.Also applies to: 75-77, 94-96, 113-115, 124-126
server/src/main/java/org/opensearch/gateway/remote/model/RemoteCustomMetadata.java (1)
78-82: LGTM! Version parameter correctly propagated.The download-path constructor now passes the
versionparameter toChecksumWritableBlobStoreFormat, enabling version-aware deserialization while the upload-path constructor (lines 60-63) continues to use the default version.server/src/test/java/org/opensearch/gateway/remote/model/RemoteClusterStateCustomsTests.java (1)
104-106: LGTM! Test updates align with new constructor signature.All
RemoteClusterStateCustomsdownload-path constructor calls correctly includeCURRENT(Version.CURRENT) as the final parameter.Also applies to: 127-129, 150-152, 163-165
server/src/test/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTableDiffTests.java (1)
11-11: LGTM! Test constructor calls updated correctly.Added
Version.CURRENTparameter to allRemoteRoutingTableDiffdownload-path instantiations, aligning with the version-aware deserialization pattern.Also applies to: 104-105, 127-128, 150-151, 253-253
server/src/test/java/org/opensearch/gateway/remote/routingtable/RemoteIndexRoutingTableTests.java (1)
110-116: LGTM! Test constructor calls updated correctly.All
RemoteIndexRoutingTabledownload-path constructor calls now includeVersion.CURRENT, properly exercising the version-aware deserialization path.Also applies to: 145-152, 180-187, 192-198
server/src/test/java/org/opensearch/cluster/routing/remote/RemoteRoutingTableServiceTests.java (4)
570-582: LGTM! Correctly adapted to instance-based format pattern.The test now creates a
RemoteIndexRoutingTableinstance to access itsindexRoutingTableFormatfield (line 577) instead of using a static format. This aligns with the shift to per-instance, version-aware formatting while maintaining test coverage.
586-591: LGTM! Version parameter correctly propagated.The
getAsyncIndexRoutingReadActioncall now includesVersion.CURRENT(line 590), enabling version-aware deserialization in the async read path.
608-616: LGTM! Consistent instance-based format pattern.Similar to the index routing table test, this correctly creates a
RemoteRoutingTableDiffinstance to access itsremoteRoutingTableDiffFormatfield (line 615) for serialization.
621-626: LGTM! Version parameter correctly added.The
getAsyncIndexRoutingTableDiffReadActioncall now includesVersion.CURRENT(line 625), consistent with the version-aware deserialization pattern.server/src/test/java/org/opensearch/gateway/remote/RemoteClusterStateServiceTests.java (4)
1176-1188: LGTM! Version-aware deserialization pattern correctly implemented.The test properly creates a
RemoteHashesOfConsistentSettingsinstance withVersion.CURRENTfor deserialization and uses the instance-based format for serialization. This aligns with the PR's objective of adding version checks to remote entities for backward compatibility.
1224-1237: LGTM! Consistent version-aware pattern for RemoteDiscoveryNodes.The test correctly instantiates
RemoteDiscoveryNodeswith version parameter and uses the instance'sdiscoveryNodesFormatfor serialization, consistent with the approach used for other remote entities.
1246-1258: LGTM! RemoteClusterBlocks follows the same version-aware pattern.The implementation correctly creates
RemoteClusterBlockswithVersion.CURRENTand uses the instance-based format. The pattern is consistently applied across all three remote state entities (RemoteHashesOfConsistentSettings, RemoteDiscoveryNodes, RemoteClusterBlocks).
1268-1268: LGTM! Critical addition for version-aware remote publication.Adding
opensearchVersion(Version.CURRENT)to the manifest is essential for the backward compatibility fix. This embeds the OpenSearch version in the manifest, enabling proper version-aware deserialization when new nodes join older-version nodes.server/src/main/java/org/opensearch/gateway/remote/RemoteClusterStateService.java (6)
1264-1270: Version mismatch logging looks good.The informational log when reading cluster state from a manifest uploaded with a different version is helpful for debugging cross-version compatibility issues. This addresses the previous review feedback.
1304-1310: Version propagation for index routing reads is correct.The version from the manifest is properly passed through to the async read action, enabling version-aware deserialization.
1324-1330: Version propagation for routing table diff reads is correct.Consistent with the index routing read pattern.
1333-1346: Version propagation for custom metadata reads is correct.The
manifest.getOpensearchVersion()is correctly passed to theRemoteCustomMetadataconstructor for version-aware deserialization.
1400-1424: Version propagation for discovery nodes, cluster blocks, and hashes of consistent settings is correct.All three components correctly receive the manifest's OpenSearch version for version-aware deserialization.
1439-1453: Version propagation for cluster state customs is correct.The pattern is consistent with other remote entity reads.
server/src/test/java/org/opensearch/gateway/remote/RemoteGlobalMetadataManagerTests.java (1)
353-383: Test correctly updated for version-aware format.The test properly:
- Constructs
RemoteHashesOfConsistentSettingswithVersion.CURRENT- Uses the instance's
hashesOfConsistentSettingsFormatfor serialization instead of a static formatThis aligns with the broader migration to per-instance version-aware formats.
server/src/main/java/org/opensearch/gateway/remote/model/RemoteClusterStateCustoms.java (1)
64-81: Version-aware constructor for read path is correctly implemented.The design correctly differentiates between:
- Write path (first constructor): Uses default format without version (implicitly current version)
- Read path (second constructor): Accepts version from manifest for backward-compatible deserialization
The version is properly passed to
ChecksumWritableBlobStoreFormatto enable version-aware stream handling.server/src/main/java/org/opensearch/cluster/routing/remote/NoopRemoteRoutingTableService.java (1)
71-89: Noop service correctly updated with Version parameter.The method signatures are updated to match the interface contract. The noop implementation appropriately ignores the version parameter since no actual read operation occurs.
server/src/test/java/org/opensearch/gateway/remote/RemoteClusterStateAttributesManagerTests.java (4)
138-162: Test correctly updated for version-aware DiscoveryNodes format.The test properly constructs
RemoteDiscoveryNodeswithVersion.CURRENTand uses the instance'sdiscoveryNodesFormatfor serialization.
193-215: Test correctly updated for version-aware ClusterBlocks format.The test properly constructs
RemoteClusterBlockswithVersion.CURRENTand uses the instance'sclusterBlocksFormatfor serialization.
257-284: Test correctly updated for version-aware ClusterStateCustoms format.The test properly constructs
RemoteClusterStateCustomswith the newVersion.CURRENTparameter.
310-326: Exception test correctly updated with Version parameter.Consistency maintained in error path testing.
|
❌ Gradle check result for 8fdee5c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Harsh Garg <[email protected]>
Signed-off-by: Harsh Garg <[email protected]>
8fdee5c to
c0a6830
Compare
|
❌ Gradle check result for c0a6830: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Harsh Garg <[email protected]>
|
❌ Gradle check result for fbbd7bc: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
RemotePublication entities which use bytestream for ser/de of clusterState stored in remote were not setting Opensearch version in the bytestream. This was leading to failures during version upgrades whenever a new attribute was getting added to any of those remoteState entities as the new nodes in the latest version were failing to join the existing older version nodes while trying to deserialize the state which did not have new attributes.
Related Issues
Resolves #19843
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.