Skip to content

Conversation

snazy
Copy link
Member

@snazy snazy commented Sep 30, 2025

This PR provides a mechanism to assign a Polaris-cluster-wide unique node-ID to each Polaris instance, which is then used when generating Polaris-cluster-wide unique Snowflake-IDs.

The change is fundamental for the NoSQL work, but also demanded for the existing relational JDBC persistence.

Does not include any persistence specific implementation.

return 0;
}
};
while (true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree that permanent store/load failures are unlikely, but would it be worth adding a limit here just for general robustness?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to fail startup in case we hit a limit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here - yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth breaking this loop on timeout (with an exception)?

Copy link
Member Author

@snazy snazy Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. It takes a bit. Worst that can happen is that it reads 1024 (default) rows in batches of 16 and figures out that none can be leased.

Renewals are safe. Unless you stall your system for 15+ minutes during the "renewal time window" (that would fall back to a new lease attempt).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure.... storeManagementState() (line 153) could keep returning false (for whatever reason, we do not know what the impl. will do 🤷 ) and stall startup... WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a retry-loop for having one state (think: the id-generator configuration) persisted. That configuration (for the snowflake id generator) must be immutable for the lifetime of the "repository", because changing the id-generator configuration will lead to ID collisions. Once that config is there, the loop's finished. That's orthogonal to the node-lease mechanism.

This loop is literally only for the case when concurrently starting nodes read "no state" and then attempt to persist "their" config.

A "legit DB error" bubbles up and aborts the startup anyways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit too fuzzy at this level (what the related components are and do depends on too many factors (build, config, etc.). I'd prefer to avoid the unlimited loop just for "general robustness".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a hard coded timeout. But there's not more we can do. It has to eventually succeed or fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx 👍

This PR provides a mechanism to assign a Polaris-cluster-wide unique node-ID to each Polaris instance, which is then used when generating Polaris-cluster-wide unique Snowflake-IDs.

The change is fundamental for the NoSQL work, but also demanded for the existing relational JDBC persistence.

Does not include any persistence specific implementation.
snazy added 2 commits October 2, 2025 15:32
Also move the expensive part to a `@PostConstruct` to not block CDI entirely from initializing.
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Oct 2, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on it! Left some comments. Given this is a big change(23 new files and 3 new modules), is it worth to have a dev list discussion? So that people are aware of the changes and contribute their ideas.

Some ID generation mechanisms,
like [Snowflake-IDs](https://medium.com/@jitenderkmr/demystifying-snowflake-ids-a-unique-identifier-in-distributed-computing-72796a827c9d),
require unique integer IDs for each running node. This framework provides a mechanism to assign each running node a
unique integer ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If snowflake id generator requires such complex node id generator, maybe we should consider other options. Would it possible to use other id generators? Since we are in the persistence module already, why cannot we use something like ObjectID in mongoDB, or Java UUID?

Comment on lines +31 to +34
* `polaris-nodes-api` provides the necessary Java interfaces and immutable types.
* `polaris-nodes-impl` provides the storage agnostic implementation.
* `polaris-nodes-spi` provides the necessary interfaces to provide a storage specific implementation.
* `polaris-nodes-store-nosql` provides the storage implementation based on `polaris-persistence-nosql-api`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the module?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's in the end-to-end NoSQL PR: #1189 ... to be made available for review later (to allow for smaller, easier-to-review PRs, as discussed)

* specific language governing permissions and limitations
* under the License.
*/
package org.apache.polaris.nodes.api;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anywhere else in Polaris needs this. Can we rename it to org.apache.polaris.nosql.nodes.api or org.apache.polaris.nosql.snowflakeid.nodes.api?

Comment on lines +31 to +33
* `polaris-nodes-api` provides the necessary Java interfaces and immutable types.
* `polaris-nodes-impl` provides the storage agnostic implementation.
* `polaris-nodes-spi` provides the necessary interfaces to provide a storage specific implementation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These modules are used by snowflake id generator only, can we merge it into the modules holding snowflake id generators? So that the snowflake id generator is more consistent and self-contained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants