Add loadEntities batch call and rename listFullEntities #2508

collado-mike · 2025-09-04T18:28:55Z

#2290 introduced a new loadEntities variant, which is really a listEntities call that returns the complete PolarisBaseEntity rather than the EntityNameLookupRecord. A batch loadEntities call that functions similar to the loadEntity, when given an id, returns the identified entity, is also useful, notably for cases when you don't want to list all entities of a particular type (e.g., loading a set of Principal Roles or refreshing specific entities for the EntityCache).

This introduces a new loadResolvedEntities API and renames the previous loadEntities to listFullEntities to avoid ambiguity. The new API now mirrors the loadResolvedEntity...APIs, accepting either a list of PolarisEntityIds or EntityNameLookupRecords. I used EntityNameLookupRecord because that is the return type for the original listEntities API, but also because PolarisEntityCore requires a grantVersion, which may not be present, e.g., if the caller only has the results of a listEntities call. I also wanted to mirror the existing loadEntity API, which requires an PolarisEntityType argument and the PolarisEntityId doesn't contain a type field.

I used the ResolvedPolarisEntity type and terminology in the API name in order to make the EntityCache API and the raw PolarisMetaStoreManager API the same. In part, this aims to start bringing the two APIs closer together so that the concept of the cache can one day be just an implementation detail, rather than part of the core business logic. The bulk load implementation in the cache mirrors the logic in the Resolver, in that it ensures that it always returns a snapshot consistent with the state of the persistence layer as it exists at a single point in time. This means that it validates that the entire batch of entities returned matches the entity versions and grant versions returned by a call to the loadEntitiesChangeTracking API.

...bc/src/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcBasePersistenceImpl.java

polaris-core/src/main/java/org/apache/polaris/core/entity/PolarisBaseEntity.java

...-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisMetaStoreManager.java

dennishuo · 2025-09-29T18:42:50Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisMetaStoreManager.java

+   *     NULL if the entity has been dropped.
+   */
+  @Nonnull
+  ResolvedEntitiesResult loadResolvedEntities(


The existence of this method may cause confusion since the single-lookup methods have both ByName and ById variations, and an EntityNameLookupRecord happens to have the name in it, and yet it looks like the actual key difference between the two methods is whether we have EntityType per item.

I'm not sure if we already use EntityNameLookupRecord as an input argument anywhere else, but generally since it was kind of structured as an output argument before it seems it becomes ambiguous as an input argument.

And at first glance the unittests seems to imply that the lookup would be "by name":

@ParameterizedTest @ValueSource(strings = {"id", "name"}) .... if (loadType.equals("id")) { // Create entity ID list with the updated entity List<PolarisEntityId> entityIds = List.of(getPolarisEntityId(T6v2)); // Call batch load - this should detect the stale version and reload results = cache.getOrLoadResolvedEntities(this.callCtx, PolarisEntityType.TABLE_LIKE, entityIds); } else { results = cache.getOrLoadResolvedEntities(this.callCtx, List.of(new EntityNameLookupRecord(T6v2))); }

To match convention with the single-item lookups can we rename these methods to say loadResolvedEntitiesById?

And if the difference is really just whether we pass in a per-entity EntityType, I think even parallel Lists (List<EntityId>, List<EntityType> would be better than reusing EntityNameLookupRecord just for its catalogId, entityId, entityType.

Alternatively, cleanest would be just having one interface entirely with List<EntityIdAndType> as the input argument.

Another thing to consider is whether we actually want to allow different entity type lookups within a single batch. It may change the atomicity semantics for persistence implementations where different types are in different atomicity domains.

Are there any callsites that actually rely on using this form of the method instead of the one with a single entityType across the whole list of EntityIds?

TBH, I can't think of a use case where we would mix EntityTypes in a single call. The two immediate use cases I have in mind are

Batch loading the principal roles during the authentication step

Support for loading TableMetadata from persistence rather than from cloud storage (this is in concert with Add properties from TableMetadata into Table entity internalProperties #2735 and other future PRs).

In both cases, only a single EntityType is loaded. I used the EntityNameLookupRecord type largely because it is the return type for the listEntities API, but I wanted to avoid fetching all entities in full in the case that many/most entities are already in cache. Personally, I don't like the pattern of using parallel list parameters for an API, so I would oppose the List<EntityId>, List<EntityType> option. I am ok with a new EntityIdAndType argument, but I'd also be ok with just supporting the one API that takes in the EntityType and List<EntityId> arguments and getting rid of the other option until a need arises.

Yeah sounds good, I think it's good to have the more opinionated "all entities in the batch are the same EntityType" method signature for now, as it's easier to ensure different impls can fulfill the interface. Signature (EntityType, List<EntityId>) looks good to me, and let's remove the overloaded method regarding EntityNameLookupRecord.

dimas-b

I do not have any concerns with the current state of this PR, but I'd be interested in reviewing again after comments from @dennishuo are resolved :)

dimas-b · 2025-09-30T02:35:20Z

...ava/org/apache/polaris/core/persistence/transactional/TransactionalMetaStoreManagerImpl.java

+                  if (e == null) {
+                    return null;
+                  } else {
+                    // load the grant records


nit: maybe add toResolvedPolarisEntity() as in AtomicOperationMetaStoreManager?

dimas-b · 2025-10-09T21:18:46Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisMetaStoreManager.java

+   *     NULL if the entity has been dropped.
+   */
+  @Nonnull
+  ResolvedEntitiesResult loadResolvedEntities(


The name LGTM, but I guess it does not match the PR description anymore 🤔

dimas-b

Changes LGTM 👍 but I'll defer approval to @dennishuo .

collado-mike requested review from clambertus and dennishuo September 4, 2025 18:28

github-project-automation bot added this to Basic Kanban Board Sep 4, 2025

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Sep 4, 2025

dimas-b reviewed Sep 4, 2025

View reviewed changes

collado-mike marked this pull request as draft September 5, 2025 22:18

dennishuo reviewed Sep 5, 2025

View reviewed changes

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisMetaStoreManager.java Outdated Show resolved Hide resolved

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisMetaStoreManager.java Outdated Show resolved Hide resolved

collado-mike force-pushed the mcollado-loadentities-batch branch 2 times, most recently from 1e05a56 to 05d7f49 Compare September 23, 2025 22:29

collado-mike marked this pull request as ready for review September 24, 2025 00:01

collado-mike requested review from dennishuo and dimas-b September 24, 2025 00:01

dennishuo reviewed Sep 29, 2025

View reviewed changes

dimas-b reviewed Sep 30, 2025

View reviewed changes

collado-mike added 9 commits October 8, 2025 15:38

Add loadEntities batch call and rename listFullEntities

e8fee9b

Changed batch call to implement loadResolvedEntities instead

072bac8

Add loadResolvedEntities by id and entity cache support

d693e4e

Add additional test for loadResolvedEntities by id

18fd74e

Added additional test and updated comments in EntityCache interface

a5e779d

Add additional constructor to ResolvedEntitiesResult

b0a6c38

Fixed unused method reference

4fe92d3

Removed loadResolvedEntities method with lookup record param

827ec33

Pulled out toResolvedPolarisEntity method per PR comment

2289355

collado-mike force-pushed the mcollado-loadentities-batch branch from 6804058 to 2289355 Compare October 9, 2025 21:10

dimas-b reviewed Oct 9, 2025

View reviewed changes

collado-mike requested a review from dennishuo October 14, 2025 03:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add loadEntities batch call and rename listFullEntities #2508

Add loadEntities batch call and rename listFullEntities #2508

Uh oh!

collado-mike commented Sep 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dennishuo Sep 29, 2025

Uh oh!

dennishuo Sep 29, 2025

Uh oh!

collado-mike Oct 6, 2025

Uh oh!

dennishuo Oct 6, 2025

Uh oh!

dimas-b left a comment •

edited

Loading

Uh oh!

dimas-b Sep 30, 2025

Uh oh!

dimas-b Oct 9, 2025

Uh oh!

dimas-b left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add loadEntities batch call and rename listFullEntities #2508

Are you sure you want to change the base?

Add loadEntities batch call and rename listFullEntities #2508

Uh oh!

Conversation

collado-mike commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dennishuo Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

dennishuo Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

collado-mike Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

dennishuo Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

dimas-b left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

dimas-b Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

collado-mike commented Sep 4, 2025 •

edited

Loading

dimas-b left a comment •

edited

Loading