Skip to content

Conversation

@platinumhamburg
Copy link
Contributor

Purpose

Linked issue: close #1914

Brief change log

Tests

API and Format

Documentation


@Override
public CompletableFuture<LookupResult> lookup(InternalRow prefixKey) {
public synchronized CompletableFuture<LookupResult> lookup(InternalRow prefixKey) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using synchronized on the entire lookup method might affect client-side lookup efficiency. May be we could narrow the lock scope to only cover thread-unsafe components instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @beryllw I'll try to optimize that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xx789633 What's your solution to optimize it? IIUC, Flink lookup join operator should call lookup synchronously.. The mutiple thread calling should happen when retry happen which is call in a callback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @luoyuxia @platinumhamburg @beryllw some members in lookuper are stateful, so I agree that there should be some lock protection for those members.

But what I'm more curious about is: in what scenarios would Flink perform concurrent lookups using the same lookuper instance? Are there any more detailed logs available for that particular bug?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @luoyuxia , I don't think the retry logic would cause concurrent lookups because the result future defined in asyncLookup will only be completed when the retry fails (with resultFuture.completeExceptionally) or successes (resultFuture.complete). In either case, Flink will wait for the result future to finish before its own retry logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @luoyuxia @platinumhamburg @beryllw some members in lookuper are stateful, so I agree that there should be some lock protection for those members.

But what I'm more curious about is: in what scenarios would Flink perform concurrent lookups using the same lookuper instance? Are there any more detailed logs available for that particular bug?

Hi, @xx789633. This case occurs during the tabletServer restart process (cluster upgrade). Currently, what I have observed is that it can be stably reproduced when restarting the Flink lookup job during the upgrade. I have tried to reproduce it in other scenarios, but without success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Client] lookup() function in PrimaryKeyLookuper and PrefixKeyLookuper is not thread safety

5 participants