Skip to content

[hotfix][s3] Improve interruption handling for s5cmd #26576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pnowojski
Copy link
Contributor

Small hotfix, for not waiting for s5cmd process to finish (s5cmd was being cancelled, Flink was just not waiting for it to happen and was incorrectly logging a warning).

Verifying this change

N/A

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@pnowojski pnowojski requested a review from rkhachatryan May 19, 2025 12:50
Copy link
Contributor

@rkhachatryan rkhachatryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@flinkbot
Copy link
Collaborator

flinkbot commented May 19, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@@ -81,7 +81,7 @@ public class FlinkS3FileSystem extends HadoopFileSystem
implements EntropyInjectingFileSystem, PathsCopyingFileSystem {
private static final Logger LOG = LoggerFactory.getLogger(FlinkS3FileSystem.class);

private static final long PROCESS_KILL_SLEEP_TIME_MS = 50L;
private static final long PROCESS_KILL_SLEEP_TIME_MS = 1000L;
Copy link
Contributor

@davidradl davidradl May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see https://flink.apache.org/how-to-contribute/contribute-code/ says hotfix is "Note: trivial hot fixes such as typos or syntax errors can be opened as a [hotfix] pull request, without a Jira ticket."

I do not think this is that trivial, as it relates to a change of the Thread's intterupt status as it relates to Thread sleeps. I suggest raising a Jira to indicate what problem you are looking to fix.

Please could you add a comment to say why you are increasing this sleep time by 20 times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants