Skip to content

Add note to use s5cmd for large transfers to LTS#948

Closed
mdefende wants to merge 1 commit into
uabrc:mainfrom
mdefende:enh-add-note-to-use-s5cmd-for-large-transfers
Closed

Add note to use s5cmd for large transfers to LTS#948
mdefende wants to merge 1 commit into
uabrc:mainfrom
mdefende:enh-add-note-to-use-s5cmd-for-large-transfers

Conversation

@mdefende
Copy link
Copy Markdown
Member

Added note at the top of the Globus section on LTS's interfaces explaining that large data should be transferred using s5cmd instead of Globus. We've run into a number of tickets asking about slow transfers when using Globus. Many people probably use Globus since it's the first option listed on the page and is generally the easiest to use. I added a note at the top of that section directing people to s5cmd for larger data transfers

@mdefende mdefende added the pr: review PR is ready for review label Mar 27, 2025
Copy link
Copy Markdown
Contributor

@wwarriner wwarriner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but should be temporary. The issue should be resolved in a couple of weeks or so when Ceph backports a bugfix to our version. Please note the temporary nature and we'll get this merged.

Additionally, this affects all data going in or out of LTS right now, to any other endpoint.

Thank you for this.

@wwarriner wwarriner added pr: changes requested Review complete, needs changes and removed pr: review PR is ready for review labels Mar 28, 2025
@mdefende
Copy link
Copy Markdown
Member Author

I'm not sure this is true. I know there's a slowdown due to that bug, but Globus has always been slower to transfer data to LTS than using the CLI. For instance, on Friday I tested a transfer of 390 GB of data using both s5cmd and Globus. s5cmd with a modest worker setup (12 cores split into 4 workers with 3 concurrent transfers) transferred that amount of data in around 30-40 minutes while Globus took 3 hours. Maybe the amount of compute Globus had access to for copying those files was much lower than the s5cmd task, but I'm not confident that once the bug is patched that Globus will immediately become as performant as s5cmd.

@wwarriner
Copy link
Copy Markdown
Contributor

I'm going to change the wording for this in the PR to something like:

Globus makes data transfer simple and robust, with an easy-to-use graphical interface. s5cmd has the potential to transfer data more rapidly than Globus, but requires use of the command line. If you need large amounts of data transferred quickly, consider using s5cmd.

With appropriate URLs. We'll need to get the s5cmd documentation in before we recommend its use, so I'm going to pause this one until we can get #548 in.

I'll go ahead and move #956 forward, or something like it, in the meantime.

@wwarriner
Copy link
Copy Markdown
Contributor

Closing in favor of #965

@wwarriner wwarriner closed this Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: changes requested Review complete, needs changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants