Skip to content

Conversation

@l45k
Copy link
Contributor

@l45k l45k commented Jan 3, 2026

In case of an error when a worker sends gradients to the ps, we will not add the update to the list of updates and remove the corrupted file.

@l45k l45k requested a review from orlandohohmeier January 3, 2026 13:33
@codecov
Copy link

codecov bot commented Jan 3, 2026

Codecov Report

❌ Patch coverage is 0% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/worker/src/executor/parameter_server.rs 0.00% 21 Missing ⚠️

📢 Thoughts on this report? Let us know!

updates_notify.notify_one();
}
Err(err) => {
tracing::error!(error = %err, file = %file_path.display(), "Failed to write received update");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you adjust the error message? This is both it fails to write or receive as the fixed size asynchronous reader of the streaming protocol will check received against the expected number of bytes and error if not everything was sent.

orlandohohmeier
orlandohohmeier previously approved these changes Jan 3, 2026
@l45k l45k merged commit e75a116 into alpha Jan 3, 2026
7 of 8 checks passed
@l45k l45k deleted the leo/fix_ps branch January 3, 2026 16:05
@github-actions
Copy link

github-actions bot commented Jan 3, 2026

🎉 This PR is included in version 1.0.0-alpha.72 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants