Skip to content

Conversation

@ShahabT
Copy link
Contributor

@ShahabT ShahabT commented Dec 11, 2025

What changed?

Add concurrency test and refine some rough edges.

Why?

testing is good!

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

None

@ShahabT ShahabT requested review from a team as code owners December 11, 2025 07:45
Comment on lines -207 to -215
// First ensure deployment workflow is running
if !d.VersionState.StartedDeploymentWorkflow {
activityCtx := workflow.WithActivityOptions(ctx, defaultActivityOptions)
err := workflow.ExecuteActivity(activityCtx, d.a.StartWorkerDeploymentWorkflow, &deploymentspb.StartWorkerDeploymentRequest{
DeploymentName: d.VersionState.Version.DeploymentName,
RequestId: d.newUUID(ctx),
}).Get(ctx, nil)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder: Could there ever be a case where a deployment workflow CAN's while the version workflow is processing an update?

In theory, this could happen. Imagine this scenario:

  1. User sends in a setCurrent update.
  2. Deployment workflow commences an async operation of the sync propagation to the task queues by calling the version workflow.
  3. Version workflow, when it comes here, sends a signal.
  4. The deployment workflow CAN's since according to itself, the state has changed.

I am not fully familiar about the signal "sending" semantics when a workflow is CAN'ing, but this could be important to consider so pasting this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it's possible, but CaN is atomic which means there is no gap between closing the old execution and creating the new one. both pieces happen at the same time.

so the signal will go to the old or the new execution but it can't see "not found".

Even if that was not the case, this check does not protect against the mentioned scenario because the CaN can happen just after this check and before the signal.

Comment on lines +352 to +353
d.logger.Error("Update canceled before worker deployment workflow started")
return serviceerror.NewDeadlineExceeded("Update canceled before worker deployment workflow started")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can improve error message by saying something like:

Update canceled since the corresponding worker deployment workflow for this version did not start.

// workflowVersion is set at workflow start based on the dynamic config of the worker
// that completes the first task. It remains constant for the lifetime of the run and
// only updates when the workflow performs continue-as-new.
// Tracks the version of the deployment workflow when a particular run of a workflow starts base on the dynamic config of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run of a workflow starts based on the dynamic config of the
worker that completes the first task of the workflow. workflowVersion remains the same until the workflow CaNs, after which it will get another chance to pick the latest manager version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part was undone unintentionally, will revert to the old comment which was your suggestion.

Comment on lines +779 to +782
if !asyncMode {
// Erase summary drainage status immediately, so it is not draining/drained.
d.setDrainageStatus(newRampingVersion, enumspb.VERSION_DRAINAGE_STATUS_UNSPECIFIED, routingUpdateTime)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly confused; I think we did have this functionality of changing the version's drainage status almost instantly if we realize it's being promoted. Did we delete it by mistake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this because syncVersion takes care of updating summary already.

Comment on lines 43 to 48
//t.Run("sync", func(t *testing.T) {
// suite.Run(t, &WorkerDeploymentSuite{workflowVersion: workerdeployment.InitialVersion})
//})
//t.Run("async", func(t *testing.T) {
// suite.Run(t, &WorkerDeploymentSuite{workflowVersion: workerdeployment.AsyncSetCurrentAndRamping})
//})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder: remove the comments here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants