Consider starting the renewal of messages in DTFx.Core as soon as they are fetched

In DTFx.Core, the method `RenewTaskOrchestrationWorkItemLockAsync` is used to ensure a given worker maintains exclusivity over a given partition message. For example, in the Azure Storage backend, this messages renews the "message visibility timeout" so that the message does not get dequeued again, or at least until the visibility timeout expires.

This renewal flow is invoked when the message is being **processed**, which has a very specific meaning: we have not exceeded the "maxConcurrentOrchestrations" / "maxConcurrentActivities" limit, and therefore have enough capacity to process more messages.

This means that a message may be received by a given worker, but not become processable for a long time if the active orchestrators/activities match their "max concurrent" settings and are long-running. In that time, since we're not actively extending the message's visibilityTimeout, it is possible for the message to become visible again (possibly being dequeued by the same worker that already has that message!), therefore changing it's `popReceipt`, which in turn prevents us from successfully processing the copy of the message with the old popReceipt. This can lead to a cascade of errors.

I believe framework-level fix to this is to start renewing messages as soon as they're fetched/received, not just when they're being processed. This may require some refactoring in DTFx.Core's `WorkItemDispatcher` class, so it needs to be done with care.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider starting the renewal of messages in DTFx.Core as soon as they are fetched #1150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider starting the renewal of messages in DTFx.Core as soon as they are fetched #1150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions