Add other_doi_by_id_job queue to queue workers autoscaling policy by wendelfabianchinsamy · Pull Request #271 · datacite/mastino

wendelfabianchinsamy · 2025-11-07T07:15:41Z

Purpose

closes: datacite/datacite#2453

Approach

Open Questions and Pre-Merge TODOs

Learning

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Reviewer, please remember our guidelines:

Be humble in the language and feedback you give, ask don't tell.
Consider using positive language as opposed to neutral when offering feedback. This is to avoid the negative bias that can occur with neutral language appearing negative.
Offer suggestions on how to improve code e.g. simplification or expanding clarity.
Ensure you give reasons for the changes you are proposing.

wendelfabianchinsamy · 2025-11-07T07:20:02Z

@richardhallett this is my initial implementation. I will monitor further to check if we can change thresholds. It seems that adding a new alarm is the route to attempt. I am open to suggestions or opinions on this current approach.

richardhallett · 2025-11-07T08:03:22Z

prod-eu-west/services/client-api/queue-worker.tf

+  alarm_description   = "Scale up when the new queue gets large"
+  alarm_actions       = [aws_appautoscaling_policy.queue-worker_scale_up.arn]
+
+  threshold = "3000000"


We're expecting 3 million messages as a baseline? that's a lot

There are 4 million messages in the queue at the moment. I've set the message lifetime to ten days since these operations are not time sensitive.

richardhallett

The problem with this is I think it's going to conflict with the existing scaling.

Because lets say this triggers and scales up, the existing alarm which checks the counts of the other queues will trigger a scale down I'd imagine.

wendelfabianchinsamy · 2025-11-07T08:20:31Z

The problem with this is I think it's going to conflict with the existing scaling.

Because lets say this triggers and scales up, the existing alarm which checks the counts of the other queues will trigger a scale down I'd imagine.

Yes I've been thinking about this as well. I wonder if we should consider setting the desired count for queue workers to two or maybe having a different deployment of queue workers for this process. The latter seems wasteful.

richardhallett · 2025-11-07T08:36:19Z

If it's not time sensitive maybe we don't scale at all, just as you say increase the minimum to be 2 to handle the slightly larger load.

Another possibility is you use the existing scale_down metric to include the new queue, that way it'll scale up based on your custom rule, but only scale down if the total across all queues is less than X (5000 currently)

Without thinking more about bigger architecture changes I'd be reluctant to have a different deployment of queue-workers. To me the only way that makes sense if it's an isolated service (which is a possibility but more work)

wendelfabianchinsamy · 2025-11-07T08:39:33Z

If it's not time sensitive maybe we don't scale at all, just as you say increase the minimum to be 2 to handle the slightly larger load.

Another possibility is you use the existing scale_down metric to include the new queue, that way it'll scale up based on your custom rule, but only scale down if the total across all queues is less than X (5000 currently)

Without thinking more about bigger architecture changes I'd be reluctant to have a different deployment of queue-workers. To me the only way that makes sense if it's an isolated service (which is a possibility but more work)

Yes let's go ahead with setting the desired count to two for now. We can continue monitoring and make a call at a later date. I will keep this pr open for future reference.

Add other_doi_by_id_job queue to queue workers autoscaling policy

449de02

wendelfabianchinsamy requested a review from richardhallett November 7, 2025 07:18

richardhallett reviewed Nov 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add other_doi_by_id_job queue to queue workers autoscaling policy#271

Add other_doi_by_id_job queue to queue workers autoscaling policy#271
wendelfabianchinsamy wants to merge 1 commit intomasterfrom
add-other-doi-job-queue-to-queue-workers-auto-scaling-policy

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

richardhallett Nov 7, 2025

Uh oh!

wendelfabianchinsamy Nov 7, 2025

Uh oh!

richardhallett left a comment

Uh oh!

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

richardhallett commented Nov 7, 2025

Uh oh!

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wendelfabianchinsamy commented Nov 7, 2025

Purpose

Approach

Open Questions and Pre-Merge TODOs

Learning

Types of changes

Reviewer, please remember our guidelines:

Uh oh!

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

richardhallett Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

wendelfabianchinsamy Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

richardhallett left a comment

Choose a reason for hiding this comment

Uh oh!

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

richardhallett commented Nov 7, 2025

Uh oh!

wendelfabianchinsamy commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants