Add other_doi_by_id_job queue to queue workers autoscaling policy#271
Add other_doi_by_id_job queue to queue workers autoscaling policy#271wendelfabianchinsamy wants to merge 1 commit intomasterfrom
Conversation
|
@richardhallett this is my initial implementation. I will monitor further to check if we can change thresholds. It seems that adding a new alarm is the route to attempt. I am open to suggestions or opinions on this current approach. |
| alarm_description = "Scale up when the new queue gets large" | ||
| alarm_actions = [aws_appautoscaling_policy.queue-worker_scale_up.arn] | ||
|
|
||
| threshold = "3000000" |
There was a problem hiding this comment.
We're expecting 3 million messages as a baseline? that's a lot
There was a problem hiding this comment.
There are 4 million messages in the queue at the moment. I've set the message lifetime to ten days since these operations are not time sensitive.
richardhallett
left a comment
There was a problem hiding this comment.
The problem with this is I think it's going to conflict with the existing scaling.
Because lets say this triggers and scales up, the existing alarm which checks the counts of the other queues will trigger a scale down I'd imagine.
Yes I've been thinking about this as well. I wonder if we should consider setting the desired count for queue workers to two or maybe having a different deployment of queue workers for this process. The latter seems wasteful. |
|
If it's not time sensitive maybe we don't scale at all, just as you say increase the minimum to be 2 to handle the slightly larger load. Another possibility is you use the existing scale_down metric to include the new queue, that way it'll scale up based on your custom rule, but only scale down if the total across all queues is less than X (5000 currently) Without thinking more about bigger architecture changes I'd be reluctant to have a different deployment of queue-workers. To me the only way that makes sense if it's an isolated service (which is a possibility but more work) |
Yes let's go ahead with setting the desired count to two for now. We can continue monitoring and make a call at a later date. I will keep this pr open for future reference. |
Purpose
closes: datacite/datacite#2453
Approach
Open Questions and Pre-Merge TODOs
Learning
Types of changes
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Reviewer, please remember our guidelines: