fix(queue): share one ioredis connection across BullMQ queues and workers#1009
Open
LorenzoGalassi wants to merge 1 commit into
Open
Conversation
…kers
BullMQ instantiates a fresh ioredis client per Queue/Worker when handed a
plain {host, port} config object, and under sustained ZIM ingestion the
embed pipeline leaked ~1 client/sec until Redis maxclients was exhausted.
Pass a single shared ioredis instance (maxRetriesPerRequest: null, as
required by BullMQ) so all queues and workers reuse one client pool.
Workers still duplicate the connection once for their blocking client,
which is expected and bounded.
Closes Crosstalk-Solutions#885
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #885
config/queue.tshands BullMQ a plain{host, port}connection object. BullMQ treats that as a recipe: everyQueueandWorkerinstantiates its own ioredis client from it, and script commands executed against those clients can spawn further short-lived connections. As documented in #885, under sustained ZIM ingestion this leaks ~1 client/sec on the admin process until Redismaxclients(10k) is exhausted in ~2–3 hours, forcing periodic admin restarts.This PR implements the fix direction suggested in the issue — the BullMQ-documented production pattern of passing a single shared ioredis instance so all Queues and Workers reuse one client pool:
admin/config/queue.ts— construct one sharedRedisinstance (preserving theREDIS_DBsupport from feat(config): respect REDIS_DB env var for queue and transmit #939) and export it asqueueConfig.connection.maxRetriesPerRequest: nullis set, as BullMQ requires for shared connections.admin/package.json/ lockfile — addioredis@5.10.1as a direct dependency (previously only transitive via bullmq; pinned to the version already in the lockfile).admin/app/services/queue_service.ts— comment-only update: the singleton rationale from fix(queue): singleton QueueService to stop ioredis connection leak #877 still applies, but the per-Queue connection-count claim is no longer accurate with a shared instance.No call sites change:
QueueServiceandcommands/queue/work.tsalready readqueueConfig.connection, so both the web process and the worker process pick up the shared instance automatically. Workers duplicate the connection once for their blocking client, which is expected and bounded (1 per worker).Testing
npm run typecheckandeslintpass clean.Synthetic steady-state test against a throwaway
redis:7-alpine, simulating NOMAD's topology (7 queues + 7 workers at the same concurrency settings, sustainedadd+getJobCounts+getJobsdispatch for 45 s, samplingCLIENT LISTevery 5 s):{host, port}object (current)The post-fix steady state matches the ~20–30 client expectation from the issue once scaled to the full app. Caveat, in the interest of full transparency: the synthetic run did not reproduce the unbounded ~1 client/sec growth — that appears to need the real embed pipeline's dispatch pattern — so verification on a NOMAD instance under sustained multi-batch ZIM ingestion (the bug(queue): internal embed pipeline opens ~1 fresh ioredis client per second under sustained ingestion — distinct from #877's QueueService fix #885 repro: watch
docker exec nomad_redis redis-cli CLIENT LIST | wc -lover ~10 min of embedding) would be a valuable confirmation before merge.Suggested release note