Refactor ReindexByDoiJob to include Shoryuken::Worker and update perform method#1495
Refactor ReindexByDoiJob to include Shoryuken::Worker and update perform method#1495codycooperross merged 3 commits intomasterfrom
Conversation
📝 WalkthroughWalkthroughReindexByDoiJob now includes Shoryuken worker integration, adds Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/jobs/reindex_by_doi_job.rb`:
- Around line 8-10: The perform signature currently def perform(sqs_message =
nil, doi_id = nil) causes single-argument calls (perform_later/perform_now) to
place the DOI into sqs_message and leave doi_id nil; update perform in
ReindexByDoiJob so it normalizes inputs: if doi_id is nil then detect whether
sqs_message is the DOI string (e.g., String) or a Shoryuken/SQS payload (e.g.,
Hash with "body" or :body) and set doi_id accordingly before calling
Doi.find_by(doi: doi_id); this preserves support for perform_later/perform_now
and direct Shoryuken invocation while keeping calls to Doi.find_by unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 67b5ecd6-357c-47e1-8bf1-9cb52e7f4d67
📒 Files selected for processing (1)
app/jobs/reindex_by_doi_job.rb
app/jobs/reindex_by_doi_job.rb
Outdated
| def perform(sqs_message = nil, doi_id = nil) | ||
| doi = Doi.find_by(doi: doi_id) | ||
| return unless doi.present? |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Find all places where ReindexByDoiJob is invoked to verify argument compatibility
echo "=== Direct invocations of ReindexByDoiJob ==="
rg -n -C3 'ReindexByDoiJob\.(perform_later|perform_now|perform_async|set)' --type ruby
echo ""
echo "=== Test file contents ==="
cat spec/jobs/reindex_by_doi_job_spec.rb 2>/dev/null || echo "Test file not found"Repository: datacite/lupo
Length of output: 2860
🏁 Script executed:
cat -n app/jobs/reindex_by_doi_job.rbRepository: datacite/lupo
Length of output: 720
🏁 Script executed:
rg -n 'ReindexByDoiJob' --type ruby -B2 -A2Repository: datacite/lupo
Length of output: 1931
Critical: Method signature breaks all existing perform_later and perform_now invocations.
The current signature def perform(sqs_message = nil, doi_id = nil) causes all single-argument invocations to fail. When called with perform_later(doi_string) or perform_now(doi_string), the DOI value is assigned to the sqs_message parameter, leaving doi_id as nil. This causes Doi.find_by(doi: nil) to return nil, and the job silently exits without processing.
All three test invocations in spec/jobs/reindex_by_doi_job_spec.rb are broken:
- Line 8:
ReindexByDoiJob.perform_later(datacite_doi.doi) - Line 17:
ReindexByDoiJob.perform_now(datacite_doi.doi) - Line 25:
ReindexByDoiJob.perform_now(other_doi.doi)
Proposed fix to support both ActiveJob and Shoryuken
If this job needs to support both ActiveJob (perform_later) and direct Shoryuken invocation, detect how it was invoked:
def perform(sqs_message = nil, doi_id = nil)
+ # When called via perform_later, sqs_message contains the doi_id
+ # When called directly by Shoryuken, doi_id is the second argument
+ resolved_doi_id = doi_id || sqs_message
- doi = Doi.find_by(doi: doi_id)
+ doi = Doi.find_by(doi: resolved_doi_id)Alternatively, if this job is only invoked via Shoryuken, update all callers to pass the second argument:
- subject(:job) { ReindexByDoiJob.perform_later(datacite_doi.doi) }
+ subject(:job) { ReindexByDoiJob.perform_later(nil, datacite_doi.doi) }and
- ReindexByDoiJob.perform_now(datacite_doi.doi)
+ ReindexByDoiJob.perform_now(nil, datacite_doi.doi)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def perform(sqs_message = nil, doi_id = nil) | |
| doi = Doi.find_by(doi: doi_id) | |
| return unless doi.present? | |
| def perform(sqs_message = nil, doi_id = nil) | |
| # When called via perform_later, sqs_message contains the doi_id | |
| # When called directly by Shoryuken, doi_id is the second argument | |
| resolved_doi_id = doi_id || sqs_message | |
| doi = Doi.find_by(doi: resolved_doi_id) | |
| return unless doi.present? |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/jobs/reindex_by_doi_job.rb` around lines 8 - 10, The perform signature
currently def perform(sqs_message = nil, doi_id = nil) causes single-argument
calls (perform_later/perform_now) to place the DOI into sqs_message and leave
doi_id nil; update perform in ReindexByDoiJob so it normalizes inputs: if doi_id
is nil then detect whether sqs_message is the DOI string (e.g., String) or a
Shoryuken/SQS payload (e.g., Hash with "body" or :body) and set doi_id
accordingly before calling Doi.find_by(doi: doi_id); this preserves support for
perform_later/perform_now and direct Shoryuken invocation while keeping calls to
Doi.find_by unchanged.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
app/jobs/reindex_by_doi_job.rb (2)
6-7: Redundant/conflicting queue declarations may cause confusion.You have both
queue_as :lupo_background(ActiveJob) andshoryuken_options queue:(Shoryuken). Looking atOtherDoiJobin the codebase, it only usesshoryuken_optionswithoutqueue_as. WithShoryuken.active_job_queue_name_prefixing = trueset in the initializer, ActiveJob will prefix thequeue_asvalue, whileshoryuken_optionshas its own explicit prefix logic.Consider removing
queue_as :lupo_backgroundto match the pattern used byOtherDoiJoband avoid potential queue routing ambiguity.♻️ Suggested fix
class ReindexByDoiJob < ApplicationJob include Shoryuken::Worker - queue_as :lupo_background shoryuken_options queue: -> { "#{ENV["RAILS_ENV"]}_lupo_background" }, auto_delete: true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/jobs/reindex_by_doi_job.rb` around lines 6 - 7, Remove the redundant ActiveJob queue declaration to avoid conflicting queue naming: delete the call to queue_as :lupo_background from ReindexByDoiJob so it only uses the Shoryuken-specific shoryuken_options queue: -> { "#{ENV["RAILS_ENV"]}_lupo_background" } configuration (matching OtherDoiJob) and rely on the existing Shoryuken.active_job_queue_name_prefixing logic instead.
9-11: Remove unusedsqs_messageparameter or document its purpose.The
sqs_messageparameter is declared but never used in the method body. All current callers passnilfor this argument, and there are no production code invocations of this job outside the test suite.If this job is strictly for Shoryuken-only usage, remove the unused parameter. If
sqs_messageis intentionally reserved for future queue-specific logic, document this clearly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/jobs/reindex_by_doi_job.rb` around lines 9 - 11, The perform method in ReindexByDoiJob declares an unused sqs_message parameter (perform(sqs_message = nil, doi_id = nil)) — either remove sqs_message from the signature and update tests/call sites to call perform(doi_id) (or perform_async with only doi_id), or keep it but mark it as intentionally unused by renaming to _sqs_message and adding a short comment documenting that it’s reserved for Shoryuken/SQS metadata; make the change in perform and ensure the Doi.find_by(doi: doi_id) usage remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@app/jobs/reindex_by_doi_job.rb`:
- Around line 6-7: Remove the redundant ActiveJob queue declaration to avoid
conflicting queue naming: delete the call to queue_as :lupo_background from
ReindexByDoiJob so it only uses the Shoryuken-specific shoryuken_options queue:
-> { "#{ENV["RAILS_ENV"]}_lupo_background" } configuration (matching
OtherDoiJob) and rely on the existing Shoryuken.active_job_queue_name_prefixing
logic instead.
- Around line 9-11: The perform method in ReindexByDoiJob declares an unused
sqs_message parameter (perform(sqs_message = nil, doi_id = nil)) — either remove
sqs_message from the signature and update tests/call sites to call
perform(doi_id) (or perform_async with only doi_id), or keep it but mark it as
intentionally unused by renaming to _sqs_message and adding a short comment
documenting that it’s reserved for Shoryuken/SQS metadata; make the change in
perform and ensure the Doi.find_by(doi: doi_id) usage remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 80515640-eed9-4f63-8c00-20a05fc395f7
📒 Files selected for processing (2)
app/jobs/reindex_by_doi_job.rbspec/jobs/reindex_by_doi_job_spec.rb
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
spec/jobs/reindex_by_doi_job_spec.rb (1)
10-14: Add a regression for the Shoryuken queue binding.This assertion only covers
queue_as; it will still pass if the newshoryuken_optionsqueue points at the wrong SQS queue. Since the production change is the Shoryuken integration, a small spec around the resolved worker queue name would make this much safer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/jobs/reindex_by_doi_job_spec.rb` around lines 10 - 14, The spec currently only checks ActiveJob's queue_as but misses the Shoryuken binding; add an assertion that the worker's Shoryuken queue is set to the same resolved queue name. In the spec for ReindexByDoiJob, after the existing enqueue assertion, add a line that inspects the job class's Shoryuken options (e.g. ReindexByDoiJob.shoryuken_options[:queue] or ReindexByDoiJob.get_shoryuken_options[:queue]) and expect it to equal the intended queue name ("test_lupo_background" or the resolved production queue name used by the app) so the spec fails if shoryuken_options points at the wrong SQS queue.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/jobs/reindex_by_doi_job.rb`:
- Line 7: The shoryuken queue name is hard-coded to ENV["RAILS_ENV"] in the
shoryuken_options call; change it to use the same prefix logic as the publisher
(ENV["SQS_PREFIX"].presence || Rails.env) so the worker listens on the same
"<prefix>_lupo_background" queue as producers; update the lambda passed to
shoryuken_options (the shoryuken_options call) to build the queue name from
ENV["SQS_PREFIX"].presence || Rails.env instead of ENV["RAILS_ENV"].
- Around line 9-14: The perform method calls JSON.parse(data) directly which
will raise for nil or malformed JSON; change perform to first guard against
nil/empty payload (e.g. return unless data.present?) and then safely parse
inside a begin/rescue that catches JSON::ParserError and TypeError, returning
(and optionally logging) when parsing fails; after the safe parse ensure
parsed.is_a?(Hash) before accessing parsed["doi"] so invalid payloads do not
cause exceptions in perform.
---
Nitpick comments:
In `@spec/jobs/reindex_by_doi_job_spec.rb`:
- Around line 10-14: The spec currently only checks ActiveJob's queue_as but
misses the Shoryuken binding; add an assertion that the worker's Shoryuken queue
is set to the same resolved queue name. In the spec for ReindexByDoiJob, after
the existing enqueue assertion, add a line that inspects the job class's
Shoryuken options (e.g. ReindexByDoiJob.shoryuken_options[:queue] or
ReindexByDoiJob.get_shoryuken_options[:queue]) and expect it to equal the
intended queue name ("test_lupo_background" or the resolved production queue
name used by the app) so the spec fails if shoryuken_options points at the wrong
SQS queue.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e25e49a5-5be5-465c-9541-8bc6fc65e30e
📒 Files selected for processing (2)
app/jobs/reindex_by_doi_job.rbspec/jobs/reindex_by_doi_job_spec.rb
| include Shoryuken::Worker | ||
|
|
||
| queue_as :lupo_background | ||
| shoryuken_options queue: -> { "#{ENV["RAILS_ENV"]}_lupo_background" }, auto_delete: true |
There was a problem hiding this comment.
Keep the Shoryuken queue name aligned with the publisher.
Line 7 hard-codes ENV["RAILS_ENV"], but app/services/sqs_message_service.rb:6-7 publishes to ENV["SQS_PREFIX"].presence || Rails.env. In prefixed environments, producers will send to <prefix>_lupo_background while this worker listens on <RAILS_ENV>_lupo_background, so it will never see those messages.
💡 Proposed fix
- shoryuken_options queue: -> { "#{ENV["RAILS_ENV"]}_lupo_background" }, auto_delete: true
+ shoryuken_options queue: -> { "#{ENV["SQS_PREFIX"].presence || Rails.env}_lupo_background" }, auto_delete: true📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| shoryuken_options queue: -> { "#{ENV["RAILS_ENV"]}_lupo_background" }, auto_delete: true | |
| shoryuken_options queue: -> { "#{ENV["SQS_PREFIX"].presence || Rails.env}_lupo_background" }, auto_delete: true |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/jobs/reindex_by_doi_job.rb` at line 7, The shoryuken queue name is
hard-coded to ENV["RAILS_ENV"] in the shoryuken_options call; change it to use
the same prefix logic as the publisher (ENV["SQS_PREFIX"].presence || Rails.env)
so the worker listens on the same "<prefix>_lupo_background" queue as producers;
update the lambda passed to shoryuken_options (the shoryuken_options call) to
build the queue name from ENV["SQS_PREFIX"].presence || Rails.env instead of
ENV["RAILS_ENV"].
| def perform(sqs_message = nil, data = nil) | ||
| parsed = JSON.parse(data) | ||
| return unless parsed.is_a?(Hash) | ||
|
|
||
| doi_id = parsed["doi"] | ||
| return if doi_id.blank? |
There was a problem hiding this comment.
Handle invalid payloads before parsing.
JSON.parse(data) can raise on nil, malformed JSON, or a Hash, so Line 11 never runs. On a queue consumer, that turns one bad payload into an exception path before the worker can decide how to handle it.
💡 Proposed fix
def perform(sqs_message = nil, data = nil)
- parsed = JSON.parse(data)
+ parsed =
+ case data
+ when Hash
+ data
+ when String
+ JSON.parse(data)
+ else
+ return
+ end
return unless parsed.is_a?(Hash)
- doi_id = parsed["doi"]
+ doi_id = parsed["doi"] || parsed[:doi]
return if doi_id.blank?
doi = Doi.find_by(doi: doi_id)
return unless doi.present?
if doi.agency == "datacite"
DataciteDoiImportInBulkJob.perform_later([doi.id], index: DataciteDoi.active_index)
else
OtherDoiImportInBulkJob.perform_later([doi.id], index: OtherDoi.active_index)
end
+ rescue JSON::ParserError
+ return
end📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def perform(sqs_message = nil, data = nil) | |
| parsed = JSON.parse(data) | |
| return unless parsed.is_a?(Hash) | |
| doi_id = parsed["doi"] | |
| return if doi_id.blank? | |
| def perform(sqs_message = nil, data = nil) | |
| parsed = | |
| case data | |
| when Hash | |
| data | |
| when String | |
| JSON.parse(data) | |
| else | |
| return | |
| end | |
| return unless parsed.is_a?(Hash) | |
| doi_id = parsed["doi"] || parsed[:doi] | |
| return if doi_id.blank? | |
| doi = Doi.find_by(doi: doi_id) | |
| return unless doi.present? | |
| if doi.agency == "datacite" | |
| DataciteDoiImportInBulkJob.perform_later([doi.id], index: DataciteDoi.active_index) | |
| else | |
| OtherDoiImportInBulkJob.perform_later([doi.id], index: OtherDoi.active_index) | |
| end | |
| rescue JSON::ParserError | |
| return | |
| end |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/jobs/reindex_by_doi_job.rb` around lines 9 - 14, The perform method calls
JSON.parse(data) directly which will raise for nil or malformed JSON; change
perform to first guard against nil/empty payload (e.g. return unless
data.present?) and then safely parse inside a begin/rescue that catches
JSON::ParserError and TypeError, returning (and optionally logging) when parsing
fails; after the safe parse ensure parsed.is_a?(Hash) before accessing
parsed["doi"] so invalid payloads do not cause exceptions in perform.
Purpose
closes: Add github issue that originated this PR
Approach
Open Questions and Pre-Merge TODOs
Learning
Types of changes
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Reviewer, please remember our guidelines:
Summary by CodeRabbit
Chores
Tests