-
Notifications
You must be signed in to change notification settings - Fork 8
Optimize DOI event performance with bulk-preloading #1449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 21 commits
b8d83e8
d92a4dd
3ef1f1a
d6f28ae
e62e4e2
7568389
036fd66
7243b85
10b163e
f509b75
5e253ae
55b93dc
9c6e1f6
7cf9f31
3b67b1e
45eb43e
8ceaecf
ab3dee0
cf567e8
a54ac9d
6e5bdee
ac5fc9c
671d5e7
5d230c6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| # Wrapper class to make preloaded event arrays compatible with ActiveRecord::Relation API | ||
| # This allows existing code that calls methods like `pluck`, `map`, `select` to work | ||
| # with in-memory arrays without modification. | ||
| class PreloadedEventRelation | ||
| include Enumerable | ||
|
|
||
| def initialize(events) | ||
| @events = Array(events) | ||
| end | ||
|
|
||
| # Delegate Enumerable methods to the underlying array | ||
| def each(&block) | ||
| @events.each(&block) | ||
| end | ||
|
|
||
| # Delegate common Enumerable methods explicitly | ||
| def first | ||
| @events.first | ||
| end | ||
|
|
||
| def last | ||
| @events.last | ||
| end | ||
|
|
||
| def count | ||
| @events.count | ||
| end | ||
|
|
||
| # Implement pluck to match ActiveRecord::Relation#pluck behavior | ||
| # Supports single or multiple column names | ||
| def pluck(*column_names) | ||
| if column_names.length == 1 | ||
| column_name = column_names.first | ||
| @events.map { |event| event.public_send(column_name) } | ||
| else | ||
| @events.map { |event| column_names.map { |col| event.public_send(col) } } | ||
| end | ||
| end | ||
|
|
||
| # Delegate map to the underlying array | ||
| def map(&block) | ||
| @events.map(&block) | ||
| end | ||
|
|
||
| # Delegate select to the underlying array | ||
| def select(&block) | ||
| PreloadedEventRelation.new(@events.select(&block)) | ||
| end | ||
|
|
||
| # Delegate other common Enumerable methods | ||
| def compact | ||
| PreloadedEventRelation.new(@events.compact) | ||
| end | ||
|
|
||
| def uniq | ||
| PreloadedEventRelation.new(@events.uniq) | ||
| end | ||
|
|
||
| def sort_by(&block) | ||
| PreloadedEventRelation.new(@events.sort_by(&block)) | ||
| end | ||
|
|
||
| def group_by(&block) | ||
| @events.group_by(&block) | ||
| end | ||
|
|
||
| def inject(*args, &block) | ||
| @events.inject(*args, &block) | ||
| end | ||
|
|
||
| def length | ||
| @events.length | ||
| end | ||
|
|
||
| def empty? | ||
| @events.empty? | ||
| end | ||
|
|
||
| def present? | ||
| @events.present? | ||
| end | ||
|
|
||
| def blank? | ||
| @events.blank? | ||
| end | ||
|
|
||
| # Allow direct access to the underlying array | ||
| def to_a | ||
| @events | ||
| end | ||
|
|
||
| def to_ary | ||
| @events | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,62 @@ | ||||||||||||||||||||||||
| # frozen_string_literal: true | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Service class to preload events for a batch of DOIs in a single query | ||||||||||||||||||||||||
| # This dramatically reduces database queries from N*M (DOIs * Relationship Types) to 1-2 queries total | ||||||||||||||||||||||||
| class EventsPreloader | ||||||||||||||||||||||||
| # Maximum number of DOIs to query at once to avoid database parameter limits | ||||||||||||||||||||||||
| CHUNK_SIZE = 1000 | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| def initialize(dois) | ||||||||||||||||||||||||
| @dois = Array(dois) | ||||||||||||||||||||||||
| @doi_map = {} | ||||||||||||||||||||||||
| @dois.each do |doi| | ||||||||||||||||||||||||
| next if doi.doi.blank? | ||||||||||||||||||||||||
| @doi_map[doi.doi.upcase] = doi | ||||||||||||||||||||||||
| # Initialize preloaded_events array if not already set | ||||||||||||||||||||||||
| doi.preloaded_events ||= [] | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Preload all events for the batch of DOIs | ||||||||||||||||||||||||
| def preload! | ||||||||||||||||||||||||
| return if @dois.empty? | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| doi_identifiers = @dois.filter_map { |doi| doi.doi&.upcase }.uniq | ||||||||||||||||||||||||
| return if doi_identifiers.empty? | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
Comment on lines
+22
to
+26
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Keep Line 22 and Line 25 return 🛠️ Proposed fix def preload!
- return if `@dois.empty`?
+ return self if `@dois.empty`?
doi_identifiers = `@dois.filter_map` { |doi| doi.doi&.upcase }.uniq
- return if doi_identifiers.empty?
+ if doi_identifiers.empty?
+ `@dois.each` { |doi| doi.preloaded_events ||= [] }
+ return self
+ end📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||
| # Fetch events in chunks to avoid database parameter limits | ||||||||||||||||||||||||
| all_events = [] | ||||||||||||||||||||||||
| doi_identifiers.each_slice(CHUNK_SIZE) do |chunk| | ||||||||||||||||||||||||
| events = Event.where( | ||||||||||||||||||||||||
| "source_doi IN (?) OR target_doi IN (?)", | ||||||||||||||||||||||||
| chunk, chunk | ||||||||||||||||||||||||
| ).order(updated_at: :desc).to_a | ||||||||||||||||||||||||
| all_events.concat(events) | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||||
| all_events.uniq! | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Group events by DOI and assign to each Doi object | ||||||||||||||||||||||||
| all_events.each do |event| | ||||||||||||||||||||||||
| # Add event to source DOI's preloaded_events if it matches | ||||||||||||||||||||||||
| if event.source_doi.present? | ||||||||||||||||||||||||
| source_doi_obj = @doi_map[event.source_doi.upcase] | ||||||||||||||||||||||||
| source_doi_obj.preloaded_events << event if source_doi_obj | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Add event to target DOI's preloaded_events if it matches | ||||||||||||||||||||||||
| if event.target_doi.present? | ||||||||||||||||||||||||
| target_doi_obj = @doi_map[event.target_doi.upcase] | ||||||||||||||||||||||||
| if target_doi_obj && target_doi_obj != source_doi_obj | ||||||||||||||||||||||||
| target_doi_obj.preloaded_events << event | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Ensure all DOIs have an array (even if empty) | ||||||||||||||||||||||||
| @dois.each do |doi| | ||||||||||||||||||||||||
| doi.preloaded_events ||= [] | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| self | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
| end | ||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: datacite/lupo
Length of output: 39
🏁 Script executed:
# Read the full implementation file cat -n app/models/concerns/preloaded_event_relation.rbRepository: datacite/lupo
Length of output: 2619
🏁 Script executed:
Repository: datacite/lupo
Length of output: 313
🏁 Script executed:
Repository: datacite/lupo
Length of output: 377
🏁 Script executed:
Repository: datacite/lupo
Length of output: 1875
🏁 Script executed:
Repository: datacite/lupo
Length of output: 802
🏁 Script executed:
# Check the test file to see if args are tested cat -n spec/models/preloaded_event_relation_spec.rbRepository: datacite/lupo
Length of output: 5202
🏁 Script executed:
Repository: datacite/lupo
Length of output: 165
Match
first/last/countarity with Enumerable and Array behavior.The current zero-arity signatures reject valid arguments and blocks that Array and Enumerable support. Ruby's Array methods accept optional parameters:
first(n),last(n), andcount { |item| ... }. As a drop-in replacement for ActiveRecord::Relation, this wrapper must support the same signatures to maintain compatibility.🐛 Fix
📝 Committable suggestion
🤖 Prompt for AI Agents