Skip to content

Fix Exporters (Bulkrax upgrade) #347

@bkiahstroud

Description

@bkiahstroud

Follow up to:

Story

The version of Bulkrax currently being used has an issue causing Exporters to run very slowly. In short, the export tries to load every exported work's file sets into memory at once, which is very slow and can cause crashes. Upgrading Bulkrax to a newer version should resolve this issue.

Acceptance Criteria

  • Upgrade Bulkrax to v5.2.1 or newer (diff)
  • Merge important/necessary customizations from the v4.4-patch1

Notes

1 Atla Bulkrax patches/overrides:

diff
diff --git a/app/models/bulkrax/csv_entry.rb b/app/models/bulkrax/csv_entry.rb
index 9e856f8..baf9e47 100644
--- a/app/models/bulkrax/csv_entry.rb
+++ b/app/models/bulkrax/csv_entry.rb
@@ -107,7 +107,9 @@ module Bulkrax
     # Metadata required by Bulkrax for round-tripping
     def build_system_metadata
       self.parsed_metadata['id'] = hyrax_record.id
-      self.parsed_metadata[source_identifier] = hyrax_record.send(work_identifier)
+      source_id = hyrax_record.send(work_identifier)
+      source_id = source_id.to_a.first if source_id.is_a?(ActiveTriples::Relation)
+      self.parsed_metadata[source_identifier] = source_id
       self.parsed_metadata[key_for_export('model')] = hyrax_record.has_model.first
     end
 
@@ -149,7 +151,7 @@ module Bulkrax
       mapping = fetch_field_mapping
       mapping.each do |key, value|
         # these keys are handled by other methods
-        next if ['model', 'file', related_parents_parsed_mapping, related_children_parsed_mapping].include?(key)
+        next if ['model', 'file', related_parents_parsed_mapping, related_children_parsed_mapping, source_identifier].include?(key)
         next if value['excluded']
         next if Bulkrax.reserved_properties.include?(key) && !field_supported?(key)
 
diff --git a/app/parsers/bulkrax/oai_dc_parser.rb b/app/parsers/bulkrax/oai_dc_parser.rb
index f16c467..8cc3ea8 100644
--- a/app/parsers/bulkrax/oai_dc_parser.rb
+++ b/app/parsers/bulkrax/oai_dc_parser.rb
@@ -32,10 +32,6 @@ module Bulkrax
 
     def file_set_entry_class; end
 
-    def create_relationships; end
-
-    def create_file_sets; end
-
     def records(opts = {})
       opts[:metadata_prefix] ||= importerexporter.parser_fields['metadata_prefix']
       opts[:set] = collection_name unless collection_name == 'all'
@@ -113,6 +109,12 @@ module Bulkrax
       importer.record_status
     end
 
+    def create_relationships
+      ScheduleRelationshipsJob.set(wait: 5.minutes).perform_later(importer_id: importerexporter.id)
+    end
+
+    def create_file_sets; end
+
     def collections
       @collections ||= list_sets
     end

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions