Skip to content

Commit ab4a81f

Browse files
authored
Add version metadata to objects (#745)
* add workflow metadata to objects * rubocop fixes * fix rabbitmq test * switch to a new VersionMetadata model to make associations clearer * allow workflow metadata to be deleted by passing in an empty hash * remove dual primary key, and use index and validation instead * move druid validation to a validator * store all metadata in the database as json * remove the parsing back and forth from json * update comments * parse JSON in controller; version now accepts hash * rerun rubocop todo * add info to README * rename VersionMetadata to VersionContext; use POST for context instead of GET params * serialize json in table; update tests * client response should have json still for easier parsing by client * remove unneeded code; make tests match how client works
1 parent ee42504 commit ab4a81f

19 files changed

+428
-77
lines changed

.rubocop_todo.yml

+7-7
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# This configuration was generated by
22
# `rubocop --auto-gen-config`
3-
# on 2023-09-11 17:19:21 UTC using RuboCop version 1.56.3.
3+
# on 2024-04-18 19:41:40 UTC using RuboCop version 1.63.1.
44
# The point is for the user to remove these configuration records
55
# one by one as the offenses are removed from the code base.
66
# Note that changes in the inspected code, or installation of new
77
# versions of RuboCop, may require this file to be generated again.
88

9-
# Offense count: 2
9+
# Offense count: 3
1010
# Configuration parameters: CountComments, CountAsOne, AllowedMethods, AllowedPatterns.
1111
Metrics/MethodLength:
1212
Max: 11
@@ -16,7 +16,7 @@ RSpec/AnyInstance:
1616
Exclude:
1717
- 'spec/requests/workflows/update_step_spec.rb'
1818

19-
# Offense count: 19
19+
# Offense count: 28
2020
# Configuration parameters: CountAsOne.
2121
RSpec/ExampleLength:
2222
Max: 33
@@ -28,16 +28,16 @@ RSpec/LetSetup:
2828
- 'spec/services/sweeper_spec.rb'
2929
- 'spec/services/workflow_monitor_spec.rb'
3030

31-
# Offense count: 46
31+
# Offense count: 55
3232
RSpec/MultipleExpectations:
3333
Max: 6
3434

35-
# Offense count: 9
35+
# Offense count: 16
3636
# Configuration parameters: AllowSubject.
3737
RSpec/MultipleMemoizedHelpers:
3838
Max: 8
3939

40-
# Offense count: 5
40+
# Offense count: 6
4141
# Configuration parameters: AllowedGroups.
4242
RSpec/NestedGroups:
4343
Max: 4
@@ -79,7 +79,7 @@ Style/SlicingWithRange:
7979
Exclude:
8080
- 'app/services/intersect_query.rb'
8181

82-
# Offense count: 11
82+
# Offense count: 20
8383
# This cop supports safe autocorrection (--autocorrect).
8484
# Configuration parameters: AllowHeredoc, AllowURI, URISchemes, IgnoreCopDirectives, AllowedPatterns.
8585
# URISchemes: http, https

README.md

+18
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,24 @@ GET /workflow_queue/all_queued
112112
GET /workflow_queue
113113
```
114114

115+
### Workflow Variables
116+
117+
If a workflow or workflows for a particular object require data to be persisted and available between steps, workflow variables can be set.
118+
These are per object/version pair and thus available to any step in any workflow for a given version of an object once set.
119+
120+
These data are not persisted in Cocina, and are not preserved or available outside of the workflow-service, so they should only be used to persist information used during workflow processing.
121+
122+
To use, pass in a "context" parameter as JSON in the body of the request when creating a workflow (and set content type to application/json). The json can contain any number of key/value pairs of context:
123+
124+
```
125+
POST /objects/:druid/workflows/:workflow?version=Y
126+
```
127+
128+
This context will then be returned as JSON in each `process` block of the XML response containing workflow data, e.g. `GET /objects/:druid/workflows` for use in processing.
129+
130+
This can be used if a user selects an option in Pre-assembly or Argo that needs to be passed through the accessioning pipeline, such as if OCR or captioning is required. The value is set when creating the workflow, and then available to each robot which needs it.
131+
132+
115133
## Deploy
116134
### Logs
117135
Logs are located in `/var/log/httpd`.

app/controllers/workflows_controller.rb

+4-1
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ def show
4040
@workflow = Workflow.new(name: params[:workflow], druid: params[:druid], steps: workflow_steps)
4141
end
4242

43+
# rubocop:disable Metrics/AbcSize
4344
def create
4445
return render(plain: 'Unknown workflow', status: :bad_request) if template.nil?
4546

@@ -48,12 +49,14 @@ def create
4849
processes: initial_parser.processes,
4950
version: Version.new(
5051
druid: params[:druid],
51-
version: params[:version]
52+
version: params[:version],
53+
context: params[:context] # any context in the body of the request as JSON; wrapped in "context" key to allow for future body values
5254
)
5355
).create_workflow_steps
5456

5557
head :created
5658
end
59+
# rubocop:enable Metrics/AbcSize
5760

5861
def destroy
5962
obj = Version.new(

app/models/version.rb

+16-3
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,27 @@
11
# frozen_string_literal: true
22

3-
# Represents a version of a digital object.
3+
# Represents a version of a digital object with associated context.
44
# All workflow steps occur on to a particular version.
55
class Version
6-
def initialize(druid:, version:)
6+
def initialize(druid:, version:, context: nil)
77
@druid = druid
88
@version_id = version
9+
@context = context # this is context as a hash to be stored in the VersionContext table
910
end
1011

11-
attr_reader :druid, :version_id
12+
attr_reader :druid, :version_id, :context
13+
14+
def update_context
15+
# if no context is passed in (nil), do nothing
16+
return unless context
17+
18+
# if context is passed in but is empty, delete the version context record to clear all context
19+
if context.blank?
20+
VersionContext.find_by(druid:, version: version_id)&.destroy
21+
else # otherwise, create/update the version context record as json in the database
22+
VersionContext.find_or_create_by(druid:, version: version_id).update!(values: context)
23+
end
24+
end
1225

1326
# @return [ActiveRecord::Relationship] an ActiveRecord scope that has the WorkflowSteps for this version
1427
def workflow_steps(workflow)

app/models/version_context.rb

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# frozen_string_literal: true
2+
3+
# Models optional context that is associated with a druid/version pair for any workflow
4+
class VersionContext < ApplicationRecord
5+
validates :druid, uniqueness: { scope: :version }
6+
validates_with DruidValidator
7+
end

app/models/workflow_step.rb

+9-13
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
# Models a process that occurred for a digital object. Basically a log entry.
44
class WorkflowStep < ApplicationRecord
5-
validate :druid_is_valid
5+
validates_with DruidValidator
66
validates :workflow, presence: true
77
validates :process, presence: true
88
validates :version, numericality: { only_integer: true }
@@ -42,6 +42,11 @@ def maybe_set_completed
4242
self.completed_at ||= Time.now
4343
end
4444

45+
# any associated version context for this step (if it exists) -- note: same for any workflow/step for a given druid/version combination
46+
def context
47+
VersionContext.find_by(druid:, version:)&.values
48+
end
49+
4550
##
4651
# indicate if this step is marked as completed
4752
# @return [boolean]
@@ -59,13 +64,6 @@ def milestone_date
5964
end
6065
end
6166

62-
##
63-
# check if we have a valid druid with prefix
64-
# @return [boolean]
65-
def valid_druid?
66-
DruidTools::Druid.valid?(druid, true) && druid.starts_with?('druid:')
67-
end
68-
6967
##
7068
# check if the named workflow has a current definition
7169
# @return [boolean]
@@ -83,11 +81,6 @@ def valid_process_for_workflow?
8381
wtp.processes.map(&:name).include? process
8482
end
8583

86-
# ensure we have a valid druid with prefix
87-
def druid_is_valid
88-
errors.add(:druid, 'is not valid') unless valid_druid?
89-
end
90-
9184
# ensure we have a valid workflow before creating a new step
9285
def workflow_exists
9386
errors.add(:workflow, 'is not valid') unless valid_workflow?
@@ -99,6 +92,7 @@ def process_exists_for_workflow
9992
end
10093

10194
# rubocop:disable Metrics/MethodLength
95+
# rubocop:disable Metrics/AbcSize
10296
def attributes_for_process
10397
{
10498
version:,
@@ -108,11 +102,13 @@ def attributes_for_process
108102
elapsed:,
109103
attempts:,
110104
datetime: updated_at.to_time.iso8601,
105+
context: context&.to_json, # context (which is deserialized as a hash by activerecord) as json so it can be deserialized by client
111106
status:,
112107
name: process
113108
}.tap do |attr|
114109
attr[:errorMessage] = error_msg if error_msg
115110
end
116111
end
117112
# rubocop:enable Metrics/MethodLength
113+
# rubocop:enable Metrics/AbcSize
118114
end

app/services/workflow_creator.rb

+5
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ def initialize(processes:, workflow_id:, version:)
1818
##
1919
# Delete all the rows for this druid/version/workflow, and replace with new rows.
2020
# @return [Array]
21+
# rubocop:disable Metrics/AbcSize
2122
def create_workflow_steps
2223
ActiveRecord::Base.transaction do
2324
version.workflow_steps(workflow_id).destroy_all
@@ -28,9 +29,13 @@ def create_workflow_steps
2829
processes.map do |process|
2930
WorkflowStep.create!(workflow_attributes(process))
3031
end
32+
33+
# Create/update version context
34+
version.update_context
3135
end
3236
enqueue
3337
end
38+
# rubocop:enable Metrics/AbcSize
3439

3540
private
3641

app/validators/druid_validator.rb

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# frozen_string_literal: true
2+
3+
# Validates the a druid is valid and starts with 'druid:'
4+
class DruidValidator < ActiveModel::Validator
5+
def validate(record)
6+
return if DruidTools::Druid.valid?(record.druid, true) && record.druid.starts_with?('druid:')
7+
8+
record.errors.add(:druid, 'is not valid')
9+
end
10+
end
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# frozen_string_literal: true
2+
3+
class CreateVersionContexts < ActiveRecord::Migration[7.0]
4+
def change
5+
create_table :version_contexts do |t|
6+
t.string :druid, null: false
7+
t.integer :version, null: false, default: 1
8+
t.jsonb :values, default: {}
9+
t.timestamps
10+
end
11+
12+
add_index :version_contexts, %i[druid version], unique: true
13+
end
14+
end

db/schema.rb

+13-5
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,19 @@
1010
#
1111
# It's strongly recommended that you check this file into your version control system.
1212

13-
ActiveRecord::Schema.define(version: 2020_08_11_212454) do
14-
13+
ActiveRecord::Schema[7.0].define(version: 2024_04_02_194159) do
1514
# These are extensions that must be enabled in order to support this database
1615
enable_extension "plpgsql"
1716

17+
create_table "version_contexts", force: :cascade do |t|
18+
t.string "druid", null: false
19+
t.integer "version", default: 1, null: false
20+
t.jsonb "values", default: {}
21+
t.datetime "created_at", null: false
22+
t.datetime "updated_at", null: false
23+
t.index ["druid", "version"], name: "index_version_contexts_on_druid_and_version", unique: true
24+
end
25+
1826
create_table "workflow_steps", id: :serial, force: :cascade do |t|
1927
t.string "druid", null: false
2028
t.string "workflow", null: false
@@ -28,10 +36,10 @@
2836
t.integer "version"
2937
t.text "note"
3038
t.string "lane_id", default: "default", null: false
31-
t.datetime "created_at", null: false
32-
t.datetime "updated_at", null: false
39+
t.datetime "created_at", precision: nil, null: false
40+
t.datetime "updated_at", precision: nil, null: false
3341
t.boolean "active_version", default: false
34-
t.datetime "completed_at"
42+
t.datetime "completed_at", precision: nil
3543
t.index ["active_version", "status", "workflow", "process"], name: "active_version_step_name_workflow2_idx"
3644
t.index ["druid", "version"], name: "index_workflow_steps_on_druid_and_version"
3745
t.index ["druid"], name: "index_workflow_steps_on_druid"

spec/factories/version_contexts.rb

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# frozen_string_literal: true
2+
3+
FactoryBot.define do
4+
factory :version_context do
5+
sequence :druid do |n|
6+
"druid:bb123bc#{format('%04d', n)}" # ensure we always have a valid druid format
7+
end
8+
version { 1 }
9+
values { { requireOCR: true, requireTranscript: true } }
10+
end
11+
end

spec/factories/workflow_steps.rb

+7-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
FactoryBot.define do
44
factory :workflow_step do
55
sequence :druid do |n|
6-
"druid:bb123bc#{n.to_s.rjust(4, '0')}" # ensure we always have a valid druid format
6+
"druid:bb123bc#{format('%04d', n)}" # ensure we always have a valid druid format
77
end
88
workflow { 'accessionWF' }
99
process { 'start-accession' }
@@ -15,5 +15,11 @@
1515
status { 'completed' }
1616
completed_at { Time.now }
1717
end
18+
19+
trait :with_ocr_context do
20+
after(:create) do |step|
21+
create(:version_context, druid: step.druid, version: step.version)
22+
end
23+
end
1824
end
1925
end

spec/models/version_context_spec.rb

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# frozen_string_literal: true
2+
3+
require 'rails_helper'
4+
5+
RSpec.describe VersionContext do
6+
let(:version_context) { FactoryBot.create(:version_context) }
7+
8+
it 'includes the context as a hash' do
9+
expect(version_context.values).to eq({ 'requireOCR' => true, 'requireTranscript' => true })
10+
end
11+
12+
it 'validates the uniqueness of druid and version combination' do
13+
expect(described_class.new(druid: version_context.druid, version: version_context.version)).not_to be_valid
14+
end
15+
16+
it 'validates the druid' do
17+
expect(described_class.new(druid: 'foo', version: '1')).not_to be_valid
18+
end
19+
end

spec/models/workflow_step_spec.rb

+14
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,20 @@
135135
end
136136
end
137137

138+
context 'with workflow context' do
139+
let(:step_with_context) { FactoryBot.create(:workflow_step, :with_ocr_context) }
140+
141+
it 'includes the context as json' do
142+
expect(step_with_context.context).to eq({ 'requireOCR' => true, 'requireTranscript' => true })
143+
end
144+
end
145+
146+
context 'without workflow context' do
147+
it 'includes the context as nil' do
148+
expect(step.context).to be_nil
149+
end
150+
end
151+
138152
describe '#completed?' do
139153
it 'indicates if the step is not completed' do
140154
expect(step).not_to be_completed

0 commit comments

Comments
 (0)