feat(imports): add SureImport session batches#1785
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds ordered multi-chunk import sessions: migrations and schema, ImportSession/ImportSourceMapping models with chunk lifecycle and idempotency, Family::DataImporter refactor for session-aware ID mapping and summaries, API endpoints for create/chunk/publish with NDJSON validation, background job, OpenAPI docs, and tests. ChangesChunked Import Sessions Infrastructure
Sequence DiagramsequenceDiagram
participant Client
participant Api as Api::V1::ImportSessionsController
participant Session as ImportSession
participant Job as ImportSessionJob
participant Importer as Family::DataImporter
Client->>Api: POST /api/v1/import_sessions (create)
Client->>Api: POST /api/v1/import_sessions/:id/chunks (attach NDJSON)
Api->>Session: attach_chunk!(sequence, content, client_chunk_id)
Api-->>Client: 201 Created (chunk)
Client->>Api: POST /api/v1/import_sessions/:id/publish
Api->>Session: publish_later
Session->>Job: enqueue ImportSessionJob
Job->>Session: publish()
Session->>Importer: import!(import_session: self) per chunk
Importer-->>Session: per-import summary
Session->>Session: aggregate_chunk_summaries -> set status complete/failed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Brin PR Security ScanThis PR has findings that should block merge.
Findings:
Analyzed by Brin |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: acda613e8f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
acda613 to
b31e25a
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (1)
app/jobs/import_session_job.rb (1)
4-8: ⚡ Quick winConsider raising an error instead of silently skipping.
Silently returning when
import_sessionisnilcan mask bugs in the enqueue logic. If the session is expected to exist when the job runs, this should fail loudly.If
nilis a valid edge case (e.g., cascading delete), consider logging aterrorlevel and including context about why the job was enqueued without a valid session.💡 Suggested improvement
def perform(import_session) if import_session.nil? - Rails.logger.info("ImportSessionJob skipped missing import_session") + Rails.logger.error("ImportSessionJob received nil import_session - this indicates a bug in enqueue logic") + raise ArgumentError, "import_session cannot be nil" - return end Rails.logger.info("ImportSessionJob started import_session_id=#{import_session.id}")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/jobs/import_session_job.rb` around lines 4 - 8, The perform method in ImportSessionJob currently returns silently when import_session is nil which can hide enqueue bugs; change this to either raise a descriptive error (e.g., raise ArgumentError or a custom ImportSessionMissingError) from ImportSessionJob#perform so failures surface, or if nil is an acceptable edge case, replace the info log with Rails.logger.error and include context (job id, args, and reason) so the condition is visible; update tests for ImportSessionJob.perform to expect the new behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/controllers/api/v1/import_sessions_controller.rb`:
- Line 10: Replace uses of current_resource_owner.family with Current.family in
this controller to follow repo standards: update the session creation payload
(the hash containing family: current_resource_owner.family) and any lookup calls
(the usage around line ~70 where current_resource_owner.family is used) to
reference Current.family instead; ensure you only change the family accessor and
leave authentication/authorization logic intact so the controller methods (e.g.,
create and any lookup/find method) continue to use the same semantics but via
Current.family.
In `@app/models/family/data_importer.rb`:
- Around line 326-345: The import_balances method is double-counting skipped
Balance rows because mapped_id(:accounts, ...) already calls missing_reference
which increments_summary("Balance", :skipped) on blank account_id; remove the
duplicate increment there: replace the explicit post-mapped_id increment+next
with a simple guard that skips when new_account_id.blank? (e.g. next unless
new_account_id) or otherwise only call increment_summary once, keeping
mapped_id/missing_reference as the single place that logs skipped references;
ensure references to mapped_id, missing_reference, increment_summary and
import_balances are used to locate and update the logic.
- Around line 904-911: The import_rules loop misses the required check for blank
source ids; add a call to require_source_id!("Rule", old_id) near the top of
import_rules (before calling mapped_record(:rules, old_id, ...)) so records
exported without an "id" are rejected in strict/session mode, matching other
importers and preventing creation of new Rule records on chunk replays where
map_source! would otherwise early-return on a blank source_id.
In `@app/models/import_session.rb`:
- Around line 29-54: The create_or_find_for! method has a race between
family.import_sessions.find_or_initialize_by and session.save! that can raise
RecordNotUnique instead of the intended ConflictError; wrap the save! in a
rescue for ActiveRecord::RecordNotUnique, then re-load the existing session via
family.import_sessions.find_by(client_session_id: client_session_id), compare
its expected_chunks against the requested expected_chunks and raise
ConflictError if they differ, otherwise return the existing session (or retry
the save if appropriate); reference create_or_find_for!,
family.import_sessions.find_or_initialize_by, session.save!, and ConflictError
when making the change.
In `@app/views/api/v1/import_sessions/show.json.jbuilder`:
- Line 14: The JSON view currently iterates over `@import_session.imports` which
is not guaranteed to be ordered; change the rendering to iterate over the
imports ordered by their sequence value (e.g. use
`@import_session.imports.order`(:sequence) or equivalent sorted collection) so
that the chunks are returned in deterministic sequence order when calling
json.array! in show.json.jbuilder.
In `@docs/api/openapi.yaml`:
- Around line 4773-4802: Add the missing 422 response for the POST
/api/v1/import_sessions/{id}/publish OpenAPI operation: declare a '422' response
with a description like "max_row_count_exceeded" and set content
application/json to use the existing ErrorResponse schema (same pattern as
401/403/404/409). Update the OpenAPI fragment that includes
ImportSessionResponse to mirror the other error entries so clients can see the
controller can return 422 for publish failures.
In `@test/controllers/api/v1/import_sessions_controller_test.rb`:
- Around line 33-180: Add two tests: one that omits the Authorization header and
posts to api_v1_import_sessions_url (or any other mutate endpoint like
publish_api_v1_import_session_url) asserting response :unauthorized and error
"missing_auth" (401); and one that sends a request with an invalid date payload
(e.g. bad date/string param) to the relevant action (api_v1_import_sessions_url
or publish_api_v1_import_session_url) asserting response :unprocessable_entity
and the "invalid_date" (422) error; name them clearly (e.g. test "requires
authentication returns 401" and test "rejects invalid date returns 422") and
follow the existing pattern using api_headers and JSON.parse assertions.
---
Nitpick comments:
In `@app/jobs/import_session_job.rb`:
- Around line 4-8: The perform method in ImportSessionJob currently returns
silently when import_session is nil which can hide enqueue bugs; change this to
either raise a descriptive error (e.g., raise ArgumentError or a custom
ImportSessionMissingError) from ImportSessionJob#perform so failures surface, or
if nil is an acceptable edge case, replace the info log with Rails.logger.error
and include context (job id, args, and reason) so the condition is visible;
update tests for ImportSessionJob.perform to expect the new behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fc5067a8-cba4-4e36-86fe-10fd2a48921d
📒 Files selected for processing (19)
.devcontainer/Dockerfile.rubocop.ymlapp/controllers/api/v1/import_sessions_controller.rbapp/jobs/import_session_job.rbapp/models/family.rbapp/models/family/data_importer.rbapp/models/import.rbapp/models/import_session.rbapp/models/import_source_mapping.rbapp/models/sure_import.rbapp/views/api/v1/import_sessions/show.json.jbuilderconfig/routes.rbdb/migrate/20260513013000_create_import_sessions.rbdb/schema.rbdocs/api/openapi.yamlspec/requests/api/v1/import_sessions_spec.rbspec/swagger_helper.rbtest/controllers/api/v1/import_sessions_controller_test.rbtest/models/import_session_test.rb
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b31e25a590
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
b31e25a to
a787c17
Compare
a787c17 to
8cadec4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/models/family/data_importer_test.rb`:
- Around line 128-145: Add an assertion after calling
Family::DataImporter.new(`@family`, ndjson).import! to confirm the Balance record
was not persisted: query the Balance model (e.g., find_by(id: "balance-1") or by
attributes) and assert it is nil or that no record exists for that
id/attributes; keep this check alongside the existing summary assertion to
ensure import! both counted and skipped saving the invalid Balance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b4e0a8fa-c742-40aa-b92a-6a653b0d7d0e
📒 Files selected for processing (21)
.devcontainer/Dockerfile.rubocop.ymlapp/controllers/api/v1/import_sessions_controller.rbapp/jobs/import_session_job.rbapp/models/family.rbapp/models/family/data_importer.rbapp/models/import.rbapp/models/import_session.rbapp/models/import_source_mapping.rbapp/models/sure_import.rbapp/views/api/v1/import_sessions/show.json.jbuilderconfig/routes.rbdb/migrate/20260513013000_create_import_sessions.rbdb/schema.rbdocs/api/openapi.yamlspec/requests/api/v1/import_sessions_spec.rbspec/swagger_helper.rbtest/controllers/api/v1/import_sessions_controller_test.rbtest/jobs/import_session_job_test.rbtest/models/family/data_importer_test.rbtest/models/import_session_test.rb
🚧 Files skipped from review as they are similar to previous changes (14)
- app/models/family.rb
- .rubocop.yml
- .devcontainer/Dockerfile
- config/routes.rb
- app/views/api/v1/import_sessions/show.json.jbuilder
- app/models/import_source_mapping.rb
- app/models/import.rb
- spec/swagger_helper.rb
- docs/api/openapi.yaml
- app/models/sure_import.rb
- db/migrate/20260513013000_create_import_sessions.rb
- spec/requests/api/v1/import_sessions_spec.rb
- app/controllers/api/v1/import_sessions_controller.rb
- app/models/family/data_importer.rb
1776e30 to
e24999b
Compare
Add first-class SureImport sessions for ordered multi-file NDJSON imports. Persist source mappings across chunks, make session/chunk processing idempotent, expose progress readback, and keep existing single-file import behavior compatible. Includes the devcontainer libvips runtime dependency needed by ActiveStorage variant tests. Addresses we-promise#1610. Related to we-promise#1458.
e10ed45 to
c128c06
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/models/import_session.rb`:
- Around line 153-155: The current guard in ImportSession (the block using
expected_chunks and imports.count) only checks counts; instead enforce that the
exact set of chunk sequences 1..expected_chunks is present: collect the imports'
sequence numbers (e.g., imports.map(&:sequence) or imports.pluck(:sequence)),
compare to the expected set (Set.new(1..expected_chunks) or
(1..expected_chunks).to_a), and raise ConflictError (same class) if they differ;
update the error message to indicate which sequences are missing or unexpected
for easier debugging.
- Around line 157-161: Wrap the status transition and job enqueue so a failed
ImportSessionJob.perform_later does not leave the record stuck in :importing;
specifically, use a DB transaction around the update!(status: :importing,
error_details: {}) and the ImportSessionJob.perform_later(self) call, attempt
perform_later inside a begin/rescue (or rescue the specific enqueue error), and
on failure revert the status (e.g. back to :pending or previous state) and
persist error_details with the exception before re-raising or returning a
failure, ensuring the update! to :importing is only committed if the job enqueue
succeeds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c0a1fd32-bd38-452c-96e9-6436da790752
📒 Files selected for processing (21)
.devcontainer/Dockerfile.rubocop.ymlapp/controllers/api/v1/import_sessions_controller.rbapp/jobs/import_session_job.rbapp/models/family.rbapp/models/family/data_importer.rbapp/models/import.rbapp/models/import_session.rbapp/models/import_source_mapping.rbapp/models/sure_import.rbapp/views/api/v1/import_sessions/show.json.jbuilderconfig/routes.rbdb/migrate/20260513013000_create_import_sessions.rbdb/schema.rbdocs/api/openapi.yamlspec/requests/api/v1/import_sessions_spec.rbspec/swagger_helper.rbtest/controllers/api/v1/import_sessions_controller_test.rbtest/jobs/import_session_job_test.rbtest/models/family/data_importer_test.rbtest/models/import_session_test.rb
✅ Files skipped from review due to trivial changes (2)
- .devcontainer/Dockerfile
- .rubocop.yml
🚧 Files skipped from review as they are similar to previous changes (16)
- app/jobs/import_session_job.rb
- app/models/import_source_mapping.rb
- test/jobs/import_session_job_test.rb
- app/models/family.rb
- spec/swagger_helper.rb
- test/models/family/data_importer_test.rb
- app/models/import.rb
- spec/requests/api/v1/import_sessions_spec.rb
- db/migrate/20260513013000_create_import_sessions.rb
- db/schema.rb
- app/models/sure_import.rb
- test/controllers/api/v1/import_sessions_controller_test.rb
- test/models/import_session_test.rb
- app/controllers/api/v1/import_sessions_controller.rb
- docs/api/openapi.yaml
- app/models/family/data_importer.rb
Validate expected import chunk sequences exactly before publish, and restore session state with error details when enqueueing the publish job fails.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/models/import_session.rb`:
- Around line 54-66: Rescue path for ActiveRecord::RecordNotUnique can return a
row with expected_chunks == nil even when the current caller provided
expected_chunks; update the found existing import session to backfill the
caller's expected_chunks when existing.expected_chunks is nil and the caller
supplied a value. In the rescue block around client_session_id lookup (the code
using family.import_sessions.find_by(client_session_id: client_session_id)),
after the existing row is found and after the mismatch check, set
existing.expected_chunks = expected_chunks (or call
existing.update!(expected_chunks: expected_chunks)) when
expected_chunks.present? && existing.expected_chunks.nil? and persist the change
before returning existing; preserve raising ConflictError when values disagree.
- Around line 145-169: publish_later can leave the session stuck in :importing
because ImportSessionJob uses Rails 7.2's deferred enqueue and perform_later is
called inside the inner transaction savepoint; fix by disabling deferred enqueue
in the job or ensuring enqueue errors are handled outside the savepoint: add
self.enqueue_after_transaction_commit = :never to the ImportSessionJob class
(same pattern as DestroyJob) so ImportSessionJob.perform_later will enqueue
immediately, or alternatively move the ImportSessionJob.perform_later call out
of the self.class.transaction(requires_new: true) block in
ImportSession#publish_later and add an outer rescue around the enqueue that
restores status (use with_lock/previous_status/error_details_for to revert on
enqueue errors).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 026075df-a7a7-4238-99a8-8f354887dbc1
📒 Files selected for processing (2)
app/models/import_session.rbtest/models/import_session_test.rb
🚧 Files skipped from review as they are similar to previous changes (1)
- test/models/import_session_test.rb
Backfill expected chunk counts after client-session insert races and enqueue import-session jobs after the status transition commits. Persist a safe enqueue failure body so API readback does not expose raw queue errors.
|
Solid architectural addition — the cross-chunk Dead code in if @import_session && transaction.respond_to?(:external_id=) && old_id.present?
transaction.external_id = old_id
end
The happy path enqueues a sync and then calls Stale status values in the OpenAPI schema
Silent behavioral change: The old code picked a random color for new tags; the new code always picks the first. More importantly, the expression Missing test: partial-failure retry There's a test verifying that a Generated by Claude Code |
Summary
POST /api/v1/importsbehavior.What changed
POST /api/v1/import_sessions,GET /api/v1/import_sessions/:id,POST /api/v1/import_sessions/:id/chunks, andPOST /api/v1/import_sessions/:id/publish.Why
This addresses the import/export failure surface reported in #1610 and follows the full round-trip import/export thread in #1458.
The existing single-file import path can prove the data route works, but it is weak for full-history, user-facing imports when files must be chunked and clients have to resolve target IDs manually. This keeps the existing file-size constraint while making ordered chunk uploads full-fidelity and idempotent.
Relationship to other PRs
maindirectly and is not stacked on another PR.Validation
bin/rails testRAILS_ENV=test bundle exec rake rswag:specs:swaggerizebin/rubocopgit diff --checkbin/brakemannpm run lintupstream/mainNotes
Opened as draft because this is a broad API/import behavior change and should get maintainer review before being marked ready.
Summary by CodeRabbit