Skip to content

feat(imports): add SureImport session batches#1785

Open
JSONbored wants to merge 6 commits into
we-promise:mainfrom
JSONbored:codex/feat-sure-import-sessions
Open

feat(imports): add SureImport session batches#1785
JSONbored wants to merge 6 commits into
we-promise:mainfrom
JSONbored:codex/feat-sure-import-sessions

Conversation

@JSONbored
Copy link
Copy Markdown
Contributor

@JSONbored JSONbored commented May 13, 2026

Summary

  • Add first-class SureImport sessions for ordered multi-file NDJSON imports.
  • Persist source-to-target mappings across chunks so transaction chunks can reference prior source IDs.
  • Add idempotent session, chunk, mapping, transaction, and balance-like record handling.
  • Add session progress/readback and strict missing-reference failures.
  • Preserve existing single-file POST /api/v1/imports behavior.

What changed

  • Added POST /api/v1/import_sessions, GET /api/v1/import_sessions/:id, POST /api/v1/import_sessions/:id/chunks, and POST /api/v1/import_sessions/:id/publish.
  • Added persisted session/chunk/mapping state, per-type summary counts, and safe error bodies.
  • Updated SureImport/DataImporter session mode to resolve persisted source IDs across ordered chunks.
  • Regenerated OpenAPI docs and added Minitest coverage for importer/session/API behavior.
  • Added the devcontainer libvips runtime dependency needed by ActiveStorage variant tests.

Why

This addresses the import/export failure surface reported in #1610 and follows the full round-trip import/export thread in #1458.

The existing single-file import path can prove the data route works, but it is weak for full-history, user-facing imports when files must be chunked and clients have to resolve target IDs manually. This keeps the existing file-size constraint while making ordered chunk uploads full-fidelity and idempotent.

Relationship to other PRs

Validation

  • bin/rails test
  • RAILS_ENV=test bundle exec rake rswag:specs:swaggerize
  • bin/rubocop
  • git diff --check
  • bin/brakeman
  • npm run lint
  • API endpoint consistency checker
  • Codex Security diff-scoped scan
  • CodeRabbit diff review against upstream/main

Notes

Opened as draft because this is a broad API/import behavior change and should get maintainer review before being marked ready.

Summary by CodeRabbit

  • New Features
    • Multi-chunk import sessions: create/retrieve sessions, upload ordered chunks (file or raw), per-chunk idempotency keys, publish queued processing, session/chunk status, aggregated summaries, and structured error reporting.
  • Documentation
    • API/OpenAPI updated to document new import-session endpoints and schemas.
  • Tests
    • Extensive request, controller, model, and job tests covering flows, idempotency, edge cases and error handling.
  • Style
    • Lint configuration whitespace/spacing adjustments.
  • Chores
    • Development container updated to install libvips for image processing.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds ordered multi-chunk import sessions: migrations and schema, ImportSession/ImportSourceMapping models with chunk lifecycle and idempotency, Family::DataImporter refactor for session-aware ID mapping and summaries, API endpoints for create/chunk/publish with NDJSON validation, background job, OpenAPI docs, and tests.

Changes

Chunked Import Sessions Infrastructure

Layer / File(s) Summary
Devcontainer & RuboCop tweaks
.devcontainer/Dockerfile, .rubocop.yml
Adds libvips42 to the devcontainer and adjusts RuboCop layout rules/whitespace.
Migrations, schema, and core models
db/migrate/*, db/schema.rb, app/models/import_source_mapping.rb, app/models/import.rb, app/models/family.rb
Creates import_sessions and import_source_mappings tables, alters imports with chunk metadata and session FK, updates schema, adds ImportSourceMapping model and Family/import associations/validations.
ImportSession model & lifecycle
app/models/import_session.rb
Adds ImportSession class with statuses/enum, create_or_find_for!, attach_chunk! with checksum/idempotency and retry/finalize logic, publish/publish_later, per-chunk processing, summary aggregation, and error normalization.
Family::DataImporter refactor
app/models/family/data_importer.rb
Introduces session-aware importer (error types, mapping helpers, strict-reference validation), expands mapping types (securities, rules), tracks per-entity created/updated/skipped/failed summary counts, and rewrites entity import methods to use mapped lookups and persist source mappings.
SureImport integration & background job
app/models/sure_import.rb, app/jobs/import_session_job.rb
SureImport#import! now accepts import_session: and uses Family::DataImporter within a transaction; ImportSessionJob enqueues on :high_priority and invokes import_session.publish (requires non-nil session).
API routes, controller, views, and OpenAPI
config/routes.rb, app/controllers/api/v1/import_sessions_controller.rb, app/views/api/v1/import_sessions/show.json.jbuilder, docs/api/openapi.yaml, spec/swagger_helper.rb
Adds API v1 import_sessions routes (create/show + member chunks/publish), implements controller actions with NDJSON validation and scoped authorization, JSON view renders session+chunks metadata, and OpenAPI docs/schemas updated.
Tests: request, model, job
spec/requests/api/v1/import_sessions_spec.rb, test/models/import_session_test.rb, test/controllers/api/v1/import_sessions_controller_test.rb, test/jobs/import_session_job_test.rb, test/models/family/data_importer_test.rb
Adds comprehensive request specs and unit tests covering idempotency, chunk upload/publish flows, conflict/retry scenarios, strict session-mode validations, and importer summary behavior.

Sequence Diagram

sequenceDiagram
  participant Client
  participant Api as Api::V1::ImportSessionsController
  participant Session as ImportSession
  participant Job as ImportSessionJob
  participant Importer as Family::DataImporter

  Client->>Api: POST /api/v1/import_sessions (create)
  Client->>Api: POST /api/v1/import_sessions/:id/chunks (attach NDJSON)
  Api->>Session: attach_chunk!(sequence, content, client_chunk_id)
  Api-->>Client: 201 Created (chunk)
  Client->>Api: POST /api/v1/import_sessions/:id/publish
  Api->>Session: publish_later
  Session->>Job: enqueue ImportSessionJob
  Job->>Session: publish()
  Session->>Importer: import!(import_session: self) per chunk
  Importer-->>Session: per-import summary
  Session->>Session: aggregate_chunk_summaries -> set status complete/failed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • we-promise/sure#1643: Related Family::DataImporter changes around security resolution and holdings.
  • we-promise/sure#722: OpenAPI/security scheme changes that may affect the new import-session endpoints.

Suggested labels

pr:verified

Suggested reviewers

  • jjmata

Poem

🐰 I nibble chunks of NDJSON bright,

checksummed hops beneath the API light,
queued to publish, mappings sewn,
summaries tally seeds we've sown,
the rabbit applauds: imports take flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(imports): add SureImport session batches' clearly and concisely summarizes the main change—adding session-based batched import functionality for SureImport.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 09:05
@superagent-security superagent-security Bot added contributor:verified Contributor passed trust analysis. pr:flagged PR flagged for review by security analysis. labels May 13, 2026
@superagent-security
Copy link
Copy Markdown

Brin PR Security Scan

This PR has findings that should block merge.

  • Score: 26/100
  • Verdict: suspicious

Findings:

  • credential_exposure: Found 1 secret/credential pattern(s) in added lines

Analyzed by Brin

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: acda613e8f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread app/models/import_session.rb
Comment thread app/models/import_session.rb Outdated
@JSONbored JSONbored force-pushed the codex/feat-sure-import-sessions branch from acda613 to b31e25a Compare May 13, 2026 09:12
@JSONbored JSONbored marked this pull request as draft May 13, 2026 09:12
@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 09:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (1)
app/jobs/import_session_job.rb (1)

4-8: ⚡ Quick win

Consider raising an error instead of silently skipping.

Silently returning when import_session is nil can mask bugs in the enqueue logic. If the session is expected to exist when the job runs, this should fail loudly.

If nil is a valid edge case (e.g., cascading delete), consider logging at error level and including context about why the job was enqueued without a valid session.

💡 Suggested improvement
   def perform(import_session)
     if import_session.nil?
-      Rails.logger.info("ImportSessionJob skipped missing import_session")
+      Rails.logger.error("ImportSessionJob received nil import_session - this indicates a bug in enqueue logic")
+      raise ArgumentError, "import_session cannot be nil"
-      return
     end

     Rails.logger.info("ImportSessionJob started import_session_id=#{import_session.id}")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/jobs/import_session_job.rb` around lines 4 - 8, The perform method in
ImportSessionJob currently returns silently when import_session is nil which can
hide enqueue bugs; change this to either raise a descriptive error (e.g., raise
ArgumentError or a custom ImportSessionMissingError) from
ImportSessionJob#perform so failures surface, or if nil is an acceptable edge
case, replace the info log with Rails.logger.error and include context (job id,
args, and reason) so the condition is visible; update tests for
ImportSessionJob.perform to expect the new behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/controllers/api/v1/import_sessions_controller.rb`:
- Line 10: Replace uses of current_resource_owner.family with Current.family in
this controller to follow repo standards: update the session creation payload
(the hash containing family: current_resource_owner.family) and any lookup calls
(the usage around line ~70 where current_resource_owner.family is used) to
reference Current.family instead; ensure you only change the family accessor and
leave authentication/authorization logic intact so the controller methods (e.g.,
create and any lookup/find method) continue to use the same semantics but via
Current.family.

In `@app/models/family/data_importer.rb`:
- Around line 326-345: The import_balances method is double-counting skipped
Balance rows because mapped_id(:accounts, ...) already calls missing_reference
which increments_summary("Balance", :skipped) on blank account_id; remove the
duplicate increment there: replace the explicit post-mapped_id increment+next
with a simple guard that skips when new_account_id.blank? (e.g. next unless
new_account_id) or otherwise only call increment_summary once, keeping
mapped_id/missing_reference as the single place that logs skipped references;
ensure references to mapped_id, missing_reference, increment_summary and
import_balances are used to locate and update the logic.
- Around line 904-911: The import_rules loop misses the required check for blank
source ids; add a call to require_source_id!("Rule", old_id) near the top of
import_rules (before calling mapped_record(:rules, old_id, ...)) so records
exported without an "id" are rejected in strict/session mode, matching other
importers and preventing creation of new Rule records on chunk replays where
map_source! would otherwise early-return on a blank source_id.

In `@app/models/import_session.rb`:
- Around line 29-54: The create_or_find_for! method has a race between
family.import_sessions.find_or_initialize_by and session.save! that can raise
RecordNotUnique instead of the intended ConflictError; wrap the save! in a
rescue for ActiveRecord::RecordNotUnique, then re-load the existing session via
family.import_sessions.find_by(client_session_id: client_session_id), compare
its expected_chunks against the requested expected_chunks and raise
ConflictError if they differ, otherwise return the existing session (or retry
the save if appropriate); reference create_or_find_for!,
family.import_sessions.find_or_initialize_by, session.save!, and ConflictError
when making the change.

In `@app/views/api/v1/import_sessions/show.json.jbuilder`:
- Line 14: The JSON view currently iterates over `@import_session.imports` which
is not guaranteed to be ordered; change the rendering to iterate over the
imports ordered by their sequence value (e.g. use
`@import_session.imports.order`(:sequence) or equivalent sorted collection) so
that the chunks are returned in deterministic sequence order when calling
json.array! in show.json.jbuilder.

In `@docs/api/openapi.yaml`:
- Around line 4773-4802: Add the missing 422 response for the POST
/api/v1/import_sessions/{id}/publish OpenAPI operation: declare a '422' response
with a description like "max_row_count_exceeded" and set content
application/json to use the existing ErrorResponse schema (same pattern as
401/403/404/409). Update the OpenAPI fragment that includes
ImportSessionResponse to mirror the other error entries so clients can see the
controller can return 422 for publish failures.

In `@test/controllers/api/v1/import_sessions_controller_test.rb`:
- Around line 33-180: Add two tests: one that omits the Authorization header and
posts to api_v1_import_sessions_url (or any other mutate endpoint like
publish_api_v1_import_session_url) asserting response :unauthorized and error
"missing_auth" (401); and one that sends a request with an invalid date payload
(e.g. bad date/string param) to the relevant action (api_v1_import_sessions_url
or publish_api_v1_import_session_url) asserting response :unprocessable_entity
and the "invalid_date" (422) error; name them clearly (e.g. test "requires
authentication returns 401" and test "rejects invalid date returns 422") and
follow the existing pattern using api_headers and JSON.parse assertions.

---

Nitpick comments:
In `@app/jobs/import_session_job.rb`:
- Around line 4-8: The perform method in ImportSessionJob currently returns
silently when import_session is nil which can hide enqueue bugs; change this to
either raise a descriptive error (e.g., raise ArgumentError or a custom
ImportSessionMissingError) from ImportSessionJob#perform so failures surface, or
if nil is an acceptable edge case, replace the info log with Rails.logger.error
and include context (job id, args, and reason) so the condition is visible;
update tests for ImportSessionJob.perform to expect the new behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fc5067a8-cba4-4e36-86fe-10fd2a48921d

📥 Commits

Reviewing files that changed from the base of the PR and between ce5d7dd and b31e25a.

📒 Files selected for processing (19)
  • .devcontainer/Dockerfile
  • .rubocop.yml
  • app/controllers/api/v1/import_sessions_controller.rb
  • app/jobs/import_session_job.rb
  • app/models/family.rb
  • app/models/family/data_importer.rb
  • app/models/import.rb
  • app/models/import_session.rb
  • app/models/import_source_mapping.rb
  • app/models/sure_import.rb
  • app/views/api/v1/import_sessions/show.json.jbuilder
  • config/routes.rb
  • db/migrate/20260513013000_create_import_sessions.rb
  • db/schema.rb
  • docs/api/openapi.yaml
  • spec/requests/api/v1/import_sessions_spec.rb
  • spec/swagger_helper.rb
  • test/controllers/api/v1/import_sessions_controller_test.rb
  • test/models/import_session_test.rb

Comment thread app/controllers/api/v1/import_sessions_controller.rb Outdated
Comment thread app/models/family/data_importer.rb
Comment thread app/models/family/data_importer.rb
Comment thread app/models/import_session.rb
Comment thread app/views/api/v1/import_sessions/show.json.jbuilder Outdated
Comment thread docs/api/openapi.yaml
Comment thread test/controllers/api/v1/import_sessions_controller_test.rb
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b31e25a590

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread app/models/import_session.rb
Comment thread app/models/import_session.rb Outdated
@JSONbored JSONbored force-pushed the codex/feat-sure-import-sessions branch from b31e25a to a787c17 Compare May 13, 2026 10:20
@JSONbored JSONbored marked this pull request as draft May 13, 2026 10:21
@JSONbored JSONbored force-pushed the codex/feat-sure-import-sessions branch from a787c17 to 8cadec4 Compare May 13, 2026 10:23
@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 10:44
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/models/family/data_importer_test.rb`:
- Around line 128-145: Add an assertion after calling
Family::DataImporter.new(`@family`, ndjson).import! to confirm the Balance record
was not persisted: query the Balance model (e.g., find_by(id: "balance-1") or by
attributes) and assert it is nil or that no record exists for that
id/attributes; keep this check alongside the existing summary assertion to
ensure import! both counted and skipped saving the invalid Balance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b4e0a8fa-c742-40aa-b92a-6a653b0d7d0e

📥 Commits

Reviewing files that changed from the base of the PR and between b31e25a and 8cadec4.

📒 Files selected for processing (21)
  • .devcontainer/Dockerfile
  • .rubocop.yml
  • app/controllers/api/v1/import_sessions_controller.rb
  • app/jobs/import_session_job.rb
  • app/models/family.rb
  • app/models/family/data_importer.rb
  • app/models/import.rb
  • app/models/import_session.rb
  • app/models/import_source_mapping.rb
  • app/models/sure_import.rb
  • app/views/api/v1/import_sessions/show.json.jbuilder
  • config/routes.rb
  • db/migrate/20260513013000_create_import_sessions.rb
  • db/schema.rb
  • docs/api/openapi.yaml
  • spec/requests/api/v1/import_sessions_spec.rb
  • spec/swagger_helper.rb
  • test/controllers/api/v1/import_sessions_controller_test.rb
  • test/jobs/import_session_job_test.rb
  • test/models/family/data_importer_test.rb
  • test/models/import_session_test.rb
🚧 Files skipped from review as they are similar to previous changes (14)
  • app/models/family.rb
  • .rubocop.yml
  • .devcontainer/Dockerfile
  • config/routes.rb
  • app/views/api/v1/import_sessions/show.json.jbuilder
  • app/models/import_source_mapping.rb
  • app/models/import.rb
  • spec/swagger_helper.rb
  • docs/api/openapi.yaml
  • app/models/sure_import.rb
  • db/migrate/20260513013000_create_import_sessions.rb
  • spec/requests/api/v1/import_sessions_spec.rb
  • app/controllers/api/v1/import_sessions_controller.rb
  • app/models/family/data_importer.rb

Comment thread test/models/family/data_importer_test.rb
@JSONbored JSONbored force-pushed the codex/feat-sure-import-sessions branch 3 times, most recently from 1776e30 to e24999b Compare May 13, 2026 11:00
JSONbored added 3 commits May 13, 2026 14:00
Add first-class SureImport sessions for ordered multi-file NDJSON imports.

Persist source mappings across chunks, make session/chunk processing idempotent, expose progress readback, and keep existing single-file import behavior compatible.

Includes the devcontainer libvips runtime dependency needed by ActiveStorage variant tests.

Addresses we-promise#1610.

Related to we-promise#1458.
@jjmata jjmata force-pushed the codex/feat-sure-import-sessions branch from e10ed45 to c128c06 Compare May 13, 2026 12:00
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/import_session.rb`:
- Around line 153-155: The current guard in ImportSession (the block using
expected_chunks and imports.count) only checks counts; instead enforce that the
exact set of chunk sequences 1..expected_chunks is present: collect the imports'
sequence numbers (e.g., imports.map(&:sequence) or imports.pluck(:sequence)),
compare to the expected set (Set.new(1..expected_chunks) or
(1..expected_chunks).to_a), and raise ConflictError (same class) if they differ;
update the error message to indicate which sequences are missing or unexpected
for easier debugging.
- Around line 157-161: Wrap the status transition and job enqueue so a failed
ImportSessionJob.perform_later does not leave the record stuck in :importing;
specifically, use a DB transaction around the update!(status: :importing,
error_details: {}) and the ImportSessionJob.perform_later(self) call, attempt
perform_later inside a begin/rescue (or rescue the specific enqueue error), and
on failure revert the status (e.g. back to :pending or previous state) and
persist error_details with the exception before re-raising or returning a
failure, ensuring the update! to :importing is only committed if the job enqueue
succeeds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c0a1fd32-bd38-452c-96e9-6436da790752

📥 Commits

Reviewing files that changed from the base of the PR and between 2847897 and c128c06.

📒 Files selected for processing (21)
  • .devcontainer/Dockerfile
  • .rubocop.yml
  • app/controllers/api/v1/import_sessions_controller.rb
  • app/jobs/import_session_job.rb
  • app/models/family.rb
  • app/models/family/data_importer.rb
  • app/models/import.rb
  • app/models/import_session.rb
  • app/models/import_source_mapping.rb
  • app/models/sure_import.rb
  • app/views/api/v1/import_sessions/show.json.jbuilder
  • config/routes.rb
  • db/migrate/20260513013000_create_import_sessions.rb
  • db/schema.rb
  • docs/api/openapi.yaml
  • spec/requests/api/v1/import_sessions_spec.rb
  • spec/swagger_helper.rb
  • test/controllers/api/v1/import_sessions_controller_test.rb
  • test/jobs/import_session_job_test.rb
  • test/models/family/data_importer_test.rb
  • test/models/import_session_test.rb
✅ Files skipped from review due to trivial changes (2)
  • .devcontainer/Dockerfile
  • .rubocop.yml
🚧 Files skipped from review as they are similar to previous changes (16)
  • app/jobs/import_session_job.rb
  • app/models/import_source_mapping.rb
  • test/jobs/import_session_job_test.rb
  • app/models/family.rb
  • spec/swagger_helper.rb
  • test/models/family/data_importer_test.rb
  • app/models/import.rb
  • spec/requests/api/v1/import_sessions_spec.rb
  • db/migrate/20260513013000_create_import_sessions.rb
  • db/schema.rb
  • app/models/sure_import.rb
  • test/controllers/api/v1/import_sessions_controller_test.rb
  • test/models/import_session_test.rb
  • app/controllers/api/v1/import_sessions_controller.rb
  • docs/api/openapi.yaml
  • app/models/family/data_importer.rb

Comment thread app/models/import_session.rb Outdated
Comment thread app/models/import_session.rb Outdated
JSONbored added 2 commits May 13, 2026 05:16
Validate expected import chunk sequences exactly before publish, and restore session state with error details when enqueueing the publish job fails.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/import_session.rb`:
- Around line 54-66: Rescue path for ActiveRecord::RecordNotUnique can return a
row with expected_chunks == nil even when the current caller provided
expected_chunks; update the found existing import session to backfill the
caller's expected_chunks when existing.expected_chunks is nil and the caller
supplied a value. In the rescue block around client_session_id lookup (the code
using family.import_sessions.find_by(client_session_id: client_session_id)),
after the existing row is found and after the mismatch check, set
existing.expected_chunks = expected_chunks (or call
existing.update!(expected_chunks: expected_chunks)) when
expected_chunks.present? && existing.expected_chunks.nil? and persist the change
before returning existing; preserve raising ConflictError when values disagree.
- Around line 145-169: publish_later can leave the session stuck in :importing
because ImportSessionJob uses Rails 7.2's deferred enqueue and perform_later is
called inside the inner transaction savepoint; fix by disabling deferred enqueue
in the job or ensuring enqueue errors are handled outside the savepoint: add
self.enqueue_after_transaction_commit = :never to the ImportSessionJob class
(same pattern as DestroyJob) so ImportSessionJob.perform_later will enqueue
immediately, or alternatively move the ImportSessionJob.perform_later call out
of the self.class.transaction(requires_new: true) block in
ImportSession#publish_later and add an outer rescue around the enqueue that
restores status (use with_lock/previous_status/error_details_for to revert on
enqueue errors).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 026075df-a7a7-4238-99a8-8f354887dbc1

📥 Commits

Reviewing files that changed from the base of the PR and between c128c06 and 18ab9a2.

📒 Files selected for processing (2)
  • app/models/import_session.rb
  • test/models/import_session_test.rb
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/models/import_session_test.rb

Comment thread app/models/import_session.rb
Comment thread app/models/import_session.rb
Backfill expected chunk counts after client-session insert races and enqueue import-session jobs after the status transition commits. Persist a safe enqueue failure body so API readback does not expose raw queue errors.
Copy link
Copy Markdown
Collaborator

jjmata commented May 14, 2026

Solid architectural addition — the cross-chunk ImportSourceMapping table is a clean way to solve the source-ID persistence problem, and the summary counts per record type are a nice operational touch. A few things worth looking at before merge:


Dead code in import_transactions

if @import_session && transaction.respond_to?(:external_id=) && old_id.present?
  transaction.external_id = old_id
end

external_id lives on Entry, not on the Transaction entryable, so transaction.respond_to?(:external_id=) will always be false. This entire block is silently skipped every time. The external_id is set correctly on entry a few lines later, so the mapping still works — but this guard is misleading and should be removed.


publish is unguarded against re-entry from a retried job

publish opens with update!(status: :importing, error_details: {}) unconditionally. publish_later has a with_lock guard, but the job calls publish directly. If Sidekiq retries a failed job mid-session, the status reset clears error state even for chunks that already complete'd in the first attempt. The per-chunk import.complete? short-circuit is correct, but a guard at the publish level (e.g., return if complete?, mirroring publish_later) would make the semantics explicit and prevent status thrash.


family.sync_later is enqueued before session is marked complete

The happy path enqueues a sync and then calls update!(status: :complete). If that final update fails (DB error, validation, etc.), the sync runs against data the session considers uncommitted. Swapping the order — mark complete first, then enqueue — makes the intent clear and avoids the race.


Stale status values in the OpenAPI schema

ImportSessionChunk.status in spec/swagger_helper.rb and docs/api/openapi.yaml includes reverting and revert_failed, which are legacy Import statuses from the revert feature. ImportSession::STATUSES is only %w[pending importing complete failed]. Clients reading the generated docs will think those transitions are valid for session-backed chunks when they aren't.


Silent behavioral change: Tag::COLORS.sampleTag::COLORS.first

The old code picked a random color for new tags; the new code always picks the first. More importantly, the expression data["color"] || tag.color || Tag::COLORS.first means existing tags with no stored color will be assigned the first color on any session-mode update pass, which could be surprising. If deterministic defaults are intentional, a brief comment explaining why would help.


Missing test: partial-failure retry

There's a test verifying that a complete session is a no-op on re-publish, but no test for the scenario where a session is failed after chunk 1 succeeded and chunk 2 failed. Verifying that a re-run of publish on that session skips the already-complete chunk and re-processes only the failed one would give important confidence for a production import tool.


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor:verified Contributor passed trust analysis. pr:flagged PR flagged for review by security analysis.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants