Skip to content

refactor(imports): back PDF imports with statements#1786

Open
JSONbored wants to merge 27 commits into
we-promise:mainfrom
JSONbored:codex/refactor-pdf-import-account-statements
Open

refactor(imports): back PDF imports with statements#1786
JSONbored wants to merge 27 commits into
we-promise:mainfrom
JSONbored:codex/refactor-pdf-import-account-statements

Conversation

@JSONbored
Copy link
Copy Markdown
Contributor

@JSONbored JSONbored commented May 13, 2026

Summary

What changed

  • Adds imports.account_statement_id so PdfImport can reference the canonical uploaded statement.
  • Routes PDF imports through AccountStatement.prepare_upload! / duplicate detection, then creates or reuses a statement-backed PdfImport.
  • Adds extraction from stored PDF statements.
  • Links selected import accounts back to the source statement when authorized.
  • Carries statement-backed file content and filenames through the existing PDF extraction job.
  • Adds statement-backed import authorization checks and regression coverage.

Why

This avoids keeping a separate PDF statement upload path next to Statement Vault. AccountStatement now owns storage, metadata, deduplication, and account association, while PdfImport remains the reviewable import draft workflow for extracted rows.

Validation

  • bin/rails test - 4098 runs, 16602 assertions, 0 failures, 0 errors, 24 skips
  • bin/rails test test/controllers/imports_controller_test.rb test/models/pdf_import_test.rb test/controllers/account_statements_controller_test.rb test/jobs/process_pdf_job_test.rb - 77 runs, 282 assertions, 0 failures
  • bin/rails test test/system/imports_test.rb - 4 runs, 16 assertions, 0 failures
  • bin/rubocop - 1726 files inspected, no offenses
  • bin/brakeman --no-pager - 0 security warnings
  • bin/rails zeitwerk:check
  • npm run lint
  • bundle exec erb_lint app/views/account_statements/show.html.erb app/views/imports/_pdf_import.html.erb
  • git diff --check
  • /Users/shadowbook/.codex/skills/sure-upstream-contributions/scripts/sure_quality_gate .

Notes

  • feat(statements): add account statement vault #1753 intentionally kept PdfImport, FamilyDocument, and document search separate from Statement Vault V1. This follow-up wires PDF statement extraction to AccountStatement without expanding generic document behavior.
  • A full all-view ERB lint sweep still reports unrelated existing offenses outside this diff; the touched ERB files pass.
  • A broad system-test sweep hit unrelated transaction/account/category/property failures after a database deadlock; the import-facing system file passes in isolation.

Summary by CodeRabbit

  • New Features

    • Account statement vault: upload/manage PDF/CSV/XLSX statements with inbox (unmatched vs linked), detailed statement pages, extraction/import workflow, and link/unlink/reject/extract actions.
    • Month-by-month coverage UI with year selector, reconciliation badges, and account-match suggestions.
  • Improvements

    • Lazy-loading of statements via Turbo frames and conditional statements tab/menu link.
    • Tighter attachment authorization, PDF import/processing improvements, and enhanced upload validation/duplicate handling.
    • Devcontainer updates for image processing and Selenium session tuning; new admin settings nav entry for Statement Vault.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a statement vault: AccountStatement domain, upload ingestion, metadata detection, matching/coverage, UI/routes/controllers, PdfImport integration, Active Storage authorization, DB migrations/schema, devcontainer tweaks, and extensive tests.

Changes

Account Statement Vault (core feature + integrations)

Layer / File(s) Summary
Domain model & ingestion
app/models/account_statement.rb, app/models/account_statement/*, db/migrate/20260505120000_create_account_statements.rb
New AccountStatement model with upload preparation/ingestion, chunked reads, MIME/format checks, MD5+SHA256 dedupe, metadata detection, enums/validations, reconciliation helpers, and DB migration with constraints/indexes.
Metadata detection & CSV parsing
app/models/account_statement/metadata_detector.rb
Filename and CSV-based metadata extraction (period dates, hints, parser confidence), sanitized parser output, CSV sampling, and format inference.
Account matching & coverage
app/models/account_statement/account_matcher.rb, app/models/account_statement/coverage.rb
New matcher scoring, best-match selection, and Coverage computing per-month expected vs actual months and summary counts.
Controllers, views, routes, component wiring
app/controllers/account_statements_controller.rb, config/routes.rb, app/controllers/accounts_controller.rb, app/components/UI/account_page.*, app/views/account_statements/*, app/views/accounts/show/*
New controller (index/show/create/update/link/unlink/reject/destroy/extract), routes, account show turbo-frame wiring, inbox and show templates, and component updates for statements tab.
PdfImport & Imports integration
app/controllers/imports_controller.rb, app/models/pdf_import.rb, app/models/import.rb, app/views/imports/_pdf_import.html.erb
PdfImport adapts to statement-backed imports; Import gains account_statement association and assign_account!; ImportsController delegates PDF uploads to PdfImport and enforces permissions; views surface source-statement.
Active Storage authorization & background job
config/initializers/active_storage_authorization.rb, app/jobs/process_pdf_job.rb
Unified Active Storage attachment authorization for Transaction and AccountStatement; ProcessPdfJob resets processing claims and sources filenames via PdfImport.
DB wiring & account/family changes
app/models/account.rb, app/models/family.rb, db/migrate/20260513120000_add_account_statement_to_imports.rb
Account/Family associations for account_statements; account destroy hooks move statements to inbox post-commit; imports table gets account_statement_id FK; schema updated.
Helpers, i18n, settings nav
app/helpers/account_statements_helper.rb, app/helpers/settings_helper.rb, config/locales/**, app/views/settings/_settings_nav.html.erb
Helpers for status badges/coverage labels/balance formatting, settings helper evaluates callable names, locale keys added, and settings nav includes admin-only Statement Vault link.
Devcontainer & compose, Brakeman
.devcontainer/Dockerfile, .devcontainer/docker-compose.yml, config/brakeman.ignore
Devcontainer installs libvips42; selenium node env vars tweaked; removed a Brakeman ignore entry.
Tests & test helpers
test/**/*, test/test_helper.rb
Extensive tests (models, controllers, integration, system, helpers, jobs), uploaded_file and family_guest test helpers, fixture cleanup, and ActiveJob test adjustments.

Sequence Diagram

sequenceDiagram
  participant User as Client (browser)
  participant AC as AccountStatementsController
  participant AS as AccountStatement (model / DB)
  participant ASBlob as ActiveStorage
  participant PI as PdfImport
  participant Job as ProcessPdfJob
  participant VS as VectorStore

  User->>AC: POST /account_statements (upload)
  AC->>AS: AccountStatement.create_from_upload!(...)
  AS->>ASBlob: attach original_file
  AS-->>AC: create result (linked/suggested)
  AC-->>User: redirect + flash

  alt PDF extract requested
    User->>AC: POST /account_statements/:id/extract
    AC->>PI: PdfImport.create_from_statement!(statement)
    PI-->>AC: created import (pending/ready)
    AC->>Job: enqueue ProcessPdfJob.perform_later(pdf_import)
    Job->>PI: download pdf (statement-backed)
    Job->>VS: upload vectors / indexing
    Job-->>PI: update status
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

codex

Suggested reviewers

  • jjmata

Poem

"I hopped in with a tiny sack of files,
Filenames, hashes, and clever little wiles.
I nudged PDFs into a tidy vault — hooray!
Now statements sleep safe until review day. 🐇"

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 11:00
@superagent-security superagent-security Bot added contributor:verified Contributor passed trust analysis. pr:verified PR passed security analysis. labels May 13, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed35ec36d4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread app/controllers/account_statements_controller.rb Outdated
Comment thread app/models/pdf_import.rb
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (10)
app/models/pdf_import.rb (3)

64-71: 💤 Low value

Concurrency: account_statement.account_id read outside the statement's lock.

assign_account! opens a transaction, but the comparison account_statement.account_id != account.id reads the in-memory association without locking the statement row. If another process flips the statement linkage concurrently, you could either skip a needed link_to_account! or call it redundantly (also overwriting match_confidence to 1.0). Given the rest of the upload flow is single-user driven and link_to_account! is idempotent, this is mostly cosmetic, but consider account_statement.lock! (or reload) inside the transaction if you want a strict invariant.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/pdf_import.rb` around lines 64 - 71, The transaction in
assign_account! reads account_statement.account_id without locking/reloading the
associated AccountStatement, so concurrent changes can cause stale comparisons
and unnecessary or missed link_to_account! calls; inside assign_account! before
comparing, obtain a DB lock or fresh state for the association (e.g., call
account_statement.lock! or account_statement.reload within the transaction) and
then perform the equality check and conditional call to link_to_account! to
ensure the check reflects the latest DB state.

213-218: 💤 Low value

pdf_file_content loads the entire PDF into memory.

account_statement.original_file.download materializes the full blob in RAM. With MAX_FILE_SIZE = 25.megabytes, repeated calls (e.g., process_with_ai then extract_transactions) will allocate ~50 MB per import. Consider caching the download in a memoized ivar or streaming to a Tempfile when the caller needs to pass it to the provider client, to reduce peak memory under concurrent jobs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/pdf_import.rb` around lines 213 - 218, The pdf_file_content method
currently downloads the entire blob each time (via
account_statement.original_file.download and pdf_file.download), causing
repeated large allocations; change pdf_file_content to memoize the downloaded
content (e.g., store result in an instance variable like `@pdf_file_content`) so
subsequent calls reuse the same buffer, or alternatively stream the blob to a
Tempfile when size approaches MAX_FILE_SIZE before returning so callers like
process_with_ai and extract_transactions can pass a file without duplicating
memory; update pdf_file_content (and any callers) to use the memoized value or
Tempfile approach and ensure Tempfile is cleaned up where appropriate.

22-33: 💤 Low value

Reused PDF imports may have a stale account/date_format from the statement's prior state.

create_from_statement! returns statement.latest_reusable_pdf_import whenever one exists, but never reconciles account / date_format against the (possibly updated) statement.account / statement.family.date_format. If a statement was previously linked to one account, generated a non-failed PDF import, then unlinked/relinked to a different account, the reused import would still point at the old account and silently bypass the new linkage. Consider reconciling on reuse (or scoping latest_reusable_pdf_import by account when one is present).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/pdf_import.rb` around lines 22 - 33, The method
create_from_statement! currently returns statement.latest_reusable_pdf_import
without checking that the reusable_import's account and date_format still match
the statement's current state; modify create_from_statement! to reconcile before
returning by either (a) scoping/looking up latest_reusable_pdf_import to include
account (e.g. use statement.account when present) and/or date_format
(statement.family.date_format), or (b) if you still load
latest_reusable_pdf_import, compare reusable_import.account and
reusable_import.date_format to statement.account and
statement.family.date_format and only return it when they match (otherwise fall
through to create!); update the logic around latest_reusable_pdf_import and
create_from_statement! accordingly so reused imports can’t carry stale
account/date_format.
app/models/account_statement.rb (4)

395-397: 💤 Low value

normalize_currency doesn't strip whitespace before upcasing.

If currency arrives as " usd " (e.g., copy-paste or trim mishap in form input), upcase.presence yields " USD ", which then fails currency_is_valid with Money::Currency::UnknownCurrencyError. A strip before upcase would tolerate the common leading/trailing whitespace case and let the rest of the validation chain do its job. Minor UX nicety.

♻️ Suggested change
     def normalize_currency
-      self.currency = currency.to_s.upcase.presence if currency.present?
+      self.currency = currency.to_s.strip.upcase.presence if currency.present?
     end
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/account_statement.rb` around lines 395 - 397, normalize_currency
currently upcases but doesn't strip whitespace, so values like " usd " become "
USD " and later raise Money::Currency::UnknownCurrencyError in
currency_is_valid; update normalize_currency to strip whitespace before upcase
and presence checking (e.g., use currency.to_s.strip.upcase.presence) so
leading/trailing spaces are removed while preserving the rest of the validation
flow.

84-124: 💤 Low value

Possibly leaking statement failure path: blob is purged but state remains attached.

In the rescue StandardError branch (line 121), purge_original_file(statement) will silently swallow purge errors via the rescue StandardError inside purge_original_file. That's the right call to avoid masking the original error, but consider one small refinement: when statement is nil (the duplicate check raised before the build), purge_original_file(nil) is a no-op — good — but the explicit re-raise of ActiveRecord::RecordNotUnique from the previous rescue block also funnels into the StandardError rescue. That second rescue is unreachable because the raise re-enters the same begin-rescue — actually it's reachable since raise propagates out of the method. The control flow is correct, just dense. Consider folding both rescues into a single ensure-style cleanup for clarity. Non-blocking.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/account_statement.rb` around lines 84 - 124, The method
create_from_prepared_upload! currently has two rescue blocks that call
purge_original_file(statement) and re-raise, which is hard to follow; refactor
by combining error handling and moving cleanup into an ensure block: in
create_from_prepared_upload! keep the duplicate check using duplicate_for and
raise DuplicateUploadError as before, remove the separate rescue StandardError
and the duplicate-specific rescue that calls purge_original_file, and instead
add an ensure block that calls purge_original_file(statement) (so cleanup always
runs but relies on purge_original_file to swallow its own errors); if you still
need to map ActiveRecord::RecordNotUnique to DuplicateUploadError, rescue
ActiveRecord::RecordNotUnique => e, compute duplicate = duplicate_for(family,
prepared_upload) and raise DuplicateUploadError, otherwise re-raise e; ensure
the method otherwise re-raises unexpected errors so behavior is preserved.

364-366: 💤 Low value

latest_reusable_pdf_import depends on the association's ordered scope; consider an explicit order.

pdf_imports is declared as has_many :pdf_imports, -> { where(type: "PdfImport").ordered }, so .first returns the latest only as long as PdfImport.ordered (inherited from Import.ordered) keeps a created_at DESC ordering. If anyone tweaks Import.ordered later, this method silently changes meaning. Pinning the order locally (pdf_imports.where.not(status: :failed).order(created_at: :desc).first) would make the intent self-contained.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/account_statement.rb` around lines 364 - 366, The method
latest_reusable_pdf_import relies implicitly on the association pdf_imports and
its inherited ordered scope (PdfImport.ordered); make the intent explicit by
applying a local order to guarantee newest-first behavior regardless of changes
to Import.ordered — update latest_reusable_pdf_import to filter out failed
statuses (pdf_imports.where.not(status: :failed)) and then apply
.order(created_at: :desc).first so it deterministically returns the most recent
reusable PdfImport.

256-266: 💤 Low value

unlink! performs two writes per call; consider folding into one save.

unlink! does an update! (which clears account and persists), then calls assign_account_match (which mutates suggested_account/match_confidence in memory), then save!. That's two SQL UPDATEs within the transaction. You could assign_attributes the unlink fields, run assign_account_match, then a single save! to halve the writes. Not a correctness issue, just a minor optimization in a low-traffic path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/account_statement.rb` around lines 256 - 266, The unlink! method
currently calls update! (persisting account:nil, review_status: :unmatched,
match_confidence:nil) and then assign_account_match which mutates
suggested_account/match_confidence in memory and triggers a second save!; change
unlink! to use assign_attributes to set account: nil, review_status: :unmatched,
match_confidence: nil, then call assign_account_match, and finally call save!
once inside the transaction so only a single UPDATE occurs; keep the transaction
block and ensure the same attribute names (account, review_status,
match_confidence, suggested_account) and method names (unlink!,
assign_account_match, save!) are used.
config/initializers/active_storage_authorization.rb (1)

58-78: 💤 Low value

Minor: authorized_attachments returns nil on the nil-blob branch, but the caller asserts .empty?.

authorized_attachments early-returns nil when authorized_blob is nil, while the caller in authorize_protected_attachment does raise ActiveRecord::RecordNotFound if attachments.empty?. Today this is unreachable because the caller already guards return unless authorized_blob first, but if that guard is ever changed/removed, the next call would raise NoMethodError: undefined method 'empty?' for nil. Returning [] (or removing the early return entirely now that the guard exists) would be safer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/initializers/active_storage_authorization.rb` around lines 58 - 78,
The method authorized_attachments currently returns nil when authorized_blob is
nil which can cause a NoMethodError if a caller expects an array (calls
.empty?); change authorized_attachments so it returns an empty array instead of
nil when authorized_blob is nil (or remove the early return) so callers like
authorize_protected_attachment can safely call attachments.empty?; update the
authorized_attachments method (referencing authorized_attachments and
authorized_blob) to ensure it always returns an array.
app/controllers/account_statements_controller.rb (2)

57-66: 💤 Low value

InvalidUploadError collapses distinct failure modes into one generic message.

The prepare_upload! path raises InvalidUploadError for both oversize uploads (declared or streamed beyond MAX_FILE_SIZE), unsupported extension/MIME mismatches, empty content, and PDFs missing the %PDF- header. All of these surface to the user as t("account_statements.create.invalid_file_type"), which can be misleading (e.g., a 30 MB PDF is reported as an invalid file type). Consider either carrying a reason through the exception or splitting into a few error subclasses so the flash can disambiguate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/controllers/account_statements_controller.rb` around lines 57 - 66, The
InvalidUploadError raised by AccountStatement.prepare_upload! conflates multiple
failure modes (oversize, unsupported extension/MIME, empty file, missing PDF
header) so update the prepare_upload! flow to either (A) add a reason attribute
to AccountStatement::InvalidUploadError (e.g., :oversize, :unsupported_type,
:empty, :invalid_pdf_header) and populate it where MAX_FILE_SIZE, extension/MIME
checks, empty content, and PDF header checks occur, or (B) introduce distinct
subclasses (e.g., AccountStatement::OversizeUploadError,
::UnsupportedFileTypeError, ::EmptyUploadError, ::InvalidPdfHeaderError) and
raise them appropriately; then change the controller rescue block in
account_statements_controller (the loop around AccountStatement.prepare_upload!
/ AccountStatement.create_from_prepared_upload!) to handle these specific errors
(or inspect error.reason) and map each to a distinct translation key/message
instead of always using t("account_statements.create.invalid_file_type").

72-93: 💤 Low value

update raises ActiveRecord::RecordNotFound on a bad account_id, but other paths redirect with a flash.

Current.user.accessible_accounts.find(statement_account_id) (line 75) and the parallel call in target_account raise ActiveRecord::RecordNotFound (→ generic 404) for an unknown/inaccessible account id, while sibling actions (link, unlink, reject, destroy) all degrade gracefully via require_account_permission! and friendly redirects. Consider rescuing or using find_by + require_account_permission! so a stale or hand-crafted account_id returns the same UX as the other actions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/controllers/account_statements_controller.rb` around lines 72 - 93, The
update action currently uses
Current.user.accessible_accounts.find(statement_account_id) which raises
ActiveRecord::RecordNotFound for an invalid/inaccessible id and yields a 404
instead of the friendly redirect used elsewhere; change this to look up the
target with a non-raising lookup (e.g. accessible_accounts.find_by(id:
statement_account_id)) or nil when not present, and then call
require_account_permission!(target) as currently done so the stale/invalid
statement_account_id path follows the same redirect/flash behavior; also update
the parallel lookup in target_account (or any other method that calls
Current.user.accessible_accounts.find) to use a non-raising lookup and nil-check
before calling require_account_permission!.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/account_statement/account_matcher.rb`:
- Around line 13-20: The iteration over visible accounts in AccountMatcher (the
candidates assignment) causes N+1 queries because later calls to
institution_name/institution_domain access account_providers; fix by preloading
the account_providers association before materializing the collection—replace
the current call that fetches accounts with one that
includes(:account_providers) (e.g., use
statement.family.accounts.visible.includes(:account_providers) before to_a), so
methods like confidence_for and the later institution_name/institution_domain
lookups (lines referenced around 56-61 and 70-72) use the preloaded association
and avoid extra queries.

In `@app/models/account.rb`:
- Around line 266-268: There are two definitions of create_from_kraken_account;
keep the delegating implementation that calls
create_from_crypto_exchange_account(kraken_account, family:
kraken_account.kraken_item.family) and remove the duplicate redefinition that
contains the original implementation so the delegating path actually runs.
Ensure only one create_from_kraken_account method remains.

In `@test/test_helper.rb`:
- Around line 109-116: The uploaded_file helper creates a Tempfile and returns a
Rack::Test::UploadedFile but currently drops the local tempfile reference,
allowing GC to delete the file before Rack reads it; update the uploaded_file
method to retain the Tempfile on the returned object (for example, set an
instance variable like `@tempfile` on the Rack::Test::UploadedFile instance or
otherwise attach the tempfile object to it) so the Tempfile remains reachable
until the request is built and the file is read.

---

Nitpick comments:
In `@app/controllers/account_statements_controller.rb`:
- Around line 57-66: The InvalidUploadError raised by
AccountStatement.prepare_upload! conflates multiple failure modes (oversize,
unsupported extension/MIME, empty file, missing PDF header) so update the
prepare_upload! flow to either (A) add a reason attribute to
AccountStatement::InvalidUploadError (e.g., :oversize, :unsupported_type,
:empty, :invalid_pdf_header) and populate it where MAX_FILE_SIZE, extension/MIME
checks, empty content, and PDF header checks occur, or (B) introduce distinct
subclasses (e.g., AccountStatement::OversizeUploadError,
::UnsupportedFileTypeError, ::EmptyUploadError, ::InvalidPdfHeaderError) and
raise them appropriately; then change the controller rescue block in
account_statements_controller (the loop around AccountStatement.prepare_upload!
/ AccountStatement.create_from_prepared_upload!) to handle these specific errors
(or inspect error.reason) and map each to a distinct translation key/message
instead of always using t("account_statements.create.invalid_file_type").
- Around line 72-93: The update action currently uses
Current.user.accessible_accounts.find(statement_account_id) which raises
ActiveRecord::RecordNotFound for an invalid/inaccessible id and yields a 404
instead of the friendly redirect used elsewhere; change this to look up the
target with a non-raising lookup (e.g. accessible_accounts.find_by(id:
statement_account_id)) or nil when not present, and then call
require_account_permission!(target) as currently done so the stale/invalid
statement_account_id path follows the same redirect/flash behavior; also update
the parallel lookup in target_account (or any other method that calls
Current.user.accessible_accounts.find) to use a non-raising lookup and nil-check
before calling require_account_permission!.

In `@app/models/account_statement.rb`:
- Around line 395-397: normalize_currency currently upcases but doesn't strip
whitespace, so values like " usd " become " USD " and later raise
Money::Currency::UnknownCurrencyError in currency_is_valid; update
normalize_currency to strip whitespace before upcase and presence checking
(e.g., use currency.to_s.strip.upcase.presence) so leading/trailing spaces are
removed while preserving the rest of the validation flow.
- Around line 84-124: The method create_from_prepared_upload! currently has two
rescue blocks that call purge_original_file(statement) and re-raise, which is
hard to follow; refactor by combining error handling and moving cleanup into an
ensure block: in create_from_prepared_upload! keep the duplicate check using
duplicate_for and raise DuplicateUploadError as before, remove the separate
rescue StandardError and the duplicate-specific rescue that calls
purge_original_file, and instead add an ensure block that calls
purge_original_file(statement) (so cleanup always runs but relies on
purge_original_file to swallow its own errors); if you still need to map
ActiveRecord::RecordNotUnique to DuplicateUploadError, rescue
ActiveRecord::RecordNotUnique => e, compute duplicate = duplicate_for(family,
prepared_upload) and raise DuplicateUploadError, otherwise re-raise e; ensure
the method otherwise re-raises unexpected errors so behavior is preserved.
- Around line 364-366: The method latest_reusable_pdf_import relies implicitly
on the association pdf_imports and its inherited ordered scope
(PdfImport.ordered); make the intent explicit by applying a local order to
guarantee newest-first behavior regardless of changes to Import.ordered — update
latest_reusable_pdf_import to filter out failed statuses
(pdf_imports.where.not(status: :failed)) and then apply .order(created_at:
:desc).first so it deterministically returns the most recent reusable PdfImport.
- Around line 256-266: The unlink! method currently calls update! (persisting
account:nil, review_status: :unmatched, match_confidence:nil) and then
assign_account_match which mutates suggested_account/match_confidence in memory
and triggers a second save!; change unlink! to use assign_attributes to set
account: nil, review_status: :unmatched, match_confidence: nil, then call
assign_account_match, and finally call save! once inside the transaction so only
a single UPDATE occurs; keep the transaction block and ensure the same attribute
names (account, review_status, match_confidence, suggested_account) and method
names (unlink!, assign_account_match, save!) are used.

In `@app/models/pdf_import.rb`:
- Around line 64-71: The transaction in assign_account! reads
account_statement.account_id without locking/reloading the associated
AccountStatement, so concurrent changes can cause stale comparisons and
unnecessary or missed link_to_account! calls; inside assign_account! before
comparing, obtain a DB lock or fresh state for the association (e.g., call
account_statement.lock! or account_statement.reload within the transaction) and
then perform the equality check and conditional call to link_to_account! to
ensure the check reflects the latest DB state.
- Around line 213-218: The pdf_file_content method currently downloads the
entire blob each time (via account_statement.original_file.download and
pdf_file.download), causing repeated large allocations; change pdf_file_content
to memoize the downloaded content (e.g., store result in an instance variable
like `@pdf_file_content`) so subsequent calls reuse the same buffer, or
alternatively stream the blob to a Tempfile when size approaches MAX_FILE_SIZE
before returning so callers like process_with_ai and extract_transactions can
pass a file without duplicating memory; update pdf_file_content (and any
callers) to use the memoized value or Tempfile approach and ensure Tempfile is
cleaned up where appropriate.
- Around line 22-33: The method create_from_statement! currently returns
statement.latest_reusable_pdf_import without checking that the reusable_import's
account and date_format still match the statement's current state; modify
create_from_statement! to reconcile before returning by either (a)
scoping/looking up latest_reusable_pdf_import to include account (e.g. use
statement.account when present) and/or date_format
(statement.family.date_format), or (b) if you still load
latest_reusable_pdf_import, compare reusable_import.account and
reusable_import.date_format to statement.account and
statement.family.date_format and only return it when they match (otherwise fall
through to create!); update the logic around latest_reusable_pdf_import and
create_from_statement! accordingly so reused imports can’t carry stale
account/date_format.

In `@config/initializers/active_storage_authorization.rb`:
- Around line 58-78: The method authorized_attachments currently returns nil
when authorized_blob is nil which can cause a NoMethodError if a caller expects
an array (calls .empty?); change authorized_attachments so it returns an empty
array instead of nil when authorized_blob is nil (or remove the early return) so
callers like authorize_protected_attachment can safely call attachments.empty?;
update the authorized_attachments method (referencing authorized_attachments and
authorized_blob) to ensure it always returns an array.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f7bde12-db04-4b58-a8db-8ce3e98f1c9b

📥 Commits

Reviewing files that changed from the base of the PR and between ce5d7dd and ed35ec3.

📒 Files selected for processing (49)
  • .devcontainer/Dockerfile
  • .devcontainer/docker-compose.yml
  • app/components/UI/account_page.html.erb
  • app/components/UI/account_page.rb
  • app/controllers/account_statements_controller.rb
  • app/controllers/accounts_controller.rb
  • app/controllers/imports_controller.rb
  • app/helpers/account_statements_helper.rb
  • app/helpers/settings_helper.rb
  • app/jobs/process_pdf_job.rb
  • app/models/account.rb
  • app/models/account_statement.rb
  • app/models/account_statement/account_matcher.rb
  • app/models/account_statement/coverage.rb
  • app/models/account_statement/metadata_detector.rb
  • app/models/family.rb
  • app/models/import.rb
  • app/models/pdf_import.rb
  • app/views/account_statements/index.html.erb
  • app/views/account_statements/show.html.erb
  • app/views/accounts/show.html.erb
  • app/views/accounts/show/_menu.html.erb
  • app/views/accounts/show/_statements.html.erb
  • app/views/accounts/show/_statements_frame.html.erb
  • app/views/imports/_pdf_import.html.erb
  • app/views/settings/_settings_nav.html.erb
  • config/brakeman.ignore
  • config/initializers/active_storage_authorization.rb
  • config/locales/models/account_statement/en.yml
  • config/locales/views/account_statements/en.yml
  • config/locales/views/accounts/en.yml
  • config/locales/views/imports/en.yml
  • config/locales/views/settings/en.yml
  • config/locales/views/settings/fr.yml
  • config/routes.rb
  • db/migrate/20260505120000_create_account_statements.rb
  • db/migrate/20260513120000_add_account_statement_to_imports.rb
  • db/schema.rb
  • test/controllers/account_statements_controller_test.rb
  • test/controllers/accounts_controller_test.rb
  • test/controllers/imports_controller_test.rb
  • test/fixtures/account_statements.yml
  • test/helpers/account_statements_helper_test.rb
  • test/integration/active_storage_authorization_test.rb
  • test/models/account_statement_test.rb
  • test/models/account_test.rb
  • test/models/pdf_import_test.rb
  • test/system/settings_test.rb
  • test/test_helper.rb
💤 Files with no reviewable changes (1)
  • config/brakeman.ignore

Comment thread app/models/account_statement/account_matcher.rb Outdated
Comment thread app/models/account.rb
Comment thread test/test_helper.rb
@JSONbored JSONbored force-pushed the codex/refactor-pdf-import-account-statements branch from fe94ee7 to af8a8b1 Compare May 13, 2026 11:28
@JSONbored JSONbored marked this pull request as draft May 13, 2026 11:29
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/models/pdf_import_test.rb (1)

198-209: ⚡ Quick win

Brittle date format selection logic.

Lines 201-202 assume multiple date formats exist in Family::DATE_FORMATS. If only one format is defined, (Family::DATE_FORMATS.map(&:last) - [...]).first returns nil, which may not properly test the "requires current date format" condition.

🛡️ Proposed fix to ensure robustness
-    alternate_date_format = (Family::DATE_FORMATS.map(&:last) - [ statement.family.date_format ]).first
-    stale_import.update!(account: nil, date_format: alternate_date_format)
+    all_formats = Family::DATE_FORMATS.map(&:last)
+    alternate_date_format = all_formats.find { |fmt| fmt != statement.family.date_format } || "MM/DD/YYYY"
+    stale_import.update!(account: nil, date_format: alternate_date_format)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/models/pdf_import_test.rb` around lines 198 - 209, The test's
alternate_date_format selection can return nil when Family::DATE_FORMATS only
contains the current family format, making the test brittle; update the test
"statement backed import reuse requires current account and date format" to
ensure alternate_date_format is non-nil by computing formats =
Family::DATE_FORMATS.map(&:last) then attempting alternate = (formats -
[statement.family.date_format]).first and, if that is nil, falling back to a
safe different string (or a known alternate format) so
stale_import.update!(account: nil, date_format: alternate_date_format) always
receives a different, non-nil date_format than statement.family.date_format and
the assertions around PdfImport.create_from_statement! remain valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/views/accounts/show/_menu.html.erb`:
- Line 9: Wrap the Statements menu item rendering in the same permission guard
used for Edit/Import/Delete: check that permission.in?([:owner, :full_control])
and AccountStatement.statement_manager?(Current.user) before calling
menu.with_item(variant: "link", text: t(".statements"), href:
account_path(account, tab: "statements"), icon: "archive"); this mirrors the
`@can_manage_statements/controller-level` restriction and keeps the UX consistent
with other sensitive actions.

---

Nitpick comments:
In `@test/models/pdf_import_test.rb`:
- Around line 198-209: The test's alternate_date_format selection can return nil
when Family::DATE_FORMATS only contains the current family format, making the
test brittle; update the test "statement backed import reuse requires current
account and date format" to ensure alternate_date_format is non-nil by computing
formats = Family::DATE_FORMATS.map(&:last) then attempting alternate = (formats
- [statement.family.date_format]).first and, if that is nil, falling back to a
safe different string (or a known alternate format) so
stale_import.update!(account: nil, date_format: alternate_date_format) always
receives a different, non-nil date_format than statement.family.date_format and
the assertions around PdfImport.create_from_statement! remain valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 71dc992d-966f-4635-9954-1dd2b1d40da4

📥 Commits

Reviewing files that changed from the base of the PR and between ed35ec3 and af8a8b1.

📒 Files selected for processing (49)
  • .devcontainer/Dockerfile
  • .devcontainer/docker-compose.yml
  • app/components/UI/account_page.html.erb
  • app/components/UI/account_page.rb
  • app/controllers/account_statements_controller.rb
  • app/controllers/accounts_controller.rb
  • app/controllers/imports_controller.rb
  • app/helpers/account_statements_helper.rb
  • app/helpers/settings_helper.rb
  • app/jobs/process_pdf_job.rb
  • app/models/account.rb
  • app/models/account_statement.rb
  • app/models/account_statement/account_matcher.rb
  • app/models/account_statement/coverage.rb
  • app/models/account_statement/metadata_detector.rb
  • app/models/family.rb
  • app/models/import.rb
  • app/models/pdf_import.rb
  • app/views/account_statements/index.html.erb
  • app/views/account_statements/show.html.erb
  • app/views/accounts/show.html.erb
  • app/views/accounts/show/_menu.html.erb
  • app/views/accounts/show/_statements.html.erb
  • app/views/accounts/show/_statements_frame.html.erb
  • app/views/imports/_pdf_import.html.erb
  • app/views/settings/_settings_nav.html.erb
  • config/brakeman.ignore
  • config/initializers/active_storage_authorization.rb
  • config/locales/models/account_statement/en.yml
  • config/locales/views/account_statements/en.yml
  • config/locales/views/accounts/en.yml
  • config/locales/views/imports/en.yml
  • config/locales/views/settings/en.yml
  • config/locales/views/settings/fr.yml
  • config/routes.rb
  • db/migrate/20260505120000_create_account_statements.rb
  • db/migrate/20260513120000_add_account_statement_to_imports.rb
  • db/schema.rb
  • test/controllers/account_statements_controller_test.rb
  • test/controllers/accounts_controller_test.rb
  • test/controllers/imports_controller_test.rb
  • test/fixtures/account_statements.yml
  • test/helpers/account_statements_helper_test.rb
  • test/integration/active_storage_authorization_test.rb
  • test/models/account_statement_test.rb
  • test/models/account_test.rb
  • test/models/pdf_import_test.rb
  • test/system/settings_test.rb
  • test/test_helper.rb
💤 Files with no reviewable changes (1)
  • config/brakeman.ignore
✅ Files skipped from review due to trivial changes (6)
  • app/components/UI/account_page.html.erb
  • .devcontainer/docker-compose.yml
  • test/fixtures/account_statements.yml
  • config/locales/views/settings/en.yml
  • config/locales/views/account_statements/en.yml
  • config/locales/views/settings/fr.yml
🚧 Files skipped from review as they are similar to previous changes (37)
  • config/locales/views/imports/en.yml
  • db/migrate/20260513120000_add_account_statement_to_imports.rb
  • test/helpers/account_statements_helper_test.rb
  • config/locales/views/accounts/en.yml
  • app/views/accounts/show/_statements_frame.html.erb
  • config/locales/models/account_statement/en.yml
  • app/views/accounts/show.html.erb
  • app/models/family.rb
  • config/routes.rb
  • test/test_helper.rb
  • app/models/import.rb
  • app/views/imports/_pdf_import.html.erb
  • app/views/accounts/show/_statements.html.erb
  • app/helpers/settings_helper.rb
  • test/controllers/accounts_controller_test.rb
  • app/jobs/process_pdf_job.rb
  • app/components/UI/account_page.rb
  • app/helpers/account_statements_helper.rb
  • app/models/account_statement/account_matcher.rb
  • app/views/settings/_settings_nav.html.erb
  • test/integration/active_storage_authorization_test.rb
  • app/views/account_statements/index.html.erb
  • test/models/account_test.rb
  • test/system/settings_test.rb
  • db/schema.rb
  • app/controllers/imports_controller.rb
  • test/controllers/imports_controller_test.rb
  • db/migrate/20260505120000_create_account_statements.rb
  • app/controllers/accounts_controller.rb
  • app/models/pdf_import.rb
  • config/initializers/active_storage_authorization.rb
  • app/views/account_statements/show.html.erb
  • test/controllers/account_statements_controller_test.rb
  • test/models/account_statement_test.rb
  • app/controllers/account_statements_controller.rb
  • app/models/account_statement/metadata_detector.rb
  • app/models/account_statement.rb

Comment thread app/views/accounts/show/_menu.html.erb Outdated
@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 12:16
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aadf338c5d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread app/controllers/imports_controller.rb Outdated
@JSONbored JSONbored force-pushed the codex/refactor-pdf-import-account-statements branch from aadf338 to ab7fc94 Compare May 13, 2026 12:33
@JSONbored JSONbored marked this pull request as draft May 13, 2026 12:34
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/system/settings_test.rb (2)

29-30: ⚡ Quick win

Use i18n for the new “Statement Vault” label in test expectations.

The newly added label is hard-coded in setup/assertions; using I18n.t(...) (as done at Line 78) keeps tests aligned with localization updates.

As per coding guidelines: "User-facing strings in views and components must use t() for translation" and i18n keys should be organized hierarchically by feature.

Also applies to: 98-98

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/system/settings_test.rb` around lines 29 - 30, Replace the hard-coded
"Statement Vault" label used when inserting into `@settings_links` with a
translation lookup (use I18n.t or the t() helper) so tests mirror view
localization; update the insertion call that uses merchants_index,
family_merchants_path and account_statements_path to use a namespaced i18n key
(e.g. settings.statement_vault) and adjust the corresponding assertion(s)
(including the other occurrence around line 98) to expect I18n.t("...") instead
of the literal string.

45-45: ⚡ Quick win

Scope settings-nav clicks to avoid relying on match: :first.

Line 45 uses click_link name, match: :first which is brittle when duplicate link text exists elsewhere on the page. Instead, scope the click to the settings nav container using within, following the pattern already established in the open_settings_from_sidebar helper. This ensures the correct link is clicked regardless of other page elements.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/system/settings_test.rb` at line 45, The test currently calls click_link
name, match: :first which is brittle; update the test to scope the click to the
settings navigation container by using within (like the existing
open_settings_from_sidebar helper) targeting the settings nav element and then
calling click_link name without match: :first so the correct link inside the
settings nav is clicked even if duplicate text exists elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/account_statement/coverage.rb`:
- Around line 73-74: The start-of-coverage queries are including statements that
only have period_start_on and no period_end_on, which inflates
available_years/expected_start_month; update both queries used to compute
coverage start (the call chains starting with account.account_statements and
account.family.account_statements.unmatched.where(suggested_account: account))
to exclude partial-period records by requiring a non-null period_end_on (e.g.
add .where.not(period_end_on: nil) or combine into .where.not(period_start_on:
nil, period_end_on: nil)) so only statements with both period_start_on and
period_end_on are considered.

---

Nitpick comments:
In `@test/system/settings_test.rb`:
- Around line 29-30: Replace the hard-coded "Statement Vault" label used when
inserting into `@settings_links` with a translation lookup (use I18n.t or the t()
helper) so tests mirror view localization; update the insertion call that uses
merchants_index, family_merchants_path and account_statements_path to use a
namespaced i18n key (e.g. settings.statement_vault) and adjust the corresponding
assertion(s) (including the other occurrence around line 98) to expect
I18n.t("...") instead of the literal string.
- Line 45: The test currently calls click_link name, match: :first which is
brittle; update the test to scope the click to the settings navigation container
by using within (like the existing open_settings_from_sidebar helper) targeting
the settings nav element and then calling click_link name without match: :first
so the correct link inside the settings nav is clicked even if duplicate text
exists elsewhere.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 79a8c199-710f-4f79-9323-b2bb69aa0ef0

📥 Commits

Reviewing files that changed from the base of the PR and between af8a8b1 and ab7fc94.

📒 Files selected for processing (49)
  • .devcontainer/Dockerfile
  • .devcontainer/docker-compose.yml
  • app/components/UI/account_page.html.erb
  • app/components/UI/account_page.rb
  • app/controllers/account_statements_controller.rb
  • app/controllers/accounts_controller.rb
  • app/controllers/imports_controller.rb
  • app/helpers/account_statements_helper.rb
  • app/helpers/settings_helper.rb
  • app/jobs/process_pdf_job.rb
  • app/models/account.rb
  • app/models/account_statement.rb
  • app/models/account_statement/account_matcher.rb
  • app/models/account_statement/coverage.rb
  • app/models/account_statement/metadata_detector.rb
  • app/models/family.rb
  • app/models/import.rb
  • app/models/pdf_import.rb
  • app/views/account_statements/index.html.erb
  • app/views/account_statements/show.html.erb
  • app/views/accounts/show.html.erb
  • app/views/accounts/show/_menu.html.erb
  • app/views/accounts/show/_statements.html.erb
  • app/views/accounts/show/_statements_frame.html.erb
  • app/views/imports/_pdf_import.html.erb
  • app/views/settings/_settings_nav.html.erb
  • config/brakeman.ignore
  • config/initializers/active_storage_authorization.rb
  • config/locales/models/account_statement/en.yml
  • config/locales/views/account_statements/en.yml
  • config/locales/views/accounts/en.yml
  • config/locales/views/imports/en.yml
  • config/locales/views/settings/en.yml
  • config/locales/views/settings/fr.yml
  • config/routes.rb
  • db/migrate/20260505120000_create_account_statements.rb
  • db/migrate/20260513120000_add_account_statement_to_imports.rb
  • db/schema.rb
  • test/controllers/account_statements_controller_test.rb
  • test/controllers/accounts_controller_test.rb
  • test/controllers/imports_controller_test.rb
  • test/fixtures/account_statements.yml
  • test/helpers/account_statements_helper_test.rb
  • test/integration/active_storage_authorization_test.rb
  • test/models/account_statement_test.rb
  • test/models/account_test.rb
  • test/models/pdf_import_test.rb
  • test/system/settings_test.rb
  • test/test_helper.rb
💤 Files with no reviewable changes (1)
  • config/brakeman.ignore
✅ Files skipped from review due to trivial changes (8)
  • config/locales/views/settings/fr.yml
  • test/fixtures/account_statements.yml
  • test/helpers/account_statements_helper_test.rb
  • config/locales/views/accounts/en.yml
  • config/locales/models/account_statement/en.yml
  • config/locales/views/imports/en.yml
  • config/locales/views/account_statements/en.yml
  • db/schema.rb
🚧 Files skipped from review as they are similar to previous changes (37)
  • app/views/settings/_settings_nav.html.erb
  • app/views/accounts/show/_menu.html.erb
  • .devcontainer/Dockerfile
  • app/models/family.rb
  • app/components/UI/account_page.html.erb
  • app/views/accounts/show.html.erb
  • app/jobs/process_pdf_job.rb
  • config/locales/views/settings/en.yml
  • test/test_helper.rb
  • db/migrate/20260513120000_add_account_statement_to_imports.rb
  • app/helpers/settings_helper.rb
  • test/models/account_test.rb
  • app/controllers/accounts_controller.rb
  • config/routes.rb
  • app/views/accounts/show/_statements_frame.html.erb
  • .devcontainer/docker-compose.yml
  • app/models/import.rb
  • app/views/imports/_pdf_import.html.erb
  • test/models/pdf_import_test.rb
  • test/controllers/imports_controller_test.rb
  • app/views/account_statements/show.html.erb
  • app/models/account.rb
  • app/views/accounts/show/_statements.html.erb
  • test/controllers/accounts_controller_test.rb
  • config/initializers/active_storage_authorization.rb
  • app/models/account_statement/account_matcher.rb
  • test/controllers/account_statements_controller_test.rb
  • db/migrate/20260505120000_create_account_statements.rb
  • test/models/account_statement_test.rb
  • app/views/account_statements/index.html.erb
  • app/models/pdf_import.rb
  • app/controllers/imports_controller.rb
  • app/components/UI/account_page.rb
  • app/controllers/account_statements_controller.rb
  • test/integration/active_storage_authorization_test.rb
  • app/models/account_statement/metadata_detector.rb
  • app/models/account_statement.rb

Comment thread app/models/account_statement/coverage.rb Outdated
@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 12:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab7fc946af

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread app/controllers/imports_controller.rb
@JSONbored JSONbored marked this pull request as draft May 13, 2026 13:10
@JSONbored JSONbored marked this pull request as ready for review May 13, 2026 13:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
app/controllers/account_statements_controller.rb (1)

33-33: ⚡ Quick win

Use the dedicated helper method for selecting the latest PDF import.

Line 33 uses @statement.pdf_imports.first, which returns the first import from the ordered association. While the association already applies explicit ordering via the .ordered scope, the dedicated latest_reusable_pdf_import helper method should be used instead—it not only orders correctly but also filters out failed imports, ensuring only valid imports are selected.

Suggested fix
- `@latest_pdf_import` = `@statement.pdf_imports.first` if `@statement.pdf`?
+ `@latest_pdf_import` = `@statement.latest_reusable_pdf_import` if `@statement.pdf`?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/controllers/account_statements_controller.rb` at line 33, Replace the
direct association access that sets `@latest_pdf_import` using
`@statement.pdf_imports.first` with the dedicated helper that picks a valid
import: call latest_reusable_pdf_import on the statement (e.g.
`@latest_pdf_import` = `@statement.latest_reusable_pdf_import`) and keep the
existing guard (`@statement.pdf`?) as needed so only non-failed, correctly ordered
PDF imports are selected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/pdf_import.rb`:
- Around line 94-104: The code sets status to :importing inside the with_lock
block in PdfImport but calls ProcessPdfJob.perform_later(self) afterwards, which
can raise and leave the record stuck; wrap the perform_later call in a
begin-rescue that on any enqueue exception re-acquires the record (reload and
with_lock) and updates status back to :pending (use update! to persist and fail
loudly in tests), and ensure the method returns false on failure; additionally,
update ProcessPdfJob (its perform/handle method) to handle its early-return
guards by acquiring the PdfImport record (with_lock) and setting status back to
:pending when the job decides not to proceed so the record can be retried or
re-queued.

---

Nitpick comments:
In `@app/controllers/account_statements_controller.rb`:
- Line 33: Replace the direct association access that sets `@latest_pdf_import`
using `@statement.pdf_imports.first` with the dedicated helper that picks a valid
import: call latest_reusable_pdf_import on the statement (e.g.
`@latest_pdf_import` = `@statement.latest_reusable_pdf_import`) and keep the
existing guard (`@statement.pdf`?) as needed so only non-failed, correctly ordered
PDF imports are selected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c7c1e53b-2250-44fa-8314-01a9f097070c

📥 Commits

Reviewing files that changed from the base of the PR and between ab7fc94 and 66173ae.

📒 Files selected for processing (9)
  • app/controllers/account_statements_controller.rb
  • app/controllers/imports_controller.rb
  • app/models/account_statement/coverage.rb
  • app/models/pdf_import.rb
  • app/views/settings/_settings_nav.html.erb
  • test/controllers/imports_controller_test.rb
  • test/models/account_statement_test.rb
  • test/models/pdf_import_test.rb
  • test/system/settings_test.rb
🚧 Files skipped from review as they are similar to previous changes (5)
  • app/views/settings/_settings_nav.html.erb
  • test/system/settings_test.rb
  • test/models/pdf_import_test.rb
  • app/controllers/imports_controller.rb
  • test/models/account_statement_test.rb

Comment thread app/models/pdf_import.rb Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/controllers/account_statements_controller.rb`:
- Around line 42-52: The create action can double-render when target_account
performs a redirect (e.g. invalid account_id) and then the empty-files branch
also redirects; fix by short-circuiting after calling target_account by checking
performed? immediately (i.e. after the call to target_account) or by moving the
files.empty? check below that performed? guard so that if target_account already
redirected we return without issuing the second redirect; update the create
method (references: create, target_account, statement_upload_params,
account_statements_path, require_account_permission!, performed?) to ensure no
second redirect is executed.

In `@app/jobs/process_pdf_job.rb`:
- Line 8: The reset_processing_claim call is currently gated only on
ai_processed?/importing? and can reopen an import while the original worker
still owns the claim; change the guard to release the processing claim only when
the claim is stale/expired or not owned by the current worker. Specifically,
update the early-return condition around reset_processing_claim so it checks
claim ownership/staleness (e.g., a helper like processing_claim_stale? or
comparing processing_claim.updated_at + TTL < Time.current or
processing_claim.owner_id != current_worker_id) before invoking
reset_processing_claim for pdf_import, and apply the same ownership/staleness
check to the similar block at lines 89-92.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a52404c-8317-4081-af87-318702050ce1

📥 Commits

Reviewing files that changed from the base of the PR and between 66173ae and b77dd26.

📒 Files selected for processing (6)
  • app/controllers/account_statements_controller.rb
  • app/jobs/process_pdf_job.rb
  • app/models/pdf_import.rb
  • test/controllers/account_statements_controller_test.rb
  • test/jobs/process_pdf_job_test.rb
  • test/models/pdf_import_test.rb
🚧 Files skipped from review as they are similar to previous changes (3)
  • app/models/pdf_import.rb
  • test/models/pdf_import_test.rb
  • test/controllers/account_statements_controller_test.rb

Comment thread app/controllers/account_statements_controller.rb Outdated
Comment thread app/jobs/process_pdf_job.rb
JSONbored added 10 commits May 13, 2026 17:56
Add web-only statement uploads, account linking, duplicate detection, and per-account coverage/reconciliation checks without mutating transactions. Extend ActiveStorage authorization and targeted tests for family/account scoping.
Preserve linked statement records when an account is deleted by moving them back to the unmatched inbox, then expand coverage for upload validation, sanitized parser metadata, unavailable reconciliation, and missing-month coverage.
Address review and security findings in the statement vault by preserving sanitized parser metadata, failing closed on orphaned statement blobs, avoiding account_id mass assignment permits, and adding regression coverage for link/delete edge cases.
Prioritize SHA-256 duplicate detection while preserving MD5 fallback for legacy rows.

Remove free-form account notes from statement matching, document direct account-destroy unlinking, and add year-selectable historical coverage with muted out-of-range months.
Clarify legacy MD5 checksum use, whitelist statement balance helper dispatch, and preserve sanitized parser metadata.

Hide statement management controls from read-only viewers while keeping server-side authorization unchanged.
Allow the changelog provider lookup in the self-hosting settings system test, include Statement Vault in settings navigation coverage, and align the feature title casing. Update the devcontainer so ActiveStorage and parallel system tests can run in the documented environment.
Place Statement Vault with account settings instead of between Imports and Exports. Keep settings footer ordering and system navigation coverage aligned, including the non-admin visibility guard.
Resolve CodeRabbit review feedback for statement upload validation, duplicate race handling, account statement matching semantics, metadata detection, ActiveStorage authorization tests, and small UI/style cleanups.
JSONbored added 17 commits May 13, 2026 17:56
Squash the statement vault migration hardening into the feature migration, tighten Active Storage authorization edge cases, bound CSV metadata detection, and add real PDF fixture coverage for stored statements.

Validation: targeted statement/auth/controller/provider tests, full Rails suite, system tests, RuboCop, Biome, Brakeman, Zeitwerk, importmap audit, npm audit, ERB lint, CodeRabbit, and Codex Security all passed locally.
Move statement unlinking to after account destroy commit, keep Kraken account creation on the shared crypto helper, and add statement metadata length limits with DB checks.

Validation: fresh devcontainer with fresh DB via db:prepare, focused account/statement/Kraken/Binance tests, RuboCop, Brakeman, Zeitwerk, git diff --check, CodeRabbit, and Codex Security passed before commit.
Move statement tab data setup out of the ERB partial, harden reconciliation labels and coverage initialization, and tighten statement schema constraints.

Validation: CodeRabbit and Codex Security reviewed the current PR diff; Rails focused tests, full Rails tests, system tests, RuboCop, Brakeman, Zeitwerk, ERB lint, npm lint, importmap audit, npm audit, and git diff --check passed.
Make AccountStatement the canonical uploaded statement record for PDF imports while keeping PdfImport as the downstream extraction and review workflow. Adds statement-backed import reuse, statement-origin extraction, canonical statement permission checks, and regression coverage for the bridge.
@jjmata jjmata force-pushed the codex/refactor-pdf-import-account-statements branch from 01839f2 to ffcd9a9 Compare May 13, 2026 15:56
@jjmata
Copy link
Copy Markdown
Collaborator

jjmata commented May 13, 2026

Rebasing so you get the "quicker CI test" PR work I landed today @JSONbored ... that shold fix your tests/go green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor:verified Contributor passed trust analysis. pr:verified PR passed security analysis.

Development

Successfully merging this pull request may close these issues.

Refactor PdfImport to build on AccountStatement as the canonical statement upload model

2 participants