perf(cms): per-page video sync + timestamp fix + keyword scope fix#564
Open
perf(cms): per-page video sync + timestamp fix + keyword scope fix#564
Conversation
Same optimization patterns from sync-video-variants.ts applied to sync-videos.ts: - Per-page upsert for videos, images, subtitles, study questions, and bible citations (no more collect-all-then-upsert) - Bulk origins/editions via bulkUpsertByCoreId instead of one-at-a-time Strapi document service calls - Bulk bible books upsert - Prefetch next page while upserting current - Incremental videoDocMap accumulated per-page instead of full table scan - Timing logs for performance monitoring Keywords and parent-child linking remain post-loop (need all videos). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Key finding: upsertByCoreId (Strapi document service) is ~100x slower on Railway than local due to network latency between app and DB containers. bulkUpsertByCoreId (raw knex SQL) is fast everywhere. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ering knex/pg returns timestamptz as a JS Date object. When passed as-is to the Core API's updatedAt filter, it gets toString()'d to "Mon Mar 30 2026 01:20:21 GMT+0000" which the gateway can't parse, causing it to return ALL records instead of filtering by date. Fix: convert to ISO string in getLastSyncTime() so the filter receives "2026-03-30T01:20:21.482Z" which the gateway parses correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dead code Review found P0: keyword link deletion used ALL video row IDs from the DB, not just the videos in the current sync batch. During incremental sync this would strip keyword links from every non-updated video. Fix: build syncedVideoRowIds from videoKeywordLinks.keys() (only the videos being synced) instead of videoIdMap.values() (all videos in DB). Also removed unused seenStudyQuestionIds and seenCitationIds sets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
🚅 Deployed to the forge-pr-564 environment in forge
2 services not affected by this PR
|
Attach .catch() to the prefetched page promise so that if the current page's processing throws, the pending next-page fetch doesn't create an unhandled promise rejection that could crash Node.js. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three critical fixes for core-sync, building on PRs #558, #560, #562:
1. Watermark timestamp format bug (affects ALL incremental syncs)
getLastSyncTimereturned a JS Date object from knex/pg. When passed to the Core API'supdatedAt: { gte: since }filter, it was stringified as"Mon Mar 30 2026 01:20:21 GMT+0000"which the gateway can't parse — silently returning ALL records instead of filtering. This is why incremental appeared to refetch everything.Fix: Convert to ISO string in
getLastSyncTime().2. Per-page upsert for video sync (same pattern as #560 for variants)
Videos phase was stuck at
processed=0/1056for 20+ minutes on production due to one-at-a-timeupsertByCoreIdcalls for origins/editions. Locally: 49s. Production: 20+ minutes.Fix: Per-page bulk upsert, prefetch, bulk origins/editions/bible books — same patterns from sync-video-variants.ts.
3. Keyword link deletion scope (P0 from code review)
During incremental sync, keyword link deletion used ALL video row IDs from the DB, not just the synced batch. This would strip keyword links from every non-updated video.
Fix: Scope deletion to only videos in
videoKeywordLinks.Performance data (production, PR #562 deploy)
Test plan
Files changed
apps/cms/src/api/core-sync/services/strapi-helpers.ts— ISO timestamp fixapps/cms/src/api/core-sync/services/sync-videos.ts— per-page upsert + keyword scope fixdocs/solutions/cms/core-sync-production-vs-local-performance-gap.md— compound doc🤖 Generated with Claude Code