Skip to content

fix: persist Bird quoted tweets#76

Draft
lukaskawerau wants to merge 1 commit into
steipete:mainfrom
lukaskawerau:fix/bird-quoted-tweets
Draft

fix: persist Bird quoted tweets#76
lukaskawerau wants to merge 1 commit into
steipete:mainfrom
lukaskawerau:fix/bird-quoted-tweets

Conversation

@lukaskawerau

@lukaskawerau lukaskawerau commented Jun 25, 2026

Copy link
Copy Markdown

Issue

Bird-backed live timeline syncs can return quoted tweets as hydrated nested payloads. Birdclaw normalized the primary tweet reference, but dropped the quoted tweet body during ingest because the quoted payload was not carried through includes.tweets.

That left local timeline items with quoted_tweet_id set, but no persisted tweet row for the quoted tweet. Result: quote cards could not render until a later hydrate fetched the quoted tweet separately.

What changed

  • Normalize hydrated Bird quotedTweet objects into includes.tweets.
  • Preserve included tweets when merging paginated live timeline payloads.
  • Persist included quoted tweets through shared tweet ingest and profile-analysis ingest.
  • Keep account timeline edges and collection rows limited to primary payload tweets, so quoted tweets do not appear as standalone home/search/profile items.
  • Add regression coverage for Bird normalization and live home timeline persistence.
  • Add a changelog entry.

How

The fix treats payload.data as the primary timeline surface and payload.includes.tweets as canonical tweet records that may be needed for rendering references.

Ingest now deduplicates primary and included tweets by ID, writes all canonical tweets into tweets and tweets_fts, but only creates tweet_account_edges, collection rows, and returned sync IDs for primary payload.data tweets.

Behavior proof

Captured June 25, 2026 from a clean temporary Birdclaw home:

  1. Copied the local account/config database to /tmp/birdclaw-pr76-proof-home.clean.tiKXyq.
  2. Cleared tweets, tweet_account_edges, tweet_collections, tweets_fts, and sync_cache in that temp copy.
  3. Ran exactly one Bird-backed home timeline sync:
    BIRDCLAW_HOME=/tmp/birdclaw-pr76-proof-home.clean.tiKXyq BIRDCLAW_BACKUP_AUTO_SYNC=0 pnpm cli sync timeline --mode bird --limit 100 --refresh --json
  4. Opened the local web UI against the same temp home and captured the Home lane after redacting all tweet text, handles, dates, IDs, and media.

Redacted quote-card rendering artifact: quote-card-redacted.svg

Redacted Bird-backed quote card proof

Redacted terminal proof from that run:

{
  "birdSync": {
    "ok": true,
    "source": "bird",
    "kind": "timeline",
    "feed": "following",
    "count": 100,
    "primaryTweets": 100,
    "includedTweets": 24,
    "includedUsers": 65
  },
  "includedOnlyQuoteProof": {
    "includedOnlyTweetsReferencedAsQuotesByPrimaryTweets": 23,
    "persistedIncludedOnlyTweetRows": 23,
    "includedOnlyHomeEdges": 0,
    "includedOnlyCollectionRows": 0,
    "includedOnlyFtsRows": 23,
    "redactedSamples": [
      {
        "text_chars": 365,
        "home_edges": 0,
        "collection_rows": 0,
        "fts_rows": 1
      },
      {
        "text_chars": 280,
        "home_edges": 0,
        "collection_rows": 0,
        "fts_rows": 1
      },
      {
        "text_chars": 249,
        "home_edges": 0,
        "collection_rows": 0,
        "fts_rows": 1
      }
    ]
  },
  "visibleQuoteCardsInRedactedScreenshot": 13
}

No bird read, hydrate, mention-thread, or other quote-fetch command ran between the clean table reset and the screenshot capture.

Validation

  • pnpm run check
  • pnpm test
  • Targeted regression suite:
    pnpm exec vitest run src/lib/bird.test.ts src/lib/timeline-live.test.ts src/lib/profile-analysis.test.ts src/lib/timeline-collections-live.test.ts src/lib/tweet-search-live.test.ts src/lib/mentions-live.test.ts src/lib/mention-threads-live.test.ts

@clawsweeper

clawsweeper Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codex review: needs maintainer review before merge. Reviewed June 25, 2026, 9:06 AM ET / 13:06 UTC.

Summary
The PR normalizes hydrated Bird quotedTweet payloads into includes.tweets, persists primary-plus-included tweets through shared/profile ingest while limiting edges and collections to primary tweets, preserves included tweets across paginated timeline merges, and adds regression tests plus a changelog note.

Reproducibility: yes. Source inspection shows current main keeps the quoted tweet ID but drops the hydrated quoted tweet body before ingest, and the PR body includes real Bird-backed after-fix proof.

Review metrics: 2 noteworthy metrics.

  • Diff scope: 8 files, +188/-24. The patch crosses normalization, shared ingest, profile ingest, timeline merge, types, tests, and release notes, so it merits cross-ingest maintainer review.
  • Live proof invariant: 23 persisted, 0 edges, 0 collection rows. The contributor proof checks the key invariant that included quote rows are stored for rendering without becoming standalone feed or collection items.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster ✨ media proof bonus
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Mark the draft ready for review when the contributor wants maintainers to consider it for merge.

Risk before merge

  • [P1] The PR is still draft, so maintainers should wait for the contributor or maintainer ready-for-review signal before merge.
  • [P1] This read-only review did not rerun the contributor's pnpm validation commands, so normal project checks should still gate the draft before landing.

Maintainer options:

  1. Decide the mitigation before merge
    Land the narrow persistence fix after the draft is ready and maintainer/project checks pass, preserving included quote rows for rendering without creating standalone timeline or collection membership for included-only tweets.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • No automated repair is needed because proof is sufficient and no patch defect surfaced; maintainer review plus draft-to-ready signal is the remaining action.

Security
Cleared: The diff touches TypeScript normalization/SQLite ingest behavior, tests, and changelog text only; no CI, dependency, secret, package, or code-execution surface change was found.

Review details

Best possible solution:

Land the narrow persistence fix after the draft is ready and maintainer/project checks pass, preserving included quote rows for rendering without creating standalone timeline or collection membership for included-only tweets.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main keeps the quoted tweet ID but drops the hydrated quoted tweet body before ingest, and the PR body includes real Bird-backed after-fix proof.

Is this the best way to solve the issue?

Yes. Treating payload.data as the primary timeline surface while persisting canonical included tweet records is the narrow maintainable fix without promoting quoted tweets into feeds.

AGENTS.md: not found in the target repository.

Codex review notes: model internal, reasoning high; reviewed against 10f98d3fb36a.

Label changes

Label justifications:

  • P2: This is a normal-priority Bird-backed timeline persistence bug with bounded quote-card rendering impact and a focused fix surface.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (linked_artifact): The PR body provides redacted terminal output from a real Bird-backed sync plus an inspected linked quote-card artifact showing the after-fix behavior.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides redacted terminal output from a real Bird-backed sync plus an inspected linked quote-card artifact showing the after-fix behavior.
Evidence reviewed

What I checked:

  • Target policy scan: No AGENTS.md exists under the target repository root, and no .agents/maintainer-notes/ directory exists for this PR's files. (10f98d3fb36a)
  • Current main drops hydrated quoted bodies: Current normalizeBirdTweets records a quoted reference from item.quotedTweet.id, but returns only includes.users, so the hydrated quoted tweet body is not carried downstream. (src/lib/bird.ts:447, 10f98d3fb36a)
  • Current main ingests primary tweets only: Current shared ingest loops over payload.data only, so tweets present solely under includes.tweets cannot be written to tweets or tweets_fts. (src/lib/tweet-repository.ts:81, 10f98d3fb36a)
  • Read model needs persisted quote rows: Timeline rendering joins quoted tweets from tweets via t.quoted_tweet_id, so a referenced quote card needs an actual persisted quoted tweet row. (src/lib/timeline-read-model.ts:740, 10f98d3fb36a)
  • PR implements primary-plus-included persistence: The PR diff adds canonical included-tweet iteration, keeps edge/collection writes and returned IDs limited to primary payload tweets, and preserves includes.tweets during live timeline payload merging. (src/lib/tweet-repository.ts:23, 216d255eccb9)
  • Regression coverage added: The patch adds Bird normalization coverage for hydrated quotedTweet objects and live home timeline coverage proving included quoted tweets are persisted without home timeline edges. (src/lib/timeline-live.test.ts:153, 216d255eccb9)

Likely related people:

  • steipete: Current-main blame and pickaxe history show Peter Steinberger on the Bird normalization, shared ingest, live timeline merge, and profile-analysis ingest surfaces involved in this fix. (role: introduced behavior and recent area contributor; confidence: high; commits: b456320a66aa, ea3db8bb5cc7, fc86e4a64192; files: src/lib/bird.ts, src/lib/tweet-repository.ts, src/lib/timeline-live.ts)
  • vyctorbrzezowski: Authored closed-unmerged fix: preserve rich bird collection data #6, which previously explored preserving Bird quoted tweet context in collection-rich-data sync. (role: prior related contributor; confidence: medium; commits: 59a53391c687; files: src/lib/bird.ts, src/lib/timeline-collections-live.ts, src/lib/types.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. labels Jun 25, 2026
@lukaskawerau

Copy link
Copy Markdown
Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 25, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant