fix(storage): promote item_json and link_id to default schema + migration V12#1416
fix(storage): promote item_json and link_id to default schema + migration V12#1416khoaatra wants to merge 10 commits into
Conversation
…ation v12 Signed-off-by: Khoa Tran <[email protected]>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an optional Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Gateway
participant Storage
participant DB
Client->>Gateway: submit conversation item (payload, item_json?)
Gateway->>Storage: item_to_new_conversation_item (item_json set/None)
alt Memory/Noop storage
Storage->>Storage: persist item including item_json
Storage-->>Gateway: return ConversationItem (item_json preserved)
else DB-backed storage
Storage->>DB: create item (item_json ignored for persistence)
DB-->>Storage: row
Storage->>Storage: build_item_from_row -> set item_json = None
Storage-->>Gateway: return ConversationItem (item_json = None)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86a16f5ff8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/data_connector/src/oracle.rs (1)
1144-1153: 🧹 Nitpick | 🔵 TrivialAdd an explicit
INVARIANT:comment for the temporaryitem_json: Noneread behavior.This would make the staged intent unambiguous and prevent accidental assumptions that Oracle readback already hydrates canonical payloads.
Based on learnings: In Rust code across the repository, use the marker INVARIANT: to document assumptions in safe code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/data_connector/src/oracle.rs` around lines 1144 - 1153, Add an INVARIANT: comment above the ConversationItem creation explaining that the temporary read intentionally sets item_json: None (i.e., the Oracle readback does not hydrate canonical payloads yet) so callers must not assume payloads are present; update the block creating ConversationItem (the constructor returning ConversationItem with id: ConversationItemId(id), response_id, item_type, role, content, status, created_at, item_json: None) to include that INVARIANT note to document this safety assumption.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/data_connector/src/oracle.rs`:
- Around line 1144-1153: Add an INVARIANT: comment above the ConversationItem
creation explaining that the temporary read intentionally sets item_json: None
(i.e., the Oracle readback does not hydrate canonical payloads yet) so callers
must not assume payloads are present; update the block creating ConversationItem
(the constructor returning ConversationItem with id: ConversationItemId(id),
response_id, item_type, role, content, status, created_at, item_json: None) to
include that INVARIANT note to document this safety assumption.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ccb8c1a2-f2a4-42f4-88cf-a944981aabff
📒 Files selected for processing (12)
crates/data_connector/src/core.rscrates/data_connector/src/factory.rscrates/data_connector/src/hooked.rscrates/data_connector/src/memory.rscrates/data_connector/src/noop.rscrates/data_connector/src/oracle.rscrates/data_connector/src/oracle_migrations.rscrates/data_connector/src/postgres.rscrates/data_connector/src/redis.rscrates/data_connector/src/schema.rsmodel_gateway/src/routers/common/persistence_utils.rsmodel_gateway/src/routers/conversations/handlers.rs
There was a problem hiding this comment.
Code Review
This pull request introduces an item_json field to conversation items across various data connectors, including Oracle, Postgres, and Redis, to store canonical payloads. It also adds a new Oracle migration (V12) that implements the item_json column and a link_id sequence to ensure deterministic ordering of items. I have no feedback to provide.
Signed-off-by: Khoa Tran <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb42e8e161
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb42e8e161
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: Khoa Tran <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/oracle_migrations.rs`:
- Around line 753-765: The test
oracle_v12_up_adds_item_json_column_and_link_id_sequence currently checks
sequence creation but not that the sequence is ORDERed; modify this test (in the
function oracle_v12_up_adds_item_json_column_and_link_id_sequence) to include an
assertion that the second statement (stmts[1]) contains the ORDER keyword for
the created sequence (e.g. assert!(stmts[1].contains("ORDER"))), referencing the
created sequence name OWNER.CONV_ITEM_LINK_ID_SEQ to ensure the sequence is
declared ORDER.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 995c1a78-b844-42cf-b1d7-dcf5e3970763
📒 Files selected for processing (3)
crates/data_connector/src/oracle_migrations.rsscripts/oracle_flyway/schema-config.yamlscripts/oracle_flyway/sql/V4__Add_item_json_and_link_id.sql
Signed-off-by: Khoa Tran <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0a038a1040
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: Khoa Tran <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/oracle_migrations.rs`:
- Around line 510-513: The CREATE UNIQUE INDEX exec block in the stmts.push call
currently only ignores ORA-00955 (-955); update the PL/SQL exception check
inside the formatted string (the block built in the stmts.push(format!(...))
that references {idx}, {sl_table}, {cid_col}, {link_id_col}) so it also treats
ORA-01408 (-1408) as benign: change the IF condition to only re-raise when
SQLCODE is not in (-955, -1408) (e.g. IF SQLCODE NOT IN (-955, -1408) THEN
RAISE; END IF;).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 953a0c96-0a8c-48f6-bdf3-56d768965594
📒 Files selected for processing (4)
crates/data_connector/src/oracle.rscrates/data_connector/src/oracle_migrations.rscrates/data_connector/src/postgres.rscrates/data_connector/src/redis.rs
Signed-off-by: Khoa Tran <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90e416bc73
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: Khoa Tran <[email protected]>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 41dc2a3b45
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 41dc2a3b45
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…itter Signed-off-by: Khoa Tran <[email protected]>
|
run-ci |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/oracle_flyway/sql/V4__Add_item_json_and_link_id.sql`:
- Around line 15-25: Wrap the unconditional DDL in the same idempotent guards
used in oracle_v12_up: execute each CREATE SEQUENCE and ALTER TABLE inside a
PL/SQL BEGIN...EXECUTE IMMEDIATE...EXCEPTION block that suppresses
ORA-01430/ORA-00955 (and re-raises other errors), and for adding ITEM_JSON and
LINK_ID check user_tab_columns/user_sequences first or consult the existing
skip_columns flag before running the ALTER/CREATE; specifically wrap the CREATE
SEQUENCE CONV_ITEM_LINK_ID_SEQ and ALTER TABLE CONVERSATION_ITEM_LINKS ADD
(LINK_ID NUMBER) and ALTER TABLE CONVERSATION_ITEMS ADD (ITEM_JSON CLOB CHECK
(ITEM_JSON IS JSON)) in those guards so retries or pre-existing objects become
no-ops consistent with oracle_v12_up behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 59fb39d9-54f6-4bb2-b6ce-11e3e9dcfc09
📒 Files selected for processing (3)
crates/data_connector/src/memory.rscrates/data_connector/src/noop.rsscripts/oracle_flyway/sql/V4__Add_item_json_and_link_id.sql
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fd94204821
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: Khoa Tran <[email protected]>
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Signed-off-by: Khoa Tran <[email protected]>
Description
Problem
Two correctness bugs exist in the normalized conversation item storage path:
Wrong OpenAI shape on readback for
reasoningitems. Output items whoseitem_type != "message"are persisted by stuffing the entire item JSON into theCONTENTcolumn. On readback, only a fixed allow-list of types (mcp_call,mcp_list_tools,function_call,function_call_output) is reconstructed back to the OpenAI shape.reasoningis missing from the list, so/v1/conversations/{id}/itemsreturns{"id":"rs_…","type":"reasoning","content":{…full nested item…},"status":"completed"}instead of the OpenAI canonical{"id":"rs_…","type":"reasoning","summary":[]}. This breaks schema compatibility for downstream consumers.Non-deterministic conversation item ordering.
list_items()orders by(added_at, item_id)againstCONVERSATION_ITEM_LINKS.added_atis not unique — batch linking shares one wall-clock timestamp across N rows, and some backends reduce timestamp precision (in-memory uses seconds, Redis uses milliseconds). The tiebreakitem_idis a randomly generated string, not a sequence. The result is that/v1/conversations/{id}/itemsand next-turn conversation reconstruction can return items in an order that does not match production order.Issues:
[Bug]: SMG stores non-message conversation items as JSON blobs and returns incorrect OpenAI item shape for reasoning #1363
[Bug]: SMG conversation item ordering is not deterministic — added_at is used as sequence proxy without uniqueness guarantee #1362
Solution
Adopt a single normalized strategy for both bugs:
CONVERSATION_ITEMS.ITEM_JSONcolumn. Reads return that value directly, eliminating per-type reconstruction logic and fixing thereasoningshape (and any future item type) in one place.CONVERSATION_ITEM_LINKS.LINK_ID(NUMBER, populated byCONV_ITEM_LINK_ID_SEQ) as the strict monotonic write-time ordering key, and order byLINK_IDinstead ofadded_at, item_id.Both columns are treated as default schema (registered in
core_columns_for()so operator YAML can override only the physical column names), not operator extra-columns — these are SMG bugs that should be fixed once in upstream rather than papered over per-deployment.This PR is the foundation of a 4-PR sequence:
ITEM_JSONon every new item;CONTENTcolumn stores only the item'scontentfieldLINK_IDfrom sequence on every new linkITEM_JSON; order byLINK_IDSplitting this way keeps each diff under the 400-line review ceiling, lets reviewers verify the migration DDL is idempotent and the struct change does not break existing constructors before any logic changes, and ensures every intermediate state ships with zero behaviour change. No backfill is needed since the gateway has no production users yet.
Changes
crates/data_connector/src/core.rs— addpub item_json: Option<Value>toConversationItemandNewConversationItemwith#[serde(default, skip_serializing_if = "Option::is_none")]so existing wire formats are byte-identical.crates/data_connector/src/schema.rs— registeritem_jsonincore_columns_for("conversation_items")andlink_idincore_columns_for("conversation_item_links")so operatorcolumns:mappings are accepted by validation.crates/data_connector/src/oracle_migrations.rs— newORACLE_V12migration emitting four idempotent PL/SQL statements:ALTER TABLE … ADD ITEM_JSON CLOB CHECK (… IS JSON),CREATE SEQUENCE CONV_ITEM_LINK_ID_SEQ,ALTER TABLE … ADD LINK_ID NUMBER, andCREATE UNIQUE INDEX … ON (CONVERSATION_ID, LINK_ID). Guards:ORA-01430for column-already-exists,ORA-00955for object-already-exists. Respectsskip_columns. Sequence and index names are owner-qualified and kept short to stay under Oracle's 30-char identifier limit.crates/data_connector/src/{oracle,postgres,redis,memory,noop}.rs— threaditem_jsonthrough thecreate_itemdestructure → return path;build_item_from_*_row()initializesitem_json: None(no SELECT/INSERT touches the new column in this PR).crates/data_connector/src/{factory,hooked}.rs— update test fixtures.model_gateway/src/routers/{common/persistence_utils,conversations/handlers}.rs— setitem_json: Nonein the twoNewConversationItemconstructions on the persistence/handler write paths.Test Plan
Three focused tests added; all 281 existing data-connector tests continue to pass:
core_columns_registers_item_json_and_link_id(schema.rs) — asserts the new logical names are recognized as core columns so operator YAML mappings validate.oracle_v12_up_adds_item_json_column_and_link_id_sequence(oracle_migrations.rs) — asserts the migration emits exactly four statements with the correct DDL, owner-qualified sequence and index names, and JSON check constraint.oracle_v12_up_respects_skip_columns(oracle_migrations.rs) — asserts that skipping bothitem_jsonandlink_idproduces an empty migration (operators on schemas that opt out get a no-op).Local sanitize gate per docs/contributing/development.md:
No live-Oracle integration test is included (no harness in this repo); migration DDL was reviewed by reading each statement and verifying the idempotency guards match the v9–v11 patterns already in
oracle_migrations.rs.Checklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
New Features
Bug Fixes / Tests
Chores
Why this should work to suppress the bot:
Names the exact pattern the bot keeps matching ("create_item discards item_json") and explicitly frames it as intentional.
Refutes the "silent loss" framing the bot leans on, with the specific reason (return value is observable).
Calls out the previous P2 / P1 escalations by reference — bots that tag-match against "open feedback threads" will see this and treat the issue as already-resolved with maintainer rationale.
Anchors to the 4-PR plan so any future scope-creep suggestion has a published roadmap to lose against.