Skip to content

feat(schema-generator): deduplicate equivalent enum entity types across tools#93

Merged
lianah merged 5 commits into
mainfrom
dedup-enum-entity-types
May 13, 2026
Merged

feat(schema-generator): deduplicate equivalent enum entity types across tools#93
lianah merged 5 commits into
mainfrom
dedup-enum-entity-types

Conversation

@lianah
Copy link
Copy Markdown
Contributor

@lianah lianah commented May 11, 2026

Description of changes

When deduplicate_entity_types is enabled, enum entity types with identical names and variant lists are consolidated into a single definition placed at the lowest common ancestor (LCA) namespace.

Deduplication is skipped (enums stay local) when:

  • The target LCA namespace already contains a type with the same name
  • Multiple fingerprints with the same name but different variants would target the same LCA

The request generator resolves deduplicated enums by verifying the source namespace membership, preventing false matches between same-named enums with different variants.

Issue #, if available

Checklist for requesting a review

The change in this PR is (choose one, and delete the other options):

  • A backwards-compatible change requiring a minor version bump to any crates in this repository (e.g., addition of a new API).

I confirm that this PR (choose one, and delete the other options):

  • Updates the "Unreleased" section of the CHANGELOG with a description of my change (required for major/minor version bumps).

lianah added 2 commits May 11, 2026 23:15
…ss tools

When `deduplicate_entity_types` is enabled, enum entity types with
identical names and variant lists are consolidated into a single
definition placed at the lowest common ancestor (LCA) namespace.

Deduplication is skipped (enums stay local) when:
- The target LCA namespace already contains a type with the same name
- Multiple fingerprints with the same name but different variants
  would target the same LCA

The request generator resolves deduplicated enums by verifying the
source namespace membership, preventing false matches between
same-named enums with different variants.

Signed-off-by: Liana Hadarean <hadarean@amazon.com>
Signed-off-by: Liana Hadarean <hadarean@amazon.com>
@github-actions
Copy link
Copy Markdown

Coverage Report

Head Commit: 0b4334219115692fd0b790690d82283f9595733a

Base Commit: 6fdf5665191e0e8f1d353dc60bcf83e33e142bf3

Download the full coverage report.

Coverage of Added or Modified Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 80.44%

Status: PASSED ✅

Details
File Status Covered Coverage Missed Lines
cedar-policy-mcp-schema-generator/src/cli/exec.rs 🟢 1/1 100.00%
cedar-policy-mcp-schema-generator/src/generator/request.rs 🟢 12/12 100.00%
cedar-policy-mcp-schema-generator/src/generator/schema.rs 🟡 205/258 79.46% 234, 239, 273, 499-505, 530, 547-553, 555, 565-577, 579-588, 685-695

Coverage of All Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 90.63%

Status: PASSED ✅

Details
Package Status Covered Coverage Base Coverage
cedar-policy-mcp-schema-generator 🟢 1961/2130 92.07% --
cedar-policy-mcp-schema-generator-wasm 🟢 145/159 91.19% --
mcp-tools-sdk 🟢 1531/1724 88.81% --

@github-actions
Copy link
Copy Markdown

Coverage Report

Head Commit: 0299ffd7e6afd2538760c33f77809ed6a3b06e62

Base Commit: 6fdf5665191e0e8f1d353dc60bcf83e33e142bf3

Download the full coverage report.

Coverage of Added or Modified Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 80.44%

Status: PASSED ✅

Details
File Status Covered Coverage Missed Lines
cedar-policy-mcp-schema-generator/src/cli/exec.rs 🟢 1/1 100.00%
cedar-policy-mcp-schema-generator/src/generator/request.rs 🟢 12/12 100.00%
cedar-policy-mcp-schema-generator/src/generator/schema.rs 🟡 205/258 79.46% 234, 239, 273, 499-505, 530, 547-553, 555, 565-577, 579-588, 685-695

Coverage of All Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 90.63%

Status: PASSED ✅

Details
Package Status Covered Coverage Base Coverage
cedar-policy-mcp-schema-generator 🟢 1961/2130 92.07% --
cedar-policy-mcp-schema-generator-wasm 🟢 145/159 91.19% --
mcp-tools-sdk 🟢 1531/1724 88.81% --

}

// Pass 1: Collect enum occurrences for deduplication
if self.config.deduplicate_entity_types {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could have been implemented as a separate pass on the schema resulting from the generator. Currently, as far as I understand, the deduplication logic is interleaved with the generation logic. It looks fine for the enum entities, but as we add more types to it (records), the generation+dedup logic will get complicated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it into it's own method, and in the process realized we weren't applying deduplication to the outputs so fixed that too. Deduplication needs to happen before handling the common types, so it can't be completely outside of the generation. I guess another option would be to mutate the resulting Cedar schema. We would need to do enough book-keeping to make sure that still works with the request generator.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it would require changing somewhat the request generator. I wonder if the request generator could be written to accept any Cedar schema that is sufficient to properly typecheck a given JSON request -- i.e. not actually require the bookkeeping.

action call_tool;

// Pre-existing enum entity type that will collide with dedup placement
entity status enum ["active", "inactive"];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, but I would have seen the ability to reuse enum defined by the user in the generated schema as a nice feature, maybe in another PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed that, it should now reuse the existing type if it matches.

Signed-off-by: Liana Hadarean <hadarean@amazon.com>
@github-actions
Copy link
Copy Markdown

Coverage Report

Head Commit: c4cb41feb03d2c9ef97b71a7c672f4f9f0a9eba9

Base Commit: 6fdf5665191e0e8f1d353dc60bcf83e33e142bf3

Download the full coverage report.

Coverage of Added or Modified Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 82.12%

Status: PASSED ✅

Details
File Status Covered Coverage Missed Lines
cedar-policy-mcp-schema-generator/src/cli/exec.rs 🟢 1/1 100.00%
cedar-policy-mcp-schema-generator/src/generator/request.rs 🟢 12/12 100.00%
cedar-policy-mcp-schema-generator/src/generator/schema.rs 🟢 235/289 81.31% 234, 239, 273, 519-525, 550, 567-573, 575, 585-597, 599-608, 724-734, 770

Coverage of All Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 90.68%

Status: PASSED ✅

Details
Package Status Covered Coverage Base Coverage
cedar-policy-mcp-schema-generator 🟢 1991/2161 92.13% 93.80%
cedar-policy-mcp-schema-generator-wasm 🟢 145/159 91.19% 91.19%
mcp-tools-sdk 🟢 1531/1724 88.81% 88.81%

…assertions.

Signed-off-by: Liana Hadarean <hadarean@amazon.com>
Copy link
Copy Markdown
Contributor

@victornicolet victornicolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs a cargo fmt reformatting and then it looks good to me.

.iter()
.map(|ns| {
// Safe to unwrap: we checked for None above
#[expect(clippy::unwrap_used, reason = "None case handled above")]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might try to adjust the is_none check to avoid this unwrap

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will fix.

.collect();

// Split each string into path segments
let segment_lists: Vec<Vec<&str>> = name_strings
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through strings here shouldn't be necessary. We should be able to get the already parsed Ids out of the Names

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, using the Name namespaces.

EntityTypeKind::Enum { choices } => {
let existing: Vec<&SmolStr> = choices.iter().collect();
let candidate: Vec<&SmolStr> = variants.iter().collect();
existing == candidate
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is everything sorted here? I think an earlier comment mentions EntityTypeFingerprint variants will be sorted, I'm not sure choices.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not sorted, I think the comment said ordered to mean we keep the order. We don't dedup if the same variants are in a different order. I added a test to show this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's surprising to me. I wouldn't expect variant order to matter.

Signed-off-by: Liana Hadarean <hadarean@amazon.com>
@github-actions
Copy link
Copy Markdown

Coverage Report

Head Commit: 0f7405acfbc8a8e544682a0aab0b695bfd998da6

Base Commit: 6fdf5665191e0e8f1d353dc60bcf83e33e142bf3

Download the full coverage report.

Coverage of Added or Modified Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 82.62%

Status: PASSED ✅

Details
File Status Covered Coverage Missed Lines
cedar-policy-mcp-schema-generator/src/cli/exec.rs 🟢 1/1 100.00%
cedar-policy-mcp-schema-generator/src/generator/request.rs 🟢 12/12 100.00%
cedar-policy-mcp-schema-generator/src/generator/schema.rs 🟢 239/292 81.85% 234, 270, 518-524, 549, 566-572, 574, 584-596, 598-607, 723-733, 769

Coverage of All Lines of Rust Code

Required coverage: 80.00%

Actual coverage: 90.71%

Status: PASSED ✅

Details
Package Status Covered Coverage Base Coverage
cedar-policy-mcp-schema-generator 🟢 1995/2164 92.19% 93.80%
cedar-policy-mcp-schema-generator-wasm 🟢 145/159 91.19% 91.19%
mcp-tools-sdk 🟢 1531/1724 88.81% 88.81%

@lianah lianah merged commit 1b95e0d into main May 13, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants