-
Notifications
You must be signed in to change notification settings - Fork 229
feat(compass-collection): LLM Output Validation - Mock Data Generator CLOUDP-333855 #7342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements LLM output validation for the mock data generator by adding a validation layer that ensures a one-to-one mapping between input schema and faker schema. The validation filters out invalid fields from the LLM response and defaults unmapped input fields to "Unrecognized" faker methods.
- Refactors the
validateFakerSchema
function to accept input schema and validate against it - Adds logic to filter out faker schema fields that don't exist in the input schema
- Implements fallback handling for unmapped input schema fields with "Unrecognized" faker method
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
packages/compass-collection/src/modules/collection-tab.ts | Updated validateFakerSchema function to validate faker schema against input schema and handle unmapped fields |
packages/compass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx | Added comprehensive tests for schema validation including filtered fields and unmapped field handling |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx
Show resolved
Hide resolved
5cabab1
to
9b9bea4
Compare
packages/compass-collection/src/components/mock-data-generator-modal/types.ts
Outdated
Show resolved
Hide resolved
9b9bea4
to
867b002
Compare
[activeField]: { | ||
...currentMapping, | ||
mongoType: newJsonType, | ||
mongoType: newJsonType as MongoDBFieldType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of casting here, is it possible for us to just type the input arg as MongoDBFieldType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking this down, the type is unclear because the onChange
callback of <Select />
component provides a string. I don't see any way to clarify the type for the leafygreeen component (will ask them to confirm), so we might need the typecasting there. But still, it would be good if you don't do the typecasting here but at the source, otherwise it's hard to tell why are we typecasting and how do we even know that the type is correct.
onChange={(value) => onJsonTypeSelect(value as MongoDBFieldType)}
const { fieldPath, ...fieldMapping } = field; | ||
result[fieldPath] = { | ||
...fieldMapping, | ||
mongoType: fieldMapping.mongoType as MongoDBFieldType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make a sanity check here that mongoType is valid member of MongoDBFieldType, and if not, pass null perhaps? Then below in validateFakerSchema, if this mongoType is null, we can use the input schema's mongoType? Or something similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, could we use a more fitting type in the zod definition?
mongoType: z.string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the type in the zod definition to use MongoDBFieldType
.
… across mock data generator components
53384e8
to
70f02bf
Compare
packages/compass-collection/src/components/mock-data-generator-modal/types.ts
Outdated
Show resolved
Hide resolved
z.object({ | ||
fieldPath: z.string(), | ||
mongoType: z.string(), | ||
mongoType: z.string() as z.ZodType<MongoDBFieldType>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point you can validate the string. Unfortunately it seems mongodb-schema doesn't export the values (only the type). You can copy them here and create a ticket to add this export from the mongodb-schema
package.
You can use z.custom<MongoDBFieldType>
with a validator function that checks the values. https://zod.dev/api?id=custom
...field, | ||
): FakerSchema => { | ||
// Transform to keyed object structure | ||
const fakerSchemaRaw = transformFakerSchemaToObject(fakerSchemaArray); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's generally better to have functions with a single, clear purpose (for readability and maintainability). if we don't need the array variant for validation, we can move the transformation out, so that this function only takes an object and returns a validated object.
…d outside validation method
Description
Validates input schema against the LLM output schema (faker) to ensure a one-to-one map.
Screen.Recording.2025-09-19.at.11.53.01.AM.mov
Checklist
Motivation and Context
Open Questions
Dependents
Types of changes