-
Notifications
You must be signed in to change notification settings - Fork 2.9k
NIFI-15209 JoltTransformRecord should not only take schema from first record #10545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Introduction of a new property for JoltTransformRecord, which will enable the outputting of different schemas by having multiple output flowfiles
exceptionfactory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this issue @sammu97. As a general note, it seems like it would be cleaner to avoid moving all of the test schema and JSON files to a new directory, in order to focus on the actual changes proposed. Can that be adjusted?
|
Yes sure @exceptionfactory , will handle this as soon as i can. Also, I'm seeing that some checks are failing on code checkout, is this due to the Cloudflare outage? |
Yes, they were due to the outage, I have restarted the checks. |
|
Looks like the build failed on some of the OSs, im suspecting a file ordering issue. Will investigate and update the PR accordingly |
- Added new test for jolt which filters out everything - Cleaned up some irrelevant code - Fix for non-deterministic ordering in test
Disabling checks on Windows
|
@exceptionfactory Had to make some fixes for Windows as the checks are usually omitted. However, any idea about the error for the Mac tests?
|
Test file rename
@sammu97 the node cache issue in the build appears to have been an intermittent problem over the weekend. I spotted other PRs with similar errors, but then things seem to be working again this morning. I've restarted the failed job on your PR and so far things like happier 🤞 |
@ChrisSamo632 Yep, seems like it's already past the step that was failing. Thanks! |
Some minor fixes for JoltTransformRecord. Inserted creation of new flowfile after we check that we have a valid record
|
@exceptionfactory Just a small note too. I've also amended some logic for the testNoRecords() test, as I have put out a small change that if the Jolt has no records to transform, in my opinion there should be no resulting flowfile as there is nothing to write. Not sure what you think about this, should I be leaving the old logic? |
Closing of writer
This PR aims to fix an unwanted behaviour of having fields omitted after a JoltTransformRecord on a batch of records within the same FlowFile, due to multiple outputs having more than 1 schema. The current implementation of the processor retrieves the schema of the FIRST transformed record, and abides by that schema throughout the rest of the transformations. A new property is introduced for the JoltTransformRecord, where users can decide to either keep the same behaviour, or utilize the new PARTITION_BY_SCHEMA strategy, which will split the transformations into separate FlowFIles, according to the number of schemas.
Summary
NIFI-15209
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000Pull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
./mvnw clean install -P contrib-checkLicensing
LICENSEandNOTICEfilesDocumentation