Skip to content

Conversation

@sammu97
Copy link

@sammu97 sammu97 commented Nov 18, 2025

This PR aims to fix an unwanted behaviour of having fields omitted after a JoltTransformRecord on a batch of records within the same FlowFile, due to multiple outputs having more than 1 schema. The current implementation of the processor retrieves the schema of the FIRST transformed record, and abides by that schema throughout the rest of the transformations. A new property is introduced for the JoltTransformRecord, where users can decide to either keep the same behaviour, or utilize the new PARTITION_BY_SCHEMA strategy, which will split the transformations into separate FlowFIles, according to the number of schemas.

Summary

NIFI-15209

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using ./mvnw clean install -P contrib-check
    • JDK 21
    • JDK 25

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Introduction of a new property for JoltTransformRecord, which will enable the outputting of different schemas by having multiple output flowfiles
@sammu97 sammu97 marked this pull request as draft November 18, 2025 20:09
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this issue @sammu97. As a general note, it seems like it would be cleaner to avoid moving all of the test schema and JSON files to a new directory, in order to focus on the actual changes proposed. Can that be adjusted?

@sammu97
Copy link
Author

sammu97 commented Nov 19, 2025

Yes sure @exceptionfactory , will handle this as soon as i can.

Also, I'm seeing that some checks are failing on code checkout, is this due to the Cloudflare outage?

@exceptionfactory
Copy link
Contributor

Yes sure @exceptionfactory , will handle this as soon as i can.

Also, I'm seeing that some checks are failing on code checkout, is this due to the Cloudflare outage?

Yes, they were due to the outage, I have restarted the checks.

@sammu97
Copy link
Author

sammu97 commented Nov 19, 2025

Looks like the build failed on some of the OSs, im suspecting a file ordering issue. Will investigate and update the PR accordingly

Jordan Sammut added 4 commits November 19, 2025 14:34
- Added new test for jolt which filters out everything
- Cleaned up some irrelevant code
- Fix for non-deterministic ordering in test
Disabling checks on Windows
@sammu97
Copy link
Author

sammu97 commented Nov 23, 2025

@exceptionfactory Had to make some fixes for Windows as the checks are usually omitted. However, any idea about the error for the Mac tests?

The template is not valid. .github/workflows/ci-workflow.yml (Line: 224, Col: 16): hashFiles('**/package-lock.json') failed. Fail to hash files under directory '/Users/runner/work/nifi/nifi'

Test file rename
@sammu97 sammu97 marked this pull request as ready for review November 23, 2025 16:01
@ChrisSamo632
Copy link
Contributor

@exceptionfactory Had to make some fixes for Windows as the checks are usually omitted. However, any idea about the error for the Mac tests?

The template is not valid. .github/workflows/ci-workflow.yml (Line: 224, Col: 16): hashFiles('**/package-lock.json') failed. Fail to hash files under directory '/Users/runner/work/nifi/nifi'

@sammu97 the node cache issue in the build appears to have been an intermittent problem over the weekend. I spotted other PRs with similar errors, but then things seem to be working again this morning. I've restarted the failed job on your PR and so far things like happier 🤞

@sammu97
Copy link
Author

sammu97 commented Nov 24, 2025

@exceptionfactory Had to make some fixes for Windows as the checks are usually omitted. However, any idea about the error for the Mac tests?
The template is not valid. .github/workflows/ci-workflow.yml (Line: 224, Col: 16): hashFiles('**/package-lock.json') failed. Fail to hash files under directory '/Users/runner/work/nifi/nifi'

@sammu97 the node cache issue in the build appears to have been an intermittent problem over the weekend. I spotted other PRs with similar errors, but then things seem to be working again this morning. I've restarted the failed job on your PR and so far things like happier 🤞

@ChrisSamo632 Yep, seems like it's already past the step that was failing. Thanks!

Some minor fixes for JoltTransformRecord. Inserted creation of new flowfile after we check that we have a valid record
@sammu97
Copy link
Author

sammu97 commented Nov 24, 2025

@exceptionfactory Just a small note too. I've also amended some logic for the testNoRecords() test, as I have put out a small change that if the Jolt has no records to transform, in my opinion there should be no resulting flowfile as there is nothing to write. Not sure what you think about this, should I be leaving the old logic?

Closing of writer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants