Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Json to Avro converter supports capturing failures #23

Merged

Conversation

johnny-schmidt
Copy link

@johnny-schmidt johnny-schmidt commented Jul 22, 2024

Summary

This is to provide support for capturing field-level conversion failures when converting from json to avro. It adds a FieldConversionFailureListener option to the record reader, which provides a hook for replacing the field value and adding metadata to the record when a failure occurs.

Most of the changes here are to support that. Additionally I had to thread the original field name through the reader so that it could be added to the metadata if needed. (Sometimes the field name is edited to make it avro compatible.)

Additionally I had to update the dependencies, as one of the underlying artifacts vanished.

Checklist

  • Write unit tests
  • Make sure there is no logging in the Json / Avro conversion code
  • Update documentation in README.md

Copy link
Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @johnny-schmidt and the rest of your teammates on Graphite Graphite

@johnny-schmidt johnny-schmidt force-pushed the epic-8349-certify-s3/issue-8556-s3-v2-fields-gt branch from 8b6e5d1 to e93e688 Compare July 22, 2024 19:50
@johnny-schmidt johnny-schmidt marked this pull request as ready for review July 22, 2024 19:53
@johnny-schmidt johnny-schmidt requested a review from gisripa July 22, 2024 19:55
Copy link

@gisripa gisripa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

I had to read the test to figure pushPostProcessingAction needs to be called for a sample Listener impl.

@johnny-schmidt johnny-schmidt force-pushed the epic-8349-certify-s3/issue-8556-s3-v2-fields-gt branch from e93e688 to 71331fd Compare July 22, 2024 23:04
@johnny-schmidt
Copy link
Author

johnny-schmidt commented Jul 22, 2024

lgtm.

I had to read the test to figure pushPostProcessingAction needs to be called for a sample Listener impl.

I updated the readme.

A listener can be set to react to conversion failures at the field level. It will be called with metadata about the field and failure, and it may do any of the following:

* return a replacement value for the field
* call `pushPostProcessingAction` to register a function to apply to the record (eg, to add metadata about the failure)
* (re)throw an exception if the failure is unrecoverable

@johnny-schmidt johnny-schmidt force-pushed the epic-8349-certify-s3/issue-8556-s3-v2-fields-gt branch from 71331fd to 7a2e83d Compare July 22, 2024 23:06
@johnny-schmidt johnny-schmidt force-pushed the epic-8349-certify-s3/issue-8556-s3-v2-fields-gt branch from 7a2e83d to e52e86e Compare July 22, 2024 23:06
@johnny-schmidt johnny-schmidt merged commit 2e7d28b into master Jul 22, 2024
1 check passed
@johnny-schmidt johnny-schmidt deleted the epic-8349-certify-s3/issue-8556-s3-v2-fields-gt branch July 22, 2024 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants