Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement custom format for object keys uploaded by S3Writer #441

Merged
merged 3 commits into from
Sep 23, 2024

Conversation

jfzunigac
Copy link
Contributor

This PR introduces 2 new configs to S3Writer to allow for more customizable format for object uploads:

  • filenamePattern : A regex to extract fields/tokens out of the filename
  • filenameTokens : A list of tokens extracted from filenamePattern, these are extracted automatically but can also be set for testing

keyPrefix and filenameFormat are also replaced with keyFormat so that tokens extracted from the flename can be used in keyFormat by following the syntax: %{TOKEN}, e.g:

# Configuration for writer
writer.type=s3
writer.s3.maxFileSizeMB=5
writer.s3.minUploadTimeInSeconds=30
writer.s3.bucket=my-bucket
writer.s3.keyFormat=%{service}/%{index}/my_log.%TIMESTAMP
writer.s3.maxRetries=3
writer.s3.filenamePattern=^(?<service>[a-zA-Z0-9]+)_.*_(?<index>\\d+)\\.log$

Additionally, we add some default formatters:

  • %UUID
  • %TIMESTAMP
  • %HOST
  • %LOGNAME

@jfzunigac jfzunigac requested a review from a team as a code owner September 20, 2024 23:12
@jfzunigac jfzunigac merged commit ce85282 into pinterest:master Sep 23, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants