Conversation
# Conflicts: # marimba/core/wrappers/dataset.py # pyproject.toml
Updated the logger to include the file extension and indicate when incomplete iFDO files will be saved with the '.incomplete' suffix. This provides clearer feedback on the file naming and format during validation failures.
|
Hi @GermanHydrogen, Thank you for implementing this iFDO validation feature. This is a good improvement to help ensure FAIR compliance. I've tested the implementation and it works perfectly for both YAML and JSON output formats. I made a small modification to the logging warning message for better clarity and consistency with other dataset logging statements. This now shows the full filename with correct extension and clearly explains what's happening. I note that the I see two main options to address this: Option 1: Field-specific ignore list
Option 2: Post-processing rename
My preference would be for Option 1 as it maintains validation for all other required fields, provides a clean, temporary workaround, doesn't require immediate changes to all of our existing pipelines, and can be easily removed once our DOI workflow is resolved. What are your thoughts on this approach? Do you see any other solutions or have experience with similar publication workflow challenges? |
|
Hi @cjackett, the problem you are describing is also true for GEOMAR is currently working on a solution for templating the image handles based on the image UUIDs, so that they are known prior to the publication. I would also support option 1, but the ignore list should be set by a CLI argument or environment variable to allow for individualization. I will look into implementing this next week. |
Added iFDO validation based on the iFDO JSON schema to warn the user if an incomplete iFDO is output to a dataset. This is done by logging and appending the suffix 'incomplete' to the filename of the iFDO. For this to work, I had to add the iFDO JSON schema to the marimba package.
This addition does not break the behavior of marimba.