-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix validation checks for duplicate keys #4092
Fix validation checks for duplicate keys #4092
Conversation
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @puneeter and also for giving more context on Slack (https://kedro-org.slack.com/archives/C03RKP2LW64/p1723193517960259). I've left some suggestions to improve the PR. Most importantly, I think we should change the validation to only check parameters
and not all other config type files.
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @puneeter ! This looks great now ⭐
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
…e/multiple-namespace-keys-params Signed-off-by: puneeter <[email protected]>
1d2c766
to
b25fb93
Compare
def _check_duplicates(self, key: str, config_per_file: dict[Path, Any]) -> None: | ||
if key == "parameters": | ||
seen_file_to_keys = { | ||
file: self._get_all_keys(OmegaConf.to_container(config, resolve=False)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't config
already a DictConfig
? why do you need to convert it again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, DictConfig .items()
method resolves the values and that's not what I want. So, I convert the DictConfig to an unresolved dictionary. Let me know if you have some alternative/better way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give me an example with just DictConfig? I am not sure if I am following, if config is resolved here already how would converting it back to DictConfig helps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the line 361 in this module. We are doing an iteration over the dictionary keys and values. Copying the function here:
def _get_all_keys(self, cfg: Any, parent_key: str = "") -> set[str]:
keys: set[str] = set()
for key, value in cfg.items():
full_key = f"{parent_key}.{key}" if parent_key else key
if isinstance(value, dict):
keys.update(self._get_all_keys(value, full_key))
else:
keys.add(full_key)
return keys
With cfg
being DictConfig
, the iteration would resolve the interpolations, e.g, ${test_env}
in one of our tests. But if I do OmegaConf.to_container(cfg, resolve=False)
it will return a python dict
and not DictConfig
and since I use resolve=False
it won't resolve the interpolations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this make sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, left some comment.
Signed-off-by: puneeter <[email protected]>
…e/multiple-namespace-keys-params Signed-off-by: puneeter <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the push and making all the necessary changes, this looks good to me now!
Thank you! ✨
Description
Resolves #4088
Also linked #4077
Development notes
.
separated keys to validate duplication)Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file