Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix validation checks for duplicate keys #4092

Merged
merged 20 commits into from
Aug 28, 2024

Conversation

puneeter
Copy link
Contributor

@puneeter puneeter commented Aug 15, 2024

Description

Resolves #4088
Also linked #4077

Development notes

  • The OmegaConfigloader now validates the keys to the most granular level (creates . separated keys to validate duplication)

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

@puneeter puneeter requested a review from merelcht as a code owner August 15, 2024 09:21
@puneeter puneeter marked this pull request as draft August 15, 2024 09:24
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
@puneeter puneeter marked this pull request as ready for review August 15, 2024 10:51
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @puneeter and also for giving more context on Slack (https://kedro-org.slack.com/archives/C03RKP2LW64/p1723193517960259). I've left some suggestions to improve the PR. Most importantly, I think we should change the validation to only check parameters and not all other config type files.

kedro/config/omegaconf_config.py Outdated Show resolved Hide resolved
kedro/config/omegaconf_config.py Outdated Show resolved Hide resolved
RELEASE.md Outdated Show resolved Hide resolved
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
@merelcht merelcht requested a review from noklam August 21, 2024 17:09
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
@puneeter puneeter requested a review from merelcht August 21, 2024 18:20
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @puneeter ! This looks great now ⭐

RELEASE.md Outdated Show resolved Hide resolved
@merelcht merelcht requested review from ankatiyar and removed request for astrojuanlu and yetudada August 22, 2024 15:13
Signed-off-by: puneeter <[email protected]>
Signed-off-by: puneeter <[email protected]>
…e/multiple-namespace-keys-params

Signed-off-by: puneeter <[email protected]>
@puneeter puneeter force-pushed the feature/multiple-namespace-keys-params branch from 1d2c766 to b25fb93 Compare August 23, 2024 06:44
def _check_duplicates(self, key: str, config_per_file: dict[Path, Any]) -> None:
if key == "parameters":
seen_file_to_keys = {
file: self._get_all_keys(OmegaConf.to_container(config, resolve=False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't config already a DictConfig? why do you need to convert it again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, DictConfig .items() method resolves the values and that's not what I want. So, I convert the DictConfig to an unresolved dictionary. Let me know if you have some alternative/better way to handle this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give me an example with just DictConfig? I am not sure if I am following, if config is resolved here already how would converting it back to DictConfig helps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the line 361 in this module. We are doing an iteration over the dictionary keys and values. Copying the function here:

    def _get_all_keys(self, cfg: Any, parent_key: str = "") -> set[str]:
        keys: set[str] = set()

        for key, value in cfg.items():
            full_key = f"{parent_key}.{key}" if parent_key else key
            if isinstance(value, dict):
                keys.update(self._get_all_keys(value, full_key))
            else:
                keys.add(full_key)
        return keys

With cfg being DictConfig, the iteration would resolve the interpolations, e.g, ${test_env} in one of our tests. But if I do OmegaConf.to_container(cfg, resolve=False) it will return a python dict and not DictConfig and since I use resolve=False it won't resolve the interpolations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this make sense!

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, left some comment.

kedro/config/omegaconf_config.py Outdated Show resolved Hide resolved
kedro/config/omegaconf_config.py Outdated Show resolved Hide resolved
Signed-off-by: puneeter <[email protected]>
@puneeter puneeter requested a review from noklam August 27, 2024 18:26
…e/multiple-namespace-keys-params

Signed-off-by: puneeter <[email protected]>
Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the push and making all the necessary changes, this looks good to me now!

Thank you! ✨

@noklam noklam merged commit 080b265 into kedro-org:main Aug 28, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't use same top-level keys in two different parameters.yaml
3 participants