chore: Backpush to main changes to skeleton of python hooks #741

MaxymVlasov · 2025-01-03T10:57:01Z

Put an x into the box if that apply:

This PR introduces breaking change.
This PR fixes a bug.
This PR adds new functionality.
This PR enhances existing functionality.

Description of your changes

This PR includes only changes to skeleton of python hooks, w/o addition of hooks itself, as it will be required a new release.

Changes includes but not limited to:

Addition of basic tests in Pytest (mostly generated by GH Copilot, so could be not very useful, but at least they catch basic issues)
Rename basic skeleton things to more usual in bash (like CLI_SUBCOMMAND_NAME to HOOK_ID)
Dropping of support of Python 2 in python package definition - it was never supported - terraform_docs_replace using python 3 syntax here from hook addition
Addition of linters and fix most of found issues
- Would be nice to support relative imports of submodules, but I didn't find solution that will be pass linters, especially mypy - error: No parent module -- cannot perform relative import - Found 1 error in 1 file (errors prevented further checking)
Reimplementation of _common.sh functionality. Located mostly in _common.py, _run_on_whole_repo.py and _logger.py. common::parse_cmdline function was fully reworked by @webknjaz in prev PRs and now populated with common functionality in _cli_parsing.py

This PR DOES NOT include any hooks additions or breaking changes for existing hooks. In branch with hooks exist a bunch of TODOs part of which could result in rework of functional in this PR. But there are already many changes, and this could block @webknjaz to implement testing best practices, so I checkout to new branch, dropped everything which could end in new release and created this PR to merge non-user-faced changes to master

How can we test changes

You can

repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: a278a04b257f82ee86c84410e7f462fb357a9810
    hooks:
      - id: terraform_docs_replace

or play around using updated CONTIBUTING.md notes

Additional references

This PR includes and based on work of @webknjaz who set all these best-practice CLI structure in pre PRs and fixed essential bug of accessing .pre-commit-hooks.yaml values in not-yet-released python hooks (#740 (included in this PR)) and initial work of @ericfrederich who shown us in #652 that reimplementation in Python can be done relatively easy

modifications incoming

webknjaz · 2025-01-07T02:28:47Z

src/pre_commit_terraform/_common.py

+        if exit_code != 0:
+            final_exit_code = exit_code


How about reporting the number of failures through the return code?

Suggested change

if exit_code != 0:

final_exit_code = exit_code

final_exit_code += exit_code

webknjaz · 2025-01-07T02:31:05Z

src/pre_commit_terraform/_common.py

+    return final_exit_code
+
+
+class BinaryNotFoundError(Exception):


This would be useful in a dedicated module. And it should probably inherit a LookupError, maybe together with something else:

Suggested change

class BinaryNotFoundError(Exception):

class BinaryNotFoundError(FileNotFoundError, LookupError):

webknjaz · 2025-01-07T02:40:16Z

src/pre_commit_terraform/_common.py

In general, it's a bad structural decision to have a soup-mixture style module with just about any arbitrary low-level thing from the project. It's better to have more specific names that would represent what the module contains. Most of the modules in the project expose common functions, only the outer layer of the app doesn't re-export them, really.

Give this module a narrow purpose and stick with it. Then move out anything that doesn't fit into the category into other modules with names that have some semantic meaning beyond being a generic English word that means whatever the reader would like to guess it means.

Honestly, I'd hoped WPS100 would catch this, but that check is rather dumb right now: https://wemake-python-styleguide.rtfd.io/en/latest/pages/usage/violations/naming.html#wemake_python_styleguide.violations.naming.WrongModuleNameViolation.

webknjaz · 2025-01-07T02:41:46Z

src/pre_commit_terraform/_common.py

+    return expanded_args
+
+
+def per_dir_hook(


This name doesn't look like a function. There's no action in the name, no verb. It looks like a name of a constant and should be improved to look like something that does things and not just exists.

webknjaz · 2025-01-07T02:42:16Z

src/pre_commit_terraform/_logger.py

+from copy import copy
+
+
+class ColoredFormatter(logging.Formatter):


Integrate Rich instead.

webknjaz · 2025-01-07T02:42:52Z

src/pre_commit_terraform/_run_on_whole_repo.py

This whole thing reads weird. Perhaps you meant entire?

webknjaz · 2025-01-07T02:44:13Z

src/pre_commit_terraform/_run_on_whole_repo.py

+    Check if a function is defined in the global scope.
+
+    Args:
+        scope (dict): The scope (usually globals()) to check in.


If you're ever going to integrate Sphinx, beware that it's expected that you use RST syntax in docstrings:

Suggested change

scope (dict): The scope (usually globals()) to check in.

scope (dict): The scope (usually ``globals()``) to check in.

(this is a simplistic example that would probably be :py:func:`globals` IRL)

webknjaz · 2025-01-07T02:45:18Z

src/pre_commit_terraform/_run_on_whole_repo.py

+    return is_defined and is_callable
+
+
+def is_hook_run_on_whole_repo(hook_id: str, file_paths: list[str]) -> bool:


webknjaz · 2025-01-07T02:46:22Z

src/pre_commit_terraform/_types.py



+@runtime_checkable


Any reason to do this in this PR?

webknjaz · 2025-01-07T02:50:10Z

tests/pytest/test___main__.py

+    if 'pre_commit_terraform.__main__' in sys.modules:
+        importlib.reload(sys.modules['pre_commit_terraform.__main__'])
+    else:
+        import pre_commit_terraform.__main__  # noqa: F401


Nooooooo! Don't do this! Use runpy like you're supposed to. This module is to be invoked as an entry-point and not in the middle of runtime. So in general you should've instead just called it through sys.executable and checked the return code + the output. No need to patch stuff in this case.

webknjaz · 2025-01-07T02:51:50Z

tests/pytest/test___main__.py

+    mock_return_code = 0
+
+    mocker.patch('sys.argv', mock_argv)
+    mock_invoke_cli_app = mocker.patch(


Also, you could do a --help instead of patching most of the logic out. It would give you your successful return code.

webknjaz · 2025-01-07T02:57:50Z

tests/pytest/test__cli_parsing.py

+def test_populate_common_argument_parser(mocker):
+    parser = ArgumentParser(add_help=False)
+    populate_common_argument_parser(parser)
+    args = parser.parse_args(
+        ['-a', 'arg1', '-h', 'hook1', '-i', 'init1', '-e', 'env1', 'file1', 'file2'],
+    )
+
+    assert args.args == ['arg1']
+    assert args.hook_config == ['hook1']
+    assert args.tf_init_args == ['init1']
+    assert args.env_vars_strs == ['env1']
+    assert args.files == ['file1', 'file2']
+
+
+def test_populate_common_argument_parser_defaults(mocker):
+    parser = ArgumentParser(add_help=False)
+    populate_common_argument_parser(parser)
+    args = parser.parse_args([])
+
+    assert args.args == []
+    assert args.hook_config == []
+    assert args.tf_init_args == []
+    assert args.env_vars_strs == []
+    assert args.files == []
+
+
+def test_populate_common_argument_parser_multiple_values(mocker):
+    parser = ArgumentParser(add_help=False)
+    populate_common_argument_parser(parser)
+    args = parser.parse_args(
+        [
+            '-a',
+            'arg1',
+            '-a',
+            'arg2',
+            '-h',
+            'hook1',
+            '-h',
+            'hook2',
+            '-i',
+            'init1',
+            '-i',
+            'init2',
+            '-e',
+            'env1',
+            '-e',
+            'env2',
+            'file1',
+            'file2',
+        ],
+    )
+
+    assert args.args == ['arg1', 'arg2']
+    assert args.hook_config == ['hook1', 'hook2']
+    assert args.tf_init_args == ['init1', 'init2']
+    assert args.env_vars_strs == ['env1', 'env2']
+    assert args.files == ['file1', 'file2']


These are actually a single test, and it should be a single test function. To run it multiple times, use @pytest.mark.parametrize. Don't forget to specify test IDs for better UX/DX in test report and invocation.

webknjaz · 2025-01-07T02:58:31Z

tests/pytest/test__cli_parsing.py

+# ?
+# ? populate_common_argument_parser
+# ?
+def test_populate_common_argument_parser(mocker):


Also, why didn't you write a docstring, so I would know what you're testing/expecting here?

webknjaz · 2025-01-07T03:00:24Z

tests/pytest/test__cli_parsing.py

In general, I see a huge overuse of mocking here. It's best to avoid it entirely where possible. Or only have a few exceptions. Designing runtime code well (like through dependency injection) allows for testing without mocking random things out.

webknjaz · 2025-01-07T03:02:36Z

tests/pytest/test__cli_subcommands.py

+        mock_subcommand_modules,
+    )
+
+    from pre_commit_terraform._cli_subcommands import (


Do not add imports within functions. It's a really bad idea that makes runtime less predictable / more difficult to interpret. Also, it's already imported at the top, it's not going to load the file for the second time, it's already cached. Don't disable the linters that report that this is dumb.

webknjaz · 2025-01-07T03:03:21Z

tests/pytest/test__common.py

+import os
+from os.path import join


Double import. Don't. Just don't. Enable the linters and fix the problems instead of showing this to other humans.

webknjaz · 2025-01-07T03:03:35Z

tests/pytest/test__common.py

+# ?
+# ? get_unique_dirs
+# ?
+def test_get_unique_dirs_empty():


webknjaz · 2025-01-07T03:05:30Z

tests/pytest/test__common.py

+def test_parse_env_vars_empty():
+    env_var_strs = []
+    result = parse_env_vars(env_var_strs)
+    assert result == {}
+
+
+def test_parse_env_vars_single():
+    env_var_strs = ['VAR1=value1']
+    result = parse_env_vars(env_var_strs)
+    assert result == {'VAR1': 'value1'}
+
+
+def test_parse_env_vars_multiple():
+    env_var_strs = ['VAR1=value1', 'VAR2=value2']
+    result = parse_env_vars(env_var_strs)
+    assert result == {'VAR1': 'value1', 'VAR2': 'value2'}
+
+
+def test_parse_env_vars_with_quotes():
+    env_var_strs = ['VAR1="value1"', 'VAR2="value2"']
+    result = parse_env_vars(env_var_strs)
+    assert result == {'VAR1': 'value1', 'VAR2': 'value2'}
+
+
+def test_parse_env_vars_with_equal_sign_in_value():
+    env_var_strs = ['VAR1=value=1', 'VAR2=value=2']
+    result = parse_env_vars(env_var_strs)
+    assert result == {'VAR1': 'value=1', 'VAR2': 'value=2'}
+
+
+def test_parse_env_vars_with_empty_value():
+    env_var_strs = ['VAR1=', 'VAR2=']
+    result = parse_env_vars(env_var_strs)
+    assert result == {'VAR1': '', 'VAR2': ''}


parametrize

webknjaz · 2025-01-07T03:05:41Z

tests/pytest/test__common.py

+def test_expand_env_vars_no_vars():
+    args = ['arg1', 'arg2']
+    env_vars = {}
+    result = expand_env_vars(args, env_vars)
+    assert result == ['arg1', 'arg2']
+
+
+def test_expand_env_vars_single_var():
+    args = ['arg1', '${VAR1}', 'arg3']
+    env_vars = {'VAR1': 'value1'}
+    result = expand_env_vars(args, env_vars)
+    assert result == ['arg1', 'value1', 'arg3']
+
+
+def test_expand_env_vars_multiple_vars():
+    args = ['${VAR1}', 'arg2', '${VAR2}']
+    env_vars = {'VAR1': 'value1', 'VAR2': 'value2'}
+    result = expand_env_vars(args, env_vars)
+    assert result == ['value1', 'arg2', 'value2']
+
+
+def test_expand_env_vars_no_expansion():
+    args = ['arg1', 'arg2']
+    env_vars = {'VAR1': 'value1'}
+    result = expand_env_vars(args, env_vars)
+    assert result == ['arg1', 'arg2']
+
+
+def test_expand_env_vars_partial_expansion():
+    args = ['arg1', '${VAR1}', '${VAR2}']
+    env_vars = {'VAR1': 'value1'}
+    result = expand_env_vars(args, env_vars)
+    assert result == ['arg1', 'value1', '${VAR2}']
+
+
+def test_expand_env_vars_with_special_chars():
+    args = ['arg1', '${VAR_1}', 'arg3']
+    env_vars = {'VAR_1': 'value1'}
+    result = expand_env_vars(args, env_vars)
+    assert result == ['arg1', 'value1', 'arg3']


parametrize

webknjaz · 2025-01-07T03:06:15Z

tests/pytest/test__common.py

+def test_get_tf_binary_path_from_hook_config():
+    hook_config = ['--tf-path=/custom/path/to/terraform']
+    result = get_tf_binary_path(hook_config)
+    assert result == '/custom/path/to/terraform'
+
+
+def test_get_tf_binary_path_from_pct_tfpath_env_var(mocker):
+    hook_config = []
+    mocker.patch.dict(os.environ, {'PCT_TFPATH': '/env/path/to/terraform'})
+    result = get_tf_binary_path(hook_config)
+    assert result == '/env/path/to/terraform'
+
+
+def test_get_tf_binary_path_from_terragrunt_tfpath_env_var(mocker):
+    hook_config = []
+    mocker.patch.dict(os.environ, {'TERRAGRUNT_TFPATH': '/env/path/to/terragrunt'})
+    result = get_tf_binary_path(hook_config)
+    assert result == '/env/path/to/terragrunt'
+
+
+def test_get_tf_binary_path_from_system_path_terraform(mocker):
+    hook_config = []
+    mocker.patch('shutil.which', return_value='/usr/local/bin/terraform')
+    result = get_tf_binary_path(hook_config)
+    assert result == '/usr/local/bin/terraform'
+
+
+def test_get_tf_binary_path_from_system_path_tofu(mocker):
+    hook_config = []
+    mocker.patch('shutil.which', side_effect=[None, '/usr/local/bin/tofu'])
+    result = get_tf_binary_path(hook_config)
+    assert result == '/usr/local/bin/tofu'


parametrize

webknjaz · 2025-01-07T03:06:31Z

tests/pytest/test__common.py

+
+def test_get_tf_binary_path_from_pct_tfpath_env_var(mocker):
+    hook_config = []
+    mocker.patch.dict(os.environ, {'PCT_TFPATH': '/env/path/to/terraform'})


use monkeypatch.setenv()

webknjaz · 2025-01-07T03:08:26Z

tests/pytest/test__common.py

+    with pytest.raises(
+        BinaryNotFoundError,
+        match='Neither Terraform nor OpenTofu binary could be found. Please either set the "--tf-path"'
+        + ' hook configuration argument, or set the "PCT_TFPATH" environment variable, or set the'
+        + ' "TERRAGRUNT_TFPATH" environment variable, or install Terraform or OpenTofu globally.',
+    ):


Suggested change

with pytest.raises(

BinaryNotFoundError,

match='Neither Terraform nor OpenTofu binary could be found. Please either set the "--tf-path"'

+ ' hook configuration argument, or set the "PCT_TFPATH" environment variable, or set the'

+ ' "TERRAGRUNT_TFPATH" environment variable, or install Terraform or OpenTofu globally.',

):

error_msg = (

r'^Neither Terraform nor OpenTofu binary could be found. Please either set the "--tf-path" '

'hook configuration argument, or set the "PCT_TFPATH" environment variable, or set the '

'"TERRAGRUNT_TFPATH" environment variable, or install Terraform or OpenTofu globally\.$'

)

with pytest.raises(BinaryNotFoundError, match=error_msg):

webknjaz · 2025-01-07T03:09:03Z

tests/pytest/test__errors.py

This module is rather pointless. Delete it.

webknjaz · 2025-01-07T03:11:28Z

tests/pytest/test__run_on_whole_repo.py

+    def sample_function():
+        pass
+
+    scope = globals()


Wow, this is nasty! Messing up the entire process state for all the test functions, eh? Don't do this. At least copy the thing instead of side-effecting everything...

Suggested change

scope = globals()

scope = globals().copy()

webknjaz · 2025-01-07T03:12:19Z

tests/pytest/test__run_on_whole_repo.py

+    scope = globals()
+    scope['sample_function'] = sample_function


But really.. Can't you just not go the unnecessarily complicated route? KISS.

Suggested change

scope = globals()

scope['sample_function'] = sample_function

scope = {'sample_function': sample_function}

webknjaz · 2025-01-07T03:12:37Z

tests/pytest/test__run_on_whole_repo.py

+# ?
+# ? is_function_defined
+# ?
+def test_is_function_defined_existing_function():
+    def sample_function():
+        pass
+
+    scope = globals()
+    scope['sample_function'] = sample_function
+
+    assert is_function_defined('sample_function', scope) is True
+
+
+def test_is_function_defined_non_existing_function():
+    scope = globals()
+
+    assert is_function_defined('non_existing_function', scope) is False
+
+
+def test_is_function_defined_non_callable():
+    non_callable = 'I am not a function'
+    scope = globals()
+    scope['non_callable'] = non_callable
+
+    assert is_function_defined('non_callable', scope) is False
+
+
+def test_is_function_defined_callable_object():
+    class CallableObject:
+        def __call__(self):
+            pass
+
+    callable_object = CallableObject()
+    scope = globals()
+    scope['callable_object'] = callable_object
+
+    assert is_function_defined('callable_object', scope) is True


Parametrize

webknjaz · 2025-01-07T03:15:33Z

tests/pytest/test__run_on_whole_repo.py

+def test_is_hook_run_on_whole_repo(mocker, mock_git_ls_files, mock_hooks_config):
+    # Mock the return value of git ls-files
+    mocker.patch('subprocess.check_output', return_value='\n'.join(mock_git_ls_files))
+    # Mock the return value of reading the .pre-commit-hooks.yaml file
+    mocker.patch('builtins.open', mocker.mock_open(read_data=yaml.dump(mock_hooks_config)))
+    # Mock the Path object to return a specific path
+    mock_path = mocker.patch('pathlib.Path.resolve')
+    mock_path.return_value.parents.__getitem__.return_value = Path('/mocked/path')
+    # Mock the read_text method of Path to return the hooks config
+    mocker.patch('pathlib.Path.read_text', return_value=yaml.dump(mock_hooks_config))
+
+    # Test case where files match the included pattern and do not match the excluded pattern
+    files = [
+        'environment/prd/backends.tf',
+        'environment/prd/data.tf',
+        'environment/prd/main.tf',
+        'environment/prd/outputs.tf',
+        'environment/prd/providers.tf',
+        'environment/prd/variables.tf',
+        'environment/prd/versions.tf',
+        'environment/qa/backends.tf',
+    ]
+    assert is_hook_run_on_whole_repo('example_hook_id', files) is True
+
+    # Test case where files do not match the included pattern
+    files = ['environment/prd/README.md']
+    assert is_hook_run_on_whole_repo('example_hook_id', files) is False
+
+    # Test case where files match the excluded pattern
+    files = ['environment/prd/.terraform/config.tf']
+    assert is_hook_run_on_whole_repo('example_hook_id', files) is False
+
+    # Test case where hook_id is not found
+    with pytest.raises(
+        ValueError,
+        match='Hook ID "non_existing_hook_id" not found in .pre-commit-hooks.yaml',
+    ):
+        is_hook_run_on_whole_repo('non_existing_hook_id', files)


This is actually multiple tests stuffed into one. They should be split, perhaps parametrized.

Of mocking, you can use tmp_path and generate an actual example Git repo and just run the test against that. With mocks you're mostly testing your mocks and not logic. Especially, with this many in place.

webknjaz · 2025-01-07T03:16:03Z

tests/pytest/test__structs.py

This isn't very useful either. Delete.

webknjaz · 2025-01-07T03:17:01Z

tests/pytest/test__types.py

+    assert isinstance(MockSubcommandModule(), CLISubcommandModuleProtocol)
+
+    class InvalidSubcommandModule:
+        HOOK_ID = 'invalid_hook'
+
+        def populate_argument_parser(self, subcommand_parser: ArgumentParser) -> None:
+            pass
+
+    assert not isinstance(InvalidSubcommandModule(), CLISubcommandModuleProtocol)


This looks like two separate tests, they shouldn't be one.

webknjaz · 2025-01-07T03:17:50Z

tests/pytest/test__types.py

Honestly, this is merely checking that CPython stdlib works. I don't see it being useful and would just remove it.

ericfrederich and others added 30 commits April 11, 2024 16:10

feat: new terraform_fmt_v2 with better Windows support

18291bd

Merge remote-tracking branch 'upstream/master' into windows-support

9409f77

Merge remote-tracking branch 'upstream/master' into windows-support

ef844ce

Add linters andd apply part which not requiers tests

806a87a

Apply suggestions from code review

3f3fdd9

Apply suggestions from code review

a37971b

fix mypy andy pylint issues

85269df

Merge branch 'master' into windows-support

8fbba99

Add proto of per-dir execution and fix style

0421801

Add first PyTests

1f956dc

Separate per_dir_hook logic

59b6bac

Add simple tests for per_dir_hook

79ac415

Fix common issues

93cddb7

Fix violations to have "greenfield"

0280a60

Move common funtions to common.py

a14589a

Rename test file

b1a466c

dd tests for parse_env_vars

06d4fc6

Init tests for parse_cmdline

878f5bd

Mark get_unique_dirs as private funtion

dae3af2

Improve func docs

f99ca4a

Decouple parse_env_vars from parse_cmdline as there much more

c254a5b

modifications incoming

Refactor parse_env_vars to make it more readable

0cdbaec

Expand env vars. Improve logging

e5914e0

Implement "colorify" in sligtly different (better?) way

c033767

Add get_tf_binary_path

01a5850

Import all common functions at once

c7c8e56

Update linter settings to common project limitations

fe92d45

Split logger logic to separate file

e010c8a

Add tests to terraform_fmt

4c459df

Itint terraform_checkov hook

33e1547