Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Backpush to main changes to skeleton of python hooks #741

Open
wants to merge 72 commits into
base: master
Choose a base branch
from

Conversation

MaxymVlasov
Copy link
Collaborator

@MaxymVlasov MaxymVlasov commented Jan 3, 2025

Put an x into the box if that apply:

  • This PR introduces breaking change.
  • This PR fixes a bug.
  • This PR adds new functionality.
  • This PR enhances existing functionality.

Description of your changes

This PR includes only changes to skeleton of python hooks, w/o addition of hooks itself, as it will be required a new release.

Changes includes but not limited to:

  • Addition of basic tests in Pytest (mostly generated by GH Copilot, so could be not very useful, but at least they catch basic issues)
  • Rename basic skeleton things to more usual in bash (like CLI_SUBCOMMAND_NAME to HOOK_ID)
  • Dropping of support of Python 2 in python package definition - it was never supported - terraform_docs_replace using python 3 syntax here from hook addition
  • Addition of linters and fix most of found issues
    • Would be nice to support relative imports of submodules, but I didn't find solution that will be pass linters, especially mypy - error: No parent module -- cannot perform relative import - Found 1 error in 1 file (errors prevented further checking)
  • Reimplementation of _common.sh functionality. Located mostly in _common.py, _run_on_whole_repo.py and _logger.py. common::parse_cmdline function was fully reworked by @webknjaz in prev PRs and now populated with common functionality in _cli_parsing.py

This PR DOES NOT include any hooks additions or breaking changes for existing hooks. In branch with hooks exist a bunch of TODOs part of which could result in rework of functional in this PR. But there are already many changes, and this could block @webknjaz to implement testing best practices, so I checkout to new branch, dropped everything which could end in new release and created this PR to merge non-user-faced changes to master

How can we test changes

You can

repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: a278a04b257f82ee86c84410e7f462fb357a9810
    hooks:
      - id: terraform_docs_replace

or play around using updated CONTIBUTING.md notes

Additional references

This PR includes and based on work of @webknjaz who set all these best-practice CLI structure in pre PRs and fixed essential bug of accessing .pre-commit-hooks.yaml values in not-yet-released python hooks (#740 (included in this PR)) and initial work of @ericfrederich who shown us in #652 that reimplementation in Python can be done relatively easy

Comment on lines +116 to +117
if exit_code != 0:
final_exit_code = exit_code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about reporting the number of failures through the return code?

Suggested change
if exit_code != 0:
final_exit_code = exit_code
final_exit_code += exit_code

return final_exit_code


class BinaryNotFoundError(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be useful in a dedicated module. And it should probably inherit a LookupError, maybe together with something else:

Suggested change
class BinaryNotFoundError(Exception):
class BinaryNotFoundError(FileNotFoundError, LookupError):

Copy link
Contributor

@webknjaz webknjaz Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it's a bad structural decision to have a soup-mixture style module with just about any arbitrary low-level thing from the project. It's better to have more specific names that would represent what the module contains. Most of the modules in the project expose common functions, only the outer layer of the app doesn't re-export them, really.

Give this module a narrow purpose and stick with it. Then move out anything that doesn't fit into the category into other modules with names that have some semantic meaning beyond being a generic English word that means whatever the reader would like to guess it means.

Honestly, I'd hoped WPS100 would catch this, but that check is rather dumb right now: https://wemake-python-styleguide.rtfd.io/en/latest/pages/usage/violations/naming.html#wemake_python_styleguide.violations.naming.WrongModuleNameViolation.

return expanded_args


def per_dir_hook(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name doesn't look like a function. There's no action in the name, no verb. It looks like a name of a constant and should be improved to look like something that does things and not just exists.

from copy import copy


class ColoredFormatter(logging.Formatter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integrate Rich instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole thing reads weird. Perhaps you meant entire?

Check if a function is defined in the global scope.

Args:
scope (dict): The scope (usually globals()) to check in.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're ever going to integrate Sphinx, beware that it's expected that you use RST syntax in docstrings:

Suggested change
scope (dict): The scope (usually globals()) to check in.
scope (dict): The scope (usually ``globals()``) to check in.

(this is a simplistic example that would probably be :py:func:`globals` IRL)

return is_defined and is_callable


def is_hook_run_on_whole_repo(hook_id: str, file_paths: list[str]) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(entire)



@runtime_checkable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to do this in this PR?

Comment on lines +17 to +20
if 'pre_commit_terraform.__main__' in sys.modules:
importlib.reload(sys.modules['pre_commit_terraform.__main__'])
else:
import pre_commit_terraform.__main__ # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nooooooo! Don't do this! Use runpy like you're supposed to. This module is to be invoked as an entry-point and not in the middle of runtime. So in general you should've instead just called it through sys.executable and checked the return code + the output. No need to patch stuff in this case.

mock_return_code = 0

mocker.patch('sys.argv', mock_argv)
mock_invoke_cli_app = mocker.patch(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you could do a --help instead of patching most of the logic out. It would give you your successful return code.

Comment on lines +13 to +69
def test_populate_common_argument_parser(mocker):
parser = ArgumentParser(add_help=False)
populate_common_argument_parser(parser)
args = parser.parse_args(
['-a', 'arg1', '-h', 'hook1', '-i', 'init1', '-e', 'env1', 'file1', 'file2'],
)

assert args.args == ['arg1']
assert args.hook_config == ['hook1']
assert args.tf_init_args == ['init1']
assert args.env_vars_strs == ['env1']
assert args.files == ['file1', 'file2']


def test_populate_common_argument_parser_defaults(mocker):
parser = ArgumentParser(add_help=False)
populate_common_argument_parser(parser)
args = parser.parse_args([])

assert args.args == []
assert args.hook_config == []
assert args.tf_init_args == []
assert args.env_vars_strs == []
assert args.files == []


def test_populate_common_argument_parser_multiple_values(mocker):
parser = ArgumentParser(add_help=False)
populate_common_argument_parser(parser)
args = parser.parse_args(
[
'-a',
'arg1',
'-a',
'arg2',
'-h',
'hook1',
'-h',
'hook2',
'-i',
'init1',
'-i',
'init2',
'-e',
'env1',
'-e',
'env2',
'file1',
'file2',
],
)

assert args.args == ['arg1', 'arg2']
assert args.hook_config == ['hook1', 'hook2']
assert args.tf_init_args == ['init1', 'init2']
assert args.env_vars_strs == ['env1', 'env2']
assert args.files == ['file1', 'file2']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are actually a single test, and it should be a single test function. To run it multiple times, use @pytest.mark.parametrize. Don't forget to specify test IDs for better UX/DX in test report and invocation.

# ?
# ? populate_common_argument_parser
# ?
def test_populate_common_argument_parser(mocker):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why didn't you write a docstring, so I would know what you're testing/expecting here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I see a huge overuse of mocking here. It's best to avoid it entirely where possible. Or only have a few exceptions. Designing runtime code well (like through dependency injection) allows for testing without mocking random things out.

mock_subcommand_modules,
)

from pre_commit_terraform._cli_subcommands import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add imports within functions. It's a really bad idea that makes runtime less predictable / more difficult to interpret. Also, it's already imported at the top, it's not going to load the file for the second time, it's already cached. Don't disable the linters that report that this is dumb.

Comment on lines +2 to +3
import os
from os.path import join
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double import. Don't. Just don't. Enable the linters and fix the problems instead of showing this to other humans.

# ?
# ? get_unique_dirs
# ?
def test_get_unique_dirs_empty():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring?

Comment on lines +148 to +181
def test_parse_env_vars_empty():
env_var_strs = []
result = parse_env_vars(env_var_strs)
assert result == {}


def test_parse_env_vars_single():
env_var_strs = ['VAR1=value1']
result = parse_env_vars(env_var_strs)
assert result == {'VAR1': 'value1'}


def test_parse_env_vars_multiple():
env_var_strs = ['VAR1=value1', 'VAR2=value2']
result = parse_env_vars(env_var_strs)
assert result == {'VAR1': 'value1', 'VAR2': 'value2'}


def test_parse_env_vars_with_quotes():
env_var_strs = ['VAR1="value1"', 'VAR2="value2"']
result = parse_env_vars(env_var_strs)
assert result == {'VAR1': 'value1', 'VAR2': 'value2'}


def test_parse_env_vars_with_equal_sign_in_value():
env_var_strs = ['VAR1=value=1', 'VAR2=value=2']
result = parse_env_vars(env_var_strs)
assert result == {'VAR1': 'value=1', 'VAR2': 'value=2'}


def test_parse_env_vars_with_empty_value():
env_var_strs = ['VAR1=', 'VAR2=']
result = parse_env_vars(env_var_strs)
assert result == {'VAR1': '', 'VAR2': ''}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize

Comment on lines +187 to +226
def test_expand_env_vars_no_vars():
args = ['arg1', 'arg2']
env_vars = {}
result = expand_env_vars(args, env_vars)
assert result == ['arg1', 'arg2']


def test_expand_env_vars_single_var():
args = ['arg1', '${VAR1}', 'arg3']
env_vars = {'VAR1': 'value1'}
result = expand_env_vars(args, env_vars)
assert result == ['arg1', 'value1', 'arg3']


def test_expand_env_vars_multiple_vars():
args = ['${VAR1}', 'arg2', '${VAR2}']
env_vars = {'VAR1': 'value1', 'VAR2': 'value2'}
result = expand_env_vars(args, env_vars)
assert result == ['value1', 'arg2', 'value2']


def test_expand_env_vars_no_expansion():
args = ['arg1', 'arg2']
env_vars = {'VAR1': 'value1'}
result = expand_env_vars(args, env_vars)
assert result == ['arg1', 'arg2']


def test_expand_env_vars_partial_expansion():
args = ['arg1', '${VAR1}', '${VAR2}']
env_vars = {'VAR1': 'value1'}
result = expand_env_vars(args, env_vars)
assert result == ['arg1', 'value1', '${VAR2}']


def test_expand_env_vars_with_special_chars():
args = ['arg1', '${VAR_1}', 'arg3']
env_vars = {'VAR_1': 'value1'}
result = expand_env_vars(args, env_vars)
assert result == ['arg1', 'value1', 'arg3']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize

Comment on lines +232 to +263
def test_get_tf_binary_path_from_hook_config():
hook_config = ['--tf-path=/custom/path/to/terraform']
result = get_tf_binary_path(hook_config)
assert result == '/custom/path/to/terraform'


def test_get_tf_binary_path_from_pct_tfpath_env_var(mocker):
hook_config = []
mocker.patch.dict(os.environ, {'PCT_TFPATH': '/env/path/to/terraform'})
result = get_tf_binary_path(hook_config)
assert result == '/env/path/to/terraform'


def test_get_tf_binary_path_from_terragrunt_tfpath_env_var(mocker):
hook_config = []
mocker.patch.dict(os.environ, {'TERRAGRUNT_TFPATH': '/env/path/to/terragrunt'})
result = get_tf_binary_path(hook_config)
assert result == '/env/path/to/terragrunt'


def test_get_tf_binary_path_from_system_path_terraform(mocker):
hook_config = []
mocker.patch('shutil.which', return_value='/usr/local/bin/terraform')
result = get_tf_binary_path(hook_config)
assert result == '/usr/local/bin/terraform'


def test_get_tf_binary_path_from_system_path_tofu(mocker):
hook_config = []
mocker.patch('shutil.which', side_effect=[None, '/usr/local/bin/tofu'])
result = get_tf_binary_path(hook_config)
assert result == '/usr/local/bin/tofu'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize


def test_get_tf_binary_path_from_pct_tfpath_env_var(mocker):
hook_config = []
mocker.patch.dict(os.environ, {'PCT_TFPATH': '/env/path/to/terraform'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use monkeypatch.setenv()

Comment on lines +269 to +274
with pytest.raises(
BinaryNotFoundError,
match='Neither Terraform nor OpenTofu binary could be found. Please either set the "--tf-path"'
+ ' hook configuration argument, or set the "PCT_TFPATH" environment variable, or set the'
+ ' "TERRAGRUNT_TFPATH" environment variable, or install Terraform or OpenTofu globally.',
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with pytest.raises(
BinaryNotFoundError,
match='Neither Terraform nor OpenTofu binary could be found. Please either set the "--tf-path"'
+ ' hook configuration argument, or set the "PCT_TFPATH" environment variable, or set the'
+ ' "TERRAGRUNT_TFPATH" environment variable, or install Terraform or OpenTofu globally.',
):
error_msg = (
r'^Neither Terraform nor OpenTofu binary could be found. Please either set the "--tf-path" '
'hook configuration argument, or set the "PCT_TFPATH" environment variable, or set the '
'"TERRAGRUNT_TFPATH" environment variable, or install Terraform or OpenTofu globally\.$'
)
with pytest.raises(BinaryNotFoundError, match=error_msg):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module is rather pointless. Delete it.

def sample_function():
pass

scope = globals()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is nasty! Messing up the entire process state for all the test functions, eh? Don't do this. At least copy the thing instead of side-effecting everything...

Suggested change
scope = globals()
scope = globals().copy()

Comment on lines +17 to +18
scope = globals()
scope['sample_function'] = sample_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But really.. Can't you just not go the unnecessarily complicated route? KISS.

Suggested change
scope = globals()
scope['sample_function'] = sample_function
scope = {'sample_function': sample_function}

Comment on lines +10 to +46
# ?
# ? is_function_defined
# ?
def test_is_function_defined_existing_function():
def sample_function():
pass

scope = globals()
scope['sample_function'] = sample_function

assert is_function_defined('sample_function', scope) is True


def test_is_function_defined_non_existing_function():
scope = globals()

assert is_function_defined('non_existing_function', scope) is False


def test_is_function_defined_non_callable():
non_callable = 'I am not a function'
scope = globals()
scope['non_callable'] = non_callable

assert is_function_defined('non_callable', scope) is False


def test_is_function_defined_callable_object():
class CallableObject:
def __call__(self):
pass

callable_object = CallableObject()
scope = globals()
scope['callable_object'] = callable_object

assert is_function_defined('callable_object', scope) is True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrize

Comment on lines +71 to +108
def test_is_hook_run_on_whole_repo(mocker, mock_git_ls_files, mock_hooks_config):
# Mock the return value of git ls-files
mocker.patch('subprocess.check_output', return_value='\n'.join(mock_git_ls_files))
# Mock the return value of reading the .pre-commit-hooks.yaml file
mocker.patch('builtins.open', mocker.mock_open(read_data=yaml.dump(mock_hooks_config)))
# Mock the Path object to return a specific path
mock_path = mocker.patch('pathlib.Path.resolve')
mock_path.return_value.parents.__getitem__.return_value = Path('/mocked/path')
# Mock the read_text method of Path to return the hooks config
mocker.patch('pathlib.Path.read_text', return_value=yaml.dump(mock_hooks_config))

# Test case where files match the included pattern and do not match the excluded pattern
files = [
'environment/prd/backends.tf',
'environment/prd/data.tf',
'environment/prd/main.tf',
'environment/prd/outputs.tf',
'environment/prd/providers.tf',
'environment/prd/variables.tf',
'environment/prd/versions.tf',
'environment/qa/backends.tf',
]
assert is_hook_run_on_whole_repo('example_hook_id', files) is True

# Test case where files do not match the included pattern
files = ['environment/prd/README.md']
assert is_hook_run_on_whole_repo('example_hook_id', files) is False

# Test case where files match the excluded pattern
files = ['environment/prd/.terraform/config.tf']
assert is_hook_run_on_whole_repo('example_hook_id', files) is False

# Test case where hook_id is not found
with pytest.raises(
ValueError,
match='Hook ID "non_existing_hook_id" not found in .pre-commit-hooks.yaml',
):
is_hook_run_on_whole_repo('non_existing_hook_id', files)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually multiple tests stuffed into one. They should be split, perhaps parametrized.

Of mocking, you can use tmp_path and generate an actual example Git repo and just run the test against that. With mocks you're mostly testing your mocks and not logic. Especially, with this many in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't very useful either. Delete.

Comment on lines +28 to +36
assert isinstance(MockSubcommandModule(), CLISubcommandModuleProtocol)

class InvalidSubcommandModule:
HOOK_ID = 'invalid_hook'

def populate_argument_parser(self, subcommand_parser: ArgumentParser) -> None:
pass

assert not isinstance(InvalidSubcommandModule(), CLISubcommandModuleProtocol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like two separate tests, they shouldn't be one.

Copy link
Contributor

@webknjaz webknjaz Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, this is merely checking that CPython stdlib works. I don't see it being useful and would just remove it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants