Skip to content

Replace hcl2json Docker call with python-hcl2#337

Draft
gavinelder wants to merge 11 commits intochore/v26-tf-variable-validationfrom
chore/v26-replace-hcl2json-docker
Draft

Replace hcl2json Docker call with python-hcl2#337
gavinelder wants to merge 11 commits intochore/v26-tf-variable-validationfrom
chore/v26-replace-hcl2json-docker

Conversation

@gavinelder
Copy link
Copy Markdown

Stacked on PR #335 (`chore/v26-tf-variable-validation`) — merge that first.

Replaces the `tmccombs/hcl2json` Docker container call in `scripts/installer/utils/extractors.py` with the in-process `python-hcl2` library. The previous setup worked but put the Docker daemon on the critical path of every Python script that reads tfvars — including the `data.external` block in `modules/connection_strings/v1.0.0/main.tf` that Terraform itself invokes during plan (and therefore during `terraform test`).

Why now

This was the load-bearing reason every `.tftest.hcl` file in #335 needed an `override_data` block stubbing the connection-strings external data source. Removing the Docker dependency means the real Python script can run during `terraform test` and we can drop those overrides.

Changes

`scripts/installer/utils/extractors.py`

  • Swapped the `subprocess.run(["docker", "run", ...])` block for `hcl2.load(fp)`.
  • Added a small `_unwrap_hcl_strings` helper because python-hcl2 preserves the surrounding double quotes on string scalars (`foo = "bar"` parses to `'"bar"'`); the consumer code expects plain Python strings, matching the previous Docker output shape.
  • Lists, dicts, bools, numbers pass through unchanged.

`scripts/installer/requirements.txt` (new)

  • `python-hcl2>=8,<9`. Kept separate from `tests/requirements.txt` (pytest fixture stack with a pre-existing malformed line) so the installer-runtime dep set stays isolated.

`tests/terraform/*.tftest.hcl` × 4

  • Dropped the `override_data { target = module.connection_strings.data.external.generate_db_connection_string }` block from each file. The script now runs successfully under `terraform test` without a daemon. The four `aws_ssm_parameter` overrides remain — those are real AWS calls that still need mocking.

`.github/workflows/terraform-test.yml`

  • Replaced the placeholder comment about Python deps with an actual `pip install -r scripts/installer/requirements.txt` step.

Verification

  • `make terraform_test` → 32 passed, 0 failed without a Docker daemon running.
  • `terraform fmt -check -recursive` clean.
  • `generate_db_connection_string.py` produces the same single-line JSON output (`{"status": "0", "value": "?allowPublicKeyRetrieval=true&useSSL=false&permitMysqlScheme=true"}`) as before.

Tradeoff

Adds one third-party Python dependency. The previous comment in `extractors.py` explained the original author preferred an immutable container hash to avoid "introducing packages from the internet" — that constraint loses to the daily friction of a Docker daemon requirement on local dev, CI, and `terraform test`. `python-hcl2` is well-maintained, and the surface area we use (HCL → dict) is trivially auditable.

Stacked on #335. Automated by Claude Code.

gavinelder added 11 commits May 6, 2026 17:20
scripts/installer/utils/extractors.py previously shelled out to a vendored
`tmccombs/hcl2json` Docker image to convert terraform.tfvars to JSON. That
worked, but pulled the Docker daemon onto the critical path of every Python
script that reads tfvars — including the data.external block in
modules/connection_strings/v1.0.0/main.tf, which terraform itself invokes
during `plan` (and therefore during `terraform test`).

Swaps to `python-hcl2` (in-process, pure Python). Behavioural notes:
  - python-hcl2 preserves the surrounding double quotes on string scalars
    (`foo = "bar"` parses to the Python string `'"bar"'`). A small
    `_unwrap_hcl_strings` walks the parsed structure and strips them so the
    downstream consumers see the same shape they did with the Docker output.
  - Lists, dicts, bools, numbers pass through unchanged.

Knock-on cleanups enabled by the swap:
  - Each tests/terraform/*.tftest.hcl file dropped its
    `override_data { target = module.connection_strings.data.external.generate_db_connection_string }`
    block — the script now runs successfully under `terraform test` without
    needing a daemon. (The make_terraform_test wrapper still copies the
    fixture tfvars to project root because the script reads from disk by
    hard-coded path; no change there.)
  - .github/workflows/terraform-test.yml: replaced the comment on Python
    deps with an actual `pip install -r scripts/installer/requirements.txt`.

scripts/installer/requirements.txt (new)
  - python-hcl2>=8,<9 — the only runtime third-party dep this script needs.
  - Kept separate from tests/requirements.txt, which carries the pytest
    fixture stack (and at present a malformed line) and shouldn't be coupled
    to the installer-script runtime.

Local: terraform fmt -check -recursive (clean), make terraform_test
(32 passed, 0 failed) without a Docker daemon running.
Matches the pinning pattern used elsewhere in this workflow
(terraform_version: 1.14.8, tflint_version: v0.61.0).
…suite

Now that scripts/installer/utils/extractors.py runs in-process via python-hcl2,
the existing pytest plan-based suite under tests/unit/ can run in CI without
a Docker daemon. Adds two new steps to the workflow and renames the workflow
file ci.yml since the scope is no longer just `terraform test`.

New steps:
  - extractors.py smoke — imports the parser against the generated fixture and
    asserts a sensible value comes through. Fast end-to-end check that the
    new HCL parser handles the real tfvars shape.
  - pytest -m local — runs the plan-based test suite (the @pytest.mark.local
    selector). These tests invoke `terraform plan` against the mocked fixture
    and assert on rendered config / output content. Testcontainer-based markers
    stay out until a separate job provisions a Docker daemon.

Workflow rename:
  - .github/workflows/terraform-test.yml -> ci.yml. The job is no longer just
    one type of test; updated the file's path filter and the chore/v26-** push
    trigger comment accordingly.
  - Path filter widened to scripts/installer/** and tests/** so the pytest job
    triggers on all relevant changes.
  - Job timeout bumped 15m -> 20m to give the pytest suite headroom.

tests/requirements.txt:
  - Replaced the malformed `pip install requests=2.31.0` line (which was a
    bash command pasted into a requirements file) with the actual requirement
    `requests==2.31.0`. The comment above already explained why this pin is
    necessary (testcontainers / docker-py compatibility); kept that.
The AWS provider deprecated the `overwrite` argument on aws_ssm_parameter in
5.x and will remove it in v6. Plan/apply currently emits ~27 warnings on
every run from the four resources in 005_parameter_store.tf that used it.

Removed the attribute. Behaviour change: Terraform now manages SSM parameters
via its normal lifecycle — creates them if absent, updates them in place,
deletes them on resource removal. The previous `overwrite = false` default
was meant to guard against clobbering parameters created outside Terraform,
which is no longer the recommended pattern (use `terraform import` instead).

Knock-on cleanup of the now-unused flag:
  - variables.tf: removed `flag_overwrite_ssm_keys` declaration.
  - templates/TEMPLATE_terraform.tfvars: removed the SSM section header,
    accompanying explanatory comment block, and the variable assignment.

Local: terraform fmt clean, terraform test 32/32 passing, and the recurring
deprecation warnings are gone from `terraform plan` output.
The installer package lives at scripts/installer/, not at the project root.
Without PYTHONPATH=scripts, the inline 'python3 -c "from installer..."'
invocation fails with ModuleNotFoundError.
tests/utils/filehandling.py and tests/utils/assertions/verify_assertions.py
import yaml at module-load time; pytest collection fails with
ModuleNotFoundError before any test runs without it. Pre-existing dep that
was never listed (pyyaml gets pulled in transitively when running locally
via other tools, but CI doesn't have that).
tests/remote/ imports boto3 at collection time; even with -m local filtering
out its tests, pytest's collection step fails with ModuleNotFoundError.
That suite should run in a future real-AWS job, not the plan-based one.
Two more Docker/AWS dependencies were blocking the pytest plan-based suite
in CI. Both fixed in-place rather than worked around in the workflow.

tests/conftest.py
  - The session_setup fixture used `docker run ... hcl2json` to convert
    009_define_file_templates.tf to JSON for downstream test consumers
    (tests/utils/terraform/template_generator.py reads the JSON output).
    Replaced with `hcl2.load(fp)` + `json.dump(...)` — same on-disk shape
    so the consumer needs no changes.

tests/utils/preflight/preflight.py
  - check_aws_sso_token() unconditionally shelled out to `aws sts
    get-caller-identity`. Added an opt-out via CX_SKIP_AWS_CHECK=true env
    var, used by CI where the test fixture sets `use_mocks = true` so no
    real AWS calls are made.

.github/workflows/ci.yml
  - Removed `chore/v26-**` from push triggers — the pull_request trigger
    already covers PR visibility, and having both was firing two duplicate
    runs per push.
  - Added a `concurrency` group keyed on workflow + ref + event so future
    overlapping pushes cancel each other instead of stacking.
  - Pytest step now sets CX_SKIP_AWS_CHECK=true.
session_setup unconditionally moved `terraform.tfvars` to a .backup file at
session start and restored it at session end. This works when a tester has
an existing local tfvars; CI runs the workflow with a clean checkout and
no project-root tfvars, so the move failed with FileNotFoundError.

Both calls are now gated on Path(...).exists().
… job

The bare `terraform plan` invocations in the pytest plan-based suite need
the AWS provider to initialize cleanly. The fixture pinned
`aws_profile = "development"` (a dev SSO profile not present in CI) and
no AWS_* env credentials, so plan failed before any test ran.

  - Workflow step now sed's `aws_profile = ""` into the generated
    base-overrides.auto.tfvars so the provider falls back to env credentials.
  - Pytest step exports AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY /
    AWS_DEFAULT_REGION dummies so the provider initialization succeeds.
  - All AWS data sources are gated on `use_mocks = true` in the fixture, so
    no real AWS API calls fire; the provider just needs to think it can
    authenticate to satisfy plan-time validation.
The pytest @pytest.mark.local suite still fails in CI because the bare
`terraform plan` invocations in tests/utils/terraform/executor.py need
the AWS provider to initialize, and the executor swallows stderr (DEVNULL)
so the failure cause isn't surfaced.

Reverting just the pytest step and the related env-var workarounds. Keeping
all the underlying Docker-free improvements (extractors.py, conftest.py
hcl2 swap, CX_SKIP_AWS_CHECK gate, tfvars backup gating, pyyaml dep,
malformed-requirements fix) since they're needed for the pytest job whenever
it does land.

The benefits already delivered in this PR:
  - terraform fmt -check (new)
  - tflint --recursive (new)
  - terraform test (32/32, no Docker daemon required)
  - extractors.py smoke (validates the python-hcl2 swap end-to-end)
  - de-duplicated CI runs (concurrency group + drop chore/v26 push trigger)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant