Skip to content

test : added unit tests for _validate_existing_column_sequence helper#1387

Closed
tmdeveloper007 wants to merge 1 commit into
im-anishraj:mainfrom
tmdeveloper007:#1376
Closed

test : added unit tests for _validate_existing_column_sequence helper#1387
tmdeveloper007 wants to merge 1 commit into
im-anishraj:mainfrom
tmdeveloper007:#1376

Conversation

@tmdeveloper007

@tmdeveloper007 tmdeveloper007 commented May 25, 2026

Copy link
Copy Markdown
Contributor

Refs #1387

Refs #1376, #1403

Summary

Added unit tests for _validate_existing_column_sequence helper in arnio/cleaning.py.

Changes Made

  • tests/test_cleaning.py: Added TestValidateExistingColumnSequence class with 5 tests covering missing column errors, empty sequence handling, and valid column normalization.

Impact

  • Improves test coverage for private cleaning helper function
  • All ruff checks pass

Verification

python -m ruff check tests/test_cleaning.py
python -m ruff format --check tests/test_cleaning.py

@tmdeveloper007

Copy link
Copy Markdown
Contributor Author

Hi @im-anishraj, this pull request addresses issue #1376.

Here is a brief summary of what was implemented:

What was done:

_validate_existing_column_sequence is an internal helper in arnio/cleaning.py that is called by nearly every public cleaning function (strip_whitespace, drop_duplicates, normalize_case, fill_nulls, cast_types, etc.) to validate that caller-supplied column names exist in the frame before any data is touched. Despite being on the critical path for all column-scoped operations, it had no dedicated tests.

Tests added in tests/test_cleaning.py:

  • test_validate_existing_missing_columns_raise_key_error — verifies that requesting a non-existent column name raises KeyError with an informative message
  • test_validate_existing_custom_missing_message_callback — verifies that the missing_message lambda is called and its return value appears in the exception
  • test_validate_existing_empty_sequence_allow_empty_false_raises — verifies that an empty column list with allow_empty=False raises ValueError
  • test_validate_existing_empty_sequence_allow_empty_true_returns_empty — verifies that an empty column list with allow_empty=True returns an empty list without error
  • test_validate_existing_valid_columns_returned_normalized — verifies that valid column names are returned unchanged (as a list)

Verification:

python -m pytest tests/test_cleaning.py -v -k "validate_existing"

All 5 tests pass. No existing tests were modified.

@im-anishraj im-anishraj added area:cleaning Cleaning primitives and data preparation transforms gssoc Part of the GSSoC 2026 contributor program type:testing GSSoC scoring label for tests and testing contributions level:beginner GSSoC scoring label for beginner-level merged PRs labels May 25, 2026

@im-anishraj im-anishraj left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the focused coverage. CI is green, but two cleanup changes are needed before this can be merged.

  1. Change the PR body from Closes #1376 to non-closing references, for example Refs #1376 and Refs #1403. Issue #1376 was already closed as part of the umbrella consolidation, so this PR should not close it again.
  2. Import _validate_existing_column_sequence once at module level or once inside the test class instead of repeating the same local import in each test method.

Once that is cleaned up, I can recheck with the existing green CI results.

@vercel

vercel Bot commented May 26, 2026

Copy link
Copy Markdown

@tmdeveloper007 is attempting to deploy a commit to the xtylishanish-gmailcom's projects Team on Vercel.

A member of the Team first needs to authorize it.

@tmdeveloper007 tmdeveloper007 force-pushed the #1376 branch 2 times, most recently from 4cacb9a to b48edf1 Compare May 26, 2026 03:07
@tmdeveloper007

Copy link
Copy Markdown
Contributor Author

Hi @im-anishraj, I have successfully rebased this branch on top of the latest upstream main, resolved all merge conflicts, and reformatted/cleaned up the code according to guidelines. All local tests are passing perfectly and the CI style/lint checks are completely green. This PR is now ready to merge! Thank you!

@im-anishraj im-anishraj left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now blocked by a merge conflict with latest main after the related helper-test PRs were merged.

A merge simulation against current origin/main fails in:

tests/test_cleaning.py

Please rebase/update the branch on latest main and keep this PR limited to its own helper coverage.

@im-anishraj

Copy link
Copy Markdown
Owner

Hi @tmdeveloper007, maintainer deadline update: please push the requested fixes/rebase or reply with your ETA by May 31, 2026 at 11:59 PM IST.

We are not closing this PR right now. This deadline is only to keep the active review queue organized and confirm which PRs are still being worked on. If this PR overlaps with already merged work or another open PR, please also clarify what unique change remains. Thanks.

@tmdeveloper007

Copy link
Copy Markdown
Contributor Author

Hi, this PR has successfully passed all local checks and is ready to be merged. Please review it at your convenience. Thank you!

@im-anishraj

Copy link
Copy Markdown
Owner

@tmdeveloper007 thanks for the update. I see your note that this is ready/updated for review. It is noted in the review queue; please keep the branch up to date with main, make sure checks are green, and keep the PR linked to its assigned issue.

@im-anishraj im-anishraj added gssoc:level-1 GSSoC beginner-level task difficulty:beginner Good for new contributors with limited project context labels May 29, 2026
@tmdeveloper007

Copy link
Copy Markdown
Contributor Author

Reopened as #2564 (rebased onto current upstream main). Please review the new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:cleaning Cleaning primitives and data preparation transforms difficulty:beginner Good for new contributors with limited project context gssoc:level-1 GSSoC beginner-level task gssoc Part of the GSSoC 2026 contributor program level:beginner GSSoC scoring label for beginner-level merged PRs type:testing GSSoC scoring label for tests and testing contributions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants