Skip to content

Conversation

@grusin-db
Copy link
Collaborator

Changes

Adds row_count dataset check, that for each record in datasets computes rowcount, and if that fails, add failure reason into a dedicated column (i.e. table does not exist, lack of security, etc...)

example structure of yaml checks:

# row_count check with single column containing table names
- criticality: error
  check:
    function: row_count
    arguments:
      table_expr: table_name # this column will contain the name of a table to compute row counts for

# row_count check with multiple columns forming fully qualified table name
- criticality: error
  check:
    function: row_count
    arguments:
      table_expr: "catalog_name || '.' || schema_name || '.' || table_name" # expressions on how to compute table name 
      row_count_column: table_row_count
      row_count_error_column: table_row_count_error
      worker_count: 16

updated docs with new examples

Tests

  • manually tested
  • added unit tests
  • added integration tests
  • added end-to-end tests
  • added performance tests

@github-actions
Copy link

All commits in PR should be signed ('git commit -S ...'). See https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits

@grusin-db grusin-db changed the title Feat/row count feat: row_count dataset level check function Nov 20, 2025
@github-actions
Copy link

github-actions bot commented Nov 20, 2025

✅ 427/427 passed, 3 flaky, 35 skipped, 3h38m17s total

Flaky tests:

  • 🤪 test_save_checks_to_table_with_unresolved_for_each_column (3.384s)
  • 🤪 test_save_results_in_table_in_user_installation_only_quarantine (8.901s)
  • 🤪 test_e2e_workflow_serverless (8m27.933s)

Running from acceptance #3207

@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.18%. Comparing base (e34661b) to head (462eafe).

Files with missing lines Patch % Lines
src/databricks/labs/dqx/check_funcs.py 96.55% 1 Missing ⚠️
src/databricks/labs/dqx/utils.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #939      +/-   ##
==========================================
+ Coverage   90.15%   90.18%   +0.02%     
==========================================
  Files          60       60              
  Lines        5221     5255      +34     
==========================================
+ Hits         4707     4739      +32     
- Misses        514      516       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants