Python-based data validation tool designed to support QA and structured data review in high-volume operational environments. Built around rule-based checks to identify missing, invalid, or inconsistent data before downstream processing or reporting.
Processes CSV or Excel inputs and generates a clear output of data quality issues for review.
- Loads structured data from CSV or Excel files
- Applies validation checks for missing, invalid, or inconsistent values such as dates, IDs, and key fields
- Flags records that require review based on defined rules
- Outputs results to
error_report.csvfor QA or audit follow-up
- Python (pandas)
- Microsoft Excel
- CSV and Excel structured data formats
demo_data.csv- Sample dataset for testingdata_checker.py- Validation and rule-checking scripttracker_template.xlsx- Optional structured input template
- Populate
demo_data.csvor use the Excel template - Run
data_checker.py - Review
error_report.csvfor flagged issues
Built to reflect real-world data validation and QA workflows across regulated and high-volume environments. Focuses on improving data integrity, enforcing consistency, and reducing rework by identifying issues early in the process.