Skip to content

Commit ca07291

Browse files
authored
Add required rule for a column (#119)
1 parent f86acf2 commit ca07291

20 files changed

+490
-201
lines changed
File renamed without changes.

README.md

+32-28
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@
1111
<!-- /top-badges -->
1212

1313
<!-- rules-counter -->
14-
[![Static Badge](https://img.shields.io/badge/Rules-366-green?label=Total%20number%20of%20rules&labelColor=darkgreen&color=gray)](schema-examples/full.yml)
14+
[![Static Badge](https://img.shields.io/badge/Rules-367-green?label=Total%20number%20of%20rules&labelColor=darkgreen&color=gray)](schema-examples/full.yml)
1515
[![Static Badge](https://img.shields.io/badge/Rules-153-green?label=Cell%20rules&labelColor=blue&color=gray)](src/Rules/Cell)
1616
[![Static Badge](https://img.shields.io/badge/Rules-206-green?label=Aggregate%20rules&labelColor=blue&color=gray)](src/Rules/Aggregate)
17-
[![Static Badge](https://img.shields.io/badge/Rules-7-green?label=Extra%20checks&labelColor=blue&color=gray)](#extra-checks)
18-
[![Static Badge](https://img.shields.io/badge/Rules-32/54/13-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml)
17+
[![Static Badge](https://img.shields.io/badge/Rules-8-green?label=Extra%20checks&labelColor=blue&color=gray)](#extra-checks)
18+
[![Static Badge](https://img.shields.io/badge/Rules-32/54/9-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml)
1919
<!-- /rules-counter -->
2020

2121
A console utility designed for validating CSV files against a strictly defined schema and validation rules outlined
@@ -232,7 +232,6 @@ columns:
232232
count: 10
233233
234234
```
235-
236235
<!-- /readme-sample-yml -->
237236

238237

@@ -242,19 +241,33 @@ In the [example Yml file](schema-examples/full.yml) you can find a detailed desc
242241
It's also covered by tests, so it's always up-to-date.
243242

244243
**Important notes**
244+
245245
* I have deliberately refused typing of columns (like `type: integer`) and replaced them with rules,
246246
which can be combined in any sequence and completely at your discretion.
247247
This gives you great flexibility when validating CSV files.
248-
* All fields (unless explicitly stated otherwise) are optional, and you can choose not to declare them. Up to you.
249-
* If you specify a wrong rule name, non-existent values (which are not in the example below) or a different variable
250-
type for any of the options, you will get a schema validation error. At your own risk, you can use the `skip-schema`
248+
* All options (unless explicitly stated otherwise) are optional, and you can choose not to declare them. Up to you.
249+
* If you specify a wrong rule name, non-existent values (which are not in the example below) or a different variable
250+
type for any of the options, you will get a schema validation error. At your own risk, you can use the `--skip-schema`
251251
option to avoid seeing these errors and use your keys in the schema.
252-
252+
* All rules except `not_empty` ignored for empty strings (length 0). If the value must be non-empty,
253+
use `not_empty: true` as extra rule. Keep in mind that a space (` `) is also a character. In this case the string
254+
length
255+
will be `1`. If you want to avoid such situations, add the `is_trimmed: true` rule.
256+
* All rules don't depend on each other. They know nothing about each other and cannot influence each other.
257+
* You can use the rules in any combination. Or not use any of them. They are grouped below simply for ease of navigation
258+
and reading.
259+
* If you see the value for the rule is `is_some_rule: true` - that's just an enable flag. In other cases, these are rule
260+
parameters.
261+
* The order of rules execution is the same as in the schema. But in reality it will only change the order of errors in
262+
the report.
263+
* Most of the rules are case-sensitive. Unless otherwise specified.
264+
* As backup plan, you always can use the `regex` rule. But it is much more reliable to use clear combinations of rules.
265+
That way it will be more obvious what went wrong.
253266

254267
Below you'll find the full list of rules and a brief commentary and example for context.
255268
This part of the readme is also covered by autotests, so these code are always up-to-date.
256269

257-
In any unclear situation, look into it first.
270+
In any unclear situation, look into it first ;)
258271

259272
<!-- full-yml -->
260273
```yml
@@ -265,7 +278,7 @@ In any unclear situation, look into it first.
265278
name: CSV Blueprint Schema Example # Name of a CSV file. Not used in the validation process.
266279
description: | # Any description of the CSV file. Not used in the validation process.
267280
This YAML file provides a detailed description and validation rules for CSV files
268-
to be processed by JBZoo/Csv-Blueprint tool. It includes specifications for file name patterns,
281+
to be processed by CSV Blueprint tool. It includes specifications for file name patterns,
269282
CSV formatting options, and extensive validation criteria for individual columns and their values,
270283
supporting a wide range of data validation rules from basic type checks to complex regex validations.
271284
This example serves as a comprehensive guide for creating robust CSV file validations.
@@ -298,26 +311,17 @@ structural_rules: # Here are default values.
298311
# This will not affect the validator, but will make it easier for you to navigate.
299312
# For convenience, use the first line as a header (if possible).
300313
columns:
301-
- name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv_structure.header" is true.
314+
- name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true.
302315
description: Lorem ipsum # Description of the column. Not used in the validation process.
303316
example: Some example # Example of the column value. Schema will also check this value on its own.
304317
305-
# Important notes about the validation rules.
306-
# 1. All rules except "not_empty" ignored for empty strings (length 0).
307-
# If the value must be non-empty, use "not_empty" as extra rule!
308-
# 2. All rules don't depend on each other. They are independent.
309-
# They know nothing about each other and cannot influence each other.
310-
# 3. You can use the rules in any combination. Or not use any of them.
311-
# They are grouped below simply for ease of navigation and reading.
312-
# 4. If you see the value for the rule is "true" - that's just an enable flag.
313-
# In other cases, these are rule parameters.
314-
# 5. The order of rules execution is the same as in the schema. But it doesn't matter.
315-
# The result will be the same in any order.
316-
# 6. Most of the rules are case-sensitive. Unless otherwise specified.
317-
# 7. As backup plan, you always can use the "regex" rule. ON YOUR OWN RISK!
318+
# If the column is required. If true, the column must be present in the CSV file. If false, the column can be missing in the CSV file.
319+
# So, if you want to make the column optional, set this value to false, and it will validate the column only if it is present.
320+
# By default, the column is required. It works only if "csv.header" is true and "structural_rules.allow_extra_columns" is false.
321+
required: true
318322
319323
####################################################################################################################
320-
# Data validation for each(!) value in the column.
324+
# Data validation for each(!) value in the column. Please, see notes in README.md
321325
# Every rule is optional.
322326
rules:
323327
# General rules
@@ -487,7 +491,7 @@ columns:
487491

488492
# Check if the column is sorted in a specific order.
489493
# - Direction: ["asc", "desc"].
490-
# - Method: ["natural", "regular", "numeric", "string"].
494+
# - Method: ["natural", "regular", "numeric", "string"].
491495
# See: https://www.php.net/manual/en/function.sort.php
492496
sorted: [ asc, natural ] # Expected ascending order, natural sorting.
493497

@@ -821,7 +825,8 @@ Behind the scenes to what is outlined in the yml above, there are additional che
821825
<!-- extra-rules -->
822826

823827
* With `filename_pattern` rule, you can check if the file name matches the pattern.
824-
* Property `name` is not defined in a column. If `csv.header: true`.
828+
* Checks if property `name` is not defined in a column. Only if `csv.header: true`.
829+
* If property `required` is set to `true`, the column must must be present in CSV. Only if `csv.header: true`
825830
* Check that each row matches the number of columns.
826831
* With `strict_column_order` rule, you can check that the columns are in the correct order.
827832
* With `allow_extra_columns` rule, you can check that there are no extra columns in the CSV file.
@@ -1284,7 +1289,6 @@ It's random ideas and plans. No promises and deadlines. Feel free to [help me!](
12841289
file name.
12851290

12861291
* **Validation**
1287-
* `required` flag for the column.
12881292
* Multi values in one cell.
12891293
* Custom cell rule as a callback. It's useful when you have a complex rule that can't be described in the schema
12901294
file.

schema-examples/full.json

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name" : "CSV Blueprint Schema Example",
3-
"description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by JBZoo\/Csv-Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n",
3+
"description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by CSV Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n",
44

55
"filename_pattern" : "\/demo(-\\d+)?\\.csv$\/i",
66

@@ -23,6 +23,7 @@
2323
"name" : "Column Name (header)",
2424
"description" : "Lorem ipsum",
2525
"example" : "Some example",
26+
"required" : true,
2627

2728
"rules" : {
2829
"not_empty" : true,

schema-examples/full.php

+2-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
return [
1818
'name' => 'CSV Blueprint Schema Example',
1919
'description' => 'This YAML file provides a detailed description and validation rules for CSV files
20-
to be processed by JBZoo/Csv-Blueprint tool. It includes specifications for file name patterns,
20+
to be processed by CSV Blueprint tool. It includes specifications for file name patterns,
2121
CSV formatting options, and extensive validation criteria for individual columns and their values,
2222
supporting a wide range of data validation rules from basic type checks to complex regex validations.
2323
This example serves as a comprehensive guide for creating robust CSV file validations.
@@ -44,6 +44,7 @@
4444
'name' => 'Column Name (header)',
4545
'description' => 'Lorem ipsum',
4646
'example' => 'Some example',
47+
'required' => true,
4748

4849
'rules' => [
4950
'not_empty' => true,

schema-examples/full.yml

+8-17
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
name: CSV Blueprint Schema Example # Name of a CSV file. Not used in the validation process.
1818
description: | # Any description of the CSV file. Not used in the validation process.
1919
This YAML file provides a detailed description and validation rules for CSV files
20-
to be processed by JBZoo/Csv-Blueprint tool. It includes specifications for file name patterns,
20+
to be processed by CSV Blueprint tool. It includes specifications for file name patterns,
2121
CSV formatting options, and extensive validation criteria for individual columns and their values,
2222
supporting a wide range of data validation rules from basic type checks to complex regex validations.
2323
This example serves as a comprehensive guide for creating robust CSV file validations.
@@ -50,26 +50,17 @@ structural_rules: # Here are default values.
5050
# This will not affect the validator, but will make it easier for you to navigate.
5151
# For convenience, use the first line as a header (if possible).
5252
columns:
53-
- name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv_structure.header" is true.
53+
- name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true.
5454
description: Lorem ipsum # Description of the column. Not used in the validation process.
5555
example: Some example # Example of the column value. Schema will also check this value on its own.
5656

57-
# Important notes about the validation rules.
58-
# 1. All rules except "not_empty" ignored for empty strings (length 0).
59-
# If the value must be non-empty, use "not_empty" as extra rule!
60-
# 2. All rules don't depend on each other. They are independent.
61-
# They know nothing about each other and cannot influence each other.
62-
# 3. You can use the rules in any combination. Or not use any of them.
63-
# They are grouped below simply for ease of navigation and reading.
64-
# 4. If you see the value for the rule is "true" - that's just an enable flag.
65-
# In other cases, these are rule parameters.
66-
# 5. The order of rules execution is the same as in the schema. But it doesn't matter.
67-
# The result will be the same in any order.
68-
# 6. Most of the rules are case-sensitive. Unless otherwise specified.
69-
# 7. As backup plan, you always can use the "regex" rule. ON YOUR OWN RISK!
57+
# If the column is required. If true, the column must be present in the CSV file. If false, the column can be missing in the CSV file.
58+
# So, if you want to make the column optional, set this value to false, and it will validate the column only if it is present.
59+
# By default, the column is required. It works only if "csv.header" is true and "structural_rules.allow_extra_columns" is false.
60+
required: true
7061

7162
####################################################################################################################
72-
# Data validation for each(!) value in the column.
63+
# Data validation for each(!) value in the column. Please, see notes in README.md
7364
# Every rule is optional.
7465
rules:
7566
# General rules
@@ -239,7 +230,7 @@ columns:
239230

240231
# Check if the column is sorted in a specific order.
241232
# - Direction: ["asc", "desc"].
242-
# - Method: ["natural", "regular", "numeric", "string"].
233+
# - Method: ["natural", "regular", "numeric", "string"].
243234
# See: https://www.php.net/manual/en/function.sort.php
244235
sorted: [ asc, natural ] # Expected ascending order, natural sorting.
245236

schema-examples/full_clean.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
name: 'CSV Blueprint Schema Example'
1717
description: |
1818
This YAML file provides a detailed description and validation rules for CSV files
19-
to be processed by JBZoo/Csv-Blueprint tool. It includes specifications for file name patterns,
19+
to be processed by CSV Blueprint tool. It includes specifications for file name patterns,
2020
CSV formatting options, and extensive validation criteria for individual columns and their values,
2121
supporting a wide range of data validation rules from basic type checks to complex regex validations.
2222
This example serves as a comprehensive guide for creating robust CSV file validations.
@@ -39,6 +39,7 @@ columns:
3939
- name: 'Column Name (header)'
4040
description: 'Lorem ipsum'
4141
example: 'Some example'
42+
required: true
4243

4344
rules:
4445
not_empty: true

src/Csv/Column.php

+19-26
Original file line numberDiff line numberDiff line change
@@ -24,25 +24,22 @@
2424
final class Column
2525
{
2626
private const FALLBACK_VALUES = [
27-
'inherit' => '',
2827
'name' => '',
2928
'description' => '',
30-
'type' => 'base', // TODO: class
31-
'required' => false,
32-
'allow_empty' => false,
33-
'regex' => null,
29+
'required' => true,
3430
'rules' => [],
3531
'aggregate_rules' => [],
3632
];
3733

38-
private int $id;
34+
private ?int $csvOffset = null;
35+
private int $schemaId;
3936
private Data $column;
4037
private array $rules;
4138
private array $aggRules;
4239

43-
public function __construct(int $id, array $config)
40+
public function __construct(int $schemaId, array $config)
4441
{
45-
$this->id = $id;
42+
$this->schemaId = $schemaId;
4643
$this->column = new Data($config);
4744
$this->rules = $this->prepareRuleSet('rules');
4845
$this->aggRules = $this->prepareRuleSet('aggregate_rules');
@@ -53,28 +50,30 @@ public function getName(): string
5350
return $this->column->getString('name', self::FALLBACK_VALUES['name']);
5451
}
5552

56-
public function getId(): int
53+
public function getCsvOffset(): ?int
5754
{
58-
return $this->id;
55+
return $this->csvOffset;
5956
}
6057

61-
public function getDescription(): string
58+
public function getSchemaId(): int
6259
{
63-
return $this->column->getString('description', self::FALLBACK_VALUES['description']);
60+
return $this->schemaId;
6461
}
6562

66-
public function getHumanName(): string
63+
public function getDescription(): string
6764
{
68-
return $this->getId() . ':' . \trim($this->getName());
65+
return $this->column->getString('description', self::FALLBACK_VALUES['description']);
6966
}
7067

71-
public function getKey(): string
68+
public function getHumanName(): string
7269
{
73-
if ($this->getName() !== '') {
74-
return $this->getName();
70+
if ($this->csvOffset !== null) {
71+
$prefix = $this->csvOffset;
72+
} else {
73+
$prefix = $this->schemaId;
7574
}
7675

77-
return (string)$this->getId();
76+
return $prefix . ':' . \trim($this->getName());
7877
}
7978

8079
public function isRequired(): bool
@@ -92,11 +91,6 @@ public function getAggregateRules(): array
9291
return $this->aggRules;
9392
}
9493

95-
public function getInherit(): string
96-
{
97-
return $this->column->getString('inherit', self::FALLBACK_VALUES['inherit']);
98-
}
99-
10094
public function getValidator(): ValidatorColumn
10195
{
10296
return new ValidatorColumn($this);
@@ -107,17 +101,16 @@ public function validateCell(string $cellValue, int $line = Error::UNDEFINED_LIN
107101
return $this->getValidator()->validateCell($cellValue, $line);
108102
}
109103

110-
public function setId(int $realIndex): void
104+
public function setCsvOffset(int $csvOffset): void
111105
{
112-
$this->id = $realIndex;
106+
$this->csvOffset = $csvOffset;
113107
}
114108

115109
private function prepareRuleSet(string $schemaKey): array
116110
{
117111
$rules = [];
118112

119113
$ruleSetConfig = $this->column->getSelf($schemaKey, [])->getArrayCopy();
120-
121114
foreach ($ruleSetConfig as $ruleName => $ruleValue) {
122115
$rules[$ruleName] = $ruleValue;
123116
}

0 commit comments

Comments
 (0)