-
Notifications
You must be signed in to change notification settings - Fork 20
For developers
Primary intention of Main
class in Schema Guru is to take entered subcommand, create corresponding subcommand object and pass control to it.
All subcommand-related code is contained in cli
package.
Since argonaut parser doesn't have support of anything similar to subcommands, we implemented them by ourselves in GuruCommand
trait.
It contains separate argonaut parser, name of command (like "schema" and "ddl") and it's description for help message.
Everything else is up to Command classes. They mostly look like usual App objects.
Nothing in Schema Guru except output method in subcommand classes has Unit
type.
And all code related to input from file system is contained inside utils
package.
Everthing else imply work with pure functions and have no shared state.
All schema types are described in schema
package.
Most of them represents JSON Schema types like "string", "object", "null", etc.
But there's few auxilirary Schema types not presented in JSON Schema Specification.
Each schema type need to mixin JsonSchema
trait.
This trait require to:
- implement metod
toJson
which will show how to represent this type as JSON object - implement partial function
mergeSameType
which provide fine-grained control over merge two schemas of the same type - implicitly provide
SchemaContext
on create (by deriving, merging or just instantiating) schema type object.
None of properties is required in any type of JSON Schema.
Thus all properties in all schema types are optional.
Minimal JSON Schema is just an empty hash {}
. It will validate any JSON value.
It's represented as ZeroSchema
in our types (one of those auxilirary schema types).
Another special schema type is ProductType
.
It help us to map "dynamic world" of JSON to "static world" of Scala types, because JSON Schema Specification states that we can have value that can any of two or more types.
When we try to merge two schemas of different types like "string" and "object" or "null" and "integer" we will get ProductType
on output with all information presented in those original types.
ProductType
optionally contains each of schemas types and provide correct output with toJson
(becuase in the end, only this output matters).
If we try to merge "object"
schema into product type ["string", "null"]
it will just place all object's info into it's place in ProductType
case class.
If we try to merge ["string", "object"]
with another ["string", "object"]
it will use mergeSameType
for each of corresponding type.
If we try to merge "string"
and "string"
... well we will get a "string"
, it's not a product case.
merge
for all schema types is defined in terms of partial functions.
Each type must have defined mergeSameType
partial function which contains logic of merging two similar types.
Because each property need to have it's own rules for merge: "minimum" will
take lesser value and eliminate greater, "maximum" other way round, "format" should be eliminated if another value encountered and so on.
merge
will sequentally try each of four partial functions and mergeSameType
is the first one.
It will stop if one of these four functions is defined over argument (other three already defined in JsonSchema
trait).
SchemaContext
is a special case class which give Schema Guru hints about how to create and merge schema types.
It is basically something that is passed from outer world (like user preferences) that will affect our schemas.
It can be limit for enum cardinality or rules to apply pattern suggestions.
It is being implicitly passed around by every function that creates and merges schema types.
All work is happening in convertsJsonsToSchema
and mergeAndTransform
.
They called sequentally and probably can be even a single long function.
First one takes a list of JSON instances (received from FS or network) and tries to convert each one into micro-schema.
Micro-schema is usual subclass of JsonSchema
, but it is "micro" because it can validate only one value, which it was derived against.
For example for value 42
micro-schema will be {"type": "integer", "minimum": 42, "maximum": 42, "enum": [42]}
.
None of values except 42
will pass validation against such schema.
Then we pass list of these micro-schemas to mergeAndTransform
.
Now all of these micro-schemas will be merged (summed using Monoid) into one which will validate all of them.
For example, if we're merging two integers with micro-schemas like {"maximum": 10, "minimum: 10}
, {"maximum": 14, "minimum": 14}
it will give us {"maximum": 14, "minimum": 10}
converting it to something more sensible.
So these boundaries are always expanding only to validate merged micro-schemas.
Last step to create meaningful JSON Schema is to apply required transformations.
Method transform
defined on complex schema types (object, array, product) will recursively apply it's argument (partial function) to all nested primitive schema types (string, number, integer).
One of those transformations for example is encaseNumericRange
which will expand above {"maximum": 14, "minimum": 10}
to something more usefule like {"maximum": 32767, "minimum": 0}
(positive 16-bit integer).
All the DDL generation logic contained in schema-ddl schema-ddl package.