Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 38 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,43 +6,67 @@
[![CI](https://github.com/jmarshrossney/dirconf/actions/workflows/ci.yml/badge.svg)](https://github.com/jmarshrossney/dirconf/actions/workflows/ci.yml)
[![Docs](https://github.com/jmarshrossney/dirconf/actions/workflows/docs.yml/badge.svg)](https://jmarshrossney.github.io/dirconf)

`dirconf` is a Python tool for declaratively specifying what a valid configuration directory looks like.
`dirconf` is a Python tool for declaratively specifying configuration directory structures, and constructing Python `dict` representations of their contents.

For full user documentation and examples please visit **[https://jmarshrossney.github.io/dirconf/](https://jmarshrossney.github.io/dirconf/)**.

## Motivations

I wrote this because I sometimes have to work with quite old scientific models that require various configuration files and data inputs in various formats to be present in various locations.
I was (and remain) concerned about how easy it can be to misconfigure certain models without realising, and how common workflows compromise reproducibility.

`dirconf` helps by

1. Allowing the user to describe the structure of a directory representing a valid configuration, and validate real directories against this description.

2. Facilitating the generation of new configurations and metadata programmatically, in Python, as opposed to copying and editing files by hand or writing shell scripts.
1. Allowing the user to describe the structure of a directory representing a valid configuration using Python [dataclasses](https://docs.python.org/3/library/dataclasses.html), and validate real directories against this description.

3. Providing a consistent mechanism through which complex, distributed configurations in legacy formats can be validated using excellent tools such as [JSON Schema](https://json-schema.org/) and [Pydantic](https://docs.pydantic.dev/).
2. Providing a scaffold for defining consistent read/write mechanisms through which complex, distributed configurations in legacy formats can be mapped to Python `dict`s.

Configurations are specified using Python [dataclasses](https://docs.python.org/3/library/dataclasses.html); `dirconf` has no dependencies beyond the standard library.
The ability to represent configurations as `dict`s is very useful indeed.
With no extra effort, we can:

For full user documentation and examples please visit **[https://jmarshrossney.github.io/dirconf/](https://jmarshrossney.github.io/dirconf/)**.
- Validate configurations using excellent tools such as [JSON Schema](https://json-schema.org/) and [Pydantic](https://docs.pydantic.dev/).
- Generate new configurations and metadata programmatically, as opposed to copying and editing files by hand or writing shell scripts.

## Installation

`dirconf` is a Python package and thus can be installed using `pip`, or tools such as `uv` and `poetry` that wrap around `pip`.

```sh
pip install dirconf
uv add dirconf
```

or with `uv`:
or

```sh
uv add dirconf
pip install dirconf
```

or the equivalent command for other package managers (poetry etc).
Currently Python versions equal to or above 3.12 are supported.

## Overview of usage

There are two essential steps for adapting `dirconf` to a specific use-case.

1. **Define handlers** satisfying the `Handler` protocol for each of the paths (files and directories) present in your configuration.
2. **Define the structure of a valid configuration** in terms of its paths and their respective handlers, by subclassing the `DirConfig` class. This is most easily done using the `make_dirconfig` function.

The custom `DirConfig` subclass can then be used to

1. **Read** a configuration from the filesystem into a Python `dict`.
2. **Write** a configuration `dict` to the filesystem.

These steps are most easily understood through examples: see [the docs](https://jmarshrossney.github.io/dirconf/101.html).
All examples are based on self-contained [marimo](https://marimo.io/) notebooks, which can be found in the [examples](examples/) directory.

## Philosophy

## Development
`dirconf` contains ~700 lines of code (including docstrings) and has no dependencies beyond the Standard Library.

Contributions are welcome!
This is by design.
I have no intention of developing `dirconf` into a more sophisticated tool than it already is.
The aim is that is works seamlessly alongside other tools and packages for parsing and validation, without ever getting in the way or creating conflicts.

Please open a Pull Request against the `main` branch.
With that out of the way, please feel free to raise an [issue](https://github.com/jmarshrossney/dirconf/issues) or make a [pull request](https://github.com/jmarshrossney/dirconf/pulls) to suggest a change or feature.

See [`CONTRIBUTING.md`](CONTRIBUTING.md) for full details.

17 changes: 10 additions & 7 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
# Home

`dirconf` is a simple tool for the meta-configuration of collections of configuration files, leaning heavily on Python [dataclasses](https://docs.python.org/3/library/dataclasses.html).
`dirconf` is a Python tool for declaratively specifying configuration directory structures, and constructing Python `dict` representations of their contents.

I wrote this because I sometimes work with quite old scientific models requiring various configuration files and data inputs in various formats to be present in various locations. I was (and remain) concerned about how easy it can be to misconfigure certain models without realising, and how common workflows compromise reproducibility.
I wrote this because I sometimes have to work with quite old scientific models that require various configuration files and data inputs in various formats to be present in various locations.
I was (and remain) concerned about how easy it can be to misconfigure certain models without realising, and how common workflows compromise reproducibility.

`dirconf` helps by

1. Allowing the user to describe the structure of a directory representing a valid configuration, and validate real directories against this description.
1. Allowing the user to describe the structure of a directory representing a valid configuration using Python [dataclasses](https://docs.python.org/3/library/dataclasses.html), and validate real directories against this description.

2. Facilitating the generation of new configurations and metadata programmatically, in Python, as opposed to copying and editing files by hand or writing shell scripts.
2. Providing a scaffold for defining consistent read/write mechanisms through which complex, distributed configurations in legacy formats can be mapped to Python `dict`s.

3. Providing a consistent mechanism through which complex, distributed configurations in legacy formats can be validated using excellent tools such as [JSON Schema](https://json-schema.org/) and [Pydantic](https://docs.pydantic.dev/).
The ability to represent configurations as `dict`s is very useful indeed.
With no extra effort, we can:

- Validate configurations using excellent tools such as [JSON Schema](https://json-schema.org/) and [Pydantic](https://docs.pydantic.dev/).
- Generate new configurations and metadata programmatically, as opposed to copying and editing files by hand or writing shell scripts.

## Installation

Expand All @@ -31,7 +35,6 @@ I wrote this because I sometimes work with quite old scientific models requiring
```

Currently Python versions equal to or above 3.12 are supported.
It has no dependencies other than the Standard Library.


## Overview of usage
Expand All @@ -50,7 +53,7 @@ These steps are most easily understood through examples.
To start with, take a look at the [Usage](101.md) section.
More realistic examples can be found under the 'examples' heading.

All of the examples (including 'Usage') are based on self-contained [marimo](https://marimo.io/) notebooks, which can be browsed and downloaded [here](https://github.com/jmarshrossney/dirconf/tree/main/examples/)
All examples are based on self-contained [marimo](https://marimo.io/) notebooks, which can be found [here](https://github.com/jmarshrossney/dirconf/tree/main/examples/).


## Philosophy
Expand Down
87 changes: 85 additions & 2 deletions examples/101/notebook.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# requires-python = ">=3.12"
# dependencies = [
# "marimo",
# "pydantic",
# "pyyaml",
# ]
# ///
Expand Down Expand Up @@ -779,9 +780,91 @@ def _(mo):
@app.cell(hide_code=True)
def _(mo):
mo.md(r"""
### Strategies for validation
## Config Validation

*To do.*
A primary motivation for reading file-based configurations into Python dicts is to enable validation using Python tooling.

Here we demonstrate how to validate the configuration dict returned by `read` using [Pydantic](https://docs.pydantic.dev/).
""")
return


@app.cell
def _():
from pydantic import BaseModel

class ParamsModel(BaseModel):
a: float
b: float
c: float

class ConfigModel(BaseModel):
id: str
params: ParamsModel
init_state: list[float]
switch: bool

return (ConfigModel,)


@app.cell(hide_code=True)
def _(mo):
mo.md(r"""
We can now validate the 'basic' configuration from earlier:
""")
return


@app.cell
def _(config_dict):
config_dict
return


@app.cell
def _(ConfigModel, config_dict):
validated_config = ConfigModel(**config_dict["config"])
validated_config
return


@app.cell(hide_code=True)
def _(mo):
mo.md(r"""
If the configuration contains invalid data, Pydantic will raise a clear validation error:
""")
return


@app.cell
def _(ConfigModel):
try:
ConfigModel(
id=123,
params={"a": "not a float", "b": 2.0, "c": 3.0},
init_state=[0, 0, 0],
switch=True,
)
except Exception as e:
print(type(e).__name__)
print(e)
return


@app.cell(hide_code=True)
def _(mo):
mo.md(r"""
!!! tip
You can integrate validation directly into your workflow by wrapping the `read` method:

```python
def read_validated(config_instance, path):
config_dict = config_instance.read(path)
config_dict["config"] = ConfigModel(**config_dict["config"]).model_dump()
return config_dict
```

This ensures that every time you load a configuration, it is automatically validated against your Pydantic model.
""")
return

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "uv_build"

[project]
name = "dirconf"
version = "0.5.0"
version = "0.5.1"
description = "Build declarative schemas for multi-file configuration directories using Python dataclasses, with dict-based read and write."
authors = [
{ name = "Joe Marsh Rossney", email = "[email protected]" }
Expand Down
2 changes: 1 addition & 1 deletion uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading