Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON-LD taxonomy validation to CI #52

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

imbrou
Copy link
Collaborator

@imbrou imbrou commented Jan 26, 2025

This PR adds automated validation for JSON-LD taxonomy files to ensure they follow DFC naming conventions.

Changes

.
├── .github
│   └── workflows
│       ├── validate-taxonomies.yml
│       └── scripts
│           └── validate_taxonomy.py

Description

  • Adds a GitHub Actions workflow to validate all JSON files
    • Triggers when a pull request to main is created AND contains (edits or adds) a JSON file
  • Validates:
    • JSON-LD syntax
    • URI naming patterns (alphanumeric, starts with letter)
    • Class names use PascalCase
    • Property names use camelCase

Testing Done

Tested against current taxonomies:

  • facets.json
  • measures.json
  • productTypes.json
  • vocabulary.json

Here is the result of this test, to validate before merging this PR.

Breaking Changes

None. This only adds CI validation.

Next steps

  • Configure the repo to prevent the merge of PRs on main if the CI fails.

@imbrou imbrou self-assigned this Jan 26, 2025
Copy link

@RaggedStaff RaggedStaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @imbrou

This looks great!

Just looking at the test output and I'm wondering if we could have more meaningful error messages?

Currently is failing on the aplhanum check & giving the same Must be alphanumeric, starting with a letter message for everything.

Could we split it out & have separate messages for ?:

  • Remove all non-alphanumeric characters
  • Must start with an lowercase letter

Also noting - we are envisaging that Id's will correspond (as closely as possible) to the english skos:prefLabel. In the future would be nice to have a warning if the label isn't similar to the id. 🤔 Not urgent - we're catching the auto-generated Id's due to the underscore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants