Skip to content

Conversation

jmorice91
Copy link
Contributor

@jmorice91 jmorice91 commented Oct 1, 2025

This PR correspond to fix the comment in the previous PR #583

List of things to check before making a PR

Before merging your code, please check the following:

  • you have added a line describing your changes to the Changelog;
  • you have added unit tests for any new or improved feature;
  • In case you updated dependencies, you have checked pdi/docs/CheckList.md
  • you have checked your code format:
    • you have checked that you respect all conventions specified in CONTRIBUTING.md;
    • you have checked that the indentation and formatting conforms to the .clang-format;
    • you have documented with doxygen any new or changed function / class;
  • you have correctly updated the copyright headers:
    • your institution is in the copyright header of every file you (substantially) modified;
    • you have checked that the end-year of the copyright there is the current one;
  • you have updated the AUTHORS file:
    • you have added yourself to the AUTHORS file;
    • if this is a new contribution, you have added it to the AUTHORS file;
  • you have added everything to the user documentation:
    • any new CMake configuration option;
    • any change in the yaml config;
    • any change to the public or plugin API;
    • any other new or changed user-facing feature;
    • any change to the dependencies;
  • you have correctly linked your MR to one or more issues:
    • your MR solves an identified issue;
    • your commit contain the Fix #issue keyword to autoclose the issue when merged.

@jmorice91 jmorice91 self-assigned this Oct 2, 2025
Each string is the name of a dataset to create in the file on first
access, with the type described in the value. The string key can also be
access, with the type described in the value. The string key is
a regular expression (regex), and be used to define "generic keys",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add a reference to the regex's standard?

Copy link
Member

@jbigot jbigot Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patterns and replacement strings support the following regular expression grammars:

  • Modified ECMAScript regular expression grammar. This is the default grammar.
  • Basic POSIX regular expression grammar.
  • Extended POSIX regular expression grammar.
  • The regular expression grammar used by the awk utility in POSIX.
  • The regular expression grammar used by the grep utility in POSIX. This is effectively the same as the basic POSIX regular expression grammar, with the addition of newline '\n' as an alternation separator.
  • The regular expression grammar used by the grep utility, with the -E option, in POSIX. This is effectively the same as the extended POSIX regular expression grammar, with the addition of newline '\n' as an alternation separator in addition to '|'.

Some grammar variations (such as case-insensitive matching) are also avaliable, see this page for details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no grammar is chosen, ECMAScript is assumed to be selected

Copy link
Contributor Author

@jmorice91 jmorice91 Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to add at the end of this paragraph.

For the regex grammar, we used the default grammar in C++: ECMAScript.

What do you think?

Copy link
Member

@jbigot jbigot Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, with your proposition@jbigot.
I suggest to add explicitly the grammar when we create a regex:

std::regex dset_regex(dset_name_string, std::regex::ECMAScript);

instead of

std::regex dset_regex(dset_name_string);

Just in the case the default grammar changes in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants