Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline Options for Roles and Directives #28

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

agoose77
Copy link
Contributor

@agoose77 agoose77 commented Jan 29, 2025

This is a WIP draft for adding "inline attributes" to MyST.

Preview on MyST Sandbox

Spec Issue: jupyter-book/myst-spec#68

agoose77 and others added 2 commits January 29, 2025 14:49
@agoose77 agoose77 force-pushed the agoose77/mep-inline-attributes branch from ce47cdf to 4b5b761 Compare January 29, 2025 14:59
@rowanc1 rowanc1 changed the title Add MEP for inline attributes Inline Options for Roles and Directives Jan 29, 2025
meps/mep-0003.md Outdated Show resolved Hide resolved
@agoose77 agoose77 marked this pull request as ready for review January 29, 2025 17:14
@agoose77
Copy link
Contributor Author

@choldgraf @rowanc1 @fwkoch this is now set to active!

Copy link
Contributor

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment, I'm supportive of this change and excited to see it unlock new functionality for roles and directives (especially for HTML outputs). I like the fact that it doesn't break any existing syntax (since only a subset of invalid syntax will now be valid), and that it follows patterns that have become common across adjacent markdown syntax ecosystems. +1 from me

meps/mep-0003.md Outdated Show resolved Hide resolved
meps/mep-0003.md Outdated

### Inline Attribute Syntax

The internal inline syntax for specifying content will follow existing standards in [Pandoc](https://pandoc.org/chunkedhtml-demo/8.18-divs-and-spans.html)/[djot](https://htmlpreview.github.io/?https://github.com/jgm/djot/blob/master/doc/syntax.html#inline-attributes), with the addition of the role/directive name that already exists in MyST Markdown.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except by placing it in front it goes against these standards, and also somewhat breaks djot's 1st design goal

It should be possible to parse djot markup in linear time, with no backtracking.

now every time you encounter a { you have to do a complex parse to work out if it is a role. It will also mean people will have to escape { in certain circumstances where you can use them currently in plain text

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence could be cleaned up, we are referring to internal inline syntax i.e. the things inside of the braces. MyST syntax is already significantly different than djot/pandoc here, and this proposal is not a large step by any means in aligning the syntax.

I don't understand how placing it in front changes the complexity of where we are now with myst; can you clarify that? With this proposal, we simply allow more options inside of the braces than we did before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how placing it in front changes the complexity of where we are now with myst

this is how you check if you have reached a role during parsing:
https://github.com/jupyter-book/mystmd/blob/d1f8768eb4dbada0a370765ba696fbb3c00ba66f/packages/markdown-it-myst/src/roles.ts#L31-L32

how will you do this now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so you are saying that we will no longer be able to parse this with a regexp because it is complex, so the inline code has to be parsed and then could reject. The implementation we have experimented with so far is using regexp still, but that will fail in more complex situations, so we would have to parse to do that properly.

Hadn't thought about the performance implications of this, thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're in a slightly tricky part of the problem space because of MyST Markdown's history.

This MEP (roles and directives)

In my thinking, roles and directives are related to spans and divs (notwithstanding the fact that we have span roles and div directives). This MEP is designed to add configuration to roles (and directives) in a way that differs from djot et al.'s inline attributes.

For example,

{cite style=author}`hello_world_1984`

vs

*Hello world*{.red}

There is some overlap — adding labels and IDs to roles feels natural. We could more strictly separate the features, such that this MEP only addresses options:

{cite style=author}`hello_world_1984`

and prevents IDs, classes, etc. i.e. forbids

{cite style=author .red}`hello_world_1984`

This would then allow us to strictly talk about those attribute sets in a different proposal. But, I don't immediately see the merit of this — it feels like boilerplate and worse readability.

So, given that we already have roles, and these roles would benefit from accepting configuration (options), this MEP extends the role definition syntax to include the ability to set configuration and annotation using inline attributes.

A future MEP for annotations

This does, of course, raise the question of what an MEP that allows for annotation of inline elements should look like. It's worth zooming out here to think about MyST vs djot:

Djot
Djot's inline attribute syntax allows the author to set ID, class, and arbitrary key-value pairs. It appears that the intention for these key-value pairs is to be carried as custom data for transforms to inspect (i.e. an attributes bundle). As such, Djot's approach feels quite Quarto/Pandoc-like in that extensibility is handled through annotations and transforms.

MyST
MyST meanwhile handles extensibility through custom roles and directives, and transforms. In this way, setting arbitrary key-value annotations feels less useful. Given that MyST has a well-defined AST, it feels like we want to set existing schema-defined attributes rather than arbitrary key-value pairs.

% Role with ID
{delete #old}`See this old text`

% Role with option
{eval quote=true}`"Hello" + "World"`

% Inline attribute
![](){width=128px}

Something that comes to mind here is that setting e.g. image widths from {width=128px} feels fragile — how do we validate this?

I am still thinking about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the response @agoose77, but I wouldn't say it really addresses more point, of how you parse the inline syntax 😬

You liken it to an HTML span, yet in HTML every normal < must be represented as the escaped &lt.
Is that the case here, does every { character now need to have an escaped version?

Or how exactly do you decide when something is a role?

As I've said below, a reference paring implementation would clear this up

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your main point on performance of backtracking @chrisjsewell?

For escaping, in both myst-parser and mystmd we support \{role}`body` as escaped roles. So only need to escape { when it is part of a role pattern. (demo). I don't think that anything here changes that escape pattern, but there is more parsing required beyond a regexp to recognize a role with this MEP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh kinda both; there is the possibility that now escaping of { has to be done in a lot more places, because it is a lot less of a "certainty" whether a { is related to a role or not.

This also applies to when reading, if there would be any longish attributes.

Personally, I feel the inline reads better, when the less important configuration is put at the tail.
From the djot discussion, I would actually say this syntax would be the cleanest/tursest:

this is some inline with a  role :name`content`{#uniqueid .classname key="value"} and some more text

as you are reading it (or parsing), you get the most important information first - the type of the content, then content itself, and then the less important things

rowanc1

This comment was marked as outdated.


## Questions or Objections Considered

Adopting the proposed syntax opens up a currently impossible way to specify options for roles. The same syntax can also be applied to directives, however, it introduces a third way to specify options for directives. We find that acceptable, as this is for short-hand properties like classes and IDs, which ideally should not take up an extra line.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this paragraph sums things up really nicely: Fundamentally, the motivation for this MEP is to enable options, classes, and identifiers on MyST roles in a standardized way. This was not possible before and will be possible after this MEP is implemented. 👍

In doing so, it introduces a new, terser way to specify options, classes, and identifiers on directives as well. If authors like this new syntax, great, it's less typing for them! If not, no problem. Either way, the structured MyST data will end up the same. I think this follows a nice philosophy of making things for authors as easy and flexible as possible, while not adding bloat at the AST level.

@choldgraf
Copy link
Contributor

Thanks for the review and approval @fwkoch !

Referring to our MEP decision making process. I think we are ready to accept this unless a core team member wants to raise an objection. Here are the criteria for accepting MEPs:

  1. Active for 5 days. This MEP is currently Active and has been for 2 weeks. ✅
  2. Approval from at least two core team members (myself and @fwkoch ) ✅
  3. No Request Changes currently active from a core team member ✅

I will hold off merging for a day or two to give others a time to object if they wish.

@stevejpurves
Copy link

This is great! can't wait to see this materialise :D

Somewhat related -- but not intended for/to-delay this MEP -- I'd love to see this syntax be available on blocks, something like:

+++ { part=abstract }

Although, i've also thought that blocks should support JSON5 too, sans doublequotes.

+++ { part:abstract }

Copy link
Member

@rowanc1 rowanc1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through a spelling pass and a few minor clarifications.

@agoose77
Copy link
Contributor Author

I'd love to see this syntax be available on blocks, something like:

Yes, I hard-agree on this. It feels natural.

@choldgraf
Copy link
Contributor

choldgraf commented Feb 11, 2025

@agoose77 is there anything that you'd want to clean up on this proposal before "accepting" it and implementing it in the spec (which will be the final act here once we merge this)?

If not, then I'd suggest that we merge this and move forward!

@stevejpurves i think your proposal would be a great mini-JEP to extend this one, especially once we have some experience with people using this syntax to understand the UX around it

@chrisjsewell
Copy link
Contributor

before "accepting" it and implementing it in the spec

would it not be a good idea to actually provide a reference implementation first, before accepting it into the spec 😅

this is standard practice across most such specs

@choldgraf
Copy link
Contributor

I'm happy with either approach - @agoose77 what would you like to do? I'm referencing the MEP process here which doesn't note a requirement of reference implementations, so I don't think it's a requirement.

@mmcky
Copy link

mmcky commented Feb 13, 2025

Thanks @agoose77 for proposing this change and everyone else for your work and comments on this MEP. It is great to see detailed discussions about the syntax and proposed changes. I recognise a great deal of time, energy and work goes into these MEPs - thank you.

It will be great to have this type of functionality in MyST and look forward to using it. I'm also glad that the more verbose options for directives (yaml block and shorter :class: syntax) will be retained and continue to be supported in new option development as I find it makes teaching MyST simpler.

My 2c - I do think a technical implementation (or reference implementation) is a good next step before final acceptance in a version approved spec, as it helps to uncover any edge cases that are not always trivial to think through when thinking just about syntax. However, it is not a requirement of the process, and I agree we should follow the process as a guiding principle so agree with @choldgraf happy with either approach as well.


Here are some thoughts I have re: possible updates to the process:

A full technical implementation can feel onerous to contributors before suggesting a change, as you can end up spending a lot of time building something that might get rejected. Perhaps we could introduce three stages to allow ideas to be pitched and discussed to get a strategic direction before the full investment in a reference implementation.

Perhaps we could adopt a traffic light style 3 stages for full acceptance.

  • Approved - Would appear in the MEP list of changes to give community notice and enable bigger changes (like this) time for a reference implementation but knowing the community agrees with the change and the planned direction.
  • Technical - build and ensure Syntax works and/or change can be implemented, revise MEP if any minor issues develop due to performance degradations, or unforeseen complexity.
  • Accepted (or Cancelled) - Accepted is the formal inclusion in a version release of the spec. I think cancelled would be used very rarely only if major technical issue arose.

In the context of this MEP, once merged the status would be:

🟢 Approved
🟡 Technical
🟡 Accepted

In the context of smaller MEPs all three could be achieved in a single merge process.

@choldgraf
Copy link
Contributor

choldgraf commented Feb 14, 2025

I had a backchannel about this with @rowanc1 and he suggested holding off on accepting this until there's an implementation to look at. So I'm going to consider this as "Changes Requested" and thus hold off on accepting the MEP as-is. @agoose77 - want to take a stab at an implementation so we can discuss and then @rowanc1 can give feedback on whether that's enough to accept?

Also a process note: I'm acting as a steward of this MEP, so I'm not trying to weigh in on the decision other than ensuring we follow the process we've defined and keep things moving forward. If core team members want Angus to do something like change text, add implementation, etc before accepting, please do not click "approve PR" and instead click "request changes" with the changes you'd like to see. Here's the language we have for accepting a MEP:

A MEP may be accepted when all of the following conditions are met:

  1. More than five (5) weekdays have passed since the proposal was marked as Active.
  2. At least two PR Approvals from core team or steering council members.
  3. No Request Changes from a core team member.

Requesting an implementation example is reasonable for something of this level of complexity, and the right way to do so is to "request changes" and explain what you'd like to see happen that would unblock it. </end pedantic comment :-) >

@mmcky that's an interesting idea - can we discuss process changes in a different issue so that we don't clutter up this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants