Skip to content

[naga] Comments parsing (naga + wgsl) #6364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: trunk
Choose a base branch
from

Conversation

Vrixyz
Copy link

@Vrixyz Vrixyz commented Oct 3, 2024

Connections
None as far as I know.

Description
Generating automated documentation à la https://docs.rs is very useful, but currently difficult.

naga lacks information about comments to be too helpful, so this pull requests adds support to parsing those.

Testing

Status

  • Add a token DocComment
  • Add a token DocCommentModule
  • Rust syntax was chosen: [naga] Comments parsing (naga + wgsl) #6364 (comment)
    • Parse /// comments
    • Parse /** */ comments
    • Parse //! module comments
    • Parse /*! */ module comments
  • Add comments to ast::Struct
  • Support comments for multiple items
    • struct
      • struct members
    • var (global variables)
    • consts
    • functions
    • module level comment
      • For maximum compatibility with naga_oil, this is implemented with its own syntax: they should start with //!.
        • because naga_oil injects the imported modules at the beginning of the file, and not necessarily where the import line was written exactly. That prompted that commit: 02caf3a
    • alias ; 🚫 I consider this out of scope for this PR.
  • Add comments list in naga::Module
  • Added more test and adapted existing tests

Checklist

  • Run cargo fmt.
  • Run cargo clippy. If applicable, add:
    • --target wasm32-unknown-unknown
    • --target wasm32-unknown-emscripten
  • Run cargo xtask test to run tests.
  • Add change to CHANGELOG.md. See simple instructions inside file.

@Vrixyz Vrixyz requested a review from a team October 3, 2024 15:56
@Vrixyz Vrixyz marked this pull request as draft October 3, 2024 15:56
@jimblandy
Copy link
Member

This doesn't get the comments out of the front end, though, does it?

It seems to me it could be fine to just add a field like:

    doc_comments: Option<Box<DocComments>>,

to naga::Module, and then have something like

struct DocComments {
    types: FastIndexMap<Handle<Type>, Vec<Span>>,
    global_variables: FastIndexMap<Handle<GlobalVariable>, Vec<Span>>,
    ...
}

I think it would make sense for Naga to adopt Rust's /// syntax as a WGSL extension.

@Vrixyz
Copy link
Author

Vrixyz commented Oct 4, 2024

This doesn't get the comments out of the front end, though, does it?

Correct, I'm looking into propagating them to the module now ; my first intuition was to add comments to naga::TypeInner::Struct and the likes ; but I fear for performance.

Your idea of a "flat" DocComments with an indexmap to Spans is interesting and probably a better idea! Thanks for suggesting. I'll go in that direction.

Copy link
Member

@jimblandy jimblandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments here.

You'll also need to change compact::compact to adjust the type handles, even though only named types can have comments and compaction never removes named types, because compaction may remove anonymous types, and that renumbers the handles.

naga/src/lib.rs Outdated
@@ -2263,4 +2263,27 @@ pub struct Module {
pub functions: Arena<Function>,
/// Entry points.
pub entry_points: Vec<EntryPoint>,
/// Comments, usually serving as documentation
pub comments: Comments,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an Option<Box<Comments>>, so that users who aren't interested in collecting these comments don't need to spend memory on them.

Copy link
Author

@Vrixyz Vrixyz Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the change ; we could also add a feature for that ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I didn't really say what I meant:

I think users should be able to choose to collect or ignore comments at run time. That is, naga::front::wgsl::FrontEnd::new should take an Options type, or FrontEnd should have a new_with_comments constructor - or whatever, something boring but tasteful - and then parsing should return a Module whose comments field is either populated or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! I addressed this in 57865be ; I'm open to suggestions about the implementation.

I updated a few tests to test both options.

@jimblandy
Copy link
Member

Note: I branched off 0.22 because I'm not confident with jumping on the maybe unstable trunk.

It should be fine to work on trunk. This PR should be rebased on that.

@jimblandy
Copy link
Member

@Vrixyz Since this isn't passing CI, to make it easier to track stuff I need to review, I'm going to convert this to a draft PR until those issues are sorted out. Feel free to flip it back into non-draft when you're ready.

@jimblandy jimblandy marked this pull request as draft January 8, 2025 03:34
@Vrixyz Vrixyz force-pushed the token-comment branch 3 times, most recently from 0025864 to 243445c Compare January 9, 2025 10:23
@Vrixyz Vrixyz changed the base branch from v22 to trunk January 9, 2025 10:24
@Vrixyz
Copy link
Author

Vrixyz commented Jan 9, 2025

I rebased on trunk (github comment reviews are degraded, but I adressed those 🤞), tests are encouraging, there's a major bug I'm investigating still (validation_error_messages): I think spans are incorrect, which I hope to fix by working on Jim's feedback.

❓ I'm curious about "pushing rules": I imagine parsing should contain a rule for parsing comments ? so we push that rule when checking for comments.

@Vrixyz Vrixyz marked this pull request as ready for review January 10, 2025 10:20
@Vrixyz Vrixyz requested a review from jimblandy January 10, 2025 10:21
@jimblandy
Copy link
Member

❓ I'm curious about "pushing rules": I imagine parsing should contain a rule for parsing comments ? so we push that rule when checking for comments.

You mean Parser::rules? You shouldn't need to worry about that at all. You should just call start_byte_offset and span_from directly.

The rule stack is currently being used as a rough approximation to WGSL's template list discovery, which you definitely do not want to get tangled up with, and which shouldn't affect your work at all. And it's used for some sanity checks.

@jimblandy jimblandy added naga Shader Translator area: naga front-end lang: WGSL WebGPU Shading Language labels Feb 6, 2025
@Vrixyz
Copy link
Author

Vrixyz commented Feb 25, 2025

Let's go with the Rust syntax. Can we take care of it in this PR?

I did that in 9cb6d53 👍

@Vrixyz Vrixyz changed the title [naga] Comments parsing [naga] Comments parsing (naga + wgsl) Feb 27, 2025
@cwfitzgerald cwfitzgerald assigned teoxoy and unassigned jimblandy Apr 9, 2025
Comment on lines +336 to +339
} else if let Token::Trivia = token {
self.input = rest;
} else if let Token::CommentDocModule(_) = token {
self.input = rest;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be valid to have these items in-between the doc comment and the item itself? It seems odd to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're probably right, I thought it wouldn't hurt to not fail, but rust does indeed raise an error if a module comment is at the wrong place. I can do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Token::Trivia (comments) does rust allow those in-between doc comments?

let mut constructing_token = if !save_comments {
Token::Trivia
} else {
let mut peeker = char_indices.clone().peekable();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make char_indices peekable, avoiding the clone.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

char_indices is actually an iterator (from rustlib), so the clone is not expensive. We could make the api more ergonomic but ultimately I think that would lead to the same generated code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I should have mentioned my main concern was about readability as it took me a while to figure out what the code was doing, though I realize making char_indices peekable doesn't improve the flow much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we are already cloning the iterator, we shouldn't need to make it peekable at all, we can just call next on it.

@@ -1182,6 +1200,9 @@ impl Parser {
ExpectedToken::Token(Token::Separator(',')),
));
}
// Save a lexer to be able to backtrack comments if need be.
let mut lexer_comments = lexer.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can then just call accumulate_doc_comments here instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I went this route was because not all items should support comments, so if we try to accumulate on everything, there'd be a Vec instantiated to store the result, then we realize we don't need it, so we drop it. I suspect this would impact performance, but I didn't measure it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty Vecs don't allocate so I don't think it should be a problem.

Copy link
Author

@Vrixyz Vrixyz Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accumulate_doc_comments leaves the lexer in a "bad" state though, where it consumes 1 item "too many" in order to know it's stopped. I could change this behaviour to peek before consuming, or return the latest consumed item (which is not a comment) for further parsing 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accumulate_doc_item_comments doesn't set self.input = rest when it encounters a token it can't consume and so it won't advance the tokenizer.

token
}
while let (Token::CommentDocModule(_), span) = peek_any_next(&lexer) {
comments.push(lexer.source.index(span));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't Token::CommentDocModule(_) already contain the &str?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct! as I'm peeking I can't use it though because the borrow checker is unhappy, I think I have to make a next_module_comment function for the lexer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's not 100% trivial, changing this would either:

  • require a peek before, then getting the comment if relevant.
  • or use an accumulate_comments for module docs, then backtracking to get the last token which resulted in leaving that accumulate_comments.
  • change accumulate_comment implementation to return the last (token, rest).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar method to accumulate_doc_item_comments would work and we wouldn't need to backtrack.

Comment on lines +2858 to +2863
fn peek_any_next<'a>(lexer: &'a Lexer) -> (Token<'a>, Span) {
let mut cloned = lexer.clone();
let token = cloned.next_until(|_| true, false);
token
}
while let (Token::CommentDocModule(_), span) = peek_any_next(&lexer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a next_if method on the lexer for this, making the code easier to look at.

@teoxoy
Copy link
Member

teoxoy commented Apr 11, 2025

The overall approach looks good, I mainly left some comments that I think would improve the impl and a one last question about usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: naga front-end area: naga middle-end Intermediate representation lang: WGSL WebGPU Shading Language naga Shader Translator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants