Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement spans #41

Closed
dtolnay opened this issue Oct 14, 2016 · 7 comments
Closed

Implement spans #41

dtolnay opened this issue Oct 14, 2016 · 7 comments

Comments

@dtolnay
Copy link
Owner

dtolnay commented Oct 14, 2016

This is a requirement for implementing something like rustfmt against syn.

@mystor
Copy link
Collaborator

mystor commented Jan 24, 2017

How would you imagine these spans being implemented? My immediate thought was that they could be implemented by adding a

span: &'a str

field to each of the AST structs, which holds a string slice into the input string. This wouldn't include line and column number info, which would then be at minimum O(n) time to compute, but would act as a sort-of-byte-index tracker.

The other simple option would be to do literally that, which is record the starting and ending byte indexes of each of the AST nodes in the source, like:

span: (usize, usize)

Actually tracking the occurrence of newline characters when parsing seems unpleasant, and like it would require non-insignificant changes to nom, which is especially problematic if we are going to stabilize and expose our internal nom module as we're talking about doing in #81. In addition, even simple changes like adding the byte offsets seem like they would be easier if we tweak parts of how our nom fork work internally to track the original input string in addition to the current working substring (in order to be able to calculate byte offsets).

@mystor
Copy link
Collaborator

mystor commented Jan 24, 2017

I should also add that one of the nice things about doing byte offsets for this is that it makes it very easy to retrieve the original source text for an AST node from the source string, which is nice for error reporting when using full, for example.

@dtolnay
Copy link
Owner Author

dtolnay commented Jan 24, 2017

Once we get procedural macros I would like to take advantage of the spans contained in those rather than implementing our own separate system. The parser will be able to parse a TokenStream rather than a string and it can keep track of the span of each syntax tree node. Then the user's procedural macro logic will be able to trigger errors on particular syntax tree nodes that rustc is able to display in the right place.

I haven't been keeping track of how far we are from a usable API for iterating through a TokenStream but once we have that, we can implement TokenStream parsing behind a cfg in syn.

@mystor
Copy link
Collaborator

mystor commented Jan 24, 2017

I would also like spans for string inputs as well, for situations where I am parsing full .rs files with syn. Do you think whatever solution we end up using will support both?

@dtolnay
Copy link
Owner Author

dtolnay commented Jan 24, 2017

I think nom handles this with their InputIter trait which abstracts the difference between &[u8] and &str so that most parsers work with either one. We could do a similar abstraction over &str vs TokenStream and treat spans differently in the two cases. For now, (usize, usize) for the string case seems good to me.

@mystor mystor mentioned this issue Feb 7, 2017
@dtolnay
Copy link
Owner Author

dtolnay commented Feb 12, 2017

@mystor have you looked into possibly using strata_rs for your use case? It looks like they have spans (which they call Extent) already.

In general it looks like that library is designed for more advanced use cases and it may serve us better to direct people who need spans to use strata_rs and keep syn focused on the proc macro use case.

@dtolnay
Copy link
Owner Author

dtolnay commented May 22, 2017

This is superseded by #142 which will use real spans from the compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants