Support include files in Bikeshed and ReSpec. #60

jyasskin · 2020-05-09T03:17:39Z

Fixes the part of #27 that can be fixed in PR-Preview.

Initial description:

This depends on a to-be-added extension to
https://hg.csswg.org/dev/bikeshed-web to accept an include_urls[] query
parameter.

I sent @plinss a preliminary patch based on the -F [email protected] solution suggested in #27, but I've now discovered that PR-Preview uses the url parameter to send the file instead of the file parameter. If it sounds good to y'all, I'll update my patch to bikeshed-web to accept extra URLs instead of (or maybe in addition to) extra files.

However, the new plan, described below, is to update Bikeshed to accept a URL as its root spec argument, and resolve includes relative to that. Then bikeshed-web can just pass the url to Bikeshed, which will handle the recursive include fetching. This PR just makes sure to re-generate the spec when included files change.

tobie · 2020-05-09T13:56:10Z

I’m curious what’s missing from the way params are handled to warrant this special treatment for the include parameter. Couldn’t we just add what’s needed to the object passed to parameters instead?

jyasskin · 2020-05-09T15:01:25Z

When I thought it was going to be a file upload, that definitely wouldn't fit in params, but I guess users could build the raw URLs themselves to avoid any special handling. The raw URLs are complicated though, so this seems like better ergonomics?

tobie · 2020-05-09T15:44:42Z

Could we build the raw URL and pass it to the config along with the other things we are passing? It seems like it’s maybe a little less ergonomic, but easier to explain and more consistent.

jyasskin · 2020-05-09T16:28:22Z

I'll try something like that.

tobie · 2020-05-10T16:34:23Z

lib/models/pr.js

    get relevantSrcFiles() {
-        return [this.config.src_file];
+        return [this.config.src_file] + this.config.includes;


Mmm. This implies that we do have to do some special casing.

Have you considered just auto-detecting for include statements in the src and doing the right thing without the editor having to specify the includes?

I thought about it briefly, but you're currently not fetching the source at all, so we'd have to add that in order to scan it for includes, and I'm not eager to write my own partial Bikeshed parser to do that scan.

We could have the server do the scan, and resolve relative to the URL, but that again raises the question of how to write the parser. We could teach Bikeshed to resolve includes relative to URLs, if we give it the URL of the base file instead of a local path. @tabatkins, does that sound plausible?

We could teach Bikeshed to resolve includes relative to URLs, if we give it the URL of the base file instead of a local path.

That would be ideal, obviously.

@marcoscaceres, would that also work for ReSpec?

yes, I believe that would work. cc @sidvishnoi.

I realized that having Bikeshed and Respec fetch their own includes doesn't solve the problem this line addresses of having PR Preview know when relevant source files have changed.

If we want to fetch the files here, it wouldn't be too hard to do a recursive line-by-line grep for ^path:(.*)$ (Bikeshed) and data-include="([^"]*)"|data-include='([^']*)' (Respec) to identify the watched files.

It'll probably still be good for the processors to fetch their own files on the server, since the scan here will be an overestimate, so I think we shouldn't upload everything it finds to the processor web service.

Feel free to override me on any of that, but I'll go in that direction until I hear otherwise.

the problem this line addresses of having PR Preview know when relevant source files have changed.... If we want to fetch the files here, it wouldn't be too hard to do a recursive line-by-line grep for..

👍 It would be nice if pr-preview finds these dependencies rather than users specifying includes.

ReSpec (and spec-generator) already fetches includes relative to base URL, so this passing of includes files is not relevant to ReSpec, unless I'm missing the point of this PR.

I'm happy to either let Bikeshed take a base URL (and then do url-fetches of its include files, rather than local file fetches), or add a command that parses a document and returns all the files it might be including. (You'd then have to check if those paths actually exist, because things like local boilerplate files are included by default when they exist, without an explicit indicator.)

sidvishnoi · 2020-05-11T19:20:07Z

lib/models/pr.js

    get relevantSrcFiles() {
-        return [this.config.src_file];
+        return [this.config.src_file] + this.config.includes;


Suggested change

return [this.config.src_file] + this.config.includes;

return [this.config.src_file].concat(this.config.includes);

Python much? 😄

Heh, 😳, a test caught that for me somewhere else, but this function must not be tested. I re-learned the other day that PHP uses . for string concatenation, but I'm sure I'll forget it again by the next time I need it.

There are very few tests because there is very little budget. :)

jyasskin · 2020-05-14T00:06:16Z

How's this look?

marcoscaceres · 2020-05-14T09:35:33Z

lib/scan-includes.js

+    /** When this gets back to empty, we're done searching includes. */
+    const inProgressFetches = new Set();
+
+    await new Promise((resolve, reject) => {


At first glance, this promise seems redundant as it doesn't operate on a callback... just await scanOneFile() and get it to throw as normal. You can probably also extract scanOneFile() outside of this function and get it to return recursiveIncludes ... they could be passed as default arguments... something like:

scanOneFile(path); async function scanOneFile(path, inProgressFetches = new Set(), recursiveIncludes = new Set()) { // do sync stuff... can throw // do recursion: await scanOneFile(path, inProgressFetches, recursiveIncludes); // eventually return recursiveIncludes; } ;

Good point. I was worried about allowing maximal parallelism in the fetches, but a Promise.all() seems to allow the same amount, and it's much simpler to read.

Doing the accumulator sets as parameters initialized by default arguments seems less clear, to me, than making it a nested function. Similarly, it seems clearer which value is returned if it's returned unconditionally as the last line of scanIncludes(), than if readers have to also check what scanOneFile() returns.

marcoscaceres · 2020-05-14T10:08:31Z

lib/models/pr.js

    get relevantSrcFiles() {
-        return [this.config.src_file];
+        return [this.config.src_file].concat(this.includes);


nit:

Suggested change

return [this.config.src_file].concat(this.includes);

return [].concat(this.config.src_file, this.includes);

I'd somewhat expected that to iterate the letters in src_file, but it works. However, it turns out that this.includes both includes src_file on its own, and is a Set, so this line didn't work at all. Fixed.

I'm not sure what style you want here, @tobie. I could just assign to this.relevantSrcFiles and remove the getter?

jyasskin · 2020-05-14T23:57:10Z

lib/scan-includes.js

+                        for (let i = 1; i < match.length; i++) {
+                            const included = match[i].trim();


It's even simpler with the 1-capture regex.

lib/scan-includes.js

test/scan-includes.js

sidvishnoi · 2020-05-14T18:35:06Z

As most specs don't make use of includes feature, I think it makes sense to not incur the fetching/parsing overhead for every spec. Maybe add an hasIncludes: boolean flag in .pr-preview.json and only look out for includes if asked for?

lib/models/pr.js

jyasskin · 2020-05-15T00:00:00Z

I'm happy to add hasIncludes to the config, but since @tobie was resistant to adding new keys to the top level there, I want to double check with him that it's the right thing to do.

lib/services.js

lib/models/pr.js

tobie · 2020-05-15T23:14:54Z

I'm happy to add hasIncludes to the config, but since @tobie was resistant to adding new keys to the top level there, I want to double check with him that it's the right thing to do.

Yeah, I think paying the extra CPU cycles to avoid the UX cost is fine. Let’s not add this.

tobie

I think we’re getting close to landing this. I’m super excited; as far as I remember, it’s the biggest contribution to PR Preview to date. <3

lib/scan-includes.js

tobie

This seems good to go.

Would love reviews from @marcoscaceres and @sidvishnoi before merging.

What's the story on the ReSpec/Bikeshed path now?

And thanks all, it's awesome to see folks contributing this way!

lib/services.js

tobie · 2020-05-16T13:25:03Z

I do have a slight concern about the roll-out, though, as open pull requests will have the master branch already cached. So all of the newly added includes will show up as a diff for currently open pull requests that get modified before they're merged.

sidvishnoi

Don't forget to change the PR title before merging!

marcoscaceres · 2020-05-18T14:54:45Z

lib/models/pr.js

+        if (this.includes) {
+            return new Set(this.includes).add(this.config.src_file);
+        }
+        return new Set([this.config.src_file]);


more of a nit (ignore if you disagree) but it's usually better to just return arrays in a API like this. Sets are generally only useful inside algorithms to avoid duplicates and perform simple "set" thins - but as accessor properties, arrays are better IMO. Also, this will violate .relevantSrcFiles === .relevantSrcFiles.

Yeah, I'd be more confortable with this too, tbh.

This violates the === property for arrays too, right? And does even before this PR?

The Set is useful in the single caller of this function, in touchesSrcFile: it switches an O(N^2) algorithm to O(N).

This could just as easily be a method as a getter; I just left the kind of property that was already here.

This violates the === property for arrays too, right? And does even before this PR?

True. Maybe just rename it function getRelevantSrcFiles() or deriveRelevantSrcFiles()? At least idiomatically, it gives it a reason to return a new thing every time.

The Set is useful in the single caller of this function, in touchesSrcFile: it switches an O(N^2) algorithm to O(N).

True, but given that we are only dealing with files in the maybe tens of files, I don't think it makes a huge difference. If we were processing a thousands of files, it may have a noticeable performance impact... or even the JIT might treat it as a set under the hood or do some other magic optimization.

marcoscaceres · 2020-05-18T14:56:44Z

lib/scan-includes.js

+        let body;
+        try {
+            body = await fetchUrl(resolvedUrl, GITHUB_SERVICE);
+        } catch {


this is always risky if things go wrong for debugging purposes... maybe console.warn(err)?

@tobie, is anything watching the console warnings for PR-Preview?

Yes. It's using Papertrail.

You could log both cases. Like it is being done for caching.

jyasskin · 2020-05-18T14:59:45Z

I think ReSpec will Just Work, and probably has been Just Working except that PR-Preview could miss times it needed to regenerate the preview.

For Bikeshed, I need to teach it to operate over URLs instead of just files, and then I need to have bikeshed-web use that support instead of fetching the URLs you pass it itself.

I'm not so worried about in-progress PRs that suddenly get a big diff once the bikeshed-web support lands: specs with includes are very badly supported now, so it's unlikely people are looking at them much. The in-progress PRs will just be a last set of PRs that don't work well, and even they will at least show the new version correctly.

tobie · 2020-05-18T15:09:48Z

I think ReSpec will Just Work, and probably has been Just Working except that PR-Preview could miss times it needed to regenerate the preview.

Oh—Interesting.

For Bikeshed, I need to teach it to operate over URLs instead of just files, and then I need to have bikeshed-web use that support instead of fetching the URLs you pass it itself.

Should we enable this for Bikeshed right away or wait until that's implemented?

Thanks default editor settings.

This depends on a to-be-added extension to http://hg.csswg.org/dev/bikeshed-web/ to accept an include_urls[] query parameter.

Also added tests and fixed bugs around recursive includes and the respec regexp.

Co-authored-by: Sid Vishnoi <[email protected]>

Co-authored-by: Tobie Langel <[email protected]>

jyasskin · 2020-05-18T15:37:13Z

For Bikeshed, I need to teach it to operate over URLs instead of just files, and then I need to have bikeshed-web use that support instead of fetching the URLs you pass it itself.

Should we enable this for Bikeshed right away or wait until that's implemented?

I'd be comfortable enabling it for Bikeshed right away: the only harm should be re-generating previews when includes change, when those includes should change the output but won't yet. But it's up to you.

tobie · 2020-05-20T16:23:52Z

This looks good. I'll deploy it as soon as I have a good enough window of time to be able to test it and roll it back/fix it if there are issues.

Also changed a property to a function since its return value wasn't === with itself.

tobie · 2020-05-21T16:43:53Z

How tf did this now make @sidvishnoi and myself the authors of this pull request!?

sidvishnoi · 2020-05-21T16:47:58Z

Things go 🤷‍♂️ when you rebase and merge

tobie · 2020-05-21T17:06:53Z

I squashed but it wasn't picked up (super flaky network, probably a UI glitch). Now force pushing to master.

This triggers a build if any of the includes of a spec are modified by a pull request. This is in preparation for adding support for Bikeshed includes. ReSpec includes should already be supported.

This triggers a build if any of the includes of a spec are modified by a pull request. This is in preparation for adding support for Bikeshed includes. ReSpec includes should already be supported. Co-authored-by: Sid Vishnoi <[email protected]>

tobie reviewed May 10, 2020

View reviewed changes

sidvishnoi reviewed May 11, 2020

View reviewed changes

jyasskin force-pushed the support-bikeshed-includes branch from 1fa8627 to 7b1fd39 Compare May 11, 2020 20:51

marcoscaceres reviewed May 14, 2020

View reviewed changes

sidvishnoi reviewed May 14, 2020

View reviewed changes

lib/scan-includes.js Outdated Show resolved Hide resolved

sidvishnoi reviewed May 14, 2020

View reviewed changes

lib/scan-includes.js Outdated Show resolved Hide resolved

sidvishnoi reviewed May 14, 2020

View reviewed changes

test/scan-includes.js Outdated Show resolved Hide resolved

tobie reviewed May 14, 2020

View reviewed changes

lib/models/pr.js Show resolved Hide resolved

tobie reviewed May 15, 2020

View reviewed changes

lib/services.js Outdated Show resolved Hide resolved

tobie reviewed May 15, 2020

View reviewed changes

lib/models/pr.js Outdated Show resolved Hide resolved

tobie reviewed May 15, 2020

View reviewed changes

lib/scan-includes.js Show resolved Hide resolved

lib/scan-includes.js Outdated Show resolved Hide resolved

tobie self-requested a review May 16, 2020 13:12

tobie approved these changes May 16, 2020

View reviewed changes

tobie reviewed May 16, 2020

View reviewed changes

lib/services.js Outdated Show resolved Hide resolved

sidvishnoi approved these changes May 16, 2020

View reviewed changes

marcoscaceres reviewed May 18, 2020

View reviewed changes

marcoscaceres approved these changes May 18, 2020

View reviewed changes

jyasskin and others added 12 commits May 18, 2020 08:33

Delete some trailing whitespace.

f0f3a05

Thanks default editor settings.

Support include files in Bikeshed.

3890b99

This depends on a to-be-added extension to http://hg.csswg.org/dev/bikeshed-web/ to accept an include_urls[] query parameter.

Scan for recursive includes when deciding whether a PR changes the spec.

5f0b6cc

await allows just as much parallelism as the manual Promise wrangling.

5be8676

Also added tests and fixed bugs around recursive includes and the respec regexp.

Fix touchesSrcFile.

7695777

Use a simpler regex for respec includes

f505ec6

Co-authored-by: Sid Vishnoi <[email protected]>

Fix a test that doesn't run scanIncludes.

524435f

Simplify the aggregation of regex matches.

63b5fa6

Have scanIncludes() omit the spec itself.

1abc1c3

Tobie's review

a72fe6a

Improve the description of GitHub.

4316de7

Co-authored-by: Tobie Langel <[email protected]>

Add some logging.

a273335

jyasskin force-pushed the support-bikeshed-includes branch from 4f63104 to a273335 Compare May 18, 2020 15:33

jyasskin changed the title ~~Support include files in Bikeshed.~~ Support include files in Bikeshed and ReSpec. May 18, 2020

jyasskin added 2 commits May 20, 2020 14:32

Expose arrays rather than sets in public APIs.

1f1b9e2

Also changed a property to a function since its return value wasn't === with itself.

Support Node 12.

2d74a4d

tobie merged commit c2c3ead into tobie:master May 21, 2020

jyasskin deleted the support-bikeshed-includes branch May 21, 2020 20:01

jyasskin mentioned this pull request May 21, 2020

Support bikeshed include files. #27

Open

	return [this.config.src_file] + this.config.includes;
	return [this.config.src_file].concat(this.config.includes);

	return [this.config.src_file].concat(this.includes);
	return [].concat(this.config.src_file, this.includes);

		for (let i = 1; i < match.length; i++) {
		const included = match[i].trim();

Support include files in Bikeshed and ReSpec. #60

Support include files in Bikeshed and ReSpec. #60

Conversation

jyasskin commented May 9, 2020 • edited Loading

tobie commented May 9, 2020 • edited Loading

jyasskin commented May 9, 2020 • edited Loading

tobie commented May 9, 2020

jyasskin commented May 9, 2020

tobie May 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jyasskin May 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jyasskin commented May 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

sidvishnoi commented May 14, 2020 • edited Loading

jyasskin commented May 15, 2020

tobie commented May 15, 2020

tobie left a comment

Choose a reason for hiding this comment

tobie left a comment

Choose a reason for hiding this comment

tobie commented May 16, 2020

sidvishnoi left a comment • edited Loading

Choose a reason for hiding this comment

marcoscaceres May 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jyasskin commented May 18, 2020

tobie commented May 18, 2020

jyasskin commented May 18, 2020

tobie commented May 20, 2020

tobie commented May 21, 2020 • edited Loading

sidvishnoi commented May 21, 2020

tobie commented May 21, 2020

jyasskin commented May 9, 2020 •

edited

Loading

tobie commented May 9, 2020 •

edited

Loading

jyasskin commented May 9, 2020 •

edited

Loading

tobie May 10, 2020 •

edited

Loading

jyasskin May 11, 2020 •

edited

Loading

sidvishnoi commented May 14, 2020 •

edited

Loading

sidvishnoi left a comment •

edited

Loading

marcoscaceres May 18, 2020 •

edited

Loading

tobie commented May 21, 2020 •

edited

Loading