~30-40% perf win using 'charCodeAt' in CParser.write() #57

DLehenbauer · 2018-12-02T23:39:20Z

Uses character codes (i.e., numbers) instead of strings comparison inside of the CParser.write() state machine for a significant perf win (measured on node v8.11.4).

Also includes:

Benchmark for comparison (npm i && npm run bench)
small set of easy-to-debug tests targeting parser (optional)
.vscode configuration for debugging tests & benchmark (optional)

DLehenbauer

Hi @evan-king and @dscape - Any interest in working w/me to get this PR accepted? The perf improvement seems substantial.

.vscode/launch.json

DLehenbauer · 2018-12-03T14:33:33Z

.vscode/settings.json

@@ -0,0 +1,4 @@
+{


(Ditto re: this VS Code config file to set tab indentation to 2 spaces.)

DLehenbauer · 2018-12-03T14:39:09Z

benchmark/index.js

@@ -0,0 +1,117 @@
+const { Suite } = require("benchmark");


This is the benchmark I used to measure the impact of the change. It measures the four .json files under "../samples" and for me shows ~30+% speedup under 'node 8.11.4'. The speedup was significant enough that I didn't bother disabling turbo boost, sleep states, etc. on the CPU.

DLehenbauer · 2018-12-03T14:39:30Z

benchmark/index.js

+
+    parser.onkey = name => {
+      this.key++;
+      assert(name !== "𝓥𝓸𝓵𝓭𝓮𝓶𝓸𝓻𝓽");


"He Who Must Not Be Named" ;-)

DLehenbauer · 2018-12-03T14:46:54Z

package.json

-    "test": "mocha -r should -t 10000 -s 2000  test/clarinet.js test/npm.js test/utf8-chunks.js test/position.js"
+    "test": "mocha -r should -t 10000 -s 2000 test/parser.spec.js test/clarinet.js test/npm.js test/utf8-chunks.js test/position.js",
+    "bench": "node benchmark/index.js",
+    "postinstall": "niv clarinet@latest --destination clarinet-last-published"


'niv' is a tool for installing multiple versions of a package side-by-side under node_modules, used by the new benchmark to compare the development version with the last published version. As an alternative, we could check in a snapshot of the previous 'clarinet.js' under './benchmark/'.

DLehenbauer · 2018-12-03T14:59:38Z

package.json

    "should": "1.0.x",
    "underscore": "1.2.3"
  },
  "scripts": {
-    "test": "mocha -r should -t 10000 -s 2000  test/clarinet.js test/npm.js test/utf8-chunks.js test/position.js"
+    "test": "mocha -r should -t 10000 -s 2000 test/parser.spec.js test/clarinet.js test/npm.js test/utf8-chunks.js test/position.js",


Just FYI - I don't believe the 'should' package is still used?

DLehenbauer · 2018-12-03T15:00:09Z

clarinet.js

@@ -1,4 +1,6 @@
 ;(function (clarinet) {
+  "use strict";
+
  // non node-js needs to set clarinet debug on root
  var env
    , fastlist


Just FYI - I don't believe 'fastlist' is used?

DLehenbauer · 2018-12-03T15:04:37Z

clarinet.js

              }
-              c = chunk.charAt(i++);
+              c = chunk.charCodeAt(i++);
              parser.position++;
              starti = i-1;
              if (!c) break;


I'm not a unicode expert, but I believe surrogates are still processed correctly because:

The hi/lo surrogates won't match any of the char codes we handle explicitly, so...

They'll fall through to the pre-existing code that appends them via substring (just below the fold)

DLehenbauer · 2018-12-03T15:08:54Z

clarinet.js

@@ -548,93 +638,83 @@ else env = window;
        continue;

        case S.TRUE:
-          if (c==='')  continue; // strange buffers


Not sure these "strange buffers" can be reproduced? If so, charCodeAt(..) probably returns 'undefined'. I could put these back defensively, but I'd be curious to see if this case still occurs.

DLehenbauer · 2018-12-03T15:12:30Z

clarinet.js

@@ -361,25 +454,25 @@ else env = window;

      if (clarinet.DEBUG) console.log(i,c,clarinet.STATE[parser.state]);
      parser.position ++;
-      if (c === "\n") {
+      if (c === Char.lineFeed) {
        parser.line ++;
        parser.column = 0;


FYI - I'm not sure 'column' is updated every time we advance a character position. It might be more reliable to store the 'columnStart' position and then calculate the column on demand with:

get column() { return this.position - this.columnStart; }

evan-king

Thank you for following up on the PR. Though I've been quite busy of late, it hadn't slipped in unnoticed and I've been pondering it for a while. Especially given my lack of time to address any issues that arise in a timely manner, my top priority is stability.

With that in mind (and in addition to the other trivial feedback given), there are 2 things I'd like to know:

how much of the performance gain came from simply removing all the empty string checks
whether anything useful can be unearthed about why they were once needed

The latter is a tall order, as the git history is pretty sparse and seems to lack any higher level summaries/explanations. But I'd be a lot more comfortable removing them if I could identify what situation did require them, and thus rationalize treating them as dead code.

If they don't significantly impact performance, leaving them in would be the easier way to just get this pushed through.

.vscode/launch.json

clarinet.js

evan-king · 2018-12-31T17:09:48Z

test/parser.spec.js

@@ -0,0 +1,178 @@
+"use strict";


I rather favor test suites set up in this way (filthy ES6 classes/OOP notwithstanding ;)), and especially being able to run more focused tests for debugging, so I'm happy to keep it.

.vscode/launch.json

DLehenbauer · 2019-01-04T02:20:45Z

I was lazy about analyzing the empty string checks. Looking again, it's pretty easy to convince oneself that the removed empty string checks were previously unreachable.

The normal control flow is to advance i++ until charAt(i) reads off the end of the string. This results in c being assigned to an empty string every time write() is called. (i.e., no "strange buffers" required)

When c is an empty string, the if (!c) break; near the main while...loop terminates the loop before the switch statement has a chance to dispatch to the S.TRUE*/S.FALSE*/S.NULL* cases.

Since c is not reassigned before we dispatch to the true/false/null cases, and no one falls through to these cases, we can conclude that c can not be an empty string when we enter the true/false/null cases (and consequently, the removed empty string checks were indeed unreachable.)

evan-king · 2019-01-04T02:38:06Z

I'm pretty comfortable with that analysis. However, it does still leave the question of how much performance gain came from removing extraneous conditional checks and not from the use of string literal comparison vs character value constants.

The former makes more sense to me as a cause of sub-optimal performance (especially post-Spectre). With that in mind, I want to be certain the larger change is justified by a meaningful impact. Otherwise, significant changes are being introduced for still-unquantified benefit that may not even remain relevant as future engine-level optimizations come down the pipeline, and future maintainers will be as afraid to disrupt or remove performance-related code as I am to introduce it.

To be clear, what I'm asking is for rough benchmark numbers for removed conditionals alone, and optimized code with conditionals preserved, alongsize current vanilla and fully optimized code. If the charCodeAt optimizations count for at least 50% of the 30% gain, then I'll merge as-is. If not, I'll consider the matter further first.

DLehenbauer · 2019-01-04T17:24:46Z

I understand... The perf gain is 100% from switching to 'charCodeAt()'. The reason for the performance improvement is that 'charCodeAt()' returns a number, which can be compared in a single machine op and involves zero heap allocation or resulting GC tax.

Calling 'charAt()' on the other hand semantically allocates a 1-character string on each invocation. This allocation can potentially be optimized away, but since the string escapes the module (via parser.c and parser.p) it would be a difficult optimization to detect.

My only motivation for removing the empty string checks was that they were either unreachable or my understanding of the code was incorrect. Removing them felt more honest than mechanically converting them, as removal would attract scrutiny in review. :-)

Here are the benchmark results from removing the conditional only:

> node index.js

old-creationix x 5,022 ops/sec ±0.25% (93 runs sampled)
new-creationix x 5,036 ops/sec ±0.24% (95 runs sampled)
Fastest is new-creationix

old-npm x 6.97 ops/sec ±0.32% (22 runs sampled)
new-npm x 6.99 ops/sec ±0.23% (22 runs sampled)
Fastest is new-npm,old-npm

old-twitter x 4.09 ops/sec ±1.00% (15 runs sampled)
new-twitter x 4.07 ops/sec ±0.45% (15 runs sampled)
Fastest is old-twitter

old-wikipedia x 118,013 ops/sec ±0.46% (92 runs sampled)
new-wikipedia x 115,889 ops/sec ±1.34% (93 runs sampled)
Fastest is old-wikipedia

And here are the results w/the conversion to character codes:

> node index.js

old-creationix x 5,011 ops/sec ±0.60% (93 runs sampled)
new-creationix x 8,548 ops/sec ±0.29% (92 runs sampled)
Fastest is new-creationix

old-npm x 6.96 ops/sec ±0.61% (22 runs sampled)
new-npm x 9.87 ops/sec ±0.24% (29 runs sampled)
Fastest is new-npm

old-twitter x 4.06 ops/sec ±0.22% (15 runs sampled)
new-twitter x 6.29 ops/sec ±0.36% (20 runs sampled)
Fastest is new-twitter

old-wikipedia x 116,979 ops/sec ±0.32% (94 runs sampled)
new-wikipedia x 175,097 ops/sec ±0.34% (92 runs sampled)
Fastest is new-wikipedia

PS - In tweaking the benchmark to get these measurements, I ran into this issue with the 'npm-install-version' package I added. (I pushed a change to avoid the dependency.)

evan-king · 2019-01-04T18:03:41Z

Thanks for the thorough PR support. Approved and merging.

DLehenbauer added 5 commits December 2, 2018 13:03

Benchmark.js based benchmark to compare perf w/latest published version

2689bf9

~30-40% perf win using charCode instead of strings

f057786

add easy to debug test cases with VS Code config

9eedc9e

Lint

445b19d

Lint 2

e85bc3f

DLehenbauer commented Dec 31, 2018

View reviewed changes

evan-king requested changes Dec 31, 2018

View reviewed changes

DLehenbauer added 2 commits December 31, 2018 18:07

Remove /.vscode and add to .gitignore

3f703c9

Remove unused Char constants

fb3ed03

Benchmark: remove dependency on 'npm-install-version'

ed69aca

evan-king approved these changes Jan 4, 2019

View reviewed changes

evan-king merged commit 7f65dd3 into dscape:master Jan 4, 2019

DLehenbauer deleted the perf branch April 5, 2019 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

~30-40% perf win using 'charCodeAt' in CParser.write() #57

~30-40% perf win using 'charCodeAt' in CParser.write() #57

DLehenbauer commented Dec 2, 2018

DLehenbauer left a comment

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

DLehenbauer Dec 3, 2018

evan-king left a comment •

edited

Loading

evan-king Dec 31, 2018

DLehenbauer commented Jan 4, 2019

evan-king commented Jan 4, 2019 •

edited

Loading

DLehenbauer commented Jan 4, 2019

evan-king commented Jan 4, 2019

~30-40% perf win using 'charCodeAt' in CParser.write() #57

~30-40% perf win using 'charCodeAt' in CParser.write() #57

Conversation

DLehenbauer commented Dec 2, 2018

DLehenbauer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evan-king left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DLehenbauer commented Jan 4, 2019

evan-king commented Jan 4, 2019 • edited Loading

DLehenbauer commented Jan 4, 2019

evan-king commented Jan 4, 2019

evan-king left a comment •

edited

Loading

evan-king commented Jan 4, 2019 •

edited

Loading