Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink about the runtime encoding of ocaml values in javascript #24

Closed
bobzhang opened this issue Jan 19, 2016 · 72 comments
Closed

Rethink about the runtime encoding of ocaml values in javascript #24

bobzhang opened this issue Jan 19, 2016 · 72 comments

Comments

@bobzhang
Copy link
Member

Goal:

  1. Less work to patch the compiler
  2. OCaml Array is JS Array

Some documentation about the current encoding is [https://github.com/bloomberg/ocamlscript/blob/master/docs%2Fffi.md] here, there is a problem with this encoding is that Pfield i is no longer the same as Parrayref, while the internal OCaml compiler think it is the same, for example stdlib/camlinternalOO.ml, bytecomp/translobj.ml, bytecomp/translclass.ml there might be some other files I am missing. (Recent changes in the trunk of stdlib/camlinternalOO requires us to sync up the change)

So I am thinking of that Obj.tag is not used in too much in the compiler itself(except the GC, which is not relevant in js backend)

So I am proposing that blocks with tag zero is encoded as array, blocks with tag non zero (mostly normal variants) will be encoded as array plus an extra property via Object.defineProperty so that they are polymorphic. and Pfield, Parrayref will behave the same.

for example

type u = 
| B of int * int
|A of int * int * int 

A (1,2,3) will be encoded as Object.defineProperty([1,2,3], {'t', {'value', 1}}
B(1,2) will be encoded as [1,2]

@bobzhang
Copy link
Member Author

Note that Object.defineProperty is slow, and variants in general will not be very big, we can have objects like this for A(1,2,3)

{ t : 1, 0 : 1,1:2,2:3}

This encoding actually be efficient in most cases like option type, list and map, set, since they only have two branches for the internal representation

@m2ym
Copy link

m2ym commented Jan 19, 2016

As far as I know, ocamlscript currently encodes OCaml string (pure byte array) as JS string (UTF16 string?). I think it may go wrong in particular situations. Sorry if unrelated to this issue.

@bobzhang
Copy link
Member Author

@m2ym indeed, there is a corner case in range pattern

function '\000' .. '\255' -> 1 

in ocaml will be compiled into

function _ -> 1

do you have anything else in mind? thanks! (ocamlscript will require -safe-string)

@bobzhang
Copy link
Member Author

we can walk around this issue by x.charCodeAt(i) & 0xff, but I am not sure whether it is worth the effort (some of them can be optimized away)

@bobzhang
Copy link
Member Author

think twice, it might not be a problem, since string is immutable, and is lexed by ocamllex, there is no way you can encode a string out of range, it only happens in FFI, @alainfrisch ?

@alainfrisch
Copy link

Yes, you could get strings from JS through the FFI.

@bobzhang
Copy link
Member Author

another benefit of such encoding is that for the FFI, tuple is quite easy to map, even if we goto typedtree we can still guarantee the mapping of tuple, since the field of tag is singled out, it is easy to tell if it is an ocaml object and js object

@copy
Copy link
Contributor

copy commented Jan 23, 2016

Float arrays and records could be represented using Float64Array. Maybe integer arrays too (using Int32Array).

@bobzhang
Copy link
Member Author

@copy I have similar ideas, but Float64Array is not widely support across js engines, how does js_of_ocaml do the encoding, do you have any idea why it does not use Float64Array if not?

@copy
Copy link
Contributor

copy commented Jan 24, 2016

I don't know why they don't use it, maybe they haven't considered it yet.

I agree with engine support. It could be enabled conditionally with a flag or shimmed though.

@bobzhang
Copy link
Member Author

bobzhang commented Feb 1, 2016

@hhugo do you have any suggestion?

@bobzhang
Copy link
Member Author

bobzhang commented Feb 9, 2016

For the record, encoding variants as js objects will help debugging a lot, for example

function $Cons(x,y){ this._0 = x; this._1 = y; this.t = "Cons" }
$Cons.prototype.toString = function(){return `${this.t} ( ${this._0}, ${this._1})`}

The benefit is that we can make use of prototype chain in js and get a pretty output when debugging, the challenge is that we need know where constructors are called.

Every call site x::y has to be replaced by new $Cons(x,y)

for the record we can do something similar

type t = { x  : int ; y : int}
function $t(x,y} { this._0 = x; this._1 = y; this.lables = ['x','y']}
$t.prototype.toString= function(){...}

we can still preserve the structure comparison semantics of ocaml, the same challenge as we have for encoding variants, if we go this way, it is less work then compiling from typedtree, but get a reasonable output

@bobzhang
Copy link
Member Author

bobzhang commented Feb 9, 2016

note that the reason that lazy is broken is also due to ocaml compiler internally treat object and array in a uniform way

@bobzhang
Copy link
Member Author

so currently, record, tuple, array, polymorphic variants are all compiled into array(offset 0) normal variants are compiled into array like object

{ 0 : field_0, 1 : field_1, ... n - 1 : field_n_1, length : n , tag : obj.tag}

type option and list are handled specially,

Some x is compiled into [x]
x::y is compiled into [x, y]

@bmeurer
Copy link
Contributor

bmeurer commented Feb 12, 2016

As discussed in email previously, I'd strongly suggest to use object literals with 0 based indexing and the tag, as you described.

{ 0 : field_0, 1 : field_1, ... n - 1 : field_n_1, length : n , tag : obj.tag}

This might be slower in some cases than the array literal version

let o = [field_0, ..., field_n_1]; o.tag = tag;

but I'm sure we (and other VMs) can fix the performance differences when you run into tricky performance problems. I think the object literal version has the most potential for usability (i.e. FFI), readability and consistent performance.

@bobzhang
Copy link
Member Author

@bmeurer that's very kind of you, we will move forward with this encoding for variants

@bobzhang
Copy link
Member Author

For the record, we can use Float64Array when applicable (since operator [] is also overloaded)

@bmeurer
Copy link
Contributor

bmeurer commented Feb 16, 2016 via email

@bobzhang
Copy link
Member Author

@bmeurer thanks for the info, it's easy for us to switch, you mean like this below?

Object.defineProperty([1,2,3], 'tag', { 'value' : tag_value})

What are the best property settings?

Another question, shall I write a wrap function

let f = (arr, tag)=> Object.defineProperty(arr,'tag',{'value',tag}}
f([1,2,3],tag_value)

or always do the inlining? I guess I should do the inlining for the peephole, right?

@bmeurer
Copy link
Contributor

bmeurer commented Feb 16, 2016

Yes, I'd also go for

Object.defineProperty(arr, 'tag', {'value':tag_value})

i.e. non-writable, non-configurable 'tag' property. Not sure if the helper function makes sense. It might save you some space, and shouldn't really hurt performance, as it should not get polymorphic as long as tag is always a small integer.

@bobzhang
Copy link
Member Author

@bmeurer thanks, we will go this way, it will not save too much space after being gzipped.
for the record, int array is also available in lambda layer, so we can compile ocaml int array to Int32Array when applicable

@jordwalke
Copy link

What about creating "prototypical class" constructors for each tag type? When creating objects, you just create "new" instances of those. Every object already inherently has a constructor with its prototype, so this doesn't seem like it would add any more memory allocations. Checking the tag could be as simple as checking reference identity on that constructor/prototype of the object/record etc.

@bobzhang
Copy link
Member Author

@jordwalke can you be more elaborate (with an example), the thing is we don't have type definition. but we can add a name property for debugging experience when available like below:

{ tag : 0, 0 : field_0 , 1 : field_1, name : constructor_name_when_available_just_for_debugging}

@bobzhang
Copy link
Member Author

For the record, actually labels are available when constructing a record. so when compiling record(in debug mode)

we can generate

type t = { x : float; y : float}
let make_t x y = { x ; y }

type  u = A of int 
let make_a x = A x
function make_t (x,y) { 
  return Object.defineProperty([x,y], 'labels', { 'value' : ["x";"y"]}
}
function make_x (x){
  return Obj.defineProperties([x],  { 'name' : { 'value' : 'A'}, 'tag' : 0}}
}

with this, I think we can debug ocaml generated js in js debugger without too much pain(with or without sourcemap)

@jordwalke
Copy link

Old reply: Regarding the FFI API, I really don't care too much because I would imagine a toolchain that compiles interop based on Flow type definitions. (Flow is likely a better match than type script because it has non-nullability by default and even polymorphic variants! TypeScript is what you get when C# developers build static typing. Flow is what you get when OCaml developers build static typing - but I digress)

Why is it that you have access to the field labels at debug time, but not for other compilation modes? And yes, just having those debuggable values fulfills the purpose - the actual representation doesn't matter as long as it performs very well.

@jordwalke
Copy link

If you benchmark, don't forget to test on JavaScriptCore (JIT and non-JIT). It is one of the best, lightest weight, and most versatile JS engines for mobile where perf matters most. I'm always happy to help people test on JSC (it's already installed on every Mac by default!)

@bobzhang
Copy link
Member Author

@jordwalke in the debug time, field access is still using array. for example type t = {x : string};; (f :t ). x
is still f[0], the difference is that in debug mode, we can attach f the labels, so by looking at the debugger, you know f[0] is f["x"]
The main motivation that we designed it this way is not for performance, it is that the lable information is not available when destructing the record

@bobzhang
Copy link
Member Author

it's ok for variants, the thing is that some compiler internals (mostly in translclass/translobj) assumes block is an array, unless the upstream lend a hand, it will be a headache to maintain those patches

@jordwalke
Copy link

jordwalke commented Apr 18, 2016

It would be fine to have them continue to be delivered as an array, as long as every type chose a distinct index range.

So, a point {x=21; y=22} would have an array with indices [0=>21, 1=>22], but then a user {age=21; name="bob"} would have an array with indices [9=>21, 10=>"bob"].

We don't need to know the actual original "types", we just need to guarantee that the index ranges (for creation and access) are consistent for two data p and q if and only if p and q have the same type.

Of course, you could then normalize all the ranges back to start at zero in the typical compiler.

@bmeurer
Copy link
Contributor

bmeurer commented Apr 18, 2016

Arrays with holes will waste space and are in general slower than packed
arrays, although likely still a bit faster than regular objects with array
index like properties.

@bobzhang Object.defineProperty in a hot function will always be way slower than adding a property via a store to a previously defined literal. slow3.js usually translates to highly efficient code that just allocates and initializes the literal with slack space for the tag property, and then transitions the object to a new hidden class and stores the tag value. So that's just bump pointer allocation plus a bunch of machine level stores. While slow4.js always goes to the runtime (switching to C++) for the Object.defineProperty call.

@bmeurer
Copy link
Contributor

bmeurer commented Apr 18, 2016

@bobzhang Ok, I figured out what's causing the slow down in slow.js vs fast.js. The access to the objects is properly optimized (in V8), but the allocation of the literal is currently not. So what happens is that we generate fast code for [a,...,z], but need to fallback to the generic allocation path (which is super slow compared to the inline path) for {0:a, ..., n-1:z}. I'm not exactly sure why we do this tho, as there doesn't seem to be an obvious reason why we can't support both in the inline path. Maybe it's just because that didn't turn out to be relevant (and somehow in the back of my head I was pretty sure we already fixed this some time ago).

@jordwalke
Copy link

jordwalke commented Apr 18, 2016

@bmeurer:

Arrays with holes will waste space and are in general slower than packed
arrays, although likely still a bit faster than regular objects with array
index like properties.

I was not suggesting that there be actual holes in the arrays that are allocated in JavaScript. I was only suggesting that holes be places in the index ranges in the compiler's intermediate representation. Those holes are only there to ensure that intermediate representations maintain distinct "meaning" for various offsets. We don't need to know everything about the type at this later stage of the compiler - only its memory layout, and some hole starting index that uniquely classifies which other structures it is compatible with. I would then suggest taking those hole-ridden ranges, and then converting them into plain Objects as follows:

var obj = {
  field55: 21,
  field56: "bob"
};

This has all the benefits of the third test case that I created called "String Object Keys" above, but without the issue that JIT optimizers may have their hidden classes confused by every structure having the fields located at keys "str0", "str1", "str2".

The actual native ocaml compiler would want to disregard those holes. I'm merely suggesting a way that, via index ranges, everything we needed to know about the distinct type can be conveyed without actually having to track the type through the various intermediate representations.

For every possible engine, including legacy engines deployed to node/browsers, it seems this would be optimal, correct?

@bmeurer
Copy link
Contributor

bmeurer commented Apr 18, 2016

@jordwalke Indeed, that's a good suggestion, and I suppose it will be optimizable on all engines.

@bobzhang
Copy link
Member Author

@bmeurer , thanks for looking, is there any downside with respect to slow3.js?
Suppose v8 do the inline path for {0:.., n : } in the future, how about the access, would it be as fast as array access?

@bobzhang
Copy link
Member Author

@jordwalke the general policy of patches is that it should be sound -- which means even if it is missing somewhere, the output should be still correct (maybe less efficient or uglier). I think we can discuss more about it in the futre

@jordwalke
Copy link

@jordwalke the general policy of patches is that it should be sound -- which means even if it is missing somewhere, the output should be still correct (maybe less efficient or uglier). I think we can discuss more about it in the futre

I do not believe I proposed anything unsound.

@bmeurer
Copy link
Contributor

bmeurer commented Apr 19, 2016

It's almost the same performance on access, yes.

@bobzhang
Copy link
Member Author

@bmeurer cool, it seems slow3.js is the best encoding at this time -- I used to learn that patch an object with property could cause de-optimization, but it is not true in this case, right?

@bmeurer
Copy link
Contributor

bmeurer commented Apr 20, 2016

@bobzhang In V8 this won't cause de-opts, but the assignment to x.tag will not be inlined into the function (but use a store IC instead), because the array literal doesn't reserve space for in-object properties, so we need to allocate out-of-object properties backing store for tag first.

@bobzhang
Copy link
Member Author

@bmeurer @jordwalke so we will go with slow3 version in the short-term, in the future, we can provide a command line argument to allow different runtime encoding for different engines. Also any suggestion that can help give hints to VM engines tto optimize such code is much appreciated!

@bmeurer
Copy link
Contributor

bmeurer commented Apr 20, 2016

I want to address the issues with the object literals in V8. I'll try to reserve some time for that during the year.

@bobzhang
Copy link
Member Author

@bmeurer cool, thank you in advance!

@bobzhang
Copy link
Member Author

feel free to re-open it if anyone has better ideas

@ergl
Copy link

ergl commented Feb 22, 2017

Is there still an issue with the performance of object literals?

I'm hitting an issue with the current array representation — I'm sending the representation of union types around the network, but given that most serialization formats either ignore or strip properties in arrays, there's no guarantee that the representation will be the same after being encoded and decoded again.

@glennsl
Copy link
Contributor

glennsl commented Feb 22, 2017

@ergl There's apparently some work on "efficient deriving" that should fix this. In the meantime, you could use this for serialization: https://github.com/BuckleTypes/transit-bsc

@ergl
Copy link

ergl commented Feb 22, 2017

@glennsl thanks for the link, I ended up implementing a converter for array encoding <-> object literal encoding, as I'm not in control of the transport format right now

@bobzhang
Copy link
Member Author

bobzhang commented Feb 22, 2017 via email

bsansouci pushed a commit to bsansouci/bucklescript that referenced this issue Feb 12, 2018
* replace the basic theme Reason project with the bsb-native example

* note  behavior to create a new project

* add back bs-dependencies to basic template bsconfig; code review nits

* add back code review nit; rebase off master
@jacobp100
Copy link

jacobp100 commented Mar 24, 2019

Hope this isn't too off topic. I was just looking at a common use-case: mapping over a list. I made a test demo using an implementation like List.map (immutable updates) and Belt.List.map (mutable updates). I tried with both an array and object representation.

I tested in Chrome and Safari using esbench. For the immutable case, the object wins out slightly in Chrome, and by quite a lot in Safari. For the mutable case, the object wins out by quite a lot in both browsers (both were faster than any immutable updates, too).

My results in Safari were,

Test Ops/s
mapImmutableArray 407,790
mapImmutableObject 439,402
mapMutableArray 822,669
mapMutableObject 1,367,456
Code
let listArray = null
let listObject = null
for (let i = 100; i >= 0; i -= 1) {
    listArray = [i, listArray]
    listObject = { current: i, next: listObject }
}

const fn = x => x * 2 + 1

const mapImmutableArray = (fn, x) => x != null ? [fn(x[0]), mapImmutableArray(fn, x[1])] : null
const mapImmutableObject = (fn, x) => x != null ? { current: fn(x.current), next: mapImmutableObject(fn, x.next) } : null

const mapMutableArray = (fn, x) => {
    if (x == null) return null
    const out = [fn(x[0]), null]
    let writeTo = out
    let readFrom = x[1]
    while (readFrom != null) {
        const next = [fn(readFrom[0]), null]
        writeTo[1] = next
        writeTo = next
        readFrom = readFrom[1]
    }
    return out
}

const mapMutableObject = (fn, x) => {
    if (x == null) return null
    const out = { current: fn(x.current), next: null }
    let writeTo = out
    let readFrom = x.next
    while (readFrom != null) {
        const next = { current: fn(readFrom.current), next: null }
        writeTo.next = next
        writeTo = next
        readFrom = readFrom.next
    }
    return out
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants