Details
-
Task
-
Resolution: Done
-
P3: Somewhat important
-
None
-
5.12
-
None
Description
Objective
The port of QML to the WASM VM defaults to the use of QML byte code interpreter to run JavaScript binding expressions and code in .js files.
The port does not utilize a Just-in-time compiler. It is conceivable to generate a WASM module on the fly, but due to high frequency of function calls in and out of the JS VM, the extra overhead of function calls across WASM modules may not be worth it.
Instead the objective of this task is to investigate what overhead the QML byte code interpreter has when running in the WASM VM.
- The QML interpreter uses a computed goto on the traditional target architectures to speed up jumping between different byte code instructions. This is supported when using clang and gcc.
The WASM MVP specification does not support such a control flow and consequently the byte code interpreter for QML uses a traditional switch statement to direct between different byte code instruction implementations. This makes the interpreter running in the WASM VM slower than on other architectures.
- The byte code instructions involve a step of decoding parameters. This applies to all target architectures.
Prototype
A prototype was written to translate the byte code instructions directly to WASM expressions, to eliminate the cost of the old-style switch statement as well as the instruction decoding. In a straight-forward manner, the prototype compiles a .qml or .js file to a relocatable WASM module. Every JavaScript function or binding expressions is mapped to a WASM function, with a parameter structure suitable for the JavaScript virtual machine. The QML binary data structure is embedded as a data segment in the WASM module and along with the individual functions made visible to the LLD linker via a symbol table in the linking section. The byte code instructions in the function code are decoded and translated to WASM expressions and control flow structures.
Here is a simple example of how this translation can look like. Assuming the presence of a function local called $acc (i64) that represents the byte code VM's accumulator, $enginePtr for the address (i32) of the V4 engine and $jsStack (i32) for the address of the JavaScript stack, the "loadName" instruction that looks up a property in the activation and stores the result in the JS stack can be encoded as follows:
(set_local $acc (call $qv4runtime_method_loadName (get_local $enginePtr) (i32.const 5) // index in compilation unit's string table for the name ) ) (i64.store offset=16 (get_local $jsStack) (get_local $acc) )
As the above example also shows, function calls into the JS run-time can be resolved directly by the linker and require no extra indirection.
Results
The prototype work was developed to the point where a simple factorial calculation test-case could run and be used to measure the switch and decoding overhead. The following example was measured:
var d1 = +new Date for (var x = 0; x < 1000000; ++x) { var res = 1; for (var i = 2; i <= 12; i=i+1) res = res * i; if (res != 479001600) { console.log("KO", res) } } var d2 = +new Date console.log("done in", d2 - d1)
The generated code involves a large amount of calls into the JS runtime as well as reasonably complex control flow due to the loops as well as exception checking after run-time calls.
The above example runs ~35% faster when compiled directly to WASM compared to the traditional byte code interpreter.
Implementation caveats
The prototype uses binaryen for the code generation as well as llvm's
WASM back-end as well as WASM enabled lld to link the traditional Qt
code together with the ahead-of-time generated WASM module. This imposes
a requirement on a bleeding edge toolchain that even required
modifications.
There are two possible ways of productizing this, if desired:
- Pursue the path of relocatable WASM module generation by upstreaming
changes to binaryen and emscripten as well as putting together an SDK
with the latest set of tools for use with the Qt port.
The downsides of this approach are the dependency to a toolchain and spec that is
still in development as well as the work required to upstream
changes to allow for embedding relocatable wasm objects in archives
and "passing" them through to emscripten. The upside of this approach
is that the code generator can be fully tested against the entire
test suite of Ecmascript as well as QML through embedding of the
binaryen interpreter. This was used for prototype development.
- Replace the generation of WASM code with generation of C++ code.
The downside of this approach is that it is harder to test the code
generator for correctness. The upside is that it does not require any
changes to the SDK or toolchain. The approach itself is also not
validated yet with a prototype.
Attachments
Issue Links
- relates to
-
QTBUG-64064 QtQuick for webassembly
-
- Closed
-