This subtask is a generic go-to page for all things w.r.t. the performance of the generated code. It would contain a mix of thoughts, general ideas, tricks and drive-by remarks on the topic of generating performant C++ code, comparable in time to QQmlComponent-based workflow (yes, it is actually non-trivial to generate faster code "out of the box" in the broad case - e.g. calqlatr demo).
So, the list (in no particular order):
- (1) The big problem of generated code vs QQmlComponent is the lack of QQmlObjectCreator (and its shared state) in the former case. There are multiple things that allow QQmlObjectCreator to actually be useful and save cycles here-and-there:
- (1.2) Bindings. QQmlObjectCreator's model is to not install the eager (and deferred?) bindings straight away, but instead store them into shared state and then enable - and remove some of - those at the finalization time (e.g. possibly all script bindings that assign an enumeration value are eliminated after single evaluation). This basically means that no in-between property change triggers a binding evaluation (!= delayed evaluation). For the qmltc it means that to provide a similar facility, we must somehow remember bindings that need to be enabled after all document's objects are constructed and the document root construction is finalized, this seems fairly problematic on its own. Without doing so, we risk that certain property changes can trigger re-evaluations in random places (due to id lookups)
- (1.3) Q_INTERFACES and `Component.onCompleted` objects. Similarly to (eager) bindings, QQmlObjectCreator just remembers which objects need special handling and then does it once for all objects in the QQmlObjectCreator::finalize. qmltc might get away with generating hardcoded instructions that do the necessary evil at the end of document root's construction. The obvious problem is: current document must also care about all types that are QML-originated (thus, they come from other QML documents) – as the order is partially well-defined it is going to be tricky but actually necessary.
- Note: the Q_INTERFACES calls (such as QQmlParserStatus' componentComplete()) can be heavy (e.g. QQuickItem's componentComplete calls update layout). qmltc must ensure that we only do the necessary evil for Q_INTERFACES after all bindings for all objects are set. The problem is that certain properties' setters might easily check whether componentComplete() is already called (or not) and perform additional steps (see e.g. QQuickText::setText and QQuickTextPrivate::updateLayout) which are heavy. Thus, making sure we only start the completion after all possible bindings have been enabled could reduce the run time overhead (due to not doing useless extra work)
- Note 2: The experiments of comparing qmltc generated code and QQmlComponent for NumberPad type in calqlatr demo indeed reveal that the currently generated code is flawed - as we get much more QQuickText::updateLayout() calls in qmltc case (actually, from the logs it seems like updateLayout() is called twice every time in qmltc case while only once when using QQmlComponent):
- (2) QQmlListReference. Used (almost) always when object bindings are present. qmltc can fetch QQmlListProperty directly, without going through the meta-object system. Being able to construct QQmlListReference out of QQmlListProperty (which is the backbone of the data anyway) would eliminate the need to perform costly lookups - if we're querying property by a string name. At present, there's QQmlListReference(QVariant, QQmlEngine), maybe this is "good enough"?
- (3) QQmlPropertyValueInterceptor and QQmlPropertyValueSource use setTarget(QQmlProperty). Fetching QQmlProperty is actually not free at all: unlike property value read, which is a direct C++ function call, we have to go through property caches when instantiating a QQmlProperty. Question is: do we really need a full-blown QQmlProperty or some bits of it are enough? (e.g. querying QMetaProperty is much faster)
- Note on QQmlProperty: there's a constructor that accepts QQmlContext as additional parameter (and allows the use of type name cache) - could creating a property through this API be cheaper?
- Anyhow, this is a fairly rare case it seems
- (4) Separated construction and setup/finalization of objects. Due to non-recursive nature of object finalization (see QQmlObjectCreator::finalize which is "flat"), object "construction" (creation, context setup, etc.) and "finalization" (binding setup, Q_INTERFACES handling, etc.) has to be separated. For qmltc model it means that the document root has to call multiple independent functions on children in a defined order.
- Children access is non-trivial: the current approach is to use QObject::children().at(index) and then, if the child of a child is needed, we have to do two calls to the function one way or another
- The independent functions (set bindings, call componentComplete(), call componentFinalized(), handle Component.onCompleted, etc.) have to either be recursive - calling the children's functions - and must respect base classes. Alternative is to perform all the calls in the document root, but it means that several levels of indirection are needed: the QML code might require to call something on "a child of a child of a child of a document root" element.
- Decent strategy to handle this would be: introduce a "QmltcObjectCreator" (template class) that would create a document root. Additionally, the QmltcObjectCreator has a `std::array<QObject *> flatElements` structure that holds all sub-elements of the document root. The size of the array is determined by the document root (generated class) which must have this information. Then, document root can trivially access any created element across QML documents at run time and call the necessary things for each (e.g. classBegin(), binding creation, componentComplete(), componentFinalized(), Component.onCompleted). This model should significantly accelerate the access as the pattern is simple and is cache-friendly for the CPU. The code generator might also reorder certain calls (where possible), to improve the cache locality.