Uploaded image for project: 'Qt for Python'
  1. Qt for Python
  2. PYSIDE-2404

Create an On-Demand Initializer for PySide

    XMLWordPrintable

Details

    • Task
    • Resolution: Incomplete
    • P2: Important
    • None
    • None
    • Shiboken
    • None
    • c0b74a794 (dev), 8aa92c970 (6.5), 2bb8b0f7b (dev), a6b47fbd8 (6.6), 6e897a9ef (dev), 7f69d4d56 (dev), d9f3fb812 (dev), ceae763ac (6.6), 9b2ba5e6b (dev), d7b1c851d (6.6), 9b240cd08 (dev), 14618acc1 (6.6), dcbe4810a (dev), 546548acc (dev), fb0270f39 (dev)

    Description

      The Initialization Problem

      All PySide types are created at the very beginning of a program, when importing a module. This costs a little bit of non-negligible time, and by introducing the new enums this effect becomes even more obvious:

      • New enums require a considerable amount of Python code, which is naturally about 20-30 times slower than the C++ equivalent.
      • If you look at the effort that goes into initialization compared to the benefit, you see that most of it could be saved by not initializing at all.

      Early attempts

      A first, relatively naive approach was not to create enums directly, but only when they are used. For this purpose initialization checks were built into functions like PyObject_GetAttr. This worked quite well, but was not reliable:

      • Besides the mentioned changes there are many other places where initializations have to be checked.
      • This also happens in all sorts of conversion functions that show up everywhere in the generated wrapper code.
      • All these conversion functions must still work, even if the used Python wrappers don't exist yet.

      How to solve the problem in the first place?
      After various other attempts, this approach emerged as promising in the end:

      Register classes that do not exist yet

      In the original implementation there is a mapping in every module which assigns class names to wrapper classes. This is the communication center of every module. It looks like an unsolvable problem to stop creating this structure. But there is another possibility:

      The essential new approach is now to change this mapping in such a way that the classes are still found, but without the classes necessarily having to exist.

      • Instead of registering classes, we register functions that can create such classes.
      • The access to the classes, e.g. in conversion functions, is embedded into a Shiboken::resolve function, so that for the view of the calling function nothing changes: The requested class is delivered, even if this is initialized only by the resolve function.

      The existing mapping is changed to both support the PyObjectType directly as before or to call a PyObjectType-valued function which then generates the PyTypeObject just in time. This way, by supporting both possibilities, the change to late initialization can be done gradually. We can call that Dual Registration.

      Concrete Example of Dual Registration Preparation

      Normally, the QtCore module header contains these structures for registration:

      // Current module's type array.
      PyTypeObject **SbkPySide6_QtCoreTypes = nullptr;
      // Current module's PyObject pointer.
      PyObject *SbkPySide6_QtCoreModuleObject = nullptr;
      // Current module's converter array.
      SbkConverter **SbkPySide6_QtCoreTypeConverters = nullptr;
      

      and a usage example is

      Shiboken::Conversions::pointerToPython(SbkPySide6_QtCoreTypes[SBK_QCHILDEVENT_IDX], event)
      

      New PyTypeTypeF * array replaces PyTypeObject **

      The preparational new structure looks like this, instead:

      // Current module's type array.
      Shiboken::PyTypeTypeF *SbkPySide6_QtCoreTypes = nullptr;
      // Current module's PyObject pointer.
      PyObject *SbkPySide6_QtCoreModuleObject = nullptr;
      // Current module's converter array.
      SbkConverter **SbkPySide6_QtCoreTypeConverters = nullptr;
      

      and its slightly modified usage like this:

      Shiboken::Conversions::pointerToPython(Shiboken::resolve(SbkPySide6_QtCoreTypes[SBK_QCHILDEVENT_IDX]), event)
      

      Instead of an array of PyTypeObject *, the added Shiboken::resolve function now operates on these pairs:

      struct PyTypeTypeF {
          PyTypeObject *pyType;
          PyTypeObject *(*pyTypeF)();
      };
      

      This function is very simple, but with a huge effect:

      SbkObjectType *resolve(const PyTypeTypeF &pair)
      {
          return pair.pyTypeF ? pair.pyTypeF() : pair.pyType;
      }
      
      • If the pyTypeF field is not set, then the return value is taken from PyTypeObject * directly, as it was before this whole change.
      • If the pyTypeF field is set to a PyTypeObject * valued function, this function is called. It has to perform the lazy initialization.

      Note the effect:

      • The requested type is still delivered at the right time by construction. But now the type creation is done just in the latest moment possible.
      • The registration is still well defined because it describes how to get from name to type. The values are constant and will never be changed after assignment.

      A New Approach: Delay at the Import Level

      Thinking of possible Initialization savings in Python, there is a working example from Python itself:

      PEP 690

      This PEP defines how modules in Python can be delay-loaded. A delayed module is not eagerly loaded when the import happens, but the real import happens when something from the module is actually used.

      In PySide, the situation is slightly different, because it is not enough to trigger execution when the first object of the module is used. This would almost every time trigger. The saving for PySide modules occurs only when this is done finer-grained, i.E. per class. But the idea is quite simirar. It is the implementation of

      • Load Classes Only when they are Used

      The necessary functions have already been implemented in sbkmodules.cpp. The module import is modified in a way that only the names of classes become defined. They are no Python objects, yet. Only when a getattr function is called, the initialization takes place.

      There is an advantage over the former approaches, as we now try not to change lots of generated initialization code and many conversion functions. Instead, we try not to touch anything but simply to initialize class access by the intercepted module_getattr function.

      Remaining Problems with New Approach

      While the new way simplifies things very much, we still have the problem that PySide modules want to fill a static array of readily generated classes. In the past approach we tried to delay this initialization by using class generating functions instead of the classes themselves. This pushed the complication into the code generator, which made trying of different approaches very tedious and error-prone.

      Update 2024-01-17: First tests are very positive!

      Update 2024-04-04: This is now finally integrated.

      ... and needs a fix for polymorphic functions

      Attachments

        Issue Links

          For Gerrit Dashboard: PYSIDE-2404
          # Subject Branch Project Status CR V

          Activity

            People

              ctismer Christian Tismer
              ctismer Christian Tismer
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There is 1 open Gerrit change