Uploaded image for project: 'Qt for Python'
  1. Qt for Python
  2. PYSIDE-1313

PySide::GlobalReceiverV2::qt_metacall can corrupt CPython's internal exception state

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P2: Important
    • 5.15.1
    • 5.14.2.1
    • PySide
    • None

    Description

      Consider the following short program. demo() is a trivial async function that creates a QObject instance, connects a Python signal, and then exits. For current purposes, "async function" is basically just the same as a generator. Down below I'll explain why I suspect the underlying bug affects lots of different situations beyond just async functions, but this is the reproducer I have at hand.

      Anyway, when we call send(None) on this object, we expect to get a StopIteration exception.

      from PySide2 import QtCore
      
      class MyQObject(QtCore.QObject):
          sig = QtCore.Signal()
      
      async def demo():
          myqobject = MyQObject()
          myqobject.sig.connect(lambda: None)
          return 1
      
      coro = demo()
      try:
          coro.send(None)
      except StopIteration as exc:
          print(f"OK: got {exc!r}")
      except SystemError as exc:
          print(f"WTF: got {exc!r}")
      

      Actual output:

      StopIteration: 1
      WTF: got SystemError("<method 'send' of 'coroutine' objects> returned NULL without setting an error")
      

      So there are two weird things here: the StopIteration exception is being printed on the console for some reason, and then the actual send method is raising SystemError instead of StopIteration, with a strange error message complaining about what looks like a bug in the Python interpreter.

      Here's what's happening:

      1. In CPython, the gen_send_ex function implements the send method on coroutines and generators. When a generator/coroutine completes, it uses the C API to raise StopIteration, and then it DECREFs the generator/coroutine frame object.

      2. In our code, the Py_DECREF call causes the reference count on myqobject to drop to 0, so it's garbage collected, and its tp_dealloc hook runs.

      3. From there, we eventually end up in PySide2::GlobalReceiverV2::qt_metacall, which contains the following code:

          // SignalManager::callPythonMetaMethod might have failed, in that case we have to print the
          // error so it considered "handled".
          if (PyErr_Occurred()) {
              int reclimit = Py_GetRecursionLimit();
              // Inspired by Python's errors.c: PyErr_GivenExceptionMatches() function.
              // Temporarily bump the recursion limit, so that PyErr_Print will not raise a recursion
              // error again. Don't do it when the limit is already insanely high, to avoid overflow.
              if (reclimit < (1 << 30))
                  Py_SetRecursionLimit(reclimit + 5);
              PyErr_Print();
              Py_SetRecursionLimit(reclimit);
          }
      

      This is the bug: this code can be run from arbitrary contexts, including contexts where an exception has already been raised. gen_send_exPy_DECREF is one of them, but there are probably lots of others. So you can't just blindly call into Python, and you can't assume that PyErr_Occurred() indicates that an exception was raised by your code.

      I think this is currently a showstopper for using PySide2 and async/await together, and might be the cause of other strange behaviors as well.

      For an example of the correct way to call Python when you might be in some weird context like a GC destructor, see the slot_tp_finalize code in CPython, which is used to invoke user-supplied _del_ methods:

      https://github.com/python/cpython/blob/5f4b229df7812f1788287095eb6b138bb21876a4/Objects/typeobject.c#L6952-L6967

      Key points:

      • You need to save/restore the exception state before invoking the Python code, using PyErr_Fetch/PyErr_Restore
      • Instead of that tricky loop, you can probably just call PyErr_WriteUnraisable. It'll also produce a much clearer message, so I wouldn't have been left staring at the plain text StopIteration: 1 wondering where the heck it was coming from .
      • I'm not sure if this would be a functional change or just a stylistic one, but it's definitely weird to be calling PyErr_Occurred() directly. Usually you check the return value from invoking Python code to tell you whether it failed or not, and then if it failed you know that there's a live exception.

      Also, I noticed that PyErr_Occurred occurs more than 40 times in the PySide2 codebase. I'm nervous that there might be more places that use it incorrectly like this.

      Full stack trace for the point where the StopIteration gets discarded:

      #0  0x0000000000677eb7 in _PyErr_Fetch (p_traceback=0x7ffd9fda77d0, 
          p_value=0x7ffd9fda77d8, p_type=0x7ffd9fda77e0, tstate=0x2511280)
          at ../Python/errors.c:399
      #1  _PyErr_PrintEx (tstate=0x2511280, set_sys_last_vars=1) at ../Python/pythonrun.c:670
      #2  0x00007f1afb455967 in PySide::GlobalReceiverV2::qt_metacall(QMetaObject::Call, int, void**) ()
         from /home/njs/.user-python3.8/lib/python3.8/site-packages/PySide2/libpyside2.abi3.so.5.14
      #3  0x00007f1afaf2f657 in void doActivate<false>(QObject*, int, void**) ()
         from /home/njs/.user-python3.8/lib/python3.8/site-packages/PySide2/Qt/lib/libQt5Core.so.5
      #4  0x00007f1afaf2a37f in QObject::destroyed(QObject*) ()
         from /home/njs/.user-python3.8/lib/python3.8/site-packages/PySide2/Qt/lib/libQt5Core.so.5
      #5  0x00007f1afaf2d742 in QObject::~QObject() ()
         from /home/njs/.user-python3.8/lib/python3.8/site-packages/PySide2/Qt/lib/libQt5Core.so.5
      #6  0x00007f1afb852681 in QObjectWrapper::~QObjectWrapper() ()
         from /home/njs/.user-python3.8/lib/python3.8/site-packages/PySide2/QtCore.abi3.so
      #7  0x00007f1afbf785bb in SbkDeallocWrapperCommon ()
         from /home/njs/.user-python3.8/lib/python3.8/site-packages/shiboken2/libshiboken2.abi3.so.5.14
      #8  0x00000000005a4fbc in subtype_dealloc (self=<optimized out>)
          at ../Objects/typeobject.c:1289
      #9  0x00000000005e8c08 in _Py_Dealloc (op=<optimized out>) at ../Objects/object.c:2215
      #10 _Py_DECREF (filename=0x881795 "../Objects/frameobject.c", lineno=430, 
          op=<optimized out>) at ../Include/object.h:478
      #11 frame_dealloc (f=Frame 0x7f1afc572dd0, for file qget-min.py, line 12, in demo ())
          at ../Objects/frameobject.c:430
      #12 0x00000000004fdf30 in _Py_Dealloc (
          op=Frame 0x7f1afc572dd0, for file qget-min.py, line 12, in demo ())
          at ../Objects/object.c:2215
      #13 _Py_DECREF (filename=<synthetic pointer>, lineno=279, 
          op=Frame 0x7f1afc572dd0, for file qget-min.py, line 12, in demo ())
          at ../Include/object.h:478
      #14 gen_send_ex (gen=0x7f1afbd08440, arg=<optimized out>, exc=<optimized out>, 
          closing=<optimized out>) at ../Objects/genobject.c:279
      

      See also: https://bugs.python.org/issue40789

      Attachments

        For Gerrit Dashboard: PYSIDE-1313
        # Subject Branch Project Status CR V

        Activity

          People

            crmaurei Cristian Maureira-Fredes
            njs Nathaniel Smith
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes