Waiting on a thread that is emitting a signal, where at least one receiver that is connected via BlockingQueuedConnection lives in a thread that doesn’t spin an event loop (anymore) will deadlock.
While this is rather obvious, the problem is that this situation can sometimes not be avoided, for instance when shutting down the application: QCoreApplication::exec returns, and objects are destroyed as we leave the main function. If one of those objects is a QThread that needs to be waited on before destruction (to prevent the qFatal warning we get when destroying a running thread), then problem surfaces.
See attached code for a runnable reproducer, showing the deadlock issue (with the symptom ultimately being at least the qWarning message) with a very high hit-rate. It’s a bit contrived to show the problem frequently; moving or repeating e.g some isInterruptionRequested() handling until after the msleep, like one would probably do in a real application, only reduces the likelihood for the issue to manifest.
What happens is:
- sender thread reaches emit statement
- receiver thread exits event loop, signals thread to finish
- sender thread emits signal and blocks, waiting for event to be delivered
- receiver thread waits on thread
(the last two might be in opposite order, it doesn’t matter)
If the main thread (in which the receiver object is likely to live) has to wait for the thread finish, then the main thread won’t process events, so if - on the way towards that desired state - the thread emits the signal that a main-thread object is required to handle, then we have a deadlock.
And since there's no atomic transition from “unlock receiver thread and emit blocking-queued signal”, there's always a chance for a deadlock, and from what I see no real solution at present.
The “solution” I’ve come up with for this reproducer is to remove all posted events for the receiver if waiting for the thread to finish fails after a “reasonable amount of time”. See the commented-out section in the SendingThread destructor.
This is not a general solution - waiting for "some time" never is, and we cannot and should not expect the sender to know all receivers.
But perhaps it hints at what a more elegant solution could be:
- add an ability to block all BlockingQueued signal emissions from a thread requested to shut down
Add an explicit API, or implicitly triggered in QThread::wait or QThread::requestInterruption
- and - the more difficult part - add the capability to abort all ongoing signal emissions
For this we probably need to remember for each sender thread which BlockingQueued QMetaCallEvent is currently posted by it (it can only be one, obviously), and then abort the delivery of that event when needed. Could again be done with an explicit API, or implicitly added to QThread::wait()
|For Gerrit Dashboard: QTBUG-93259|
|346183,2||WIP: prototype for preventing deadlock with blocking signals||dev||qt/qtbase||Status: NEW||-2||0|