Status: Need More Info
Priority: P1: Critical
Affects Version/s: 5.9.1
Fix Version/s: None
Component/s: Quick: SceneGraph
Environment:I have observed this crash on CentOS 7.3, Ubuntu 17.04, and macOS Sierra.[^qt-5.9.1.patch]
I don't have an easily-distillable example that demonstrates this problem, but I have confirmed that QSGDefaultDistanceFieldGlyphCache can exhibit a use-after-free bug that causes unpredictable behavior (often a segmentation fault).
Steps to reproduce:
- Create a QQuickWidget whose scene graph contains a QML Text item (represented by a QQuickTextNode). I don't think it's key to this issue, but my scene graph contains a Map item, with the Text item added to the Map as a MapQuickItem object.
- Place the QQuickWidget into a QDockWidget that is docked into a QMainWindow.
- Create some code that updates the contents of the text periodically (I don't know how the internals of the scene graph works, so I'm not sure what is required to trigger the bug). For instance, a timer that changes the contents of the text.
- Undock the widget from the main window so it becomes its own top-level window.
- Redock the widget to the main window.
- Repeat steps 4-5 until you see a segmentation fault.
By tracing through the Qt source (the below stack traces are from a macOS build), I found that when you undock the widget from the main window, a resizeEvent() on the QQuickWidget is triggered. This triggers the creation of a new frame buffer object, which results in recreating the OpenGL context. Here is the stack trace where the OpenGL context is destroyed:
The QQuickTextNode in the scene graph creates a QSGDistanceFieldGlyphNode object to represent the distance field glyphs that are used to render the text. The QSGDistanceFieldGlyphNode object has a member variable called m_glyph_cache of type QSGDefaultDistanceFieldGlyphCache *. This cache is used during the rendering process.
QSGDefaultDistanceFieldGlyphCache has a member variable called m_funcs of type QOpenGLFunctions *. This pointer is used to access the OpenGL API function table as needed. However, it is initialized in the QSGDefaultDistanceFieldGlyphCache constructor to the QOpenGLFunctions object associated with the context provided to the constructor. There is no protection to catch the case where that context is later destroyed while the QSGDefaultDistanceFieldGlyphCache object is still in use (which is what occurs during the undocking process described above). The cache retains a dangling pointer to the functions table associated with the now-destroyed context, which resides in memory that is now freed for use by the allocator again. Therefore, the memory pointed to by m_funcs may be reallocated somewhere else in the application and overwritten, resulting in corruption of the function table (and a segmentation fault).
Here's an example of the stack trace at the point of the crash:
It's not clear to me what the appropriate fix is here. I found that if I discontinued use of m_funcs altogether and instead always used the current OpenGL context to provide the QOpenGLFunctions object, the crash seemed to go away. For reference, I attached a quick patch that implements this (although I'm sure it needs some massaging before it would be suitable for merging).