Details
-
Bug
-
Resolution: Done
-
P1: Critical
-
5.11.0
-
None
-
Python 2.7 or 3.6.6 built against 10.6 SDK (https://www.python.org/ftp/python/3.6.6/python-3.6.6-macosx10.6.pkg)
macOS 10.12
Qt 5.12 against 10.12 SDK
PySide2 5.11.0 against 10.12 SDK
-
-
qtbase 22c1a46a03bc3347afc0e7462e19558283d0e1b7
Description
Problem
When building PySide (5.11 or dev) against Qt 5.12 (dev) on macOS (10.12), using the official Python (2 or 3 doesn't matter) package which was built against a 10.6 macOS SDK, no widgets are painted, all GUI examples are broken, a bunch of tests in the CI hang.
If you use a self-compiled Python, or a newer Python package that is built against a newer macOS SDK (10.9), everything works properly.
Current workaround to make tests and CI integrations pass is found at https://codereview.qt-project.org/#/c/233830/
Below will follow the story of investigating this obscure bug.
Investigation
The python script used for testing was in the pyside-setup repo, examples/tutorial/t1.py
After some poking around the Qt cocoa QPA plugin, and enabling the following logging categories
QT_LOGGING_RULES="qt.qpa.window=true;qt.qpa.drawing=true;qt.qpa.screen=true;qt.qpa.windows=true"
as well as sprinkling some additional qCDebug()s inside qnsview_drawing.mm,
the first observed issue was that -[QNSView drawLayer] was not called, and hence no widget content was displayed. Much later I noticed that the backingLayer was not created at all, which proved to be crucial.
After realllly many build tests I determined that the widgets got painted (and drawLayer called) if the used official Python was built with minimum deployment target set to 10.9, but not 10.6.
And yet if I built Python myself with deployment target set to 10.6, everything worked. So that ended up being a bit of a dead end. We will come back to this later.
At the same time I was doing tests with Qt 5.11, and there everything worked, even with deployment target 10.6. After some advice from fellow colleagues and scouring of the git log, the initial culprit commit in qtbase was https://codereview.qt-project.org/#/c/230735/ .
Qt switched to using layer-backed NSViews and that somehow broke rendering when using Python.
The first thing to try, and the already temporary committed workaround is to set the environment variable QT_MAC_WANTS_LAYER=0, which disabled layer-backed views, and the issue went away!
Still I didn't know why layer-backed NSViews didn't work.
At some point it occurred to me try and and build against Qt 5.11, and explicitly enable (opt-in) to layer-backed NSViews. To my surprise everything worked, even with the official 10.6-built python!
That means there must have been some change done that regressed from Qt 5.11 to Qt 5.12, because layer backing worked!
After a lot of disassembly debugging and comparing the execution flow and re-reading the relevant commits I found it: https://codereview.qt-project.org/#/c/223718/4
If I revert the commit by explicitly calling -[NSView setWantsLayer], then the backingLayer is created, -[QNSView displayLayer] is called, drawing works!
Apparently it is not enough to just override -[NSView wantsLayer], calling setWantsLayer has side-effects which in our case are important.
But remember that this issue only happened for official Python 10.6-built, so it doesn't explain why I would need to call setWantsLayer explicitly for 10.6, but not for any other python version.
I went back to disassembling and execution tracing between the working and non-working case, to see what was the difference. In the working case something must be calling -[NSView setWantsLayer] for us, or something similar to it.
After some time I found it, and this is the relevant decompiled source code:
/* @class NSWindow */ -(void)setContentView:(void *)arg2 { r14 = arg2; rbx = self; r12 = *ivar_offset(_contentView); rdi = *(rbx + r12); if (rdi != r14) goto loc_885c9; .... loc_885c9: [rdi removeFromSuperview]; *(rbx + r12) = r14; rsi = rbx->_borderView; if (rsi != 0x0) { rax = [&var_50 frame]; } else { var_40 = intrinsic_movaps(var_40, 0x0); var_50 = intrinsic_movaps(var_50, 0x0); } rax = [rbx styleMask]; rax = [&var_70 contentRectForFrameRect:rbx styleMask:rax, r8]; if ((__NSConstraintBasedLayout() != 0x0) && ([rbx _layoutEngine] != 0x0)) { [r14 setAutoresizingMask:0x12]; } [r14 setTranslatesAutoresizingMaskIntoConstraints:0x1]; [*(rbx + r12) setFrame:0x1]; if (((__NSViewLayerBackWindowFrame() == 0x0) || ([*(rbx + r12) wantsLayer] == 0x0)) || ([rbx->_borderView wantsUpdateLayer] == 0x0)) goto loc_88753; <---------------------------------------------- important part loc_8870e: r13 = r12; r12 = *ivar_offset(_borderView); [*(rbx + r12) setWantsLayer:0x1]; <---------------------------------------------- second important part rax = *ivar_offset(_auxiliaryStorage); var_78 = rax; rax = *(rbx + rax); rcx = *ivar_offset(_auxWFlags); rdx = *(rax + rcx); rsi = 0x100000 | *(int32_t *)(rax + rcx + 0x8); goto loc_887c1;
For the 10.6 non-working case __NSViewLayerBackWindowFrame() returned "0", which means it skipped over calling setWantsLayer which I noted above.
The code for "__NSViewLayerBackWindowFrame" is :
void __NSViewLayerBackWindowFrame() { __NSGetBoolAppConfig(@"NSViewLayerBackWindowFrame", 0x1, _sNSViewLayerBackWindowFrameComputedValue, _NSViewLayerBackWindowFrameDefaultValueFunction); return; }
Googling for "NSViewLayerBackWindowFrame" returned a single result, unsurprisingly in Chromium https://bugs.chromium.org/p/chromium/issues/detail?id=312462
Reading through that bug report and some other pages regarding NSGetBoolAppConfig, I found a way to enable this app config property via "defaults write" command line application.
First I enabled
defaults write -globalDomain NSLogUnusualAppConfig -bool YES
and reran the example with Python 10.6-based to see this:
2018-07-05 13:08:46.063 python[10379:1573175] NSLogUnusualAppConfig=YES 2018-07-05 13:08:46.063 python[10379:1573175] NSSpacePerDisplay=YES 2018-07-05 13:08:46.099 python[10379:1573175] NSScreenReturnsNilWhenEmpty=YES 2018-07-05 13:08:46.136 python[10379:1573175] NSLayerPerformanceUpdates=NO 2018-07-05 13:08:46.136 python[10379:1573175] NSWindowShouldValidateFirstResponder=NO 2018-07-05 13:08:46.137 python[10379:1573175] NSButtonDelay=0.4 2018-07-05 13:08:46.137 python[10379:1573175] NSButtonPeriod=0.075 2018-07-05 13:08:46.139 python[10379:1573175] NSControlsUseWeakTargets=NO 2018-07-05 13:08:46.139 python[10379:1573175] NSControlInvalidateLayout=NO 2018-07-05 13:08:46.140 python[10379:1573175] NSViewLayoutCheckForResizeSubviewsOverride=YES 2018-07-05 13:08:46.141 python[10379:1573175] NSWindowDisableTilingConstraintsOnDesktop=NO 2018-07-05 13:08:46.142 python[10379:1573175] NSWindowAdjustSecondaryScreenFillingWindows=YES 2018-07-05 13:08:46.143 python[10379:1573175] NSViewKeepLayersAround=NO 2018-07-05 13:08:46.144 python[10379:1573175] NSViewDoLayoutBeforeSetDefaultKeyViewLoop=NO 2018-07-05 13:08:46.154 python[10379:1573175] NSAlwaysMatchRequestedMaskOf1=YES 2018-07-05 13:08:46.162 python[10379:1573175] NSImageLeakSystemAndBundleImages=YES
I then reran the same with Python 10.9-based, and diff-ed through the list to see the different config values. After some trial and error I tried this:
defaults write -globalDomain NSLayerPerformanceUpdates -bool YES defaults write -globalDomain NSViewLayerBackWindowFrame -bool YES
And then the example worked on Python 10.6-based!
Apparently it's not the minimum deployment target that is important, but the version of the SDK against which you build! Depending on the SDK you build against, certain silent behaviour changes can happen via these secret / private App config values.
Official Python is built on a 10.6 OS X version against 10.6 SDK. Whereas I was building with SDK 10.12, which explains why I couldn't reproduce the issue with a self-built Python.
Solution
The cleanest solution for us to make layer-backed NSViews work with a Python built against an old SDK, is to revert https://codereview.qt-project.org/#/c/223718/4 and explicitly call -[NSView setWantsLayer] ourselves.
Addendum
Here is the disassembly of -[NSView setWantsLayer] and -[NSView _doSetWantsLayerYES] and why it's important to call the former explicitly. It calls "_doSetWantsLayerYES" which ends up creating the backingLayer, and for the 10.6 sdk case because neither Qt code nor AppKit code called setWantsLayer, no backing layer was created, and nothing was drawn.
-(void)setWantsLayer:(char)arg2 { r14 = arg2; r12 = _cmd; rbx = self; rax = *ivar_offset(_viewAuxiliary); r13 = rbx + rax; rax = *(rbx + rax); if (rax == 0x0) { [rbx _allocAuxiliary:0x0]; rax = *r13; } if (*(int16_t *)&rax->_vFlags5 >= 0x0) { [rbx _setHasAutoCanDrawSubviewsIntoLayer:0x0]; rax = *r13; if ((*(int8_t *)(rax + *ivar_offset(_vFlags5) + 0x1) & 0x40) != 0x0) { [rbx _didChangeAutoSetWantsLayer:0x0]; rax = *r13; } } r15 = sign_extend_64(r14); if ((*(int32_t *)&rax->_vFlags2 >> 0x1b & 0x1) != r15) { [rbx->_window _lockViewHierarchyForModificationWithExceptionHandler:0x0]; if (__NSDebugLayerActivity() != 0x0) { NSLog(@"-[%@(%p) %@%ld]", [rbx class], rbx, NSStringFromSelector(r12), sign_extend_64(r14)); } _os_nospin_lock_lock(__NSViewAuxiliaryBitfieldLock); rax = *r13; rcx = *ivar_offset(_vFlags2); *(int32_t *)(rax + rcx) = 0xfffffffff7ffffff & *(int32_t *)(rax + rcx) | (r15 & 0x1) << 0x1b; _os_nospin_lock_unlock(__NSViewAuxiliaryBitfieldLock); if (r14 != 0x0) { [rbx _doSetWantsLayerYES]; } else { [rbx _doSetWantsLayerNO]; } if ([rbx layerContentsRedrawPolicy] > 0x0) { [rbx setNeedsDisplay:0x1]; } [rbx->_window _unlockViewHierarchyForModification]; if (0x0 != 0x0) { objc_exception_rethrow(); } } return; } /* @class NSView */ -(void)_doSetWantsLayerYES { rbx = self; r15 = [CATransaction disableActions]; [CATransaction setDisableActions:0x1]; if ((__NSGetBoolAppConfig(@"NSViewLayerBackWindowFrame", 0x1, _sNSViewLayerBackWindowFrameComputedValue, _NSViewLayerBackWindowFrameDefaultValueFunction) != 0x0) && ([rbx _canAutoLayerBackBorderView] != 0x0)) { [[rbx->_window _borderView] _setHasAutoSetWantsLayer:0x1]; } r13 = *ivar_offset(_viewAuxiliary); rax = *(rbx + r13); if ((*(int8_t *)(rax + *ivar_offset(_vFlags2) + 0x3) & 0x10) == 0x0) { if ((*(int8_t *)(rax + *ivar_offset(_vFlags4) + 0x2) & 0x80) != 0x0) { [rbx _setSurfaceBacked:0x0]; } _os_nospin_lock_lock(__NSViewAuxiliaryBitfieldLock); rax = *(rbx + r13); rcx = *ivar_offset(_vFlags2); *(int32_t *)(rax + rcx) = *(int32_t *)(rax + rcx) | 0x10000000; _os_nospin_lock_unlock(__NSViewAuxiliaryBitfieldLock); _NSViewAttachLayerSurfaceIfRoot(rbx); [rbx _updateDrawDelegateForAlphaValue]; if ((rbx->_layer == 0x0) && ((0x8000800 & *(int32_t *)(*(rbx + r13) + *ivar_offset(_vFlags2))) != 0x800)) { [rbx _createLayerAndInitialize]; } rsi = @selector(_childrenGainedLayerTreeAncestor); rdi = rbx; } else { rdi = rbx->_superview; rsi = @selector(_insertMissingSubviewLayers); } (*_objc_msgSend)(rdi, rsi); [CATransaction setDisableActions:sign_extend_64(r15)]; return; }
References
https://developer.apple.com/documentation/appkit/nsview/1483695-wantslayer?language=objc
https://developer.apple.com/documentation/quartzcore/calayerdelegate?language=objc#relationships
https://bugs.chromium.org/p/chromium/issues/detail?id=312462
https://mjtsai.com/blog/2013/12/31/defaults-for-debugging/
https://developer.apple.com/documentation/foundation/nsuserdefaults
https://ss64.com/osx/defaults.html
https://codereview.qt-project.org/#/c/233830/
https://codereview.qt-project.org/#/c/223718/4