Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-78976

WIN64: failure to link Qt5Gui generated by ICC 19.x under LTCG or IPO

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • P3: Somewhat important
    • 5.13.2
    • 5.13.1, 5.14.0 Alpha
    • Build System
    • None
    • ICC 19.1 Technical Preview (package 085), ICC 19.0.5, Windows 10 x64, Windows SDK 10.0.18362.0, MSVC 2019 16.3.1 (VC++ 14.22), MSVC 2019 16.4 preview 1 (VC++ 14.24.28117)
    • Windows
    • da12c06b99ef1a41b0ee7f84516a928d1a625ba6 (qt/qtbase/5.13.2)

    Description

      The following elusive ICE-like error during link emerges when compiling any qttool or example using Qt5Gui with ICC 19.x under LTCG/IPO:

      xilink: remark #10397: optimization reports are generated in *.optrpt files in the output location
      ipo-1: warning #11031: disabling user-directed function packaging (COMDATs)
      ipo-2: warning #11031: disabling user-directed function packaging (COMDATs)
      xilink: remark #10397: optimization reports are generated in *.optrpt files in the output location
      
                ": internal error: 010101_0
      
      xilink: error #10014: problem during multi-file optimization compilation (code 4)
      xilink: error #10014: problem during multi-file optimization compilation (code 4)
      jom: C:\qt5\qtbase\examples\widgets\mainwindows\mainwindow\Makefile.Release [release\mainwindow.exe] Error 4

      Test case:

      configure -prefix C:\qt-icc -release -ltcg -mp -platform win32-icc -opengl desktop -opensource -confirm-license -skip qtwebengine -qt-pcre -qt-libpng -qt-libjpeg -sql-sqlite -qt-freetype -avx2 -c++std c++17
      cd C:\qt5\qtbase\examples\widgets\mainwindows\mainwindow\
      jom -j1
      

      Conservative patch:

      Line 25 (https://github.com/qt/qtbase/blob/5.14/mkspecs/win32-icc/qmake.conf) is wrong because that is not the correct option to disable IPO on Windows ICC 19:

      QMAKE_CFLAGS_DISABLE_LTCG = -Qno-ipo

      The correct form should be:

      QMAKE_CFLAGS_DISABLE_LTCG = -Qipo-

      The implication is that the CONFIG feature "simd" modulating the compilation of specific ISA optimized sources in simd.prf now correctly disables IPO on intel_icl (line 31 https://github.com/qt/qtbase/blob/5.14/mkspecs/features/simd.prf):

      ltcg: cflags += $$QMAKE_CFLAGS_DISABLE_LTCG

      For the adventurous: patch for enabling full LTCG/IPO with ICC 19.x (successfully tested in all of the aforementioned environments):

      Change line 25 (https://github.com/qt/qtbase/blob/5.14/mkspecs/win32-icc/qmake.conf) to:

      QMAKE_CFLAGS_DISABLE_LTCG = -Qipo-

      Add this line to win32-icc description file (https://github.com/qt/qtbase/blob/5.14/mkspecs/win32-icc/qmake.conf):

      QMAKE_CFLAGS_HIGHEST_ISA  = -QxHost

      Change simd.prf (line 21 https://github.com/qt/qtbase/blob/5.14/mkspecs/features/simd.prf):

      defineTest(addSimdCompiler) {
          name = $$1
          upname = $$upper($$name)
          headers_var = $${upname}_HEADERS
          sources_var = $${upname}_SOURCES
          csources_var = $${upname}_C_SOURCES
          asm_var = $${upname}_ASM
      
          CONFIG($$1) {
              cflags = $$eval(QMAKE_CFLAGS_$${upname})
              ltcg: cflags += $$QMAKE_CFLAGS_DISABLE_LTCG
              contains(QT_CPU_FEATURES, $$name) {
                  # Default compiler settings include this feature, so just add to SOURCES
                  SOURCES += $$eval($$sources_var)
                  export(SOURCES)
              } else {
                  # We need special compiler flags
      

      To:

      defineTest(addSimdCompiler) {
          name = $$1
          upname = $$upper($$name)
          headers_var = $${upname}_HEADERS
          sources_var = $${upname}_SOURCES
          csources_var = $${upname}_C_SOURCES
          asm_var = $${upname}_ASM
      
          CONFIG($$1) {
      ltcg {
      intel_icl: cflags = $$eval(QMAKE_CFLAGS_HIGHEST_ISA) $$QMAKE_CFLAGS_LTCG
      else: cflags = $$eval(QMAKE_CFLAGS_$${upname}) $$QMAKE_CFLAGS_DISABLE_LTCG
      } else {
      cflags = $$eval(QMAKE_CFLAGS_$${upname})
      }
              contains(QT_CPU_FEATURES, $$name) {
                  # Default compiler settings include this feature, so just add to SOURCES
                  SOURCES += $$eval($$sources_var)
                  export(SOURCES)
              } else {
                  # We need special compiler flags
      

      According to this commit (Fix leaking ISA extensions in LTCG builds), "(...) common subexpression elimination instruction set extensions may leak from the objects where they were enabled when doing link-time optimizations. To avoid that this patch disables LTCG/LTO on files built with extra instruction set extensions". While this may be true for other compilers using interprocedural optimizations, on a translation unit or whole-program basis, it seems that ICC 19.x can successfully apply whole-program global optimizations across object files using different extra ISA sets. The only requirement is that the intermediate language (IL) in the (mock) objects files generated during compile-time use the same feature set setting (e.g., -xCORE-AVX2, -xSSE4.1) across all files and during link-time. Alternatively, all object files must be compiled and linked using the same highest instruction set available on the compilation host processor (-xHOST). The -xHOST option does exactly that: it tells the compiler to generate instructions for the highest instruction set available on the compilation host processor. In short, the error with IPO arises because "simd.prf" sets, e.g., the qdrawhelper_avx2, qdrawhelper_sse4 and the qdrawhelper_sse2 units to generate intermediate objects using different ISA targets, -QxCORE-AVX2, -Qx SSE4.1, -QxSSE2, instead of using the highest compatible ISA (in this case -QxCORE-AVX2), which confuses the linker in the final phase.

      ICC 19.x is able to analyze global optimizations opportunities across intermediate object files and automatically avoid the "border" or leak effects on programs built with extra ISA.  Additionally, it avoids AVX-SSE transition penalties by generating equivalent AVX-128 code from SSE intrinsics, skipping _mm256_zeroupper() whenever necessary — provided all the objects are compiled and linked using the same (or the highest available) feature set setting. According to "Intel 64 and IA-32 Architectures Optimization Reference Manual" and Intel's "Avoiding AVX-SSE Transition Penalties", this is the recommended procedure if you are targeting only one architecture per program. It is not the recommended procedure if you are interested in using Intel's auto-dispatching feature (e.g., -axCORE-AVX2, -axSSE4.1). Some critical modules may work better when using only SSE, but the automatic coarse-grained optimizations made by ICC 19.x seems to compile programs mixing SSE and AVX successfully.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            qtbuildsystem Qt Build System Team
            fboni Francisco Boni Neto
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes