Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-127402

Use of PCRE2 JIT in QRegularExpression makes use of QRegular Expression unsafe for use inside routines that employ QtConcurrent

    XMLWordPrintable

Details

    • Bug
    • Resolution: Invalid
    • Not Evaluated
    • None
    • 6.7.2
    • None
    • all platforms, Qt's internal PCRE2 default to JIT mode
    • All

    Description

      If you read the PCRE2 docs, the use of the JIT will make it unsafe in multithreaded environments without very special handling which is typically never employed.

      The problem is QRegularExpression uses PCRE2 and enables the JIT which makes the use of QRegularExpression in routines that are invoked by QtConcurrent a big problem causing random crashes in the ~QExplicitlySharedDataPointer as follows:

      Target 0: (Sigil) stopped.
      (lldb) bt
      * thread #48, name = 'Thread (pooled)', stop reason = EXC_BAD_ACCESS (code=1, address=0x228095ff8)
        * frame #0: 0x00000001078d4940 QtCore`sljit_free_exec + 240
          frame #1: 0x00000001078e8bb1 QtCore`_pcre2_jit_free_16 + 129
          frame #2: 0x00000001078c441f QtCore`pcre2_code_free_16 + 31
          frame #3: 0x000000010772f424 QtCore`QExplicitlySharedDataPointer<QRegularExpressionPrivate>::~QExplicitlySharedDataPointer() + 36
          frame #4: 0x00000001003c29b5 Sigil`GumboInterface::update_style_urls(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 1701
          frame #5: 0x00000001003c3576 Sigil`GumboInterface::build_attributes(GumboAttribute*, bool, bool, bool) + 1366
          frame #6: 0x00000001003b8f3d Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 829
          frame #7: 0x00000001003bd7f2 Sigil`GumboInterface::serialize_contents(GumboInternalNode*, GumboInterface::UpdateTypes) + 402
          frame #8: 0x00000001003b928f Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 1679
          frame #9: 0x00000001003bd7f2 Sigil`GumboInterface::serialize_contents(GumboInternalNode*, GumboInterface::UpdateTypes) + 402
          frame #10: 0x00000001003b928f Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 1679
          frame #11: 0x00000001003bd7f2 Sigil`GumboInterface::serialize_contents(GumboInternalNode*, GumboInterface::UpdateTypes) + 402
          frame #12: 0x00000001003b928f Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 1679
          frame #13: 0x00000001003bd7f2 Sigil`GumboInterface::serialize_contents(GumboInternalNode*, GumboInterface::UpdateTypes) + 402
          frame #14: 0x00000001003b928f Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 1679
          frame #15: 0x00000001003bd7f2 Sigil`GumboInterface::serialize_contents(GumboInternalNode*, GumboInterface::UpdateTypes) + 402
          frame #16: 0x00000001003b928f Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 1679
          frame #17: 0x00000001003bd7f2 Sigil`GumboInterface::serialize_contents(GumboInternalNode*, GumboInterface::UpdateTypes) + 402
          frame #18: 0x00000001003b8d31 Sigil`GumboInterface::serialize(GumboInternalNode*, GumboInterface::UpdateTypes) + 305
          frame #19: 0x00000001003bca35 Sigil`GumboInterface::perform_style_updates(QString const&, QString const&) + 309
          frame #20: 0x0000000100058bba Sigil`PerformHTMLUpdates::operator()() + 234
          frame #21: 0x000000010006d4df Sigil`UniversalUpdates::UpdateOneHTMLFile(HTMLResource*, QHash<QString, QString> const&, QHash<QString, QString> const&) + 239
          frame #22: 0x00000001000701fa Sigil`QtConcurrent::MappedEachKernel<QList<HTMLResource*>::const_iterator, std::__1::__bind<QString (&)(HTMLResource*, QHash<QString, QString> const&, QHash<QString, QString> const&), std::__1::placeholders::__ph<1> const&, QHash<QString, QString>&, QHash<QString, QString>&> >::runIteration(QList<HTMLResource*>::const_iterator, int, QString*) + 42
          frame #23: 0x0000000100070292 Sigil`QtConcurrent::MappedEachKernel<QList<HTMLResource*>::const_iterator, std::__1::__bind<QList<std::__1::pair<QString, QString> > (&)(HTMLResource*, QHash<QString, CSSInfo*> const&), std::__1::placeholders::__ph<1> const&, QHash<QString, CSSInfo*>&> >::runIterations(QList<HTMLResource*>::const_iterator, int, int, QList<std::__1::pair<QString, QString> >*) + 66
          frame #24: 0x00000001000705a7 Sigil`QtConcurrent::IterateKernel<QList<HTMLResource*>::const_iterator, QString>::forThreadFunction() + 359
          frame #25: 0x0000000100947996 QtConcurrent`QtConcurrent::ThreadEngineBase::run() + 134
          frame #26: 0x00000001076fb741 QtCore`QThreadPoolThread::run() + 225
          frame #27: 0x00000001076f39c9 QtCore`QThreadPrivate::start(void*) + 329
          frame #28: 0x00007ff8026851d3 libsystem_pthread.dylib`_pthread_start + 125
      
      

      If I force the use of:

      QT_ENABLE_REGEXP_JIT=0

      everything works.

      But nowhere does it say that QRegularExpression is thread unsafe and that its code should not be used with Qt:Concurrent.

      If you plan to default to using the PCRE2 JIT withing QRegularExpression, you should properly handle the its use to make it thread safe.

      See the PCRE2 docs here:

      https://www.pcre.org/current/doc/html/pcre2jit.html#SEC7

      And especially this part:

      In a multithread application, if you do not specify a JIT stack, or if you assign or pass back NULL from a callback, that is thread-safe, because each thread has its own machine stack. However, if you assign or pass back a non-NULL JIT stack, this must be a different stack for each thread so that the application is thread-safe.
      
      Strictly speaking, even more is allowed. You can assign the same non-NULL stack to a match context that is used by any number of patterns, as long as they are not used for matching by multiple threads at the same time. For example, you could use the same stack in all compiled patterns, with a global mutex in the callback to wait until the stack is available for use. However, this is an inefficient solution, and not recommended.
      
      This is a suggestion for how a multithreaded program that needs to set up non-default JIT stacks might operate:
      
        During thread initialization
          thread_local_var = pcre2_jit_stack_create(...)
      
        During thread exit
          pcre2_jit_stack_free(thread_local_var)
      
        Use a one-line callback function
          return thread_local_var
      

      Please fix QRegularExpression to be thread safe.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            peppe Giuseppe D'Angelo
            kevinhendricks Kevin B. Hendricks
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes