Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-7049

The Qt Reg-Exp engine sometimes gives false positive empty match

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • P3: Somewhat important
    • 4.7.0
    • 4.5.2, 4.5.3, 4.6.0
    • None
    • dadb99ea2c59d7d0f7a83134b7df5aaaaf80a995

    Description

      STEPS LEADING TO PROBLEM:

      Compile and execute the following program:

      #include <QDebug>
      #include <QRegExp>
      int main(int argc, char** argv)
      {
        QRegExp r("(a)|(b)") ;
        qDebug() << r.indexIn("xxb", 1) ;
        qDebug() << r.pos(0) << r.pos(1) << r.pos(2) ;
        qDebug() << r.cap(0) << r.cap(1) << r.cap(2) ;
        return 0 ;
      }

      EXPECTED OUTCOME:

      The output of the program is
      2
      2 -1 2
      "b" "" "b"

      EXPLANATION:

      According to documentation at http://doc.trolltech.com/4.6/qregexp.html#pos
      [QUOTATION]
      For zero-length matches, pos() always returns -1. (For example, if cap(4)
      would return an empty string, pos(4) returns -1.) This is a feature of the
      implementation.
      [/QUOTATION]

      ACTUAL OUTCOME:

      The output of the program is
      2
      2 0 2
      "b" "" "b"

      WHY IS IT IMPORTANT:
      For a generic (...somthing-a...)|(something-b...) kind of regular expression I'm looking for a way to find out, which one of two (or even more) alternatives — "something-a" or "something-b" was found. My first try was to compare pos(1) with -1, but it doesn't work (zero is returned instead, while according to documentation -1 should be returned). A workaround in this case is to compare cap(1).length() to zero.

      WHY IT'S NOT ENOUGH ONLY TO FIX THIS BUG:
      Furthermore, it would be very interesting to know, how to find out, if the second alternative out of two was found in following case:
      QRegExp r("(a*)|(b*)") ;
      r.indexIn("axx") ;
      r.indexIn("axx", 1) ;
      In the first case the answer is "no", in the second case it's "yes", but the function cap(2) is returning "" (empty string) in both cases.
      Thus I would like to have a method bool QRegExp::is_matched(int i) returning true if and only if the i-th sub-pattern was matched.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            biochimia João Abecasis
            ilya Ilya Dogolazky
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes