Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-25649

QRegExp/W3CXmlSchema11: \p{P} (punctuation class) does not match '-'

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Reported
    • Priority: Not Evaluated
    • Resolution: Unresolved
    • Affects Version/s: 4.8.x
    • Fix Version/s: None
    • Component/s: XML: QtXmlPatterns
    • Labels:
      None
    • Environment:
      Seen in both 4.7.2 and current 4.8 branch (bddffd6d6). Tested on Linux 12.1, gcc 4.6.2

      Description

      According to http://www.regular-expressions.info/xml.html , http://www.regular-expressions.info/unicode.html
      the regexp "\p

      {P}" should match any punctuation, including dash ("-").
      While the subset "\p{Pd}" matches "-", the more generic "\p{P}

      " does not.

      (Stumbled upon this in a real-world XML Schema I'm not authorized to share)

      See unit test below: The first three QVERIFY pass, the last one fails; all four should pass.

      diff --git a/tests/auto/qregexp/tst_qregexp.cpp b/tests/auto/qregexp/tst_qregexp.cpp
      index d444558..11ed5d4 100644
      — a/tests/auto/qregexp/tst_qregexp.cpp
      +++ b/tests/auto/qregexp/tst_qregexp.cpp
      @@ -74,6 +74,7 @@ private slots:
      void testEscapingWildcard();
      void testInvalidWildcard_data();
      void testInvalidWildcard();
      + void testPunctuationDash();
      void caretAnchoredOptimization();
      void isEmpty();
      void prepareEngineOptimization();
      @@ -993,6 +994,19 @@ void tst_QRegExp::testInvalidWildcard()

      { QCOMPARE(re.isValid(), isValid); }

      +void tst_QRegExp::testPunctuationDash()
      +{
      + QRegExp re(QLatin1String("
      p

      {Pd}

      "));
      + re.setPatternSyntax(QRegExp::W3CXmlSchema11);
      + QVERIFY(!re.exactMatch(QLatin1String(".")));
      + QVERIFY(re.exactMatch(QLatin1String("-")));
      +
      + QRegExp re2(QLatin1String("
      p

      {P}

      "));
      + re2.setPatternSyntax(QRegExp::W3CXmlSchema11);
      + QVERIFY(re2.exactMatch(QLatin1String(".")));
      + QVERIFY(re2.exactMatch(QLatin1String("-"))); //fails
      +}
      +
      void tst_QRegExp::caretAnchoredOptimization()
      {
      QString s = "--babnana---";

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              frank.osterfeld Frank Osterfeld
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:

                Gerrit Reviews

                There are no open Gerrit changes