Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-63150

QXmlStreamWriter produces XML that cannot be read with QXmlStreamReader

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • P2: Important
    • 5.11.0
    • 5.6.2, 5.7.1, 5.8.0, 5.9.1
    • None
    • 3b5b8f1d4ab8092e5dd337b7b4e32d85fda2e0b7

    Description

      Consider the following test case to be added to tst_qxmlstream.cpp:

      void tst_QXmlStream::readBack() const
      {
          ushort error = std::numeric_limits<ushort>::max();
      
          for (ushort c = 0; c < std::numeric_limits<ushort>::max(); ++c) {
              QBuffer buffer;
      
              QVERIFY(buffer.open(QIODevice::WriteOnly));
              QXmlStreamWriter writer(&buffer);
              writer.writeStartDocument();
              writer.writeTextElement("a", QString(QChar(c)));
              writer.writeEndDocument();
              buffer.close();
      
              QVERIFY(buffer.open(QIODevice::ReadOnly));
              QXmlStreamReader reader(&buffer);
              do {
                  reader.readNext();
              } while (!reader.atEnd());
      
              if (reader.hasError()) {
                  if (error > c) {
                      error = c;
                      qDebug() << "problematic XML:" << buffer.data();
                  }
              } else if (error < c) {
                  qDebug() << showbase << hex << "range" << error << "-" << (c - 1) << "is problematic";
                  error = std::numeric_limits<ushort>::max();
              }
          }
      
          if (error < std::numeric_limits<ushort>::max()) {
              qDebug() << showbase << hex << "range" << error << "-"
                       << std::numeric_limits<ushort>::max() << "is problematic";
          }
      }
      

      This just produces short snippets of XML, using all possible 16-bit characters. A number of those snippets are apparently invalid XML. When running the test the following output is generated:

      QDEBUG : tst_QXmlStream::readBack() problematic XML: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a>\x00</a>\n"
      QDEBUG : tst_QXmlStream::readBack() range 0x0 - 0x8 is problematic
      QDEBUG : tst_QXmlStream::readBack() problematic XML: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a>\x0B</a>\n"
      QDEBUG : tst_QXmlStream::readBack() range 0xb - 0xc is problematic
      QDEBUG : tst_QXmlStream::readBack() problematic XML: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a>\x0E</a>\n"
      QDEBUG : tst_QXmlStream::readBack() range 0xe - 0x1f is problematic
      QDEBUG : tst_QXmlStream::readBack() problematic XML: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a></?a>\n"
      QDEBUG : tst_QXmlStream::readBack() range 0xd800 - 0xdfff is problematic
      QDEBUG : tst_QXmlStream::readBack() problematic XML: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a>\xEF\xBF\xBE</a>\n"
      QDEBUG : tst_QXmlStream::readBack() range 0xfffe - 0xffff is problematic
      

      QXmlStreamWriter should refuse to generate broken XML. It should escape the characters it can escape and ignore others. null, for example cannot be expressed in XML, as well as the surrogates 0xd800 - 0xdfff. However, if I add code to ignore those to QXmlStreamWriterPrivate::writeEscaped, then tst_QXmlStream::checkBaseline fails as it actually expects the broken XML to be written. What is the rationale behind this?

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ulherman Ulf Hermann
            ulherman Ulf Hermann
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes