Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-135033

QXmlStreamReader::addData() can parse Latin1 data incorrectly

    XMLWordPrintable

Details

    • 2
    • 4b8659ebf (dev), cfda48772 (6.9), d852646d3 (6.8), 536610798 (tqtc/lts-6.5)
    • Foundation Sprint 128

    Description

      The patch https://codereview.qt-project.org/c/qt/qtbase/+/419210 converted QXmlStreamReader constructor and addData() method to take QAnyStringView.

      However, it didn't set the lockEncoding flag when handling Latin1 strings in addData().

      This can lead to an incorrect result when a Latin1-encoded XML document with a proper "encoding" attirbute is passed as a Latin1 string to addData() method:

      • at first it will be converted to UTF-8
      • later the parser will read the "encoding" attribute, and try to convert the data again into the specified encoding.

      A simple test that illustrates the problem:

      const auto in = "<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>"
                      "<a>M\xE5rten</a>"_L1;
      QXmlStreamReader reader;
      reader.addData(in);
      QVERIFY(reader.readNextStartElement());
      QString text = reader.readElementText();
      QCOMPARE(text, "M\xE5rten"_L1); \\ FAIL! The result is "M\u00C3\u00A5rten"
      

      The QXmlStreamReader(QAnyStringView) constructor is not affected, because it already sets the flag correctly.

      Attachments

        Issue Links

          For Gerrit Dashboard: QTBUG-135033
          # Subject Branch Project Status CR V

          Activity

            People

              ivan.solovev Ivan Solovev
              ivan.solovev Ivan Solovev
              Vladimir Minenko Vladimir Minenko
              Alex Blasche Alex Blasche
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes