Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-104362

QDomImplementation::DropInvalidChars strips emoji and other non-BMP characters

    XMLWordPrintable

    Details

    • Commits:
      da0d7f61c8 (qt/qtbase/dev) da0d7f61c8 (qt/tqtc-qtbase/dev) c0aa21734d (qt/tqtc-qtbase/6.2) a65f67f5ab (qt/qtbase/6.3) a65f67f5ab (qt/tqtc-qtbase/6.3) 1892763c8c (qt/qtbase/6.4) 1892763c8c (qt/tqtc-qtbase/6.4) ce27e6f022 (qt/tqtc-qtbase/5.15)

      Description

      Downstream bug: https://bugs.kde.org/show_bug.cgi?id=455255

      We are loading the SVG XML by wrapping a QIODevice in QXmlInputSource, then loading it with QDomDocument::setContent. After calling QDomImplementation::setInvalidDataPolicy(QDomImplementation::DropInvalidChars), QDomDocument::setContent strips non-BMP chars encoded in UTF-8 in the XML.

      QDomDocument is loading the file as UTF-16 code units. In fixedCharData, it checks the code units one by one, using QXmlUtils::isChar which rejects half surrogates. This will, of course, strip all non-BMP code points.

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

            Assignee:
            sonakur Sona Kurazyan
            Reporter:
            alvinhochun Alvin Wong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Gerrit Reviews

                There are no open Gerrit changes