Details
-
Bug
-
Resolution: Unresolved
-
P3: Somewhat important
-
None
-
6.2.0 Beta2
-
None
Description
QStringDecoder currently seems to accept any input, including disallowed range U+D800—U+DFFF reserved for UTF-16 surrogates and out of bound characters with codes >= 0x110000. It also accepts non-multiples of 4 without indicating errors (it does produce replacement characters though). Similarly QString's constructed from invalid UCS4 data do not show errors in any.
Example:
QStringDecoder toUtf16(QStringDecoder::Utf32, QStringDecoder::Flag::Stateless); const char32_t out_of_range[] = {0x4010300}; qDebug() << toUtf16(QByteArrayView(reinterpret_cast<const char *>(out_of_range), sizeof(out_of_range))) << toUtf16.hasError(); // "U+10300" false const char32_t surrogates[] = {0xd800, 0xdf00}; qDebug() << toUtf16(QByteArrayView(reinterpret_cast<const char *>(surrogates), sizeof(surrogates))) << toUtf16.hasError(); // "U+10300" false qDebug() << toUtf16(QByteArrayView(reinterpret_cast<const char *>(surrogates), sizeof(surrogates)-1)) << toUtf16.hasError(); // "\uD800�" false
U+10300 replaces the actual character due to Jira limitations. Note that hasError() always returns false. Changing the decoder flags does not seem to influence that.
For QStrings constructed from the same data I would expect either an empty string produced or one that contain replacement characters.
This simplifies exploiting bugs like QTBUG-95689.
Attachments
Issue Links
- relates to
-
QTBUG-95689 Missing overflow handling allows alternative Punycode-encoded domain name representations
- Closed