Details
-
Bug
-
Resolution: Done
-
P2: Important
-
None
-
4.8.5, 5.1.1
-
All environments
-
f3dfb372534ebe6553439ee8bfa62e94ab200889
Description
QTextCodec interprets non-character Unicode points as invalid.
Strictly speaking, UTF-xx encodings are reversible. That means, that the 66 non-character codes (including U+FFFE and U+FFFF), are valid UTF-8, and hence are perfectly acceptable within UTF-8 text messages.
According to the Unicode standard, they SHOULD NOT be used in information interchange, but a recent corrigendum clarifies that non-characters CAN be exchanged (see http://www.unicode.org/versions/corrigendum9.html)