Details
-
Type:
Bug
-
Status: Closed
-
Priority:
P2: Important
-
Resolution: Done
-
Affects Version/s: 4.8.5, 5.1.1
-
Fix Version/s: None
-
Component/s: Core: QString and Unicode
-
Environment:All environments
-
Commits:f3dfb372534ebe6553439ee8bfa62e94ab200889
Description
QTextCodec interprets non-character Unicode points as invalid.
Strictly speaking, UTF-xx encodings are reversible. That means, that the 66 non-character codes (including U+FFFE and U+FFFF), are valid UTF-8, and hence are perfectly acceptable within UTF-8 text messages.
According to the Unicode standard, they SHOULD NOT be used in information interchange, but a recent corrigendum clarifies that non-characters CAN be exchanged (see http://www.unicode.org/versions/corrigendum9.html)