Details
-
Bug
-
Resolution: Fixed
-
P2: Important
-
5.15.10
-
None
-
08b722e68 (dev)
Description
Problem:
At least some escaped unicode characters produce a wrong output in the .ts file when the next characters are possible hex digits.
"\u00A0test" in source produces
 test
in .ts file (correct)
"\u00A0anotherTest" in source produces "ਊnotherTest" in .ts file (wrong)
Steps to reproduce:
- Create an empty project
- Add some translated string including "\u00A0" (non-breaking space) and a following a, b, c, d, e or f character.
- Add a "TRANSLATION +=" setting to the .pro file
- Run Tools->External->Linguist->lupdate
Expected behaviour:
The .ts file should contain the string with the "\u00A0" part encoded as
 
and the following characters unaltered.
Actual behaviour:
The .ts file contains a unwanted unicode letter and the following character is removed.
More Details:
I am very sure that when parsing the \u escape sequence
- the possible length is ignored. \u is only allowed to be followed by four hex digits.
- The preceeding zeros are ignored. In my Tests i used "\u00A0" with a following "a". The result was the character ਊ which has the unicode value 0A0A.
I also tested a UTF-32 coding "\U000000A0" with the same result.
Using the non-breaking space directly as a single character in an UTF-8 source file works for linguist, but QtCreator replaces any non-breaking spaces (even inside strings) with normal spaces when saving the file. So this is no viable workaround.
In the appended example I used the String "One\u00A0and\u00A0two" where you see very clearly, that the first occurance (with following "a") produces a wrong result. The second (with following "t") works just fine.
Attachments
For Gerrit Dashboard: QTBUG-105573 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
606461,7 | lupdate: Correct Parsing of Escaped Unicode Characters | dev | qt/qttools | Status: MERGED | +2 | +1 |