Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-105573

lupdate produces wrong output for strings with escaped unicode characters in some cases

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • P2: Important
    • None
    • 5.15.10
    • Tools: Linguist
    • None

    Description

      Problem:

      At least some escaped unicode characters produce a wrong output in the .ts file when the next characters are possible hex digits.

      "\u00A0test" in source produces 

       test

      in .ts file (correct)

      "\u00A0anotherTest" in source produces "ਊnotherTest" in .ts file (wrong)

      Steps to reproduce:

      • Create an empty project
      • Add some translated string including "\u00A0" (non-breaking space) and a following a, b, c, d, e or f character.
      • Add a "TRANSLATION +=" setting to the .pro file
      • Run Tools->External->Linguist->lupdate

      Expected behaviour:

      The .ts file should contain the string with the "\u00A0" part encoded as

       

      and the following characters unaltered.

       

      Actual behaviour:

      The .ts file contains a unwanted unicode letter and the following character is removed.

      More Details:
      I am very sure that when parsing the \u escape sequence

      1. the possible length is ignored. \u is only allowed to be followed by four hex digits.
      2. The preceeding zeros are ignored. In my Tests i used "\u00A0" with a following "a". The result was the character ਊ which has the unicode value 0A0A.

      I also tested a UTF-32 coding "\U000000A0" with the same result.

      Using the non-breaking space directly as a single character in an UTF-8 source file works for linguist, but QtCreator replaces any non-breaking spaces (even inside strings) with normal spaces when saving the file. So this is no viable workaround.

      In the appended example I used the String "One\u00A0and\u00A0two" where you see very clearly, that the first occurance (with following "a") produces a wrong result. The second (with following "t") works just fine.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            kkohne Kai Köhne
            markus_s Markus Steinhilber
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes