Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-100879

Anchor tags with attributes breaks QTextEdit/QTextBrowser character composer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • P2: Important
    • None
    • 5.15.2, 6.2.2
    • None
    • Same on Linux X11 Xubuntu 20.04 and Windows 10, and with either gcc and MSVC.
    • Linux/X11, Windows

    Description

      When outputting HTML to a QTextEdit/QTextBrowser widget that has character composition, where accent/marks are placed above/below the letter, if the word is surrounded by an anchor ("a") tag that has attributes (and it doesn't seem to matter what the attribute is), then the first letter inside that tag will have its characters decomposed into individual characters of the constituent parts.  Without the anchor tags or with an anchor tag that has no attributes, the characters appear just fine.

      This happens to be an RTL language, Hebrew to be exact.  I've test it on Linux (X11, Xubuntu 20.04) and Windows 10 with identical results.  And I've tried with multiple fonts.  With some fonts, the issue is much more obvious than with others.  But the problem is there nonetheless.

      I use these anchors for having the software navigate the text to track the exact user's cursor placement within specific words inside the HTML text of a QTextBrowser, and so they are quite important to my project.  But having the anchors breaks the rendering of the first composed character inside the tag.  I've even tried inserting an extra space inside the tag, trying to find a workaround, and it doesn't seem to make any difference on either side of the word.

      I'm experiencing this on both Qt 6.2.2 and Qt 5.15.2 with identical behavior, and I think it goes back even further in the Qt 5 history, so it doesn't seem to be a new bug.

      Attached is a sample app that illustrates the problem.  And there are two screenshot images, one from Linux and one Windows.  The font on the Windows screenshot is more subtle about the problem, but if you look carefully at the composition placement, you'll see that it's equally wrong there as it is on Linux.

      The six lines printed are various combinations of the same text with and without the anchor tags.  You'll notice that the first line with no anchors, is fine.  And so is the second line, which has an anchor tag with no attributes.  The first word on lines three, four, and five are broken as they each have an anchor around them with an attribute.  But the other words on those lines are fine because they have no anchors.  And line six, which has an anchor around every single word, has all seven words with broken composition of its first character.

      And while this example doesn't illustrate it, it also seems to be breaking the QTextCursor move logic too, when doing things like movePosition by character or word, which seems to indicate that the underlying QString somehow got decomposed, similar to calling normalized on it.  But that's a supposition on my part based on how it seems to be behaving.

      I'm not sure I've seen the issue with LTR languages, but I don't have any good LTR examples that use as many character compositions.  So, I can't say for sure if it's a general issue or an RTL vs LTR issue too.

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            qt.team.quick.subscriptions Qt Quick and Widgets Team
            dewhisna Donna Whisnant
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes