When retrieving the character at the current cursor/caret position using the GetCharAtOffset method from the AT-SPI Text interface, the character is not correctly reported when its UTF-16 representation has 2 surrogate pairs, i.e. it has a Unicode code point above 65535.
Sample steps to reproduce:
1) build the attached sample program "qt-sample-app-character":
2) run the app, which has a QTextEdit with the text "abc𐐷d" and moves the cursor to before the "𐐷" (U+10437) character
3) run the attached Python/pyatspi script that retrieves the character at the cursor position using the GetCharAtOffset method from the AT-SPI text interface tha the QTextEdit provides on Linux.
An incorrect code point is reported:
The script should print the unicode codepoint of the "𐐷" character in hex ("0x10437") and the original character:
Note: There's a similar/related Gerrit change for LibreOffice:
|For Gerrit Dashboard: QTBUG-113438|
|476339,4||a11y atspi: Report correct char code point when it's > 65535||dev||qt/qtbase||Status: MERGED||+2||0|
|483636,2||a11y atspi: Report correct char code point when it's > 65535||6.6||qt/qtbase||Status: MERGED||+2||0|
|486112,2||a11y atspi: Report correct char code point when it's > 65535||6.5||qt/qtbase||Status: MERGED||+2||0|