Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: P3: Somewhat important
Fix Version/s: None
Affects Version/s: dev
Component/s: Core: QString and Unicode
Labels:
None

Platform/s:

All
Story Points:
21
Sprint:
Foundation PM Prioritized

Discovered at: https://codereview.qt-project.org/c/qt/qtbase/+/488440/23..24/src/corelib/text/qstringconverter.h#b245

QStringDecoder::appendToBufferchar16_t *out, QByteArrayView ba)

produces inconsistent results when feeding the data piecemeal, compared to feeding the data in one chunk or using alternative decoding facilities as 'QString::fromUtf8()'

The unicode standard doesn't enforce a strict design on how to handle illegal sequences, but I think we can all agree that the one we use should be consistent across all conversion mechanism we provide.

const char illFormed[] = u8"a\xe0\x9f\x80""a";

In this illegal sequence, we have: 1 code-point (a), \xe0 (legal), \x9f (illegal), \x80 (illegal), 1 code-point(a)

QString and QStringDecoder will write a replacement character for every illegal sequence that we encounter:

"a\uFFFD\uFFFD\uFFFDa"

however, feeding the data byte-by-byte will produce:

"a\uFFFDa"

only writing a single replacement char for the whole sequence. Perhaps some internal state handling in QStringDecoder is producing this inconsistency.

Please find the reproducer attached.

 Reproducer output:
"fromUtf8" OK 
"decoder" OK 
"decoderAppendFull" OK 
"decoderAppendPieceMeal" "a�a" != "a���a"

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List

main.cpp
14 Sep '23 11:48
3 kB
Dennis Oberst

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Assignee:: Dennis Oberst

Reporter:: Dennis Oberst

PM Owner:: Vladimir Minenko

RnD Owner:: Alex Blasche

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 14 Sep '23 10:56

Updated:: 07 Nov '24 06:51

There are no open Gerrit changes

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews