-
Technical task
-
Resolution: Unresolved
-
P2: Important
-
None
-
None
-
-
Foundation Sprint 113, Foundation Sprint 114, Foundation Sprint 115, Foundation Sprint 116, Foundation Sprint 117, Foundation Sprint 118, Foundation Sprint 119, Foundation Sprint 120, Foundation Sprint 121, Foundation Sprint 122, Foundation Sprint 123, Foundation Sprint 124, Foundation Sprint 125, Foundation Sprint 126, Foundation Sprint 127, Foundation Sprint 128, Foundation Sprint 129, Foundation Sprint 130, Foundation Sprint 131, Foundation Sprint 132, Foundation Sprint 133, Foundation Sprint 134, Foundation Sprint 135
TL;DR: Provide for QAnyStringView what QStringIterator does for QStringView.
I would prefer to call the result QUcs4Iterator to put the focus on what you get out of the iteration rather than the "anything string-like enough" that's the source of the UCS4 code-points it delivers.
When parsing localised text it is necessary to be able to do two things, often interleaved:
- Ask for the next UCS4 character;
- Ask whether what remains starts with some specified string.
As long as the iterator that delivers the former can tell us the index at which it would start looking for the next UCS4 character, we can use the slice() of a view with our existing startsWith() methods to do the latter. When that does find a match, we can use code along the lines of
const qsizetype from = ucs4.consumed(); Q_ASSERT(text.sliced(from).startsWith(asUni)); // Step ucs4 over this match of asUni: while (ucs4.hasNext() && !text.first(ucs4.consumed()).sliced(from).startsWith(asUni)) (void) ucs4.next(); // asUni should end at a UCS4-boundary within text: Q_ASSERT(text.first(ucs4.consumed()).sliced(from) == asUni);
to step over the asUni we found, on the reasonable assumption that it starts and ends on UCS4 boundaries within the text, however encoded.
Thus the addition of an iterator over UCS4 units found in arbitrary (within limits) string data (packaged by QAnyStringView) would thus decouple parsers for string data from the tedious details of how exactly that data is encoded.
I include a proof of concept implementation developed in the course of reworking parsing of date, time and numeric data as a provisional illustration of an MVP for the actual parsers I'm working on. It shall have successor commits that deploy in a basic parser for numbers.
- depends on
-
QTBUG-119713 Qt needs a QUtf8StringIterator
-
- Open
-
- is required for
-
QTBUG-138680 Port QLocale's number-parsing to new digit-conversion component
-
- In Progress
-
For Gerrit Dashboard: QTBUG-138686 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
663585,7 | Add proof-of-concept QUcs4Iterator class | dev | qt/qtbase | Status: NEW | 0 | 0 |