Loading...

XML

Word

Printable

Details

Type: Task
Resolution: Unresolved
Priority: P3: Somewhat important
Fix Version/s: None
Affects Version/s: 6.0.4, 6.1.3, 6.2.5, 6.3.1, 6.4.0 Beta4
Component/s: Core: Containers and Algorithms, Core: I/O
Labels:
None

Epic Link:
qsizetype
Story Points:
8
Sprint:
Foundation PM Prioritized

Description

While working on ~~QTBUG-104972~~, which fixed the most obvious problems with the widening of the functions from 32-bit to 64-bit-sized Qt containers, it turned out that we use an ad-hoc format whereby we prepend the decompressed data's length as a 32-bit unsigned Big Endian to the compressed data.

Upon uncompressing, we use that field as a hint for the output buffer, but if the buffer turns out to be too small (Z_BUF_ERROR), we double the buffer's size and try again.

In this way, 64-bit platforms can actually qCompress() more than 4GiB of data (except Windows, cf. ~~QTBUG-106542~~) and qUncompress() can decompress it again, albeit at the expense of several rounds that end in Z_BUF_ERROR.

This problem is exacerbated by the current code using simple narrowing of the input size to 32-bit: An input size of UINT_MAX + 1 therefore starts with a 1-byte sized buffer which is resized 32 times until it's 8GiB and can finally hold the output.

This task is about finding ways to encode the real length in a way that

old code doesn't choke on
allows new code to calculate the right buffer size on the first try

Failing that, we should at the very least minimize the number of rounds with Z_BUF_ERROR.

Some ideas:

using 0xffff'ffff for anything ≥ 4GiB (saturation arithmetic, minimizes rounds)
using a length > INT_MAX < UINT_MAX that, when repeated doubled, produces a buffer minimally larger than the real buffer (minimizes overallocation)
using a floating point encoding, provided the value interpreted as a uint is > INT_MAX and ≤ the real value
- untested example: 0b1EEE'EEES 0xSS 0xSS 0xSS where E is a 6-bit unsigned exponent and S is a 25-bit unsigned significant (could use the MSB as part of the significant)
...

Acceptance criteria: qCompress() encodes the length field such that

old qUncompress(), interpreting it as a 32-bit signed BE field, continues to work
- succeeds if it would have succeeded with the old format
new code decodes the length field such that the resulting length is larger than the output data
- but not by more than 100%
if any of the above is unachievable, the fall-back is to minimize the number of required Z_BUF_ERROR rounds in qUncompress() (old and new versions)

Attachments

Issue Links

resulted from

QTBUG-104972 qUncompress(qCompress()) round-trip fails for data > 2GiB

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Qt Core & Network

Reporter:: Marc Mutz

PM Owner:: Vladimir Minenko

RnD Owner:: Alex Blasche

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 12 Sep '22 06:49

Updated:: 28 Nov '24 11:27

Gerrit Reviews

There are no open Gerrit changes