Details
-
Task
-
Resolution: Unresolved
-
P3: Somewhat important
-
None
-
6.0.4, 6.1.3, 6.2.5, 6.3.1, 6.4.0 Beta4
-
None
-
8
-
Foundation PM Prioritized
Description
While working on QTBUG-104972, which fixed the most obvious problems with the widening of the functions from 32-bit to 64-bit-sized Qt containers, it turned out that we use an ad-hoc format whereby we prepend the decompressed data's length as a 32-bit unsigned Big Endian to the compressed data.
Upon uncompressing, we use that field as a hint for the output buffer, but if the buffer turns out to be too small (Z_BUF_ERROR), we double the buffer's size and try again.
In this way, 64-bit platforms can actually qCompress() more than 4GiB of data (except Windows, cf. QTBUG-106542) and qUncompress() can decompress it again, albeit at the expense of several rounds that end in Z_BUF_ERROR.
This problem is exacerbated by the current code using simple narrowing of the input size to 32-bit: An input size of UINT_MAX + 1 therefore starts with a 1-byte sized buffer which is resized 32 times until it's 8GiB and can finally hold the output.
This task is about finding ways to encode the real length in a way that
- old code doesn't choke on
- allows new code to calculate the right buffer size on the first try
Failing that, we should at the very least minimize the number of rounds with Z_BUF_ERROR.
Some ideas:
- using 0xffff'ffff for anything ≥ 4GiB (saturation arithmetic, minimizes rounds)
- using a length > INT_MAX < UINT_MAX that, when repeated doubled, produces a buffer minimally larger than the real buffer (minimizes overallocation)
- using a floating point encoding, provided the value interpreted as a uint is > INT_MAX and ≤ the real value
- untested example: 0b1EEE'EEES 0xSS 0xSS 0xSS where E is a 6-bit unsigned exponent and S is a 25-bit unsigned significant (could use the MSB as part of the significant)
- ...
Acceptance criteria: qCompress() encodes the length field such that
- old qUncompress(), interpreting it as a 32-bit signed BE field, continues to work
- succeeds if it would have succeeded with the old format
- new code decodes the length field such that the resulting length is larger than the output data
- but not by more than 100%
- if any of the above is unachievable, the fall-back is to minimize the number of required Z_BUF_ERROR rounds in qUncompress() (old and new versions)
Attachments
Issue Links
- resulted from
-
QTBUG-104972 qUncompress(qCompress()) round-trip fails for data > 2GiB
-
- Closed
-