Details
-
Bug
-
Resolution: Unresolved
-
P2: Important
-
None
-
5.12.3
-
None
-
Tested on macOS 10.14.5
Description
The Problem
I found out that the QSet set operations are extremely slow when operating on large sets with specific sizes.
The affected methods seem to be
- subtract()
- unite()
- intersect()
I only tested with subtract() but the other methods use the same pattern in the code so I expect that they show the same behavior.
The pattern that causes the bad performance is this:
typename QSet<T>::const_iterator i = other.constEnd(); while (i != other.constBegin()) { --i; // ... }
The issue is that other.constBegin() is evaluated in every loop iteration and in contrast to constEnd(), it is rather slow if the buckets in the underlying QHash are sparsely populated because constBegin() searches through the buckets until it finds the first one not empty. So the same issue occurs with QHash instead of QSet when using the pattern above.
Benchmark & Fix
I attach a benchmark application showing the behavior.
Note that since the behavior depends on the entries in the set and the qGlobalQHashSeed(), I use fixed values for both to always show the behavior. But you can also comment out the line setting the qGlobalQHashSeed() and run the benchmark 3-4 times until you see the bad performance.
The benchmark also contains a modified version of QSet ( Patched::QSet inside qset_patched.h ) which fixes the bad performance. I also attach a patch file to fix the original QSet.
Here is an example of a benchmark run with the bad performance:
********* Start testing of QSetSubtractBenchmark ********* Config: Using QtTest library 5.12.3, Qt 5.12.3 (x86_64-little_endian-lp64 shared (dynamic) release build; by Clang 10.0.1 (clang-1001.0.46.4) (Apple)) qSetGlobalQHashSeed: forced seed value is not 0, cannot guarantee that the hashing functions will produce a stable value.QDEBUG : QSetSubtractBenchmark::initTestCase() QDEBUG : QSetSubtractBenchmark::initTestCase() qGlobalQHashSeed(): 608128366 QDEBUG : QSetSubtractBenchmark::initTestCase() smallSetSize: 5 QDEBUG : QSetSubtractBenchmark::initTestCase() largeSetSize: 262153 PASS : QSetSubtractBenchmark::initTestCase() PASS : QSetSubtractBenchmark::benchmarkPatchedSmallMinusLarge() RESULT : QSetSubtractBenchmark::benchmarkPatchedSmallMinusLarge(): 1 msecs per iteration (total: 1, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkOriginalSmallMinusLarge() RESULT : QSetSubtractBenchmark::benchmarkOriginalSmallMinusLarge(): 19,160 msecs per iteration (total: 19,160, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkPatchedLargeMinusSmall() RESULT : QSetSubtractBenchmark::benchmarkPatchedLargeMinusSmall(): 44 msecs per iteration (total: 44, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkOriginalLargeMinusSmall() RESULT : QSetSubtractBenchmark::benchmarkOriginalLargeMinusSmall(): 55 msecs per iteration (total: 55, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkPatchedSmallMinusSmall() RESULT : QSetSubtractBenchmark::benchmarkPatchedSmallMinusSmall(): 0 msecs per iteration (total: 0, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkOriginalSmallMinusSmall() RESULT : QSetSubtractBenchmark::benchmarkOriginalSmallMinusSmall(): 0 msecs per iteration (total: 0, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkPatchedLargeMinusLarge() RESULT : QSetSubtractBenchmark::benchmarkPatchedLargeMinusLarge(): 48 msecs per iteration (total: 48, iterations: 1) PASS : QSetSubtractBenchmark::benchmarkOriginalLargeMinusLarge() RESULT : QSetSubtractBenchmark::benchmarkOriginalLargeMinusLarge(): 19,120 msecs per iteration (total: 19,120, iterations: 1) QDEBUG : QSetSubtractBenchmark::cleanupTestCase() 1048592 PASS : QSetSubtractBenchmark::cleanupTestCase() Totals: 10 passed, 0 failed, 0 skipped, 0 blacklisted, 77651ms ********* Finished testing of QSetSubtractBenchmark *********
As you can see, the original QSet version takes ~ 19 s to execute subtract() while the patched version takes ~ 1 ms in case of "smallSet - largeSet" and ~ 48 ms in case of "largeSet - largeSet".
Of course, the problem only shows up for large set sizes in certain ranges. The size I use in the benchmark is 262153 which causes a capacity()/size() ratio of 2.
Notes
- A better solution to the problem might probably be to make QHash::constBegin() fast. But I don't know if this is possible while keeping the algorithmic complexity for insertion and removal.
If not possible, this behavior of constBegin() and begin() should maybe be noted in the documentation (for both QSet and QHash). Something like "Note that this method can be slow for large hashes/sets (O(n)), depending on the values in the hash/set and the value of the qGlobalQHashSeed(). So try to avoid calling it inside a loop or loop condition." - I also added a check
if ( isEmpty() ) break;
inside QSet::subtract(). This makes it faster in case a big set is subtracted from a small (or event empty) set and I didn't see a performance decrease for small sets.