-
Bug
-
Resolution: Unresolved
-
P2: Important
-
None
-
6.10.0
-
None
Description
I have noticed another critical bug related to the LoD (Level of Detail) with Instance Table. Unfortunately, I cannot provide a specific example this time, as the issue lies within the underlying culling mechanism.
The intended purpose of the system is to dynamically switch instances to lower-poly models and/or cull them based on distance. Visually, the model switching and culling appear to work correctly.
However, the core problem is that the total count of data in the instance table (the count for the draw call) does not decrease after the distance-based filtering/culling process. This forces the GPU to still create the original number of draw calls internally, despite the culling being visually effective.
As a result, instead of offloading the GPU, we get double the work: n objects will be drawn for every level of instances for every frame. Consequently, the FPS not only fails to improve but significantly worsens after implementing this LoD system.
This is a major performance issue. I sincerely hope that a fix will be available soon.
Steps to Reproduce
- Set up the Scene: Create one large Instance Table and connect at least two models to it: one with a high polygon count and one with a significantly lower polygon count, to ensure a measurable performance difference.
- Configure LoD Parameters: Use the following parameters to set the distance thresholds for instanced LoD switching/culling:
-
- instancingLodMin
-
- instancingLodMax
- Run the Scene: Place a camera in a position where the culling and LoD switching should be actively occurring (i.e., instances are far away).
- Crucial Step: Monitor GPU Load: Observe the GPU load (e.g., using a profiling tool like RenderDoc, Afterburner, or similar) while the culling system is active.
Expected Behavior
When instances are culled (i.e., fall outside of the instancingLodMax range) or switched to a lower LoD model, the total GPU load should decrease as the number of effective draw calls or rendered triangles is reduced. The engine should pass a reduced instance count to the graphics API (glDrawElementsInstanced or equivalent).
Actual Behavior
The visual culling and model switching work, but the GPU load remains high (or even increases). Profiling shows that the system still forces the GPU to process the original, unfiltered number of instances in the draw calls, leading to poor performance and decreased FPS.
Example
import QtQuick import QtQuick3D Node { id: root property var camera: null property var instancing: null // Resources L0_Jungle_RedTall_Tree_Small { instancing: root.instancing instancingLodMin: 0 instancingLodMax: 10 } L1_Jungle_RedTall_Tree_Small { instancing: root.instancing instancingLodMin: 10 instancingLodMax: 10000 } // Animations: }
Trouble in cullLodInstances - this method not reduce total count of array. Here the best solution will be to change count to the new size value, and physical remove not used data from lodData array.
static void cullLodInstances(QByteArray &lodData, const void *instances, int count, const QVector3D &cameraPosition, float minThreshold, float maxThreshold) { const QSSGRenderInstanceTableEntry *instance = reinterpret_cast<const QSSGRenderInstanceTableEntry *>(instances); QSSGRenderInstanceTableEntry *dest = reinterpret_cast<QSSGRenderInstanceTableEntry *>(lodData.data()); for (int i = 0; i < count; ++i) { const float x = cameraPosition.x() - instance->row0.w(); const float y = cameraPosition.y() - instance->row1.w(); const float z = cameraPosition.z() - instance->row2.w(); const float distanceSq = x * x + y * y + z * z; if (distanceSq >= minThreshold * minThreshold && (maxThreshold < 0 || distanceSq < maxThreshold * maxThreshold)) *dest = *instance; else *dest= {}; dest++; instance++; }
This is models to test.
low poly and high poly objects to test.zip![]()