Details
-
Task
-
Resolution: Unresolved
-
P3: Somewhat important
-
None
-
Qt Creator 4.6.0-beta1, Qt Creator 4.7.0-beta1
-
None
Description
dwarf unwinding is rather inefficient as it requires us to take large stack snapshots for each sample and adds a lot of CPU overhead on the host. This severely limits how much samples we can take in a given time and impacts our ability to do, for example, memory profiling.
There is an ARM-specific debug info format ("exidx" IIRC) we could process on the target, similar to how the frame pointer unwinding works. Then we only need to transfer a list of pointers to the host, not the whole stack. That would require hacking the kernel. We might be able to upstream the resulting patches, though.
There also is a technique to get up to 16 last branches at a very low cost on certain intel CPUs which might be enough to reconstruct a somewhat usable stack.