Details
-
Task
-
Resolution: Unresolved
-
Not Evaluated
-
None
-
None
-
None
Description
Nightly health check builds should always pass, so any fails there are flakiness.
To monitor this a Grafana dashboard is needed. Dashboard should display the fails on a timeline. Fail reasons should be categorized. Currently known reasons are 'sccache error' and 'malformed universal file', and rest of the fails can be categorized as 'other'.
Modify coin log parser and create a separate dashboard to track the number of failed builds:
Every failed built must be checked, write results to coin_extra database
build dasbhaord
1) Android emulator is failing to start:
example of fail:
https://coin.ci.qt.io/coin/api/log/qt/qtlocation/b74c2684acbc187ac0a22400752661f50039c116/LinuxRHEL_8_8x86_64AndroidAndroid_ANYx86GCCqtci-linux-RHEL-8.8-x86_64-50-a6a815AndroidTestRun_Sccache_UseConfigure_WarningsAreErrors/e06e4ea1db753a74190a2e5ee21e1223440f0375/test_1718060160/log.txt.gz
Waiting a few minutes for the emulator to fully boot...
agent:2024/06/10 23:15:53 build.go:404: bootanim= boot_completed= bootcomplete=
agent:2024/06/10 23:15:53 build.go:256: Virtual Memory Total: 28150202368, Free:14611292160, UsedPercent:16.198523%
agent:2024/06/10 23:15:53 build.go:268: | PID| PPID|Status| CPU-%|Command
example of success:
Emulator started successfully
https://coin.ci.qt.io/coin/api/log/qt/qtdeclarative/0ca02952cfbe4a736803e2cf1857705e9a46a5a3/LinuxRHEL_8_8x86_64AndroidAndroid_ANYx86GCCqtci-linux-RHEL-8.8-x86_64-50-ce1770AndroidTestRun_Sccache_UseConfigure_WarningsAreErrors/84f278b0c70f8affcdb76615ee671ac1b9c12ee2/test_1718836764/log.txt.gz
2) sccache fails:
sccache: error: failed to execute compile agent:2023/10/31 03:31:37 build.go:404: sccache: caused by: error reading compile response from server agent:2023/10/31 03:31:37 build.go:404: sccache: caused by: Failed to read response header agent:2023/10/31 03:31:37 build.go:404: sccache: caused by: An existing connection was forcibly closed by the remote host. (os error 10054) agent:2023/10/31 03:31:37 build.go:404: ninja: build stopped: subcommand failed.
Dashboard:
https://testresults.qt.io/grafana/d/edpu33bwhsiyof/build-failures?orgId=1
28 June 2024
categorization by fail type - picking the first fail
(we will not store precise info if in same build/test both sscashe and android emulator fail, we will categorize as either sscashe or android)
- check sscashe must be checked on both failed build and test workitems, all branches,
- android case: failed only tests (since emulator fails cannot be recovered)
- categorization names: sccache_error, android_emulator_start_failure
- run daily - 6 am Finland time, data covered - 6 am previous day- 6 am current day
- inform Rami Potinkara once it is running stably