Details
-
Bug
-
Resolution: Done
-
P1: Critical
-
None
-
master, production
-
None
Description
Coin log:
coin.log:19192:2018-07-25 02:49:09,510 ERROR:workitem(2966): Agent 1532438685-1873 is in an invalid state: Watchdog (activated on: 1532474978.7978728) has not been updated for 300s, timeout., buildKey qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build, will try another agent coin.log:19193:2018-07-25 02:49:09,513 DEBUG:workitem(2966): Work item (buildKey: qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build) changed state from Running to WaitingForHardware coin.log:19194:2018-07-25 02:49:09,515 DEBUG:opennebulahardwarepool(2966): Dispose VM: 672577, buildKey: qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build, agent: 1532438685-1873 coin.log:20417:2018-07-25 02:53:29,911 INFO:workitem(2966): Agent running 10.215.196.98:49782 (buildKey: qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build) coin.log:20418:2018-07-25 02:53:29,913 DEBUG:workitem(2966): Work item (buildKey: qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build) changed state from WaitingForHardware to WaitingForAgent coin.log:20419:2018-07-25 02:53:29,917 DEBUG:workitem(2966): Work item (buildKey: qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build) changed state from WaitingForAgent to Running coin.log:26383:2018-07-25 03:09:50,847 INFO:storage(30140): upload of UploadModuleBuildArtifact to qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/build_1532474920 coin.log:26384:2018-07-25 03:09:50,850 ERROR:storage(30140): Could not save artifacts to /home/vmbuilder/ci-working-dir/storage/qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/build_1532474920/artifacts.tar.gz: [Errno 11] Resource temporarily unavailable Traceback (most recent call last): File "src/storage.py", line 464, in handle_upload_artifact fcntl.lockf(tf, fcntl.LOCK_EX | fcntl.LOCK_NB) BlockingIOError: [Errno 11] Resource temporarily unavailable 2018-07-25 03:10:41,190 INFO:workitem(2966): Agent FINISHED FAIL: <could not determine failure location - check log!>: qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/Build
When looking at the file system locks, there is a deadlock:
(env) 13:04:05 /home/vmbuilder/ci-working-dir/logs$ sudo lslocks | grep 1532474920/artifacts.tar.gz python 29473 POSIX 14.1M WRITE 0 0 0 /home/vmbuilder/ci-working-dir/storage/qt/qtbase/3516fabfd8f50fbca3555a2f3d80b16eebf965fb/WindowsWindows_10x86_64WindowsWindows_10x86_64MSVC2015qtci-windows-10-x86_64-10-6c2312DebugAndRelease_Release_ForceDebugInfo_OpenGLDynamic/2a48a3b7dcb9652b1532a3ad14de7580b5b9b1ee/build_1532474920/artifacts.tar.gz.partial
The code should be fixed so that the lock is released if the agent that is uploading the artifact gets killed.
Attachments
Issue Links
For Gerrit Dashboard: QTQAINFRA-2124 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
235317,7 | Handle BlockingIOError while saving artifacts | master | qtqa/tqtc-coin-ci | Status: MERGED | +2 | 0 |
236266,18 | Terminate agent process if workitem does not exist in scheduler | master | qtqa/tqtc-coin-ci | Status: MERGED | +2 | 0 |