Details
-
Bug
-
Resolution: Done
-
P0: Blocker
-
None
-
production
-
None
Description
After couple of hours of every Coin restart, the service stops responding and starting up integrations, which means the CI is completely blocked.
Based on analysis, there is a reasonable suspicion that the reason is too many threads left behind by the scheduler process. Counting threads for scheduler process results in
vmbuilder@vmbuilder:~/qt-ci$ ps -T -p 25179 | wc -l 4967
Eventually this causes python process to refuse creating new threads because it does not have enough resources to do so.