Details
-
User Story
-
Resolution: Out of scope
-
P2: Important
-
None
-
None
Description
Today the OpenNebula root partition got full, went from 90% to 100% in one go and things stopped working. This has happened before on other hosts and shared storage.
As a sysadmin, I would like all servers to be regularly operational with under 70% disk utilised, or get notified if not.
Alerts
Every single host should be setup automatically to throw alerts:
- yellow alerts if utilisation goes over 80%
- red alerts if it goes over 90%
How to deliver alerts
- emails
How to automate setting this up on each and every host
puppet, ansible, cfengine, cdist?
(feel free to edit description as details are figured out)
Attachments
Issue Links
- relates to
-
QTQAINFRA-3361 qtinfluxdb01 /data partition is full
-
- Closed
-
-
QTQAINFRA-1851 Cachefilesd needs a restart once in a while
-
- Closed
-
-
QTQAINFRA-3242 qtinfluxdb01 needs more disk space and swap
-
- Closed
-
-
QTQAINFRA-1864 Not enough disk space on ON1-OpenNebula
-
- Closed
-
-
QTQAINFRA-3275 Set up automatic archiving on old stuff from ci-files02
-
- In Progress
-
-
QTQAINFRA-3274 Move CI-Master aka vmbuilder to the Compellent
-
- Closed
-
-
QTQAINFRA-3276 Increase disk size of downloads.qt.io
-
- Closed
-
-
QTQAINFRA-3277 Increase disk size of master.qt.io
-
- Closed
-
-
QTQAINFRA-1759 CI system should be automatically monitored
-
- Closed
-