Details
-
Task
-
Resolution: Done
-
P2: Important
-
None
-
None
Description
Monitor all kinds of health statistics for all our build and test VMs. Requirements:
- Install a monitoring utility to all of our Tier2 images
- Telegraf? it's the one already in use for the host machines.
- Must be able to run custom monitoring commands on custom intervals, for example "ioping" on a custom directory, in order to measure the I/O latency.
- Send all statistics to a remote database
- InfluxDB most likely, as it's already used for recording the host machines metrics
- Make sure the VMs don't cache any metrics, but send them directly, as the build VMs are by definition short lived - they can be killed the moment something goes wrong, but we definitely don't want to miss those metrics
- Data retention on the database is of secondary importance; it's OK to delete logs after a month or even only a week.
- We'll most likely need to assign a unique hostname to each build VM in Coin.
Attachments
Issue Links
- relates to
-
QTQAINFRA-3089 Implement centralised log aggregation for all hosts/VMs in Coin and OpenNebula
-
- Reported
-
Gerrit Reviews
For Gerrit Dashboard: QTQAINFRA-3088 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
268518,54 | Install telegraf on all provisioned VMs | 5.12 | qt/qt5 | Status: ABANDONED | 0 | 0 |
270137,15 | Start telegraf at the beginning of each build | master | qtqa/tqtc-coin-ci | Status: MERGED | +2 | 0 |
270406,12 | Create a new category of provisioning scripts | master | qtqa/tqtc-coin-ci | Status: ABANDONED | -2 | 0 |
270481,14 | Run provisioning scripts in order | master | qtqa/tqtc-coin-ci | Status: ABANDONED | -1 | 0 |
270566,14 | Increase RLIMIT_NOFILE if it's too small | master | qtqa/tqtc-coin-ci | Status: MERGED | +2 | 0 |
274852,3 | Workitem: Save agent ids in storage | master | qtqa/tqtc-coin-ci | Status: ABANDONED | 0 | 0 |
274882,8 | Install telegraf on all provisioned VMs | 5.13 | qt/qt5 | Status: MERGED | +2 | 0 |
275879,13 | Print links to metrics data | master | qtqa/tqtc-coin-ci | Status: MERGED | +2 | 0 |
277056,2 | Update previously committed patch to match the branch's platforms | 5.13 | qt/qt5 | Status: MERGED | +2 | 0 |