Uploaded image for project: 'Qt Quality Assurance Infrastructure'
  1. Qt Quality Assurance Infrastructure
  2. QTQAINFRA-6450

postgres: need monitoring metrics on what is cached in the shared buffers

    XMLWordPrintable

Details

    • Task
    • Resolution: Fixed
    • P2: Important
    • None
    • None
    • Metrics / Test Results
    • None

    Description

      I can hear the database sometimes "thrashing" the disk, which leads to slowness. In order to debug why this is happening, I need monitoring metrics:

      • How much of shared_buffers are tables/indices
      • How much of those are which table/index entity specifically

      I expect a particular index to grow too much and throw everything else out, when this thrashing happens.

      Additional metrics that would be nice:

      • How much of the read/write IO happens for each table/index entity.

      In other words, I want telegraf to send several numbers for each and every table and index.

      TODO

      1. Create a postgresql user with only monitoring rights, use him in Telegraf
      2. Write queries that do the monitoring (see comments below), run them regularly in Telegraf
      3. Create grafana dashboards for these new metrics

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            jimis Dimitrios Apostolou
            jimis Dimitrios Apostolou
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes