Uploaded image for project: 'Qt Quality Assurance Infrastructure'
  1. Qt Quality Assurance Infrastructure
  2. QTQAINFRA-6444

testresults influx logical databases maintenance, identificiation and cleanup

    XMLWordPrintable

Details

    • Task
    • Resolution: Unresolved
    • Not Evaluated
    • None
    • None
    • None

    Description

       

      Influx db on testresults still serves as the main db for displaying testresults in grafana (provides data for fastcheck, slowcheck, flaky failed and blacklisted dashboards). While we move to posgreSQL we still use and need to maintain it.
      The influx instance allows for the existence of several logical databases that can be shown using "show database" query. The grafana dashboards use coin and coin_extra databases and probably other databases related to benchmarks.
      > show databases
      name: databases
      name


      _internal
      demo
      test
      feature_system
      qmlbench -> should be cleaned see comment
      eriks_playground
      qtest_benchmarks
      coin
      erik_test
      creator_benchmarks
      core_benchmarks
      qt3dstudio_runtime_benchmarks
      QtWaylandTests
      qmlbench_NDA_devices
      3dstudio
      qtquick3dTests
      coin_test
      qmlbench_archive -> to be moved somewhere else
      restage_statistics
      restage_statistics2
      coin_extra
      coin_capacity
      audun_playground
      juho_test
      qmlbench_boot2qt
      jahelaak_personal

       Recently we had a situation when the influx ran out of the memory and was resetting itself. The dashboards displayed no data. Such situations happen when we store too much data and influx is excessively using all available memory. To handle the situation we deleted data from before 01/01/2023 tracked by https://bugreports.qt.io/browse/QTQAINFRA-6421. This ticket however refers only to 'coin' databases We do not check how much data is stored and written in other databases. Some of the databases contain project benchmark data , and some private (individual) data. Ideally, private databases should be created on separate instances of influx. The reason for this is there can be situations when the process we do not know about writes too much data and disables influx from working. We should identify who uses databases, and remove databases that are not used. Private databases should be moved to another instance of influx.

      1) Identyfy which logical databases on influxdb testrestuls are in use, and which status we do not know:
      daniel.smith, ausutter 

       

      in use:

      coin - coin tasks data
      coin_extra - results of preprocessing of coin data, source of information for blacklisted, flaky summary dashboards
      coin_capacity - in use - coin capacity data - retention policy of 31 days

       

      not in use

       

       

      2) Another subject we should check which processes use influx (especially writing queries). They are logged at:
      /var/log/influxdb/influxd.log

      3) We should create rules/script/calendar event or task for regular archiving and deleting of data. Exact procedure was already created by jimis.
      I suggest doing the next archival around Xmass 2024, when there is lower than usual usage of influx. We should do it in advance, ahead of influx getting bloated and unusable. So far we did it roughly once a year.  The process is documented at https://bugreports.qt.io/browse/QTQAINFRA-3501.

      4. Monitoring influx availability - this can be done by many ways, but I belive its the CI/QA team hat should discover downtown, not developpers.

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            anwojcie Anna Wojciechowska
            anwojcie Anna Wojciechowska
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes