Uploaded image for project: 'Coin'
  1. Coin
  2. COIN-1233

Add low-level execution and queue metrics to coin

    XMLWordPrintable

Details

    • Task
    • Resolution: Unresolved
    • P2: Important
    • None
    • None
    • Other
    • None

    Description

      Our current metrics collection for coin focuses on macro statistics and OS-reported data. While this has helped monitor the overall system health at times, it has not given good insight into coin internals and the system still hiccups on unknown processes for unclear reasons.

      To help with this, we can wrap a number of functions in coin, both in the server and agent, to report execution time metrics, intermittent action fail counts, RPC latency, and so on using Prometheus. Stats are reported to a Prometheus server which provides timeseries data about these low-level statistics.

      This task requires installation of new software and creation of new VMs, to be tracked in other tickets. This ticket focuses solely on the development of a metrics module for coin in both Golang and Python, as needed.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            daniel.smith Daniel Smith
            daniel.smith Daniel Smith
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes