Details

    • Sub-task
    • Resolution: Out of scope
    • P1: Critical
    • 2012q4
    • Performance tracking
    • None

    Description

      To reduce BM database growth, we could consider storing only the significant changes of each time series.

      Instead of storing all the raw observations, we store the significant changes only. We also store statistics of what is reported between the changes. When the median observation of a new complete snapshot does not differ significantly from the median observation of the snapshot of the current change, then the statistics (i.e. various sums and counts) associated with current change are updated while the observations themselves are thrown away.

      Information to keep for each change could typically include the following:

      • base value
      • base snapshot (SHA-1 and first upload timestamp)
      • last snapshot (SHA-1 and first upload timestamp)
      • snapshot count: sshot_count
      • observation count: obs_count (mean observation count per snapshot: obs_count / sshot_count)
      • sum of observations: obs_sum (mean observation: obs_sum / obs_count)
      • sum of median observations: med_obs_sum (mean median observation per snapshot: med_obs_sum / sshot_count)
      • sum of relative standard errors (RSEs): rse_sum (mean RSE per snapshot: rse_sum / sshot_count)

      A difference tolerance DT has to be defined in order to determine when a change is considered significant. Two values v1 and v2 are significantly different iff (v1 / v2 > DT) or (v2 / v1 > DT).

      The difference tolerance could typically be 1.1 by default, but could be overridden for individual time series (e.g. high-priority benchmarks could have a smaller tolerance to capture more changes).

      The compression potential is obviously bigger the fewer changes there are in a given time series. The corner case of the value changing significantly on every snapshot is not assumed to be likely in practice.

      Note that the new database structure will affect most of the current features of the BM tool. Most of them will however be easier to implement since it is often the significant changes of time series we're interested in anyway.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            bdo Jo Asplin (Inactive)
            bdo Jo Asplin (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 2 weeks
                2w
                Remaining:
                Remaining Estimate - 2 weeks
                2w
                Logged:
                Time Spent - Not Specified
                Not Specified

                Gerrit Reviews

                  There are no open Gerrit changes