Uploaded image for project: 'Coin'
  1. Coin
  2. COIN-969

Reduce time required for each integration

    XMLWordPrintable

Details

    • User Story
    • Resolution: Unresolved
    • P2: Important
    • None
    • None
    • None

    Description

      Our integrations take very, very long. This is the reason why we have to constantly fight for CPU time on the build and test machines and why we cannot afford to properly test all downstream modules that may be affected by a change. This is probably also the reason why we often see failures after OS and compiler updates. I still don't know what the exact procedure to phase those in actually is, but as we are generally having trouble to get a submodule update through at all, I guess we don't do a complete test round for each platform upgrade.

      I can think of several approaches we could take to reduce the time our integrations take:

      1. Rank test functions by failure rate. The ones that fail often are either flaky or canaries for common mistakes. The flaky ones should be fixed and the canaries should be run first to fail more quickly if the mistake is repeated.
      2. Produce code coverage vs. run time statistics for each test function. Test functions with low code coverage and high run time should be investigated to see if they are worth the effort.
      3. Rank test functions by code coverage and then examine which test functions still provide additional coverage over the cumulative coverage of the previous ones. The ones that don't provide additional coverage should be investigated to see if they are redundant. Mind that they might still test important edge cases not visible via pure coverage, though.
      4. Once we've identified suitable canaries in all modules, we might start running downstream canary tests on upstream changes. This would reduce the submodule update churn.
      5. More aggressively blacklist tests and in turn wind down the repeat-on-failure protocol. The whole idea that a test should be repeated if it fails is misguided. Tests that randomly fail don't really provide any insight on the quality of the changes at hand anyway. There's no point in repeating them.
      6. Run tests of one integration in parallel over multiple VMs. We did have this once IIRC? It could be done much more aggressively.
      7. Produce statistics on which platforms take the longest to integrate for each module and which tests are responsible for that. Then, together with the other data, determine if it's worth it to run those tests on the specific platforms.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              jujokini Jukka Jokiniva
              ulherman Ulf Hermann
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes