Uploaded image for project: 'Coin'
  1. Coin
  2. COIN-902

"Coin failed to acquire a virtual machine should" not happen so frequently

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: P1: Critical
    • Resolution: Unresolved
    • Affects Version/s: master
    • Fix Version/s: None
    • Component/s: Scheduler
    • Labels:
      None

      Description

      We have many failures where an integration gets canceled because a machine can not be acquired for some work item.
      These failures waste both CI capacity and developer time due to constant re-staging, which delay important integrations, especially before release time.

      Here's a recent failure

      https://codereview.qt-project.org/c/qt/qtwebengine/+/424421/4#message-e16a8a92f6511412a5c1ed6d5f78e497a9d77cef
      https://testresults.qt.io/coin/integration/qt/qtwebengine/tasks/1659597034

      My suggestion would be that instead of failing with that error message, we should either increase the timeout threshold, or put the job back into the queue so it is retried later.

      But failing a whole integration just because a machine was not acquired within 2-4 hours is unreasonable.

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

            Assignee:
            tosaario Toni Saario
            Reporter:
            alexandru.croitor Alexandru Croitor
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:

                Gerrit Reviews

                There are no open Gerrit changes