Uploaded image for project: 'Qt Quality Assurance Infrastructure'
  1. Qt Quality Assurance Infrastructure
  2. QTQAINFRA-928

Repeated disconnect of Windows jenkins slaves slow down continuous integration

    XMLWordPrintable

Details

    • Bug
    • Resolution: Out of scope
    • P0: Blocker
    • None
    • 2014q1
    • Jenkins
    • None

    Description

      A substantial amount of integration attempts abort on Windows slave with the following exception

      hudson.remoting.ChannelClosedException: channel is already closed
      	at hudson.remoting.Channel.send(Channel.java:549)
      	at hudson.remoting.Request.call(Request.java:129)
      	at hudson.remoting.Channel.call(Channel.java:751)
      	at hudson.EnvVars.getRemote(EnvVars.java:405)
      	at hudson.model.Computer.getEnvironment(Computer.java:942)
      	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29)
      	at hudson.model.Run.getEnvironment(Run.java:2267)
      	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:905)
      	at hudson.matrix.MatrixRun$MatrixRunExecution.getParentWorkspaceLease(MatrixRun.java:156)
      	at hudson.matrix.MatrixRun$MatrixRunExecution.decideWorkspace(MatrixRun.java:169)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:517)
      	at hudson.model.Run.execute(Run.java:1759)
      	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
      	at hudson.model.ResourceController.execute(ResourceController.java:89)
      	at hudson.model.Executor.run(Executor.java:240)
      Caused by: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@7602ce78[name=ci-win7-x86-25]
      	at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
      	at java.lang.Thread.run(Thread.java:679)
      Caused by: java.io.IOException: Connection reset by peer
      	at sun.nio.ch.FileDispatcher.read0(Native Method)
      	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
      	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
      	at sun.nio.ch.IOUtil.read(IOUtil.java:224)
      	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
      	at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:136)
      	at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:306)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:564)
      	... 7 more
      

      It is unfortunately not evident at this point what exactly is causing this failure.

      In the last 10 days we've had 375 attempted builds in the CI system. 134 - 35% - of those builds aborted due to this exception.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tosaraja Tony Sarajärvi
            shausman Simon Hausmann
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes