Buildbot slaves priority
Asked Answered
M

1

0

Problem

I have set up a latent slave in buildbot to help avoid congestion. I've set up my builds to run either in permanent slave or latent one. The idea is the latent slave is waken up only when needed but the result is that buildbot randomly selectes one slave or the other so sometimes I have to wait for the latent slave to wake even if the permanent one is idle.

Is there a way to prioritize buildbot slaves?

Attempted solutions

1. Custom nextSlave

Following @david-dean suggestion, I've created a nextSlave function as follows (updated to working version):

from twisted.python import log
import traceback

def slave_selector(builder, builders):
    try:
        host = None
        support = None
        for builder in builders:
            if builder.slave.slavename == 'host-slave':
                host = builder
            elif builder.slave.slavename == 'support-slave':
                support = builder

        if host and support and len(support.slave.slave_status.runningBuilds) < len(host.slave.slave_status.runningBuilds):
            log.msg('host-slave has many running builds, launching build in support-slave')
            return support
        if not support:
            log.msg('no support slave found, launching build in host-slave')
        elif not host:
            log.msg('no host slave found, launching build in support-slave')
            return support
        else:
            log.msg('launching build in host-slave')
        return host
    except Exception as e:
        log.err(str(e))
        log.err(traceback.format_exc())
        log.msg('Selecting random slave')
        return random.choice(buildslaves)

And then passed it to BuilderConfig. The result is that I get this in twistd.log:

2014-04-28 11:01:45+0200 [-] added buildset 4329 to database

But the build never starts, in the web UI it always appear as Pending and none of the logs I've put appear in twistd.log

2. Trying to mimic default behavior

I've having a look to buildbot code, to see how it is done by default. in file ./master/buildbot/process/buildrequestdistributor.py, class BasicBuildChooser you have:

self.nextSlave = self.bldr.config.nextSlave
if not self.nextSlave:
    self.nextSlave = lambda _,slaves: random.choice(slaves) if slaves else None

So I've set exactly that lambda function in my BuilderConfig and I'm getting exactly the same build not starting result.

Mange answered 25/4, 2014 at 8:46 Comment(6)
Related questionMange
Have you properly reconfigured (or, preferably, restarted) the master when making these changes?Nigro
Yes, otherwise it would not "hang" it would just act as before of making code changesMange
Ok, reconfiguring was applying changes but not making twisted show the logs. After restarting I can debug my functionMange
Did you finally resolve this issue? How does the working nextSlave function look like?Schonthal
Yes, before the code it says "updated to working version"Mange
P
2

You can set up a nextSlave function to assign slaves to a builder in a custom manner see: http://docs.buildbot.net/current/manual/cfg-builders.html#builder-configuration

Prerequisite answered 26/4, 2014 at 7:51 Comment(4)
nextSlave If provided, this is a function that controls which slave will be assigned future jobs. The function is passed two arguments, the Builder object which is assigning a new job, and a list of BuildSlave objects. The function should return one of the BuildSlave objects, or None if none of the available slaves should be used. The function can optionally return a Deferred, which should fire with the same results. I've seen you can get a list of runningBuilds in a slave from slave_status, does it also contain enqueued builds?Mange
It looks like both host and slave are None at the end of the function, so it returns None. For debugging, I'd raise an exception that outputs the slavename. I don't think you can access the twisted.log directly.Prerequisite
I added names = [s.slavename for s in build_slaves] raise Exception(str(names)) but no exception seems to be risen, is like the function was never called. In the log I only get 2014-04-30 09:53:30+0200 [-] added buildset 4343 to database and keeps pendingMange
Actually there is a bug in the documentation (reported here)Mange

© 2022 - 2024 — McMap. All rights reserved.