Constraint Programming: Scheduling with multiple workers
Asked Answered
S

2

3

I'm new to constraint programming. I imagine this is an easy problem but I can't wrap my head around it. Here's the problem:

  • We have multiple machines (N), each with a limited resource (let's say memory, and it can be the same for all machines.)
  • We have T tasks, each with a duration and each requiring some amount of the resource.
  • A machine can work on multiple tasks at the same time as long as its resource isn't exceeded.
  • A task cannot be split among machines and it has to be done in one shot (i.e. no pausing).

How do we assign the tasks to the machines to minimize the end-time or the number of machines used?

It seems like I should be able to achieve this with the cumulative predicate but it seems like it's limited to scheduling one set of tasks to one worker with a limited global resource rather than a variable number of workers.

I'm just learning CP & MiniZinc. Any ideas on how to generalize cumulative? Alternatively, is there an existing MiniZinc model I can understand that does something like this (or close enough?)

Thanks,

PS: I don't have any concrete data since this is a hypothetical/learning exercise for the most part. Imagine you have 10 machines and 10 tasks with various durations (in hours): 2,4,6,5,2,1,4,6,3,2,12 with memory requirements (GBs): 1,2,4,2,1,8,12,4,1,10. Each machine has 32 GBs of ram.

Senile answered 27/12, 2015 at 4:0 Comment(3)
It always helps with a (small) concrete example with the specific limits and constraints. Some questions: Does the N machines have the same or different limited resources? Can a task be split between different machines or must it be on one machine until it's finished? What should the model output?Hagerty
Thanks (and by the way thanks for your awesome website, I've been using the MiniZinc models to learn!). I added a toy example in the comment and: the tasks cannot be shared by a machine, and all machines have the same resources. Also I don't think tasks should be paused/restarted (i.e. they have to be contiguous in time). In terms of output, anything reasonable works, probably a matrix of machines x tasks and each cell is the starting time.Senile
This description sounds a lot like the ICON Energy Challenge, at least the scheduling part of that competition.Ria
H
4

Here's a model that seems to be correct. However, it don't use "cumulative" at all since I wanted to visualize as much as possible (see below).

The main idea is that - for each time step, 1..max_step - each machine must have only tasks <= 32 GB. The foreach loop checks - for each machine - that the sum of memory of each task that is active at that time on that machine is below 32GB.

The output section shows the solution in different ways. See comments below.

The model is a slightly edited version of http://hakank.org/minizinc/scheduling_with_multiple_workers.mzn

Update: I should also mention that this model allows for different size of RAM on the machines, e.g. some machines have 64GB and some 32GB. This is demonstrated - but commented - in the model on my site. Since the model use value_precede_chain/2 - which ensures that the machines are used in order - it's recommended that the machines are ordered of decreasing size of RAM (so the bigger machines are used first).

(Also, I've modeled the problem in Picat: http://hakank.org/picat/scheduling_with_multiple_workers.pi )

include "globals.mzn"; 
int: num_tasks = 10; 
int: num_machines = 10;

array[1..num_tasks] of int: duration = [2,4,6,5,2,1,4,6,3,12]; % duration of tasks

array[1..num_tasks] of int: memory   = [1,2,4,2,1,8,12,4,1,10];  % RAM requirements (GB)
int: max_time = 30; % max allowed time

% RAM for each machine (GB)
array[1..num_machines] of int: machines_memory = [32 | i in  1..num_machines];


% decision variables
array[1..num_tasks] of var 1..max_time: start_time; % start time for each task
array[1..num_tasks] of var 1..max_time: end_time;   % end time for each task
array[1..num_tasks] of var 1..num_machines: machine; % which machine to use
array[1..num_machines,1..max_time] of var 0..max(machines_memory): machine_used_ram;

var 1..num_machines: machines_used = max(machine);
var 1..max_time: last_time = max(end_time);

% solve :: int_search(start_time ++ machine ++ array1d(machine_used_ram), first_fail, indomain_split, complete)  minimize last_time;
solve :: int_search(start_time ++ machine ++ array1d(machine_used_ram), first_fail, indomain_split, complete) minimize machines_used;

constraint
  forall(t in 1..num_tasks) (
    end_time[t] = start_time[t] + duration[t] -1
  ) 
  % /\ cumulative(start_time,duration,[1 | i in  1..num_tasks],machines_used)
  /\
  forall(m in 1..num_machines) (
     % check all the times when a machine is used
     forall(tt in 1..max_time) (
        machine_used_ram[m,tt] = sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) /\ 
        machine_used_ram[m,tt] <= machines_memory[m]

        % sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) <= machines_memory[m]
     )

  )

  %  ensure that machine m is used before machine m+1 (for machine_used)
  /\ value_precede_chain([i | i in 1..num_machines],machine)
;

output [
  "start_time: \(start_time)\n",
  "durations : \(duration)\n",
  "end_time  : \(end_time)\n",
  "memory    : \(memory)\n",
  "last_time : \(last_time)\n",
  "machine   : \(machine)\n",
  "machines_used: \(machines_used)\n",
]
++
[ "Machine memory per time:\n    "]
++
[ show_int(3,tt) | tt in 1..max_time ]
++
[
  if tt = 1 then "\n" ++ "M" ++ show_int(2, m) ++ ": "  else " " endif ++
 show_int(2,machine_used_ram[m,tt])
  | m in 1..num_machines, tt in 1..max_time
]
++ ["\n\nTime / task: machine(task's memory)\n  Task "] ++
[
  show_int(7,t)
  | t in 1..num_tasks
]
++ 
[
  if t = 1 then "\nTime " ++ show_int(2,tt) ++ " " else " " endif ++
    if tt in fix(start_time[t])..fix(end_time[t]) then
      show_int(2,fix(machine[t])) ++ "(" ++ show_int(2,memory[t]) ++    ")"
    else 
       "      " 
    endif 
   | tt in 1..fix(last_time), t in 1..num_tasks
 ] 
;

The model has two "modes": one to minimize the time ("minimize last_time") and one to minimize the number of machine used ("minimize machines_used").

The result of minimizing the time is:

start_time: [11, 8, 3, 8, 11, 8, 9, 7, 8, 1]
durations : [2, 4, 6, 5, 2, 1, 4, 6, 3, 12]
end_time  : [12, 11, 8, 12, 12, 8, 12, 12, 10, 12]
memory    : [1, 2, 4, 2, 1, 8, 12, 4, 1, 10]
last_time : 12
machine   : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
machines_used: 1
Machine memory per time:
       1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
M 1: 10 10 14 14 14 14 18 31 31 31 32 30  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 2:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 3:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 4:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 5:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 6:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 7:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 8:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 9:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M10:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

 Time / task: machine(task's memory)
  Task       1      2      3      4      5      6      7      8      9     10
 Time  1                                                                 1(10)
 Time  2                                                                 1(10)
 Time  3                1( 4)                                            1(10)
 Time  4                1( 4)                                            1(10)
 Time  5                1( 4)                                            1(10)
 Time  6                1( 4)                                            1(10)
 Time  7                1( 4)                              1( 4)         1(10)
 Time  8         1( 2)  1( 4)  1( 2)         1( 8)         1( 4)  1( 1)  1(10)
 Time  9         1( 2)         1( 2)                1(12)  1( 4)  1( 1)  1(10)
 Time 10         1( 2)         1( 2)                1(12)  1( 4)  1( 1)  1(10)
 Time 11  1( 1)  1( 2)         1( 2)  1( 1)         1(12)  1( 4)         1(10)
 Time 12  1( 1)                1( 2)  1( 1)         1(12)  1( 4)         1(10)
 ----------
 ==========

The first part "Machine memory per time" shows how loaded each machine (1..10) is per time step (1..30). The second part "Time / task: machine(task's memory)" shows for each time step (rows) and tasks (columns) which machine that is used and the memory of the task in the form "machine(memory of the machine)".

The second way of using the model, to minimize the number of used machines, give this result (edited to save space). I.e. one machine is enough for handling all the tasks during the allowed time (1..22 time steps).

 start_time: [19, 11, 3, 9, 20, 22, 13, 7, 17, 1]
 durations : [2, 4, 6, 5, 2, 1, 4, 6, 3, 12]
 end_time  : [20, 14, 8, 13, 21, 22, 16, 12, 19, 12]
 memory    : [1, 2, 4, 2, 1, 8, 12, 4, 1, 10]
 last_time : 22
 machine   : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
 machines_used: 1
 Machine memory per time:
       1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
 M 1: 10 10 14 14 14 14 18 18 16 16 18 18 16 14 12 12  1  1  2  2  1  8  0  0  0  0  0  0  0  0
 M 2:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 ....
 Time / task: machine(task's memory)
   Task       1      2      3      4      5      6      7      8      9     10
 Time  1                                                                 1(10)
 Time  2                                                                 1(10)
 Time  3                1( 4)                                            1(10)
 Time  4                1( 4)                                            1(10)
 .....
 ----------
 ==========
Hagerty answered 28/12, 2015 at 1:16 Comment(5)
This is amazing. Thanks! I haven't had a chance to go through it yet. I'm still learning so it'll take me a bit of time to wrap my head around it.Senile
This is the part I struggled with when trying to solve it on my own. Thanks again! forall(m in 1..num_machines) ( % check the memory of the times when a task is run on the machine forall(tt in 1..max_time) ( machine_used_ram[m,tt] = sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) /\ machine_used_ram[m,tt] <= machines_memory[m]Senile
@Senile Great! Yes, that's the hard part in the model. The "cumulative" constraint has a similar construct to ensure that each time slice is below or equal to the limit (the last argument of the "cumulative" constraint).Hagerty
I'm not sure I followed that: are you saying that this line "cumulative(start_time,duration,[1 | i in 1..num_tasks],machines_used)" is equivalent to the forall that follows it?Senile
Mike: No, it's not equivalent but the "cumulative" use a similar principle. You can check the definition of cumulative in the MiniZinc distribution: share/minizinc/std/cumulative.mzn.Hagerty
A
0

This is an old question, but here is a CP Optimizer model for this problem (in Python). In this version, I minimize a lexicographical objective : first minimize the makespan (optimal value is 12), then given this makespan, minimize the number of used machines (here, one can execute all the tasks on a single machine and still finish at 12).

DUR = [2,4,6,5,2,1,4,6,3,12]
MEM = [1,2,4,2,1,8,12,4,1,10]
CAP = 32
TASKS    = range(len(DUR))
MACHINES = range(10)

from docplex.cp.model import *
model = CpoModel()

# Decision variables: tasks and alloc
task  = [interval_var(size=DUR[i]) for i in TASKS]
alloc = [ [interval_var(optional=True) for j in MACHINES] for i in TASKS]

# Objective terms
makespan  = max(end_of(task[i]) for i in TASKS)
nmachines = sum(max(presence_of(alloc[i][j]) for i in TASKS) for j in MACHINES)

# Objective: minimize makespan, then number of machine used
model.add(minimize_static_lex([makespan, nmachines])) 

# Allocation of tasks to machines
model.add([alternative(task[i], [alloc[i][j] for j in MACHINES]) for i in TASKS])

# Machine capacity
model.add([sum(pulse(alloc[i][j],MEM[i]) for i in TASKS) <= CAP  for j in MACHINES])

# Resolution
sol = model.solve(trace_log=True)

# Display solution
for i in TASKS:
    for j in MACHINES:
        s = sol.get_var_solution(alloc[i][j])
        if s.is_present():
            print('Task ' + str(i) + ' scheduled on machine ' + str(j) + ' on [' + str(s.get_start()) + ',' +  str(s.get_end()) + ')')

And the result is:

! ----------------------------------------------------------------------------
! Minimization problem - 110 variables, 20 constraints
! Initial process time : 0.00s (0.00s extraction + 0.00s propagation)
!  . Log search space  : 66.4 (before), 66.4 (after)
!  . Memory usage      : 897.0 kB (before), 897.0 kB (after)
! Using parallel search with 8 workers.
! ----------------------------------------------------------------------------
!          Best Branches  Non-fixed    W       Branch decision
                    0        110                 -
+ New bound is 12; 0
*            12      111  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 7
*            12      131  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 6
*            12      151  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 5
*            12      171  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 4
*            12      191  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 3
*            12      211  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 2
*            12      231  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 1
! ----------------------------------------------------------------------------
! Search completed, 7 solutions found.
! Best objective         : 12; 1 (optimal)
! Best bound             : 12; 1
! ----------------------------------------------------------------------------
! Number of branches     : 1318
! Number of fails        : 40
! Total memory usage     : 6.7 MB (6.6 MB CP Optimizer + 0.1 MB Concert)
! Time spent in solve    : 0.00s (0.00s engine + 0.00s extraction)
! Search speed (br. / s) : 131800.0
! ----------------------------------------------------------------------------

Task 0 scheduled on machine 4 on [4,6)
Task 1 scheduled on machine 4 on [4,8)
Task 2 scheduled on machine 4 on [1,7)
Task 3 scheduled on machine 4 on [0,5)
Task 4 scheduled on machine 4 on [4,6)
Task 5 scheduled on machine 4 on [0,1)
Task 6 scheduled on machine 4 on [0,4)
Task 7 scheduled on machine 4 on [1,7)
Task 8 scheduled on machine 4 on [4,7)
Task 9 scheduled on machine 4 on [0,12)
Ailis answered 24/4, 2019 at 10:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.