How to design acceptance probability function for simulated annealing with multiple distinct costs?
Asked Answered
C

4

9

I am using simulated annealing to solve an NP-complete resource scheduling problem. For each candidate ordering of the tasks I compute several different costs (or energy values). Some examples are (though the specifics are probably irrelevant to the question):

  • global_finish_time: The total number of days that the schedule spans.
  • split_cost: The number of days by which each task is delayed due to interruptions by other tasks (this is meant to discourage interruption of a task once it has started).
  • deadline_cost: The sum of the squared number of days by which each missed deadline is overdue.

The traditional acceptance probability function looks like this (in Python):

def acceptance_probability(old_cost, new_cost, temperature):
    if new_cost < old_cost:
        return 1.0
    else:
        return math.exp((old_cost - new_cost) / temperature)

So far I have combined my first two costs into one by simply adding them, so that I can feed the result into acceptance_probability. But what I would really want is for deadline_cost to always take precedence over global_finish_time, and for global_finish_time to take precedence over split_cost.

So my question to Stack Overflow is: how can I design an acceptance probability function that takes multiple energies into account but always considers the first energy to be more important than the second energy, and so on? In other words, I would like to pass in old_cost and new_cost as tuples of several costs and return a sensible value .

Edit: After a few days of experimenting with the proposed solutions I have concluded that the only way that works well enough for me is Mike Dunlavey's suggestion, even though this creates many other difficulties with cost components that have different units. I am practically forced to compare apples with oranges.

So, I put some effort into "normalizing" the values. First, deadline_cost is a sum of squares, so it grows exponentially while the other components grow linearly. To address this I use the square root to get a similar growth rate. Second, I developed a function that computes a linear combination of the costs, but auto-adjusts the coefficients according to the highest cost component seen so far.

For example, if the tuple of highest costs is (A, B, C) and the input cost vector is (x, y, z), the linear combination is BCx + Cy + z. That way, no matter how high z gets it will never be more important than an x value of 1.

This creates "jaggies" in the cost function as new maximum costs are discovered. For example, if C goes up then BCx and Cy will both be higher for a given (x, y, z) input and so will differences between costs. A higher cost difference means that the acceptance probability will drop, as if the temperature was suddenly lowered an extra step. In practice though this is not a problem because the maximum costs are updated only a few times in the beginning and do not change later. I believe this could even be theoretically proven to converge to a correct result since we know that the cost will converge toward a lower value.

One thing that still has me somewhat confused is what happens when the maximum costs are 1.0 and lower, say 0.5. With a maximum vector of (0.5, 0.5, 0.5) this would give the linear combination 0.5*0.5*x + 0.5*y + z, i.e. the order of precedence is suddenly reversed. I suppose the best way to deal with it is to use the maximum vector to scale all values to given ranges, so that the coefficients can always be the same (say, 100x + 10y + z). But I haven't tried that yet.

Crabstick answered 9/7, 2009 at 14:33 Comment(3)
I would be interested to know if this is an industry or academic problem. RegardsDwight
It is not academic. I am using this as an alternative to MS Project. The main goal of the program is to make it easier to answer the question "when can your team add feature X to our software?"Crabstick
I know this question is years old but for anyone else who stumbles on this page via Google...in fuzzy logic the weighted sum is the equivalent of logical-OR, so you're effectively saying "if condition A OR condition B etc". What you really want is A AND B AND C, and to do that you use multiplication. There are a few caveats (e.g. your weights now need to be powers) but it's far better than the mess you get trying to special-case everything. Wiki "Weighted sum model" and "Weighted product model" for more details.Aloes
X
2

mbeckish is right.

Could you make a linear combination of the different energies, and adjust the coefficients?

Possibly log-transforming them in and out?

I've done some MCMC using Metropolis-Hastings. In that case I'm defining the (non-normalized) log-likelihood of a particular state (given its priors), and I find that a way to clarify my thinking about what I want.

Xenophobe answered 9/7, 2009 at 14:55 Comment(2)
The different quantities do not always have compatible units. For example the deadline value is squared to get a least-squares type of optimization, i.e. I prefer delaying 3 tasks by 1 day each rather than delaying 1 task by 3 days. I have considered this but I'm afraid I will run into many boundary cases where the system is not doing the right thing because i didn't make the coefficients "just right" (if there even is such a thing). Also see answer to mcbeckishCrabstick
@flodin: You do want your overall energy surface to be continuous, so I would be shy of IF statements. Other than that, you can make it pretty nonlinear, like having a square-law repulsion from boundary cases - just a thought.Xenophobe
D
1

I would consider something along the lines of:

If (new deadline_cost > old deadline_cost)
  return (calculate probability)

else if (new global finish time > old global finish time)
  return (calculate probability)

else if (new split cost > old split cost)
  return (calculate probability)

else 
  return (1.0)

Of course each of the three places you calculate the probability could use a different function.

Dwight answered 9/7, 2009 at 14:52 Comment(3)
I'll try it and get back to you. I was thinking about something similar but see a potential problem in the fact that a difference X in the first value represents the same probability as a difference X in the second value. Intuitively a difference in the second value ought to represent a value that is in some sense an infinitely smaller probability. A problem here is that it's hard to convince yourself through trial & error that your algorithm is sound. It might work for simple cases but create weird behavior in complex scenarios. I'm wishing for some theoretical confirmation of the method.Crabstick
I guess this is a heuristic approach which is not uncommon in NP-complete solutions.Dwight
I have tried it out and it is generating fairly good solutions. The one issue is that once the highest-priority component has settled on an optimal value, the algorithm is too likely to jump out of that solution even at low temperatures. This is logical since moving from (0, 0) to (1, 0) has exactly the same probability as moving from (0, 0) to (0, 1) has. I'll leave the question open for a while and continue to experiment to see if anything better comes up. Right now I'm considering some sort of magnitude difference in the probability when evaluating a lower-priority component.Crabstick
S
1

I would take a hint from multi-objective evolutionary algorithm (MOEA) and have it transition if all of the objectives simultaneously pass with the acceptance_probability function you gave. This will have the effect of exploring the Pareto front much like the standard simulated annealing explores plateaus of same-energy solutions.

However, this does give up on the idea of having the first one take priority.

You will probably have to tweak your parameters, such as giving it a higher initial temperature.

Sumer answered 19/8, 2010 at 18:51 Comment(0)
C
0

It depends on what you mean by "takes precedence". For example, what if the deadline_cost goes down by 0.001, but the global_finish_time cost goes up by 10000? Do you return 1.0, because the deadline_cost decreased, and that takes precedence over anything else? This seems like it is a judgment call that only you can make, unless you can provide enough background information on the project so that others can suggest their own informed judgment call.

Coakley answered 9/7, 2009 at 14:41 Comment(1)
Yes, deadlines are always more important than the global finish time. Even if the global finish time goes up by 10000, i want the system to favor a lower deadline cost. This is what I tried to explain in the question, I'm sorry if it wasn't clear.Crabstick

© 2022 - 2024 — McMap. All rights reserved.