Explain the Differential Evolution method
Asked Answered
Z

3

19

Can someone please explain the Differential Evolution method? The Wikipedia definition is extremely technical.

A dumbed-down explanation followed by a simple example would be appreciated :)

Zarzuela answered 21/9, 2011 at 2:2 Comment(0)
H
16

Here's a simplified description. DE is an optimisation technique which iteratively modifies a population of candidate solutions to make it converge to an optimum of your function.

You first initialise your candidate solutions randomly. Then at each iteration and for each candidate solution x you do the following:

  1. you produce a trial vector: v = a + ( b - c ) / 2, where a, b, c are three distinct candidate solutions picked randomly among your population.
  2. you randomly swap vector components between x and v to produce v'. At least one component from v must be swapped.
  3. you replace x in your population with v' only if it is a better candidate (i.e. it better optimise your function).

(Note that the above algorithm is very simplified; don't code from it, find proper spec. elsewhere instead)

Unfortunately the Wikipedia article lacks illustrations. It is easier to understand with a graphical representation, you'll find some in these slides: http://www-personal.une.edu.au/~jvanderw/DE_1.pdf .

It is similar to genetic algorithm (GA) except that the candidate solutions are not considered as binary strings (chromosome) but (usually) as real vectors. One key aspect of DE is that the mutation step size (see step 1 for the mutation) is dynamic, that is, it adapts to the configuration of your population and will tend to zero when it converges. This makes DE less vulnerable to genetic drift than GA.

Homeo answered 22/9, 2011 at 18:20 Comment(5)
Doesn't step 3 make DE susceptible to getting stuck in local maximas?Zarzuela
In fact the three steps above correspond to the typical evolutionary operations: mutation, cross-over and selection. Without the selection operation (step 3) your population will never converge at all and the method will remain random. Algorithmic variants of DE have been introduced to increase diversity in the population by tweaking the individual operators. For instance you could inject Gaussian noise to the trial vector (v).Homeo
you are right, with one exception. Step 3 says that x is replaced only if v' is a better candidate. Typical Genetic Algorithms accept worse candidates to avoid getting stuck in local maximas. So my question remains: isn't this a problem?Zarzuela
@BobRoberts can you explain Gaussian noise and it's effect in trial vector?Dodecahedron
If the noise variance in your function is high then conventional DE is known to perform quite badly compared to other algorithms because it is less stochastic and more greedy. For example, one possible way to overcome this problem is to inject noise when creating the trial vector to improve exploration. Instead of dividing by 2 in the first step, you could multiply by a random number between 0.5 and 1 (randomly chosen for each v). See for instance 'Improved Differential Evolution Algorithms for Handling Noisy Optimization Problems' by S. Das, A. Konar, U. Chakraborty (2005).Homeo
Z
13

Answering my own question...

Overview

  • The principal difference between Genetic Algorithms and Differential Evolution (DE) is that Genetic Algorithms rely on crossover while evolutionary strategies use mutation as the primary search mechanism.
  • DE generates new candidates by adding a weighted difference between two population members to a third member (more on this below).
  • If the resulting candidate is superior to the candidate with which it was compared, it replaces it; otherwise, the original candidate remains unchanged.

Definitions

  • The population is made up of NP candidates.
  • Xi = A parent candidate at index i (indexes range from 0 to NP-1) from the current generation. Also known as the target vector.
  • Each candidate contains D parameters.
  • Xi(j) = The jth parameter in candidate Xi.
  • Xa, Xb, Xc = three random parent candidates.
  • Difference vector = (Xb - Xa)
  • F = A weight that determines the rate of the population's evolution.
    • Ideal values: [0.5, 1.0]
  • CR = The probability of crossover taking place.
    • Range: [0, 1]
  • Xc` = A mutant vector obtained through the differential mutation operation. Also known as the donor vector.
  • Xt = The child of Xi and Xc`. Also known as the trial vector.

Algorithm

  1. For each candidate in the population
    • for (int i = 0; i<NP; ++i)
  2. Choose three distinct parents at random (they must differ from each other and i)
do
{
  a = random.nextInt(NP);
} while (a == i)
do
{
  b = random.nextInt(NP);
} while (b == i || b == a);
do
{
  c = random.nextInt(NP);
} while (c == i || c == b || c == a);
  1. (Mutation step) Add a weighted difference vector between two population members to a third member
    • Xc` = Xc + F * (Xb - Xa)
  2. (Crossover step) For every variable in Xi, apply uniform crossover with probability CR to inherit from Xc`; otherwise, inherit from Xi. At least one variable must be inherited from Xc`
int R = random.nextInt(D);
for (int j=0; j < D; ++j)
{
  double probability = random.nextDouble();
  if (probability < CR || j == R)
    Xt[j] = Xc`[j]
  else
    Xt[j] = Xi[j]
}
  1. (Selection step) If Xt is superior to Xi then Xt replaces Xi in the next generation. Otherwise, Xi is kept unmodified.

Resources

Zarzuela answered 13/4, 2015 at 2:50 Comment(2)
In the code above you obtain R using eandom.nextInt(D) but then immediately inside the loop you get it again using random.nextDouble()? Are you sure that's right? Plus how can you compare a floating point variable R with j in the statement j == R?Nealy
@rhody Thank you for pointing out these typos. I have updated the answer.Zarzuela
D
7

The working of DE algorithm is very simple. Consider you need to optimize(minimize,for eg) ∑Xi^2 (sphere model) within a given range, say [-100,100]. We know that the minimum value is 0. Let's see how DE works.

DE is a population-based algorithm. And for each individual in the population, a fixed number of chromosomes will be there (imagine it as a set of human beings and chromosomes or genes in each of them). Let me explain DE w.r.t above function

We need to fix the population size and the number of chromosomes or genes(named as parameters). For instance, let's consider a population of size 4 and each of the individual has 3 chromosomes(or genes or parameters). Let's call the individuals R1,R2,R3,R4.

Step 1 : Initialize the population

We need to randomly initialise the population within the range [-100,100]

        G1    G2    G3    objective fn value
R1 -> |-90  |  2  | 1   |   =>8105
R2 -> |  7  |  9  | -50 |   =>2630
R3 -> |  4  |  2  | -9.2|   =>104.64
R4 -> | 8.5 |  7  |  9  |   =>202.25

objective function value is calculated using the given objective function.In this case, it's ∑Xi^2. So for R1, obj fn value will be -90^2+2^2+2^2 = 8105. Similarly it is found for all.

Step 2 : Mutation

Fix a target vector,say for eg R1 and then randomly select three other vectors(individuals)say for eg.R2,R3,R4 and performs mutation. Mutation is done as follows,

MutantVector = R2 + F(R3-R4)

(vectors can be chosen randomly, need not be in any order).F (scaling factor/mutation constant) within range [0,1] is one among the few control parameters DE is having.In simple words , it describes how different the mutated vector becomes. Let's keep F =0.5.

|  7  |  9  | -50 |
        +

       0.5 *

|  4  |  2  | -9.2|

        +

| 8.5 |  7  |  9  |

Now performing Mutation will give the following Mutant Vector

MV = | 13.25 | 13.5 | -50.1 | =>2867.82

Step 3 : Crossover

Now that we have a target vector(R1) and a mutant vector MV formed from R2,R3 & R4 ,we need to do a crossover. Consider R1 and MV as two parents and we need a child from these two parents. Crossover is done to determine how much information is to be taken from both the parents. It is controlled by Crossover rate(CR). Every gene/chromosome of the child is determined as follows,

a random number between 0 & 1 is generated, if it is greater than CR , then inherit a gene from target(R1) else from mutant(MV).

Let's set CR = 0.9. Since we have 3 chromosomes for individuals, we need to generate 3 random numbers between 0 and 1. Say for eg, those numbers are 0.21,0.97,0.8 respectively. First and last are lesser than CR value, so those positions in the child's vector will be filled by values from MV and second position will be filled by gene taken from target(R1).

Target-> |-90 | 2 | 1 | Mutant-> | 13.25 | 13.5 | -50.1 |

random num - 0.21, =>  `Child -> |13.25| -- | --    |`
random num - 0.97, =>  `Child -> |13.25| 2  | --    |`
random num - 0.80, =>  `Child -> |13.25| 2  | -50.1 |`

Trial vector/child vector -> | 13.25 | 2 | -50.1 | =>2689.57

Step 4 : Selection

Now we have child and target. Compare the obj fn of both, see which is smaller(minimization problem). Select that individual out of the two for next generation

 R1                       -> |-90    |  2  | 1     |   =>8105
Trial vector/child vector -> | 13.25 | 2   | -50.1 |   =>2689.57

Clearly, the child is better so replace target(R1) with the child. So the new population will become

        G1    G2    G3    objective fn value
R1 -> | 13.25 | 2   | -50.1 |   =>2689.57
R2 -> |  7    |  9  | -50   |   =>2500
R3 -> |  4    |  2  | -9.2  |   =>104.64
R4 -> | -8.5  |  7  |  9    |   =>202.25

This procedure will be continued either till the number of generations desired has reached or till we get our desired value. Hope this will give you some help.

Durable answered 13/1, 2016 at 8:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.