How to prevent genetic algorithm from converging on local minima?

Asked 18/11, 2014 at 3:35 Answered 24/11, 2014 at 12:48

Solved java algorithm genetic-algorithm sudoku convergence

I am trying to build a 4 x 4 sudoku solver by using the genetic algorithm. I have some issues with values converging to local minima. I am using a ranked approach and removing the bottom two ranked answer possibilities and replacing them with a crossover between the two highest ranked answer possibilities. For additional help avoiding local mininma, I am also using mutation. If an answer is not determined within a specific amount of generation, my population is filled with completely new and random state values. However, my algorithm seems to get stuck in local minima. As a fitness function, I am using:

(Total Amount of Open Squares * 7 (possible violations at each square; row, column, and box)) - total Violations

population is an ArrayList of integer arrays in which each array is a possible end state for sudoku based on the input. Fitness is determined for each array in the population.

Would someone be able to assist me in determining why my algorithm converges on local minima or perhaps recommend a technique to use to avoid local minima. Any help is greatly appreciated.

Fitness Function:

public int[] fitnessFunction(ArrayList<int[]> population)
{
    int emptySpaces = this.blankData.size();
    int maxError = emptySpaces*7;
    int[] fitness = new int[populationSize];

    for(int i=0; i<population.size();i++)
    {
        int[] temp = population.get(i);
        int value = evaluationFunc(temp);

        fitness[i] = maxError - value;
        System.out.println("Fitness(i)" + fitness[i]);
    }

    return fitness;
}

Crossover Function:

public void crossover(ArrayList<int[]> population, int indexWeakest, int indexStrong, int indexSecStrong, int indexSecWeak)
{
    int[] tempWeak = new int[16];
    int[] tempStrong = new int[16];
    int[] tempSecStrong = new int[16];
    int[] tempSecWeak = new int[16];

    tempStrong = population.get(indexStrong);
    tempSecStrong = population.get(indexSecStrong);
    tempWeak = population.get(indexWeakest);
    tempSecWeak = population.get(indexSecWeak);
    population.remove(indexWeakest);
    population.remove(indexSecWeak);


    int crossoverSite = random.nextInt(14)+1;

    for(int i=0;i<tempWeak.length;i++)
    {
        if(i<crossoverSite)
        {
            tempWeak[i] = tempStrong[i];
            tempSecWeak[i] = tempSecStrong[i];
        }
        else
        {
            tempWeak[i] = tempSecStrong[i];
            tempSecWeak[i] = tempStrong[i];
        }
    }
    mutation(tempWeak);
    mutation(tempSecWeak);
    population.add(tempWeak);
    population.add(tempSecWeak);

    for(int j=0; j<tempWeak.length;j++)
    {
        System.out.print(tempWeak[j] + ", ");
    }
    for(int j=0; j<tempWeak.length;j++)
    {
        System.out.print(tempSecWeak[j] + ", ");
    }
}

Mutation Function:

public void mutation(int[] mutate)
{
    if(this.blankData.size() > 2)
    {
        Blank blank = this.blankData.get(0);
        int x = blank.getPosition();

        Blank blank2 = this.blankData.get(1);
        int y = blank2.getPosition();

        Blank blank3 = this.blankData.get(2);
        int z = blank3.getPosition();

        int rando = random.nextInt(4) + 1;

        if(rando == 2)
        {
            int rando2 = random.nextInt(4) + 1;
            mutate[x] = rando2;
        }
        if(rando == 3)
        {
            int rando2 = random.nextInt(4) + 1;
            mutate[y] = rando2;
        }
        if(rando==4)
        {
            int rando3 = random.nextInt(4) + 1;
            mutate[z] = rando3;
        }
    }

Crinkleroot answered 18/11, 2014 at 3:35 Comment(2)

In addition, what is an appropriate population size and max number of generations for a 4x4 game of sudoku? – Crinkleroot 18/11, 2014 at 3:55

Frankly, I think this is misapplication of technology. Genetic algorithms are best suited for fuzzy questions (e.g. the traveling salesman problem) where the optimal answer is elusive, but it's easy to find answers that are reasonably good. In a well designed sudoku puzzle, there is one and only one correct answer, and it can be found quickly and easily with an exhaustive search. – Fallon 18/11, 2014 at 4:53

The reason you see rapid convergence is that your methodology for "mating" is not very good. You are always producing two offspring from "mating" of the top two scoring individuals. Imagine what happens when one of the new offspring is the same as your top individual (by chance, no crossover and no mutation, or at least none that have an effect on the fitness). Once this occurs, the top two individuals are identical which eliminates the effectiveness of crossover.

A more typical approach is to replace EVERY individual on every generation. There are lots of possible variations here, but you might do a random choice of two parents weighted fitness.

Regarding population size: I don't know how hard of a problem sudoku is given your genetic representation and fitness function, but I suggest that you think about millions of individuals, not dozens.

If you are working on really hard problems, genetic algorithms are much more effective when you place your population on a 2-D grid and choosing "parents" for each point in the grid from the nearby individuals. You will get local convergence, but each locality will have converged on different solutions; you get a huge amount of variation produced from the borders between the locally-converged areas of the grid.

Another technique you might think about is running to convergence from random populations many times and store the top individual from each run. After you build up a bunch of different local minima genomes, build a new random population from those top individuals.

Ornelas answered 18/11, 2014 at 4:35 Comment(5)

I agreed with you right up to the point where you said, "millions of individuals". By making the simple restriction that each row has 4 unique numbers, one can reduce the exhaustive search space to 331,776 possibilities. Seems pointless to run multiple generations with millions of individuals, when the brute force solution only has to consider 300K possibilities. – Fallon 18/11, 2014 at 4:57

So the problem with genetic algorithms is that it uses heuristics to attempt to predict the correct answer. This means that a poor implementation might require more time/space than a brute force approach. – Invalidism 18/11, 2014 at 6:6

In regard to population size, there are generally 2 schools of thought - large population, few generations or small population many generations. It's up to you to pick the path you want to pursue (I personally prefer the small pop approach). There are many resources out there that will tell you general values to use (I saw ~12 individuals, 1000s of generations somewhere), but it's really up to you – Invalidism 18/11, 2014 at 6:6

@Fallon As I said, I don't know how "hard" sudoko is. If the search space is trivial as you say it is, then a genetic algorithm is a silly tool to use. – Ornelas 18/11, 2014 at 7:13

@Invalidism On hard problems, small populations converge before you get very far (certainly before you get to 1000s of generations). If you want to use small populations, then you will want to use many runs with a methodology that utilizes the winners for further work (as I outlined in my answer). – Ornelas 18/11, 2014 at 7:16

I think the Sudoku is a permutation problem. therefore i suggest you to use random permutation numbers for initializing population and use the crossover method which Compatible to permutation problems.

Proximity answered 24/11, 2014 at 12:48 Comment(0)

Recommended topics

Hot tags