Generating a sudoku of a desired difficulty?

Asked 7/5, 2012 at 20:33 Answered 24/2, 2015 at 15:31

So, I've done a fair bit of reading into generation of a Sudoku puzzle. From what I can tell, the standard way to have a Sudoku puzzle of a desired difficulty is to generate a puzzle, and then grade it afterwards, and repeat until you have one of an acceptable rating. This can be refined by generating via backtracing using some of the more complex solving patterns (XY-wing, swordfish, etc.), but that's not quite what I'm wanting to do here.

What I want to do, but have been unable to find any real resource on, is generate a puzzle from a "difficulty value" (0-1.0 value, 0 being the easiest, and 1.0 being the hardest).

For example, I want create a moderately difficult puzzle, so the value .675 is selected. Now using that value I want to be able to generate a moderately difficult puzzle.

Anyone know of something like this? Or perhaps something with a similar methodology?

Beaming answered 7/5, 2012 at 20:33 Comment(4)

No, never found anything like that. Part of the thing is that "difficulty" is very relative. – Paraplegia 7/5, 2012 at 20:39

I don't believe this is possible. The only technique I know of is to do as you mention -- generate, grade, throw out if outside difficulty range. Also, as Mat said, difficulty is hard to measure as different algorithms solve different ways. – Intuition 7/5, 2012 at 20:53

I understand that, but the "generate, rate, throw away, generate rate, throw away, generate, keep" idea seems wildly inefficient. Also, looking at all the games that have the option to create a sudoku puzzle via difficulty (e.g. easy, med, hard) seem to do so in a fraction of a second, it doesn't seem very likely that they're doing this. Especially the ones on devices, like iphone or android. – Beaming 7/5, 2012 at 21:6

possible duplicate of Creating sudoku initial boards – Angers 28/6, 2014 at 0:50

Adding another answer for generating a sudoku of desired difficulty on-the-fly.

This means that unlike other approaches the algorithm runs only once and returns a sudoku configuration matching the desired difficulty (with high probability within a range or with probability=1)

Various solutions for generating (and rating) a sudoku difficulty have to do with human-based techniques and approaches, which can be easily rated.

Then one (after having generated a sudoku configuration) re-solves the sudoku with the human-like solver and depending on the techniques the solver used (e.g pairs, x-wing, swordfish etc.) a difficulty rate is also assigned.

Problems with this approach (and requirements for the use case i had)

In order to generate a sudoku with given difficulty, with previous method one needs to solve a sudoku twice (once with the basic algorithm and once with the human-like solver).
One has to (pre-)generate many sudokus which can only be rated as to difficulty after being solved by the human-like solver. So one cannot generate a desired sudoku on-the-fly once.
The human-like solver can be complicated and in most cases (if not all) is tightly coupled to 9x9 sudoku grids. So no easy generalisation to other sudokus (e.g 4x4, 16x16, 6x6 etc.)
The difficulty rating of the human-like techniques is very subjective. For example why x-wing is taken to be more difficult than hidden singles? (personaly have solved many difficult published sudoku puzzles manualy and never used such techniques)

Another approach was used which has the following benefits:

Generalises well to arbitrary sudokus (9x9, 4x4, 6x6, 16x16 etc..)
The sudoku configuration, with desired difficulty, is generated once and on-the-fly
The difficulty rating is objective.

How it works?

First of all, the simple fact that the more difficult the puzzle, the more time it needs to be solved.

But time to be solved is intimately correlated to both number of clues (givens) and average alternatives to be investigated per empty cell.

Extending my previous answer, it was mentioned that for any sudoku puzzle the minimum number of clues is an objective property of the puzzle (for example for 9x9 grids the minimum number of clues for having a valid sudoku is 17)

One can start from there and compute minimum number of clues per difficulty level (linear correlation).

Furthermore at each step of the sudoku generation process, one can make sure the average alternatives (to be investigated) per empty cell is within given bounds (as a function of desired difficulty)

Depending on whether the algorithm uses backtrack or not (for the use case discussed the algorithm does no backtracking) the desired difficulty can be reached either with probability=1 or with high probability within bounds (respectively).

Tests of the sudokus generated with this algorithm and difficulty rating based on the previous approaches (human-like solver), show a correlation of desired and estimated difficulty rates, plus a greater ability for generalisation to arbitrary sudoku configurations.

(have used this online sudoku solver (and also this one) to correlate the difficulty rates of the test sudokus)

The code is available free on github sudoku.js (along with sample demo application), a scaled-down version of CrossWord.js a professional crossword builder in JavaScript, by same author

Coheir answered 24/2, 2015 at 15:31 Comment(1)

To be technically accurate, the minimum number of clues for any board to be well-defined is ≥ 17. It is not a trivial process to find a board that requires only 17 clues to be solved: most boards require more. – Manuel 12/12, 2023 at 8:4

It's not as elegant as what you ask, but you can simulate this behavior with caching:

Decide how many "buckets" you want for puzzles. For example, let's say you choose 20. Thus, your buckets will contain puzzles of different difficulty ranges: 0-.05, .05-.1, .1-.15, .. , .9-.95, .95-1
Generate a puzzle
Grade the puzzle
Put it in the appropriate bucket (or throw it away when the bucket is full)
Repeat till your buckets are "filled". The size of the buckets and where they are stored will be based on the needs of your application.

Then when a user requests a certain difficulty puzzle, give them a cached one from the bucket they choose. You might also want to consider swapping numbers and changing orientation of puzzles with known difficulties to generate similar puzzles with the same level of difficulty. Then repeat the above as needed when you need to refill your buckets with new puzzles.

Rabideau answered 23/5, 2012 at 14:48 Comment(0)

The sudoku difficulty is related in an interesting way to the (minimum) amount of information needed to specify a unique solution for a given grid.

Sounds like information theory, yes it has applications here too.

Sudoku puzzles should have a unique solution. Furthermore sudoku puzzles have certain symmetries, i.e by row, by column and by sub-square.

These symmetries specify the minimum number of clues (and their position more or less) needed so that the solution would be unique (i.e using a sudoku compiler or an algorithm like backtrack-search).

This would be the most difficult/hard sudoku puzzle level (i.e minimum needed number of clues). Then all other difficulty levels from less hard to easy are generated by allowing more clues than the minimum amount needed.

It should be noted that sudoku difficulty levels are not standard, as explained above, one can have as many or as few difficulty levels as one wants. What is standard is the minimum number (and position) of clues (which is the hardest level and which is relatd to the sudoku symmetries), then one can generate as many difficulty levels as one wants simply by allowing extra/redundant clues to be visible as well.

Coheir answered 4/8, 2014 at 1:2 Comment(2)

Surprised me to see someone comment on such an old post. This problem is long since passed, but I do appreciate your input. The method I ended up using was generating some staggeringly large number of puzzles(about 10 million) and then rating them. The rating they would be given was based on what sudoku solving techniques were required to solve them, and (objectively) how difficult those techniques were to use. Examples of solving techniques used for rating. – Beaming 15/8, 2014 at 18:21

@ZachLHelms, thanks, actually i dont mind answering old questions if i think i have sth to say. Lately i'm doing a (profesional) puzzle application involving sudokus among other puzzles. The technique you mention is standard, but i do not want to use such a technique as the application is online so i investigate alternative techniques using symmetries and minimum information, fact remains difficulty levels are not standard. Have answered some other posts about puzzle generation on similar work i do (at present) – Coheir 15/8, 2014 at 19:34

Well, you can't know how complicated it is, before you know how to solve it. And Sudoku solving (and therefore also the difficulty rating) belongs to the NP-C complexity class, that means it's (most likely) logically impossible to find an algorithm that is (asymptotically) faster than the proposed randomly-guess-and-check.

However, if you can find one, you have solved the P versus NP problem and should clear a cupboard for the Fields Medal... :)

Coomer answered 23/5, 2012 at 15:8 Comment(0)

Recommended topics

Hot tags