Algorithmically generate a Zebra/Einstein puzzle
Asked Answered
B

6

22

Firstly I'm not necessarily looking for a complete algorithm I can just copy and paste, then call it a day. Any "general approach" solutions would be fine for me!

This entire post was spurred by a slow day at work, and stumbling upon this site and not being able to figure out how they implemented their generator.

The Problem

For those of you who don't know, the "Zebra Puzzle" or "Einstein's Puzzle" is a famous logic puzzle that you've probably ran into before.

The full wiki article is here, but I'll post the relevent bits.

There are five houses.
The Englishman lives in the red house.
The Spaniard owns the dog.
Coffee is drunk in the green house.
The Ukrainian drinks tea.
The green house is immediately to the right of the ivory house.
The Old Gold smoker owns snails.
Kools are smoked in the yellow house.
Milk is drunk in the middle house.
The Norwegian lives in the first house.
The man who smokes Chesterfields lives in the house next to the man with the fox.
Kools are smoked in the house next to the house where the horse is kept. 
The Lucky Strike smoker drinks orange juice.
The Japanese smokes Parliaments.
The Norwegian lives next to the blue house.

Now, who drinks water? Who owns the zebra? In the interest of clarity, it must be
added that each of the five houses is painted a different color, and their inhabitants
are of different national extractions, own different pets, drink different beverages 
and smoke different brands of American cigarets [sic]. One other thing: in statement 
6, right means your right. 

This is all well and good. I've found several concise and neat ways online to solve this problem, especially using constraint programming. However, what interests me is making more of these types of puzzles.

Making More

Obviously, a matrix representation is a logical way to think about this. With each column containing a person, house, what they drink, what type of car they drive, etc.

My initial thought was to start with a randomly generated grid that is complete (ie, solved) then (somehow) create hints from the solved version that uniquely identify it. Every time something can be determined, it's removed from the grid.

Ripping off the site I listed at the beginning, the following "hints" that can be used to solve the grid can be of the following type:

  • The person/animal/plant lives/grows in a given house.

  • The person/animal/plant does not live/grow in a given house.

  • The person/animal/plant lives in the same house as the other person/animal/plant.

  • The person/animal/plant is a direct neighbor of the other person/animal/plant.

  • The person/animal/plant is the left or right neighbor of other person/animal/plant.

  • There is one house between the person/animal/plant and the other person/animal/plant.

  • There is one house between the person/animal/plan and the other person/animal/plant on the left or right.

  • There are two houses between the person/animal/plant and the other person/animal/plant.

  • There are two houses between the person/animal/plan and the other person/animal/plant on the left or right.

  • The person/animal/plant lives left or right from the other person/animal/plant.

You can see how these could be generalized, extended, etc;

The difficulty is, using my approach (starting with a complete grid and generating these hints), I'm not sure how to make sure the set of hints I create would absolutely result in the target grid.

For example, if you say "The Englishman does not own a pine tree" you cannot decisively pair two things together at any given time in the puzzle. Yet if there were only two trees remaining to solve for, this could in fact be a decisive piece of evidence.

Am I thinking about this the entirely wrong way? Would a better approach be to create a grid with some randomized, pre-defined known elements (ie, the red house is in the middle) and then build up the grid using these hints as rules for building?

Any advice, articles to read, programming techniques to learn about, etc. would be greatly appreciated!

Beadle answered 28/1, 2013 at 19:12 Comment(1)
Related post: gamedev.stackexchange.com/questions/5909/…Araby
M
11

Here's a simple algorithm making use of your solver:

  1. Generate a random puzzle instance.

  2. Build a set C of all possible clues that pertain to this puzzle instance. (There are a finite and in fact quite small number of possible clues: for example if there are 5 houses, there are 5 possible clues of the form "Person A lives in house B", 8 possible clues of the form "Person A lives next to house B", and so on.)

  3. Pick a random permutation c1, c2, ..., cn of the clues in C.

  4. Set i = 1.

  5. If i > n, we are done. The set C of clues is minimal.

  6. Let D = C − { ci }. Run your solver on the set D of clues and count the number of possible solutions.

  7. If there is exactly one solution, set C = D.

  8. Set i = i + 1 and go back to step 5.

(You can speed this up by removing clues in batches rather than one at a time, but it makes the algorithm more complicated to describe.)

Mucor answered 28/1, 2013 at 20:0 Comment(5)
Thanks, Gareth! This is a great sounding solution!Beadle
+1: this looks right to me. Though step #2 is no small matter. It also leaves the possibility of redundant and/or obviated clues in the Set. Such as "A lives to the right of B" and "B lives next to A". Or less apparent ones like "A lives next to B", "B lives next to C" "C does not live next to A".Garter
This algorithm starts out with C containing all possible clues, including many redundant subsets. Then it removes clues from this set until no more can be removed without making the puzzle ambiguous. At this point there can be no redundant subsets remaining.Mucor
There is something cooler than this that is studied in automated theorem proving. If you extend the statements to 1st order logic, something cool shows up. For example, instead of "the Englishman lives in the red house", you can have "all Englishmen live in a red house" or "some Englishmen live in a red house" or.... What this creates is a universe and a set of predicates. Look familiar? In fact, this builds a theory. You can later ask is a statement true in the theory? There is a calculus that lets you convert this to computation (en.wikipedia.org/wiki/Resolution_(logic)).Migrant
The original problem can be converted into a universe and set of predicate in propositional logic. Because of the completeness theorem in propositional logic, every statement must be decidable (long story :p). No such completeness exists for 1st order logic assuming consistency, which is why the extension is cooler. It lets you generate more complex questions within your theory. Questions for which an answer cannot be "decided" in finite time.Migrant
M
4

I'm not entirely confident in this solution but this is how I would approach it:

Starting from a random solution (i.e. green house holds polish that smokes LM, red house holds irish that smokes cloves etc). you may look at that solution as a graph of relations between statements. where every element (polish, red house etc) is connected to all other elements either by a "yes" edge or a "no edge" (in our case the polish is connected to the green house with a "yes" edge and to the cloves with a "no edge" (amongst many other edges, this initial graph is a full, directional graph)).

Now, if you randomly take away edges, until you are left with a minimal connected graph, you should have a graph representing a solvable puzzle. translate every yes edge to "the foo is/does bar" and every no edge to "the foo isn't/doesn't bar".

intuitively this sounds about right to me. but again, this is in no way a formal or recognized way to do this, and may be entirely wrong.

Matadi answered 28/1, 2013 at 19:40 Comment(4)
That's a clever approach to it!, thanks for the suggestion! I'll play around with some algorithms tonight :) I'll mark this as an answer if I can confirm it works!Beadle
thanks, if this approach fails you can always use Gareths iterative approach. but I would like to know if my way works out or not :)Matadi
What is a "minimal connected graph" in this approach? And are you taking into account that there are four times as many NO connections as YES connections? (that is, YES connections are unique within one of either of its two degrees of freedom, but NO connections are not).Garter
a "no" edge represents a valid clue, just like a yes edge does, it is equivalent to starting with a set of all possible clues. how the yes to no ratio matter? a minimal connected graph would be one that allows for the puzzle to be solved, so it means a graph containing a path from at least one vertex to all other vertices. I've also just now realised this approach misses the more complex "the man who smokes cloves doesn't live in the blue house" type of clues. not sure if Gareths iterative solution covers those either..Matadi
M
2

You can also do it the other way around (which will get you a solver as well):

  1. Generate S, a list of all possible solutions (i.e. tables).
  2. Generate a random fact f (for example: the Norwegian has a cat).
  3. Set T = S
  4. Filter out from T all solutions that violate f.
  5. If |T| = 0 then goto 2 (the fact contradicts a previous one)
  6. If |T| < |S| then set S = T and F.append(f) (the fact is not already embodied by previous facts)
  7. IF |S| > 1 then goto 2

Once done - F will be the list of facts that lead to the only table left in S.

Admittedly, this is very much brute force, and will probably not work well with a table that is 5X5 or more.

Maneuver answered 9/12, 2013 at 10:8 Comment(0)
U
1

Interestingly, the "Einstein" Puzzle (quotes intended, everything "clever" tends to be assigned to Einstein maybe to have more glamour), is related to Sudoku Generation Algorithm (by a proper translation of terms) and also to Rubik Cube (3x3x3) Solving Algorithm

In the case of Sudoku, the clues correspond to already assigned numbers on the grid and the missing information to empty slots

In the case of Rubik Cube (which i find more interesting), the clues correspond to symmetries of the cube (eg green color is next to red color, sth like that) And the missing data are found by re-aligning (solving) the cube

This is a rough outline, thanks

Uniformity answered 5/5, 2014 at 19:26 Comment(0)
B
0

Here's another thought - once you've generated your complete grid take a photo of it on your phone (Or duplicate the design) before removing items as you provide clues. You might get half way through the task and forget what the original/ final layout is meat to look like and can avoid misinforming your subjects/ test takers.

Thinking of doing one for Easter, similar pattern, 5 people, 5 chocolate types, 5 ages, 5 different Easter hats, 5 different favourite drinks, ice creams etc.

Billen answered 26/3, 2019 at 13:56 Comment(0)
F
0

I think the foundation of Gareth Rees' answer is sound, but I think an additive approach rather than a subtractive one would work better.

You start with a complete set of potential clues (C), an empty set of accepted clues (D), and an empty puzzle solution (S).

You then loop through the following process until S is viable:

  1. Take the next clue ( C i ) from C.
  2. Update S by attempting to solve the puzzle with C i
  3. Has S changed? If yes, add C i to D; otherwise, discard it.
  4. Is S complete? If yes, you can stop and use D as your list of clues; otherwise, repeat from step 1.

This will absolutely guarantee that there are no unnecessary clues, because you only keep the clues that add to your puzzle solution. And it will likely be much faster than starting with dozens (or hundreds) of clues and whittling them down to a few. (Also note: With some tweaking, you can easily modify this process to reduce the puzzle difficulty by adding extra clues even after S is solvable.)

Falsify answered 22/4, 2023 at 19:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.