I recently implemented Minimax and Alpha Beta Pruning algorithms and I am 100% sure that(autograder) I implemented them correctly. But when I executed my program they behaved differently. I am 99% sure that the end state of minimax and Alpha beta should be the same. Am I right? Can they differ on their path to achieve the result? Because we ignored some values min will select which will not be selected by max or vice versa.
I know this is an old question however....
Yes Alpha-beta and minimax returns the same answer. All Alpha-Beta does is prevent minimax from making calculations that are 100% guaranteed to NOT be an optimal state for the current player (MAX or MIN).
You may however have equivalent actions for a given state. How your algorithm decides which equivalent actions to return depends on how it is implemented. If sets/unordered lists are used somewhere, the order in which evaluations are made may change.
This may also depends on what you do if Alpha/Beta value is equal to the current best option. Since equal values would not produce a better result, there's not point in exploring that path further. Therefore you would just keep the "first best action encountered". However with Minimax, you explore everything anyways, so you may decide to keep the "last best" value. That's one case where Minimax would return a different action than Alpha-Beta. But they are still equivalent as far as your scoring function is concerned...
© 2022 - 2024 — McMap. All rights reserved.