In case you're still stuck on this problem, as the other answers/comments only give you a partial answer -- here's a crack at the problem space.
You can actually use A* with a few modifications to capture a (mostly) unordered M point path. The only things you need to change are the heuristic and the termination criteria.
First you need to change your heuristic to account for paths through M points. The heuristic must be admissible and consistent, so it must equal a value less than or equal the true path cost and it can only decrease as you get closer to the goal (must be monotonic increasing).
You can think of the path you're now taking as M subpaths you must complete, each of which acts as a normal path. Thus for a single point graph (with only a terminating space) you can use a standard heuristic like Euclidean distance. This is a greedy estimate that suggests a straight path is optimal and for which you can't do better under ideal circumstances (it's admissible).
For more than one path the greedy approach similarly says that an unblocked straight path between points is the fastest you can go. It's also still a consistent heuristic because you can't step further away and still have a better score. The hard part is picking which ordering of M points is the fastest without obstructions in order to maintain an admissible heuristic. You can find the optimal choice of M points in a graph where all tiles are walk-able by breadth first searching the Euclidean distance from your current tile to each of M points, to each of M-1 remaining points, ..., to the terminating square. This operation is expensive as you need to do it for each square you reach -- but you can use dynamic programming or search caching to bring the required amortized computation down to O(M) per step.
Now, once you have the M point paths which would be fastest without obstacles you can use the Euclidean distance between each point in that path and your current position as the heuristic. It's a greedy movement estimate, so it's always admissible (you can't beat the estimated cost) and it's consistent because you can't walk away from the next greedy optimal point and reduce your cost because choosing a different greedy point from the current tile would not be admissible.
Finally your terminating criteria needs to change to reaching M points, where the last point is the terminating tile. This imitates walking M graphs without needing to construct M! possible graphs to walk. The provided heuristic will let A* do it's magic without changing the underlying algorithm and should be about as effective as you can get while maintaining the required constraints on the heuristic on generic grids.