TSP: Worst case ratio grows

As part of my high school thesis I am describing the heuristics for the Traveling Salesman Problem. I was reading this sort of case study (Page 8) but I cannot unserstand what these sentences mean:

The running time for NN as described is Θ(N^2 ). [...] In particular, we are guaranteed that NN (I)/OPT (I) ≤ ( 0. 5 ) ( log_{2} N + 1 ).

That part is very clear to me. But then:

No substantially better guarantee is possible, however, as there are instances for which the ratio grows as Θ( logN).

What's the meaning of there are instances for which?

The same thing happens with the greedy algorithm:

...but the worst examples known for Greedy only make the ratio grow as ( logN)/( 3 log log N).

So what's the meaning of those statements? Does it have to do with non-Euclidean instances (I wouldn't think so because you just have to read through the column of a distance matrix in order to solve it)? Or just instances with e.g. multiple nodes at the same distance from the starting node that requires the algorithm to split the solution tree or something similar?

EDIT: Thanks to @templatetypedef (your answer will still be accepted as correct), it all makes sense. However I would like to ask if someone knows any example (or even just a link) of these specific graphs (I doesn't matter of which algorithm). I don't think that it is too much off-topic, it would rather add something useful to the topic.

Take a look at these two statements side-by-side:

In particular, we are guaranteed that NN (I)/OPT (I) ≤ ( 0. 5 ) ( log_{2} N + 1 ).

No substantially better guarantee is possible, however, as there are instances for which the ratio grows as Θ( logN).

This first statement says that algorithm NN, in the worst case, produces an answer that's (roughly) within 1/2 lg N of the true answer (to see this, just multiply both sides by OPT(I)). That's great news! The natural follow-up question, then, is whether the actual bound is even tighter than that. For example, that result doesn't rule out the possibility that we might also have NN(I) / OPT(I) ≤ log log N or that NN(I) / OPT(I) ≤ 2. Those are much tighter bounds.

That's where the second statement comes in. This statement says that there are known instances of TSP (that is, specific graphs with weights on them) where the ratio NN(I) / OPT(I) is Θ(log n) (that is, the ratio is proportional to the log of the number of nodes in the graph). Because we already know about inputs like this, there's no possible way that NN(I) / OPT(I) could be bounded from above by something like log log n or 2, since those bounds are too tight.

However, that second statement in isolation is not very useful. We know that there are inputs that would cause the algorithm to produce something that's off by a log factor, but might it still be possible that there are inputs that cause it to be off by a lot more; say, by an exponential factor? With the first statement, we know that this can't happen, since we know we never get more than a log factor off.

Think of these statements in two steps: the first statement gives an upper bound on how bad the approximation can be - it's never worse than a 1/2 lg N + 1 factor of optimal. In the second statement gives a lower bound on how bad the approximation can be - there are specific cases where the algorithm can't do any better than a Θ(log N) approximation of the optimal answer.

(Note that the Θ(log n) here has nothing to do with the runtime - it's just a way of saying "something logarithmic.")

Going forward, keep an eye out for both upper bounds and lower bounds. The two collectively tell you a lot more than any one individually does.

Best of luck with the thesis!

Recommended topics

Hot tags