Finite Metric Embeddings: Good Algorithm?
Asked Answered
R

2

14

I have a finite metric space given as a (symmetric) k by k distance matrix. I would like an algorithm to (approximately) isometrically embed this in euclidean space R^(k-1). While it is not always possible to do exactly by solving the system of equations given by the distances I am looking for a solution that embeds with some (very small) controllable error.

I currently use multidimensional scaling (MDS) with the output dimension set to (k-1). It occurs to me that in general MDS may be optimized for the situation where you are trying to reduce the ambient embedding dimension to something less then (k-1) (typically 2 or 3) and that there may be a better algorithm for my restricted case.

Question: What is a good/fast algorithm for realizing a metric space of size k in R^{k-1} using euclidean distance?

Some parameters and pointers:

(1) My k's are relatively small. Say 3 < k < 25

(2) I don't actually care if I embed in R^{k-1}. If it simplifies things/makes things faster any R^N would also be fine as long as it's isometric. I'm happy if there's a faster algorithm or one with less error if I increase to R^k or R^(2k+1).

(3) If you can point to a python implementation I'll be even happier.

(4) Anything better then MDS would work.

Ramayana answered 28/9, 2011 at 17:16 Comment(9)
Do your distances actually obey the triangle inequality?Whistler
This might be better suited for math.stackexchange.com (probably not advanced enough for mathoverflow).Yuki
I am not sure if there is a faster algorithm than MDS that solves this in this setting. Where does your distance matrix come from? You might be better of immediately reducing the dimension of the data that you produced the distance matrix from (only if this is possible of course).Retinoscope
@j_random_hacker: Yes they do. It's a metric space linkRamayana
@LiKao: The distance matrix comes from the bottleneck distance between the persistent homology of a collection of point clouds.Ramayana
I had this problem stuck in my mind all day, and i am very sure that you do not need a quadratic solver at all. I am currently in a train, but I will post a solution as soon as I am back at home, so you can decide if it is usefull.Retinoscope
I just found a counterexample to your reasoning that such a placement always exists. Taking the the system of distances d_{1,2}=d_{2,1}=1; d_{1,3}=d_{3,1}=1; d_{1,4}=d_{4,1}=1; d_{2,3}=d_{3,2}=2; d_{2,4}=d_{4,2}=sqrt(2); d_{3,4}=d_{4,3}=1; there is no embeding for these four points in R^3. Note that p_1,p_2 and p_3 have to lie on a straight line to satisfy their distances. Also d_{1,4},d_{2,4} and d_{1,2} force p_1,p_2 and p_4 to lie on a right angled triangle perpendicular to the line p_1,p_2,p_3. Hence p_1,p_4,p_3 also make a right triangle and the distance d_{4,3}=1 cannot be achieved.Retinoscope
Yes you're right. I was misremembering a theorem (finite metric spaces embed into l_{\infinity} not l_2 as I said). An easy example is take the square with corners distance one apart but set the diagonal distances to 2. I didn't check but your example also works. To see that you can embed into l_{\infinity} take X \rightarrow \R^{|X|} given by x \mapsto (...,d(x,y),...) (with d(x,y) in the yth coordinate spot). You can use the triangle inequality to show that this works.Ramayana
You can also characterize when a finite metric space embeds into l_2 in terms of a certain associated matrix (that you can easily calculate and check). link. With this in mind I am going to change my question to: What methods exist to embed into l_2 and can you quantify their error and efficiency.Ramayana
G
1

Why not try LLE , you can also find the code there.

Gide answered 2/3, 2012 at 17:49 Comment(0)
R
0

Ok, as promised here a simple solution:

Notation: Let d_{i,j}=d_{j,i} denote the squared distance between points i and j. Let N be the number of points. Let p_i denote the point i and p_{i,k} the k-th coordinate of the point.

Let's hope I derive the algorithm correctely now. There will be some explanations afterwards so you can check the derivation (I hate it when many indexes appear).

The algorithm uses dynamic programming to arrive at the correct solution

Initialization:
p_{i,k} := 0 for all i and k

Calculation:
for i = 2 to N do
   sum = 0
   for j = 2 to i - 1 do
     accu = d_{i,j} - d_{i,1} - p_{j,j-1}^2
     for k = 1 to j-1 do
       accu = accu + 2 p_{i,k}*p_{j,k} - p_{j,k}^2
     done
     p_{i,j} = accu / ( 2 p_{j,j-1} )
     sum = sum + p_{i,j}^2
   done
   p_{i,i-1} = sqrt( d_{i,0} - sum )
done

If I have not done any grave index mistakes (I usually do) this should do the trick.

The idea behind this:

We set the first point arbitrary at the origin to make our life easier. Not that for a point p_i we never set a coordinate k when k > i-1. I.e. for the second point we only set the first coordinate, for the third point we only set first and second coordinate etc.

Now let's assume we have coordinates for all points p_{k'} where k'<i and we want to calculate the coordinates for p_{i} so that all d_{i,k'} are satisfied (we do not care yet about any constraints for points with k>k'). If we write down the set of equations we have

d_{i,j} = \sum_{k=1}^N (p_{i,k} - p_{j,k} )^2

Because both p_{i,k} and p_{j,k} are equal to zero for k>k' we can reduce this to:

d_{i,j} = \sum_{k=1}^k' (p_{i,k} - p_{j,k} )^2

Also note that by the loop invariant all p_{j,k} will be zero when k>j-1. So we split this equation:

d_{i,j} = \sum_{k=1}^{j-1} (p_{i,k} - p_{j,k} )^2 + \sum_{k=j}^{k'} p_{i,j}^2

For the first equation we just get:

d_{i,1} = \sum_{k=1}^{j-1} (p_{i,k} - p_{j,k} )^2 + \sum_{k=j}^{k'} p_i{i,1}^2

This one will need some special treatment later.

Now if we solve all binomials in the normal equation we get:

d_{i,j} = \sum_{k=1}^{j-1} p_{i,k}^2 - 2 p_{i,k} p_{j,k} + p_{j,k}^2 + \sum_{k=j}^{k'} p_{i,j}^2

subtract the first equation from this and you are left with:

d_{i,j} - d_{i,1} = \sum_{k=1}^{j-1} p_{j,k}^2 - 2 p_{i,k} p_{j,k}

for all j > 1.

If you look at this you'll note that all squares of coordinates of p_i are gone and the only squares we need are already known. This is a set of linear equations that can easily be solved using methods from linear algebra. Actually there is one more special thing about this set of equations: The equations already are in triangular form, so you only need the final step of propagating the solutions. For the final step we are left with one single quadratic equation that we can just solve by taking one square root.

I hope you can follow my reasoning. It's a bit late and my head is a bit spinning from those indexes.

EDIT: Yes, there were indexing mistakes. Fixed. I'll try to implement this in python when I have time in order to test it.

Retinoscope answered 29/9, 2011 at 22:5 Comment(13)
One additional note: It might actually not be necessary to use the full R^{N-1}, for example if three points are on a line four points on a plane etc. In this case there will be a division by zero in the algorithm above. This can be easily accounted for, by detecting when a newly calculated coordinate will be set to zero in the first calculation. In this case you can just skip this point for the rest of the calculation and reduce the number of dimensions of the final point set.Retinoscope
Ok, I was wrong for my last comment. As exemplified by the comment to the original question, this problem might not have a solution. That means that algorithm only can find the answer in some cases when the answer exists. I'll be looking for a proof, that it finds the answer in all such cases, but have not yet found one.Retinoscope
the complexity of this algorithm is O(n^4), I doubt if it is going to be better than MDS.Gide
@g24l: How did you arrive at O(n^4)? The code in the inner loop takes O(1), thus the whole loop will take O( j ), second level than takes O( i^2 ) and finally the full algorithm takes O( n^3 ) not O( n^4). Also MDS takes O( n^2 ) for each iteration with an unknown number of iterations until convergence (possible infinite time). So unless you can be sure that n iterations will be enough to converge this algorithm is better, because it has a guarantee.Retinoscope
You result in n^2 equations that will require at least O((n^2)^2) operations to solve at each step. Thus your complexityis O(n^3)+O(Kn^4) where K is the number of iterations for convergence. Moreover, the linear system at the end should be sensitive.Gide
@gl241: Look at the algorithm again. One trick is,that the equations will be in triangular form. Solving each line of the triangle matrix takes O(n). There are O(n) lines, so we have O(nn) steps for solving each matrix. You solve the matrix once for each point, so we have O(nn*n). There are not O(n^2) equations at each step, since only the last point to be added needs to be compared to all other points (so O(n) equations). The three loops from i=2 to n, j = 2 to i-2 and k = 2 to j-1 reflect this structure. Your complexity will be O(\sum_{i=2}^{N}\sum_{j=2}^{i-1}\sum_{k=2}^{j-1}) = O(n^3).Retinoscope
You misunderstood me. The first part of the algorithm is O(n^3), that's for sure. The linear system though even in triangular form will require O((n^2)(n^2+1)/2)=O(((n^2)^2)/2)=O(n^4) operations even in triangular form,i.e. K=1/2 . Thus your program is O(n^4) worst case time complexity. That is you have n^2 unknowns on the diagonal, if understand correctly, and for the last one you need to make n^2 replacements in worst case and one division. Thus you complexity is given by sum_{i=1}^{n^2}(i) = 1/2 n^2(n^2+1).Gide
@g24l: No, you are still assuming one (or more) linear systems with O(N^2) equations. Look at the algorithm. It is not using such a large linear systems. There are N independent linear systems with a maximum of N equations. Each of these takes at maximum O(N^2) to solve. So we have O(N)*O(N^2)=O(N^3). I am not sure what you mean by "first part of the algorithm". The initialisation will surely take O(N^2) since you need to assign N coordinates (making them zero) to N points. Then there are the three loops, that have the sum structure I have pointed out above for a complexity of O(N^3).Retinoscope
@g24l: I think you are assuming there is any point where each point is compared to each other point, leading to a system of O(N^2) equations. However there never is a NxN comparison. Instead there are N 1XO(N) comparisons. Each point will only be compared to all the points already assigned. THe first point will take no comparisons. The second points has one comparison so one equation. The third point has two comparisons and two equations. etc. This means O(N^3) stands. Hence the indexes of the loops, which alwas only run up to the current index of the previous loop.Retinoscope
Maybe I am not getting it the way you write it, but the last equation has a d_{i,j} in it which implies n^2 equations. Maybe, I just don't get it, but still O(n^3) is much slower than having to run MDS with O(n^2) at each iteration, when the iterations to convergence are less than n.Gide
That's why I said to look at the algorithm. The last equation at each iteration is using d_{i,j}, but i and j are not running from 1 to N. i is running from 2 to N during each step. And j is running from 2 to i-1. Hence as I said, you have a maximum of N equations at each iteration (the max is reached, when i reaches N in the final step). Sure this algorithm takes more time than MDS if MDS converges in N steps (but this cannot be determined beforhand). Also MDS might get stuck in local minima or return a solutions with stress left, whereas this algorithm will find an exact solution or fail.Retinoscope
@g24l: i is running from 2 to N in during iterations of the top loop. At each loop a system of equations with i-1 equations is created and j is running from 2 to i-1 over these equations. Then k is running from 1 to j-1 over the terms in each equation. Each single term of each equations takes O(1) to calculate. Hence you have the \sum_{i=2}^{N}\sum_{j=2}^{i-1}\sum_{k=2}^{j-1}O(1) = O(n^3) calculation for the complete running time.Retinoscope
@g24l: One further problem that might confuse you is, that I maybe have used i,j,k inconsistently in the algorithm and in the explanations. I tried to reconstruct it and I am not either sure I can get through the index mess I created. The indexes i,j,k I used in the comments here are refering to the i,j,k of the algorithm though. A point where you are correct is, that there is a total of N^2 equations, but these are not all of the same length and will not be solved in one go. Because of the difference in length, these N^2 equations only have N^3 terms (and not N^4 as in the general case).Retinoscope

© 2022 - 2024 — McMap. All rights reserved.