What is going on with floating point precision here?
Asked Answered
H

2

6

This question is in reference is an observation from a code-golf challenge.

The submitted R solution is a working solution, but a few of us (maybe just I) seems to be dumbfounded as to why the initial X=m reassignment is necessary.

The code is golfed down a bit by @Giuseppe, so I'll write a few comments for the reader.

function(m){
            X=m
            # Re-assign input m as X

            while(any(X-(X=X%*%m))) 0
            # Instead of doing the meat of the calculation in the code block after `while`
            # OP exploited its infinite looping properties to perform the
            # calculations within the condition check.
            # `-` here is an abuse of inequality check and relies on `any` to coerce
            # the numeric to logical. See `as.logical(.Machine$double.xmin)`
            # The code basically multiplies the matrix `X` with the starting matrix `m`            
            # Until the condition is met: X == X%*%m

            X
            # Return result
           }

Well as far as I can tell. Multiplying X%*%m is equivalent to X%*%X since X is a just an iteratively self-multiplied version of m. Once the matrix has converged, multiplying additional copies of m or X does not change its value. See linear algebra textbook or v(m)%*%v(m)%*%v(m)%*%v(m)%*%v(m)%*%m%*%m after defining the above function as v. Fun right?

So the question is, why does @CodesInChaos's implementation of this idea not work?

function(m){while(any(m!=(m=m%*%m)))0 m}

Is this caused by a floating point precision issue? Or is this caused by the a function in the code such as the inequality check or .Primitive("any")? I do not believe this is caused by as.logical since R seems to coerce errors smaller than .Machine$double.xmin to 0.

Here is a demonstration of above. We are simply looping and taking the difference between m and m%*%m. This error becomes 0 as we try to converge the stochastic matrix. It seems to converge then blow to 0/INF eventually depending on the input.

mat = matrix(c(7/10, 4/10, 3/10, 6/10), 2, 2, byrow = T)

m = mat
for (i in 1:25) {
  m = m%*%m
  cat("Mean Error:", mean(m-(m=m%*%m)), 
      "\n Float to Logical:", as.logical(m-(m=m%*%m)),
      "\n iter", i, "\n")
}

Some additional thoughts on why this is a floating point math issue

1) the loop indicates that this is probably not a problem with any or any logical check/conversion step but rather something to do with float matrix math.

2) @user202729's comment in the original thread that this issue persists in Jelly, a code golf language gives more credence to the idea that this is a perhaps a floating point issue.

Homestretch answered 8/3, 2018 at 19:1 Comment(6)
Additionally, in the first test case, we get convergence to a matrix of zeros, and in the second, we obtain a matrix of Inf, despite the fact that the convergence matrices are identical. This is particularly notable since, as stochastic matrices, we should never ever have entries >1.Richierichlad
With the X %*% m version after k passes through the loop X = m ^ k but in the m %*% m version after k passes through the loop m = (original m) ^ 2 ^ k. How that effects convergence, I don't know.Gonzalez
This doesn't always converge, right? See matrix(1:4, 2).Inflame
@F.Privé That isn't a stochastic matrix.Gonzalez
I missed this detail. Let me try that with t(apply(matrix(runif(4), 2), 1, function(x) x / sum(x))) instead.Inflame
Using while(any(m-(m=m%*%m))) print(m) for the loop shows how weird things can be -- it seems like it is converging for many iterations and then seemingly out of nowhere destabilizesGonzalez
G
5

The different methods iterate different functions, both starting with seed value m. Function iteration only converges to a given fixed point if that fixed point is stable and the seed is within the basin of attraction of that fixed point.

In the original code, you are iterating the function

f <- function(X) X %*% m

The limit matrix is a stable fixed-point under the assumption (stated in the Code Gulf problem) that a well-defined limit exists. Since the function definition depends on m, it isn't surprising that the fixed point is a function of m.

On the other hand, the proposed variation using m = m %*% m is obtained by iterating the function

g <- function(X) X %*% X

Note that all idempotent matrices are fixed points of this function but clearly they can't all be stable fixed points. Apparently, the limiting matrix in the original fixed function is not a stable fixed point of g (even though it is a fixed point).

To really nail this down, you would need to get into the theory of matrix fixed points under function iteration to show why the fixed point in the case of g is unstable.

Gonzalez answered 4/4, 2018 at 15:12 Comment(0)
H
-1

This is indeed a floating point math issue. To see it, see the results of this function:

test2 <- function(m) {
  c <- 0
  res <- list()
  while (any(m!=(m=m%*%m))) {
    c <- c + 1
    res[[c]] <- m
  }
  print(c)
  res
}

To test equality with some tolerance, you can use:

test3 <- function(m) {
  while (!isTRUE(all.equal(m, m <- m %*% m))) 0
  m
}
Handlebar answered 4/4, 2018 at 14:40 Comment(2)
This doesn't really explain why the calculation with X = X%*%m converges but this other one doesn't. They both are done with floating point computations and they both mathematically "should" converge to the steady-state probability matrix.Gonzalez
As you explained in your comment, the new implementation uses "big jumps" in the power iterations. This must be the cause of this instability.Inflame

© 2022 - 2024 — McMap. All rights reserved.