Strange behaviour when computing svd on a covariance matrix: different results between Microsoft R and vanilla R
Asked Answered
B

2

0

I was doing some principal component analysis on my macbook running Microsoft R 3.3.0 when I got some strange results. Double checking with a colleague, I've realised that the output of the SVD function was different from what I may get by using vanilla R.

This is the reproducible result, please load the file (~78 Mb) here

With Microsoft R 3.3.0 (x86_64-apple-darwin14.5.0) I get:

>> sv <- svd(Cx)
>> print(sv$d[1:10])

 [1] 122.73664 104.45759  90.52001  87.21890  81.28256  74.33418      73.29427  66.26472  63.51379
[10]  55.20763

Instead on a vanilla R (both with R 3.3 and R 3.3.1 on two different linux machines):

>> sv <- svd(Cx)
>> print(sv$d[1:10])

 [1] 122.73664  34.67177  18.50610  14.04483   8.35690   6.80784   6.14566
 [8]   3.91788   3.76016   2.66381

This is not happening with all the data, if I create some random matrix and I apply svd on that, I get the same results. So, it looks like a sort of numerical instability, isn't it?

UPDATE: I've tried to compute the SVD on the same matrix (Cx) on the same machine (macbook) with the same version of R by using the svd package and finally I get the "right" numbers. Then it seems due to the svd implementation used by Microsoft R Open.

UPDATE: The behaviour happens also on MRO 3.3.1

Bloodandthunder answered 14/10, 2016 at 22:18 Comment(6)
With MRS 3.2.2 on Windows, I get a result matching vanilla R. Maybe double-check that you're using the same data on both machines.Feldman
use a seed, and then make sure you're using the same seed on both machines to help verifyCaulis
@HongOoi I've checked, it's the same data. Maybe it's something related to the mac libraries...Bloodandthunder
I see that Microsoft R on MAC use the Apple Accelerate framework for BLAS, I'd like to use 1 thread instead of the 4 I'm actually using but I don't know how to set it.Bloodandthunder
Could you post your update as an answer ... ?Saddler
@BenBolker Well, it seems still a partial answer, I'll post something more complete asap.Bloodandthunder
B
0

It seems this is a sort of bug, as confirmed in the Github of microsoft-r-open. They say this behaviour is under investigation and it's related with the Accelerate library in MacOs.

Bloodandthunder answered 4/11, 2016 at 11:1 Comment(0)
S
1

The typical example forms an ill-conditioned matrix. There are some SV closest to zero making the SVD decomposition numerical sensitive to different implementations of the SVD, which is probably what you are seen

Sucker answered 16/10, 2016 at 20:28 Comment(1)
Probably, but the difference is huge, you get different principal components on two different machines...Bloodandthunder
B
0

It seems this is a sort of bug, as confirmed in the Github of microsoft-r-open. They say this behaviour is under investigation and it's related with the Accelerate library in MacOs.

Bloodandthunder answered 4/11, 2016 at 11:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.