R: Plot Axis Display Values Larger than the Original Data

Asked 1/2, 2021 at 16:45 Answered 27/8, 2022 at 10:33

I am using the R programming language. I am following a tutorial on data visualization over here: https://plotly.com/r/3d-surface-plots/

I created my own data and made a 3D plot:

library(plotly)

set.seed(123)

#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)

#3d plot
fig <- plot_ly(z = ~as.matrix(d))
fig <- fig %>% add_surface()

#view plot
fig

As seen here, there is a point on this 3D plot where "y = 97". I am not sure how this is possible, seeing how none of the values within the original data frame "d" are anywhere close to 97. I made sure of this by looking at the individual distributions of each variable in the original data frame "d":

#plot individual densities 

plot(density(d$a), main = "density plots", col = "red")
lines(density(d$b), col = "blue")
lines(density(d$c), col = "green")

legend( "topleft", c("a", "b", "c"), 
text.col=c("red", "blue", "green") )

As seen here, none of the variables (a,b,c) from the original data frame "d" have any values that are close to 97.

Thus, my question: can someone please explain how is it possible that the point (x = 0 , y = 97, z =25.326) appears on this 3D plot?

Thanks

Theroid answered 1/2, 2021 at 16:45 Comment(0)

I am not sure if this will resolve the problem - but using the same logic from this previous stackoverflow post: 3D Surface with Plot_ly in r, with x,y,z coordinates

library(plotly)
set.seed(123)

#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)


data = d

plot_ly() %>% 
  add_trace(data = data,  x=data$a, y=data$b, z=data$c, type="mesh3d" )

Now, it appears that all values seen in this visual plot are contained in the original data frame.

However, I am still not sure what is the fundamental (and mathematical) difference between both of these plots:

I am curious to see what others have to say.

Thanks

Theroid answered 1/2, 2021 at 20:0 Comment(0)

The problem is how you have your matrix built. Basically, the z-values (in your case the c variable) should be given in a matrix in which the rows and columns are like coordinates for a surface, similar to a grid or raster dataset. The values you see now along the x and y-axis are not the values from your a and b variables but the row and column numbers from your matrix (similar to coordinates). You can open the volcano dataset in R and have a look at how these data are organized, which will surely give you a better understanding of what I am trying to explain.

Farnese answered 2/2, 2021 at 13:34 Comment(0)

As Robbie mentioned, its to do with how your data is organised. To change XYZ data to the same format as the volcano dataset, you can use the following from the raster package:

raster <- rasterFromXYZ(d)

# plot raster
plot_ly(z = as.matrix(raster), type = "surface")

Marbleize answered 27/8, 2022 at 10:33 Comment(0)

Recommended topics

Hot tags