How to interpolate data between sparse points to make a contour plot in R & plotly

Asked 16/2, 2019 at 19:29 Answered 25/2, 2019 at 1:10

Solved r plotly contour r-plotly extrapolation

I'd like to create a contour plot on th xy plane from concentration data at the following coloured points in the fist figure. I don't have corner points at each height so I need to extrapolate the concentration to the edges of the xy plane (xlim=c(0,335),ylim=c(0,426)).

The plotly html file of the points is available here: https://leeds365-my.sharepoint.com/:u:/r/personal/cenmk_leeds_ac_uk/Documents/Documents/HECOIRA/Chamber%20CO2%20Experiments/Sensors.html?csf=1&e=HiX8fF

dput(df)
structure(list(Sensor = structure(c(11L, 12L, 13L, 14L, 15L, 
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 
29L, 1L, 3L, 4L, 5L, 6L, 8L, 30L, 31L, 32L, 33L, 34L, 35L), .Label = c("N1", 
"N2", "N3", "N4", "N5", "N6", "N7", "N8", "N9", "Control", "A1", 
"A10", "A11", "A12", "A13", "A14", "A15", "A16", "A17", "A18", 
"A19", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "R1", 
"R2", "R3", "R4", "R5", "R6"), class = "factor"), calCO2 = c(2237, 
2389.5, 2226.5, 2321, 2101.5, 1830.5, 2418, 2356.5, 435, 2345.5, 
2376, 2451, 2397, 2466, 2518.5, 2087, 2463, 2256.5, 2345.5, 3506, 
2950, 3386, 2511, 2385, 3441, 2473, 2357.5, 2052.5, 2318, 1893.5, 
2251), x = c(83.75, 167.5, 167.5, 167.5, 251.25, 167.5, 251.25, 
251.25, 0, 83.75, 251.25, 167.5, 251.25, 83.75, 83.75, 83.75, 
83.75, 251.25, 167.5, 335, 0, 0, 335, 167.5, 167.5, 167.5, 0, 
335, 335, 167.5, 167.5), y = c(213, 319.5, 319.5, 110, 319.5, 
213, 110, 110, 356, 213, 319.5, 110, 213, 110, 319.5, 319.5, 
110, 213, 213, 0, 0, 426, 426, 426, 0, 213, 213, 70, 213, 426, 
0), z = c(155, 50, 155, 155, 155, 226, 50, 155, 178, 50, 50, 
50, 50, 155, 50, 155, 50, 155, 50, 0, 0, 0, 0, 0, 0, 0, 130, 
50, 120, 130, 130), Type = c("Airnode", "Airnode", "Airnode", 
"Airnode", "Airnode", "Airnode", "Airnode", "Airnode", "Airnode", 
"Airnode", "Airnode", "Airnode", "Airnode", "Airnode", "Airnode", 
"Airnode", "Airnode", "Airnode", "Airnode", "Naveed", "Naveed", 
"Naveed", "Naveed", "Naveed", "Naveed", "Rotronic", "Rotronic", 
"Rotronic", "Rotronic", "Rotronic", "Rotronic")), .Names = c("Sensor", 
"calCO2", "x", "y", "z", "Type"), row.names = c(NA, -31L), class = "data.frame")

require(plotly)

plot_ly(data = subset(df,z==0), x=~x,y=~y, z=~calCO2, type = "contour") %>%
  layout(
    xaxis = list(range = c(340, 0), autorange = F, autorange="reversed"), 
    yaxis = list(range = c(0, 430)))

I'm trying to find something like this. Any help would be much appreciated.

Herrmann answered 16/2, 2019 at 19:29 Comment(4)

and rather than printing head(df) you should include the output of dput(df). This will make your data frame reproducible – Caliph 20/2, 2019 at 18:52

I understand @Caliph – Herrmann 20/2, 2019 at 19:2

I agree, I'm close to share with you the code, but I'd like to try out with you complete data. Could you please share it with us via cloud? – Pigsty 22/2, 2019 at 10:30

Hi @CésarArquero, thank you very much for this! The data set is df above, I have now used dput(df) to show it in it's full construction. – Herrmann 22/2, 2019 at 10:52

First of all you must consider that with +-30 points is not enough to get those clean separated layers that you can see in the example. Said that, lets get into work:

First you can oversee your data in order to guess how is going to be the shape of those layers. Here you can easily see that lower z values have higher CO2 values.

require(dplyr)
require(plotly)
require(akima)
require(plotly)
require(zoo)
require(raster)

plot_ly(df, x=~x,y=~y, z=~z, color =~calCO2)

An important thing is that you have to define the layers you are going to have. These layers must be made from interpolation of values all over a surface. So:

Define the data you are using for each layer.
Interpolate values for z and for calCO2. This is important because these are two different things. z interpolation will make the sape of the graphic and calCO2 will make the color (concentration or whatever). In your image from (https://plot.ly/r/3d-surface-plots/) color and z are representing the same while here, I guess that you want to represent the surface of z and colored it with the calCO2. Thats why you will need to interpolate values for both. Interpolation methods is a world, here I just did a simple interpolation and I've filled NA by mean values.

Here is the code:

## Define your layers in z range (by hand or use quantiles, percentiles, etc.)
df1 <- subset(df, z >= 0 & z <= 125) #layer between 0 and 150m
df2 <- subset(df, z > 125)           #layer between 150 and max

#interpolate values for each layer and for z and co2
z1 <- interp(df1$x, df1$y, df1$z, extrap = TRUE, duplicate = "mean") #interp z layer 1 with spline interp
ifelse(anyNA(z1$z) == TRUE, z1$z[is.na(z1$z)] <- mean(z1$z, na.rm = TRUE), NA) #fill na cells with mean value

z2 <- interp(df2$x, df2$y, df2$z, extrap = TRUE, duplicate = "mean") #interp z layer 2 with spline interp
ifelse(anyNA(z2$z) == TRUE, z2$z[is.na(z2$z)] <- mean(z2$z, na.rm = TRUE), NA) #fill na cells with mean value

c1 <- interp(df1$x, df1$y, df1$calCO2, extrap = F, linear = F, duplicate = "mean") #interp co2 layer 1 with spline interp
ifelse(anyNA(c1$z) == TRUE, c1$z[is.na(c1$z)] <- mean(c1$z, na.rm = TRUE), NA) #fill na cells with mean value

c2 <- interp(df2$x, df2$y, df2$calCO2, extrap = F, linear = F, duplicate = "mean") #interp co2 layer 2 with spline interp
ifelse(anyNA(c2$z) == TRUE, c2$z[is.na(c2$z)] <- mean(c2$z, na.rm = TRUE), NA) #fill na cells with mean value

#THE PLOT
p <- plot_ly(showscale = TRUE) %>%
    add_surface(x = z1$x, y = z1$y, z = z1$z, cmin = min(c1$z), cmax = max(c2$z), surfacecolor = c1$z) %>%
    add_surface(x = z2$x, y = z2$y, z = z2$z, cmin = min(c1$z), cmax = max(c2$z), surfacecolor = c2$z) %>%
    add_trace(data = df, x = ~x, y = ~y, z = ~z, mode = "markers", type = "scatter3d", 
              marker = list(size = 3.5, color = "red", symbol = 10))%>%
    layout(title="Stack Exchange Plot")
p

Dematerialize answered 22/2, 2019 at 17:14 Comment(2)

This is fine if bicubic spline is a reasonable interpolation/extrapolation method for the given case. You note in your answer that "Interpolation methods is a world, here I just did a simple interpolation". This point needs emphasizing, that whether one specific method is reasonable or meaningful depends completely on the nature of the underlying data. Which we cannot tell from OPs question. – Vandal 22/2, 2019 at 21:37

Thank you very much for your answer! I like this very much. The problem I'm seeing is that it does not extrapolate to the full extent of the area (0-335,0-426), unless a sensors is at the extreme points. Looking at the interp command with extrap=true, I thought this would take care of this. Is there a way around this? – Herrmann 27/2, 2019 at 14:13

As Cesar points out, you need to define the "layers" that you want to interpolate over in this 3d system.

Here, I present an approach assuming one layer (i.e. - I use all points along the z direction). Looking at a table of your values will help you to define where the breaks occur. You can re-use the code below for each "layer" you define.

> table(d$z)

  0  50 120 130 155 178 226 
  7  10   1   3   8   1   1

Since you're dealing with spatial data, let's use spatial objects in R to solve this problem.

First, I copy/pasted your data into a variable called d.

# make d into a SpatialPointsDataFrame object
library(sp)
coords <- d[, c("x", "y")]
s      <- SpatialPointsDataFrame(coords = coords, data = d)

# interpolate with a thin plate spline 
# (or another interpolation method: kriging, inverse distance weighting). 
library(raster)
library(fields)
tps <- Tps(coordinates(s), as.vector(d$calCO2))
p   <- raster(s)
p   <- interpolate(p, tps)

# plot raster, points, and contour lines
plot(p)
plot(s, add=T)
contour(p, add=T)

You can imagine splitting your data into layers based on the z value of the point, and re-running this code to generate an interpolation for each layer. Be sure to read up on various interpolation methods to determine which is best suited for your system. Once you have these layers, it's not much more work to port that data into ploty like shown above.

EDIT: taking base --> ggplot --> plotly is straightforward:

# ggplot
library(ggplot2)
p <- ggplot(as.data.frame(p, xy = TRUE), aes(x, y, fill = layer)) + 
  geom_tile() + 
  geom_contour(aes(z = layer), color = "white") + 
  scale_fill_viridis_c() + 
  theme_minimal()

Here's some instructions on adding contour labels.

Turn this into an interactive plotly object.

library(plotly)
ggplotly(p)

And the code in the first post takes you to 3d.

Enlarge answered 25/2, 2019 at 1:10 Comment(1)

Thank you very much for your help on this. I do very much like the simplicity of this. I note, that this treats all CO2 data being on the same plane (same z), but it has variation in CO2 with height. What do you think? – Herrmann 27/2, 2019 at 14:14

Recommended topics

Hot tags