I have daily rainfall from 61 gauging stations for 12 years in a catchment(8000 Km2).
The goal is create 5Km and 25 Km resolution gridded daily rainfall product. As the no of stations are small and not all stations have rain even in rainy season, i choose to go with climatological variogram.
The rain gauge measurement for a typical day(irain) are as below with few missing values denoted by NA.
7.8 4.4 15.4 19.1 5.8 0 42 6.4 21 21 0 0 0 15.6 0 0 10 5 1.2 0 14.4 NA 25 13.2 0 9.2 2 6.6 7.8 13.2 15.4 NA 9 0 15.5 0 18.6 6 0 4.8 10.6 0 9 0 12.4 NA 12 0 3 14 10 10 0 68 21.8 18 14.8 5.4 7 0 NA
As the daily rainfall are skewed, to transform i tested with cube root transform and log transform (log1p) for each day individually. However for all days both the transforms does'nt fit as i tested with shapiro wilk test. So,i choose normal score score transform(NCR) as suggested by Grimes & Pardo(2009) Geostatistical analysis of rainfall. and used Prof. Ashton Shortridge code
The below code is used for generating a climaogical variogram for the monsoon season. Note i had used days where more than 30% of the stations reported rain. Did not find any references for this. Chose 30% as i get about 65% days to pass the threshold.
lag = 3
bins.vario = seq(0, 75, lag)
nb.bins = length(bins.vario) - 1
nb.classes = numeric(nb.bins)
vario.emp = array(0,c(nb.bins,6))
variance.emp = array(NA,c(10000,nb.bins))
vario.emp = as.data.frame(vario.emp)
class(vario.emp) = c("gstatVariogram","data.frame")
names(vario.emp) = c("np","dist","gamma","dir.hor","dir.ver","id")
nRows = nrow(stn.rain.subset)
for (i in 1:nRows)
{
irain = stn.rain.subset[i,]
isMissing = is.na(irain)
isZero = (irain == 0)
irain = irain[!isMissing & !isZero]
irain = as.numeric(irain)
rain.mean[[i]] = mean(irain); rain.var[[i]] = var(irain);
# Testing with log-transformation
# irain.logtt = log1p(irain)
# # qqPlot(irain.logtt,distribution = "norm")
# res = shapiro.test(irain.logtt)
# pval[[i]] = res$p.value
if (var(irain)>0)
{
# Scaling of the rain on each day. rain in mm. After scaling this is unitless
irain.scaled = irain/sqrt(var(irain))
irain.nsc = nscore(irain)
score = irain.nsc$nscore
easting = lon.UTM[!isMissing & !isZero] # Removing the stn.locs with NA values
northing = lat.UTM[!isMissing & !isZero]
rain = data.frame(easting,northing,score)
names(rain) = c("easting","northing","rain")
coordinates(rain) = c("easting","northing")
proj4string(rain) = UTM
v = variogram(rain~1, data = rain,boundaries = bins.vario)
bin.nb = ceiling(v$dist/lag)
nb.classes[bin.nb] = nb.classes[bin.nb]+1
vario.emp[bin.nb,] = vario.emp[bin.nb,]+v
}
The climatological variogram for non-zero rain:
Similarly, i had generated the indicator variogram.
The problem now is how to do kriging with the climatogical variogram.
"model" "psill" "range" "kappa" "ang1" "ang2" "ang3" "anis1" "anis2"
"Nug" 0.446609415762287 0 0 0 0 0 1 1
"Sph" 0.533499909345213 51.7206027419321 0.5 0 0 0 1 1
Similary for kriging read each day non-zero, i had scaleed daily rain and transform it. Not sure if the below approach is correct or not?
rain = data.frame(easting,northing,score)
# Multiplying the nugget and sill with var(rain) for each day.
clim.vrmod$psill = clim.vrmod$psill * var(irain)
krige.ok = krige(rain[,3]~1,locations =~easting+northing,data = rain,newdata=output.grd,model = clim.vrmod)
krige.ok$var1.pred.bt = (backtr(krige.ok$var1.pred,irain.nsc, tails='separate'))*sqrt(var(irain))
krige.ok$var1.se = (krige.ok$var1.var)
The confusions i have are as below:
Is the var1.var the standard-error(mm) or the variance(mm2)?
Should the climatogical variogram (nugget and sill) be scaled with the variance?
Back-transformation should be done with seperate or none option?
Thanks for the help in advance.