I am working with GPS data and trying to figure out how to average the 11th-15th fixes for latitude and longitude. I have seen solutions in similar questions for how to average every n rows. The problem is that occasionally the satellites bomb out and the fixes stop at 13 or 14. So, in these cases, I only want to average 3 or 4 values instead of 5. So I am looking to average values for latitude and longitude starting from where the number in series is 11 until the number in series drops again (or as long as it is increasing? I need it to include the last set, which will not drop back down to a low number again). I have started by deleting all rows where the number in series is NOT in my desired 11-15 ranges. So,for an example dummy data set, this leaves me with:
Date Time Long Lat NoInSeries
12 17/11/2014 22:09:17 115.9508 -31.82850 11
13 17/11/2014 22:09:18 115.9508 -31.82846 12
14 17/11/2014 22:09:19 115.9513 -31.82864 13
15 17/11/2014 22:09:21 115.9511 -31.82863 14
26 18/11/2014 00:07:14 115.9509 -31.82829 11
27 18/11/2014 00:07:15 115.9509 -31.82829 12
28 18/11/2014 00:07:16 115.9509 -31.82830 13
29 18/11/2014 00:07:17 115.9509 -31.82830 14
30 18/11/2014 00:07:18 115.9509 -31.82831 15
56 18/11/2014 10:00:24 115.9513 -31.82670 11
57 18/11/2014 10:00:25 115.9514 -31.82670 12
58 18/11/2014 10:00:26 115.9514 -31.82669 13
59 18/11/2014 10:00:27 115.9514 -31.82668 14
60 18/11/2014 10:00:28 115.9514 -31.82668 15
My desired output would be something like this, with the first one averaging 4 (11-14) and the next two averaging 5 (11-15):
Date Time Long Lat NoInSeries AvgLong Avg Lat
12 17/11/2014 22:09:17 115.9508 -31.82850 11 115.9510 -31.82856
13 17/11/2014 22:09:18 115.9508 -31.82846 12 NA NA
14 17/11/2014 22:09:19 115.9513 -31.82864 13 NA NA
15 17/11/2014 22:09:21 115.9511 -31.82863 14 NA NA
26 18/11/2014 00:07:14 115.9509 -31.82829 11 115.9509 -31.82830
27 18/11/2014 00:07:15 115.9509 -31.82829 12 NA NA
28 18/11/2014 00:07:16 115.9509 -31.82830 13 NA NA
29 18/11/2014 00:07:17 115.9509 -31.82830 14 NA NA
30 18/11/2014 00:07:18 115.9509 -31.82831 15 NA NA
56 18/11/2014 10:00:24 115.9513 -31.82670 11 115.9514 -31.82669
57 18/11/2014 10:00:25 115.9514 -31.82670 12 NA NA
58 18/11/2014 10:00:26 115.9514 -31.82669 13 NA NA
59 18/11/2014 10:00:27 115.9514 -31.82668 14 NA NA
60 18/11/2014 10:00:28 115.9514 -31.82668 15 NA NA
I would then go through and delete all rows where AvgLong==NA, so my final output would just have all the rows where number in series=11 with the averages.
I really don't know where to start with the code for this...the examples I have found have all discussed averaging an exact number of rows, rather than a variable number.
For instance:
c( tapply( x, (row(x)-1)%/%5, mean ) )
Or:
idx <- ceiling(seq_len(nrow(dd)) / 5)
# do colMeans on all columns except last one.
res <- lapply(split(dd[-(ncol(dd))], idx), colMeans, na.rm = TRUE)
# assign first value of "datetime" in each 5-er group as names to list
names(res) <- dd$datetime[seq(1, nrow(df), by=5)]
# bind them to give a matrix
res <- do.call(rbind, res)
Also, the answers I have seen generally then output the averages as a new data frame... Ultimately, I also want to have this averaging under a condition: if schedule is 'Multifix', I want to average 11 to however high it goes up to 15, whereas if schedule is 'Continuous', I want to average from 181 up until however high each one goes...). Something like this:
if(import.list$Schedule=='Multifix'){
...code to average Long and Lat for Number in Series from 11 up to however high it goes (up to 15)...
} else {
...code to average Long and Lat for Number in Series from 241 up to however high it goes...
}
Or perhaps I have an if else statement to define a variable and then use that variable in the function to do the averaging?
...but I imagine this condition could complicate things if the output creates a new dataframe, which is why I was aiming for just adding values to new columns "AvgLong" and "AvgLat." Thanks for any help!!