How to remove the temporal component in an aggregation of a tsibble object?
Asked Answered
W

2

5

I have been working with the tsibble package and I can't get how is the proper way to remove the time component from the aggregation result. So in the following dataset, I want to have the mean trips by Region and State. Is the proper way to convert the tsibble to a tibble (it might be, I am just not sure) or is there some option that I am missing to achieve the aggregation?

library(tsibble)
library(dplyr)

tourism %>% group_by(Region, State) %>% summarise(Mean_trips = mean(Trips))

# A tsibble: 6,080 x 4 [1Q]
# Key:       Region, State [76]
# Groups:    Region [76]
   Region   State           Quarter Mean_trips
   <chr>    <chr>             <qtr>      <dbl>
 1 Adelaide South Australia 1998 Q1       165.
 2 Adelaide South Australia 1998 Q2       112.
 3 Adelaide South Australia 1998 Q3       148.

## This is not what I want, this is what I want:

tourism %>% as_tibble %>% group_by(Region, State) %>% summarise(Mean_trips = mean(Trips))

# A tibble: 76 x 3
# Groups:   Region [76]
   Region                     State              Mean_trips
   <chr>                      <chr>                   <dbl>
 1 Adelaide                   South Australia        143.  
 2 Adelaide Hills             South Australia          7.18
Whittier answered 5/4, 2020 at 13:39 Comment(3)
I think as_tibble is the correct way to do that as suggested when doing tourism %>% select(-Quarter)Metatarsal
Ok then if this is the way to go, please put it as an answer! Thank you!Whittier
from the reference manual: Column-wise verbs, includingselect(), transmute(), summarise(), mutate() & transmute(), keep the time context hanging around. That is, the index variable cannot be dropped for a tsibble. If any key variable is changed, it will validate whether it’s a tsibble internally. Use as_tibble() to leave off the time context.Grinder
M
5

If we use select(-Quarter) on tourism data it gives an informative error message.

library(tsibble)
library(dplyr)

tourism %>% select(-Quarter)

Error: Column Quarter (index) can't be removed. Do you need as_tibble() to work with data frame?

Hence, as_tibble is the correct way to convert to tibble.

tourism %>% 
    as_tibble %>% 
    group_by(Region, State) %>% 
    summarise(Mean_trips = mean(Trips))

#   Region                     State              Mean_trips
#   <chr>                      <chr>                   <dbl>
# 1 Adelaide                   South Australia        143.  
# 2 Adelaide Hills             South Australia          7.18
# 3 Alice Springs              Northern Territory      14.2 
# 4 Australia's Coral Coast    Western Australia       47.4 
#...
Metatarsal answered 5/4, 2020 at 13:57 Comment(0)
G
3

For the sake of completeness: from the reference manual of tsibble

Column-wise verbs, including select(), transmute(), summarise(), mutate() & transmute(), keep the time context hanging around. That is, the index variable cannot be dropped for a tsibble. If any key variable is changed, it will validate whether it’s a tsibble internally. Use as_tibble() to leave off the time context.

The temporal component cannot be dropped and as_tibble() is the right choice to convert to a tibble.

Grinder answered 5/4, 2020 at 15:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.