Overriding "Variables not shown" in dplyr, to display all columns from df
Asked Answered
J

5

39

When I have a column in a local data frame, sometimes I get the message Variables not shown such as this (ridiculous) example just needed enough columns.

library(dplyr)
library(ggplot2) # for movies

movies %.% 
 group_by(year) %.% 
 summarise(Length = mean(length), Title = max(title), 
  Dramaz = sum(Drama), Actionz = sum(Action), 
  Action = sum(Action), Comedyz = sum(Comedy)) %.% 
 mutate(Year1 = year + 1)

   year    Length                       Title Dramaz Actionz Action Comedyz
1  1898  1.000000 Pack Train at Chilkoot Pass      1       0      0       2
2  1894  1.000000           Sioux Ghost Dance      0       0      0       0
3  1902  3.555556     Voyage dans la lune, Le      1       0      0       2
4  1893  1.000000            Blacksmith Scene      0       0      0       0
5  1912 24.382353            Unseen Enemy, An     22       0      0       4
6  1922 74.192308      Trapped by the Mormons     20       0      0      16
7  1895  1.000000                 Photographe      0       0      0       0
8  1909  9.266667              What Drink Did     14       0      0       7
9  1900  1.437500      Uncle Josh's Nightmare      2       0      0       5
10 1919 53.461538     When the Clouds Roll by     17       2      2      29
..  ...       ...                         ...    ...     ...    ...     ...
Variables not shown: Year1 (dbl)

I want to see Year1! How do I see all the columns, preferably by default.

Jessejessee answered 18/3, 2014 at 5:41 Comment(0)
F
57

There's (now) a way of overriding the width of columns that gets printed out. If you run this command all will be well

options(dplyr.width = Inf)

I wrote it up here.

Frodin answered 23/11, 2014 at 9:27 Comment(2)
I think that should be options with an "s". I can't edit since edits must be 10 characters.Testy
This is a nice option, but is not so useful when you have too many columns. It happened to me in a df with some 200 columns that they were displayed but the order between rows and columns was lost. Also, most of the rows were truncated at some point because of too many characters. I wanted to share the command to bring back the default behaviour, which is: 'options(dplyr.width = NULL)'Dome
M
31

You might like glimpse :

> movies %>%
+  group_by(year) %>%
+  summarise(Length = mean(length), Title = max(title),
+   Dramaz = sum(Drama), Actionz = sum(Action),
+   Action = sum(Action), Comedyz = sum(Comedy)) %>%
+  mutate(Year1 = year + 1) %>% glimpse()
Variables:
$ year    (int) 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902,...
$ Length  (dbl) 1.000000, 1.000000, 1.000000, 1.307692, 1.000000, 1.000000,...
$ Title   (chr) "Blacksmith Scene", "Sioux Ghost Dance", "Photographe", "Ve...
$ Dramaz  (int) 0, 0, 0, 1, 0, 1, 2, 2, 5, 1, 2, 3, 4, 5, 1, 8, 14, 14, 14,...
$ Actionz (int) 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 3, 0, 0, 1, 0,...
$ Action  (int) 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 3, 0, 0, 1, 0,...
$ Comedyz (int) 0, 0, 0, 1, 2, 2, 1, 5, 8, 2, 8, 10, 6, 2, 6, 8, 7, 2, 2, 4...
$ Year1   (dbl) 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903,...NULL
Mercurialize answered 18/3, 2014 at 9:19 Comment(2)
+1 for discovering glimpse. Personally, the principal reason I like to see all columns is as a convenient way to check whether the column I've added (through summarise or mutate) has actually done what I intended. So glimpse isn't quite right for this.Jessejessee
For the latest dplyr version, use %>% instead of %.%Harrison
D
8

dplyr has its own printing functions for dplyr objects. In this case, the object that is the result of your operation is tbl_df. The matching print function is then dplyr:::print.tbl_df. This reveals that trunc_mat is the function responsible for what is printed and not, including which variables.

Sadly, dplyr:::print.tbl_df does not pass on any parameters to trunc_mat and trunc_mat also does not support choosing which variables are shown (only how many rows). A workaround is to cast the result of dplyr to a data.frame and use head:

res = movies %.% 
 group_by(year) %.% 
 summarise(Length = mean(length), Title = max(title), 
  Dramaz = sum(Drama), Actionz = sum(Action), 
  Action = sum(Action), Comedyz = sum(Comedy)) %.% 
 mutate(Year1 = year + 1)

head(data.frame(res))
  year    Length                       Title Dramaz Actionz Action Comedyz
1 1898  1.000000 Pack Train at Chilkoot Pass      1       0      0       2
2 1894  1.000000           Sioux Ghost Dance      0       0      0       0
3 1902  3.555556     Voyage dans la lune, Le      1       0      0       2
4 1893  1.000000            Blacksmith Scene      0       0      0       0
5 1912 24.382353            Unseen Enemy, An     22       0      0       4
6 1922 74.192308      Trapped by the Mormons     20       0      0      16
  Year1
1  1899
2  1895
3  1903
4  1894
5  1913
6  1923
Deepen answered 18/3, 2014 at 5:50 Comment(1)
Pull requests are always welcomed :) But print.tbl_df probably does need an all_columns argument.Roderickroderigo
W
2

So, this is a bit old, but I found this when looking for answers to same problem. I came up with this solution that holds to the spirit of piping but identical in function to the accepted answer (note that the pipe symbol %.% is deprecated in favor of %>%)

movies %>% 
    group_by(year) %>% 
    summarise(Length = mean(length), Title = max(title), 
    Dramaz = sum(Drama), Actionz = sum(Action), 
    Action = sum(Action), Comedyz = sum(Comedy)) %>% 
    mutate(Year1 = year + 1) %>%
    as.data.frame %>%
    head
Watterson answered 13/8, 2014 at 4:54 Comment(1)
And the dataset moved from ggplot to ggplot2movies. So, now we should use library(ggplot2movies).Chaldean
S
1

movies %.% group_by(year) %.% ....... %.% print.default

dplyr uses, instead of the default print option,dplyr:::print.tbl_df to make sure your screen doesn't overload with huge data-sets. When you've finally whittled your stuff down to what you want and don't want to be saved from your own mistakes anymore, just stick print.default on the end to spit out everything.


BTW, methods(print) shows how many packages need to write their own print functions (think about, eg, igraph or xts --- these are new data-types so you need to tell them how to be displayed on the screen).

HTH the next googler.

Scythia answered 1/5, 2015 at 4:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.