Select the last n columns of data frame in R
Asked Answered
S

8

18

Is there a way to systematically select the last columns of a data frame? I would like to be able to move the last columns to be the first columns, but maintain the order of the columns when they are moved. I need a way to do this that does not list all the columns using subset(data, select = c(all the columns listed in the new order)) because I will be using many different data frames.

Here's an example where I would like to move the last 2 columns to the front of the data frame. It works, but it's ugly.

A = rep("A", 5)
B = rep("B", 5)
num1 = c(1:5)
num2 = c(36:40)

mydata2 = data.frame(num1, num2, A, B)

# Move A and B to the front of mydata2
mydata2_move = data.frame(A = mydata2$A, B = mydata2$B, mydata2[,1:    (ncol(mydata2)-2)])

#  A B num1 num2
#1 A B    1   36
#2 A B    2   37
#3 A B    3   38
#4 A B    4   39
#5 A B    5   40

Changing the number of columns in the original data frame causes issues. This works (see below), but the naming gets thrown off. Why do these two examples behave differently? Is there a better way to do this, and to generalize it?

mydata1_move = data.frame(A = mydata1$A, B = mydata1$B, mydata1[,1:   (ncol(mydata1)-2)])

#  A B mydata1...1..ncol.mydata1....2..
#1 A B                                1
#2 A B                                2
#3 A B                                3
#4 A B                                4
#5 A B                                5
Savoyard answered 19/1, 2015 at 2:29 Comment(0)
D
4

You could use something like this:

move_to_start <- function(x, to_move) {
  x[, c(to_move, setdiff(colnames(x), to_move))]
} 

move_to_start(mydata2, c('A', 'B'))

#   A B num1 num2
# 1 A B    1   36
# 2 A B    2   37
# 3 A B    3   38
# 4 A B    4   39
# 5 A B    5   40

Alternatively, if you want to move the last n columns to the start:

move_to_start <- function(x, n) {
  x[, c(tail(seq_len(ncol(x)), n), seq_len(ncol(x) - n))]
} 

move_to_start(mydata2, 2)

#   A B num1 num2
# 1 A B    1   36
# 2 A B    2   37
# 3 A B    3   38
# 4 A B    4   39
# 5 A B    5   40
Diatonic answered 19/1, 2015 at 2:38 Comment(0)
D
20

The problem described doesn't match the title, and existing answers address the moving columns part, doesn't really explain how to select last N columns.

If you wanted to just select the last column in a matrix/data frame without knowing the column name:

mydata2[,ncol(mydata2)]

and if you want last n columns, try

mydata[,(ncol(mydata2)-n-1):ncol(mydata2)]

A little cumbersome, but works. Could write wrapper function if you plan to use it regularly.

Darcidarcia answered 23/4, 2016 at 23:11 Comment(2)
I might be wrong, but I reckon it should be (ncol(mydata2)-n+1):ncol(mydata2)Philadelphia
As Charles Yan has pointed out, it should be: mydata[,(ncol(mydata2)-n+1):ncol(mydata2)]. If you leave the range as ncol(mydata2):ncol(mydata2), you would get as a result the vector of a last column. Any change in -n results in one extra column added from the end. In order to make n correspond the number of rows we want to add, we have to add 1, hence -n+1.Ming
D
4

You could use something like this:

move_to_start <- function(x, to_move) {
  x[, c(to_move, setdiff(colnames(x), to_move))]
} 

move_to_start(mydata2, c('A', 'B'))

#   A B num1 num2
# 1 A B    1   36
# 2 A B    2   37
# 3 A B    3   38
# 4 A B    4   39
# 5 A B    5   40

Alternatively, if you want to move the last n columns to the start:

move_to_start <- function(x, n) {
  x[, c(tail(seq_len(ncol(x)), n), seq_len(ncol(x) - n))]
} 

move_to_start(mydata2, 2)

#   A B num1 num2
# 1 A B    1   36
# 2 A B    2   37
# 3 A B    3   38
# 4 A B    4   39
# 5 A B    5   40
Diatonic answered 19/1, 2015 at 2:38 Comment(0)
W
3

You can do a similar thing using the SOfun package, available on GitHub.

library(SOfun)

foo <- moveMe(colnames(mydata2), "A, B before num1")

mydata2[, foo]

#  A B num1 num2
#1 A B    1   36
#2 A B    2   37
#3 A B    3   38
#4 A B    4   39
#5 A B    5   40

You can move column names like this example from R Help.

x <- names(mtcars)

x
#[1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

moveMe(x, "hp first; cyl after drat; vs, am, gear before mpg; wt last")
#[1] "hp"   "vs"   "am"   "gear" "mpg"  "disp" "drat" "cyl"  "qsec" "carb" "wt" 
Write answered 19/1, 2015 at 3:14 Comment(2)
I'd say this is the most flexible approach :-) (I mean, I'm not biased or anything....)Emerick
@AnandaMahto Well, you have seen many SO posts and realised that there is need for this kind of flexible operation. I like the name of the function. :) Thanks for editing, by the way.Write
K
3

Using the offset argument in the last_col function, inside select, you can do that.

Below is an example considering the last two columns, and it in a more generic approach.

library(dplyr)

mydata <- mydata %>% select(last_col(offset=c(0,1)), everything())

n <- 2
mydata <- mydata %>% select(last_col(offset=0:(n-1), everything()))
Kory answered 25/2, 2019 at 14:58 Comment(3)
That didn't work here. But the following did: n <- ncol(mydata) ## find how many cols are in mydata mydata2 <- select(mydata, last_col(offset=0:(n-1), everything()))Tierza
Both the answer here and Leonardo's suggestion work for me. But note that the columns are displayed in reverse order with the use of offset=Singleminded
What tidyselect package version did this work for? I tried the 0.2.1, which first introduced last_col() as well as the most recent 1.2.0. Neither version took an integer vector, only a single integer value for the offset argument.Pivot
L
2

data frames are just lists, so you can rearrange them as you would any list:

newdata <- c(mydata[colNamesToStart],
             mydata[-which(names(mydata) %in% colNamesToStart)])
Languor answered 19/1, 2015 at 2:43 Comment(1)
nice, but that does not return a data.frame.Clayton
T
2

I know this topic is a little dead, but wanted to chime in with a simple dplyr solution:

library(dplyr)

mydata <- mydata %>%
  select(A, B, everything())

If you are wanting to avoid explicit calls to the last columns, use seq() within last_col(). Let's denote the number of columns we wish to move to the front as n:

mydata <- mydata %>%
  select(
    last_col(seq(n - 1, 0)),
    everything()
  )
Tineid answered 27/7, 2017 at 19:52 Comment(1)
Looks like last_col was introduced to the tidyselect package just 3 months after I initially answered. Updated above to use seq within last_col - which preserves original order of the last n columnsTineid
T
2

Another alternative with dplyr:

mydata2 <- select(mydata, 2:ncol(data),1)  
#select any cols from col2 until the last col and place them before col1
Tierza answered 6/1, 2020 at 16:55 Comment(0)
P
0

relocate() was added with dplyr 1.0.0 to help with this. You can rename and use tidy-selection:

library(dplyr)

mydata2 |> 
  relocate(A:B, .before = 1)

To get the last N columns use the offset argument, which indexes from 0 (0 being the last column):

N <- 2
mydata2 |>
  relocate(last_col(N - 1):last_col(), .before = 1)
Pivot answered 6/5 at 19:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.