How to access the last value in a vector?
Asked Answered
L

12

364

Suppose I have a vector that is nested in a dataframe with one or two levels. Is there a quick and dirty way to access the last value, without using the length() function? Something ala PERL's $# special var?

So I would like something like:

dat$vec1$vec2[$#]

instead of:

dat$vec1$vec2[length(dat$vec1$vec2)]
Labiovelar answered 16/9, 2008 at 21:40 Comment(3)
I am by no means an R expert, but a quick google turned up this: <stat.ucl.ac.be/ISdidactique/Rhelp/library/pastecs/html/…> There appears to be a "last" function.Kharif
Related: https://mcmap.net/q/93648/-getting-the-last-n-elements-of-a-vector-is-there-a-better-way-than-using-the-length-function/946850Annadiana
MATLAB has the notation "myvariable(end-k)" where k is an integer less than the length of the vector that will return the (length(myvariable)-k)th element. That would be nice to have in R.Farver
J
484

I use the tail function:

tail(vector, n=1)

The nice thing with tail is that it works on dataframes too, unlike the x[length(x)] idiom.

Jetta answered 17/9, 2008 at 13:32 Comment(5)
however x[length(x[,1]),] works on dataframes or x[dim(x)[1],]Radices
Note that for data frames, length(x) == ncol(x) so that's definitely wrong, and dim(x)[1] can more descriptively be written nrow(x).Fresh
@Fresh - kpierce8's suggestion of x[length(x[,1]),] is not wrong (note the comma in the x subset), but it's certainly awkward.Harmsworth
Please note that my benchmark below shows this to be slower than x[length(x)] by a factor of 30 on average for larger vectors!Redistrict
Doesn't work if you want to add stuff from vectors though tail(vector, n=1)-tail(vector, n=2)Roan
R
310

To answer this not from an aesthetical but performance-oriented point of view, I've put all of the above suggestions through a benchmark. To be precise, I've considered the suggestions

  • x[length(x)]
  • mylast(x), where mylast is a C++ function implemented through Rcpp,
  • tail(x, n=1)
  • dplyr::last(x)
  • x[end(x)[1]]]
  • rev(x)[1]

and applied them to random vectors of various sizes (10^3, 10^4, 10^5, 10^6, and 10^7). Before we look at the numbers, I think it should be clear that anything that becomes noticeably slower with greater input size (i.e., anything that is not O(1)) is not an option. Here's the code that I used:

Rcpp::cppFunction('double mylast(NumericVector x) { int n = x.size(); return x[n-1]; }')
options(width=100)
for (n in c(1e3,1e4,1e5,1e6,1e7)) {
  x <- runif(n);
  print(microbenchmark::microbenchmark(x[length(x)],
                                       mylast(x),
                                       tail(x, n=1),
                                       dplyr::last(x),
                                       x[end(x)[1]],
                                       rev(x)[1]))}

It gives me

Unit: nanoseconds
           expr   min      lq     mean  median      uq   max neval
   x[length(x)]   171   291.5   388.91   337.5   390.0  3233   100
      mylast(x)  1291  1832.0  2329.11  2063.0  2276.0 19053   100
 tail(x, n = 1)  7718  9589.5 11236.27 10683.0 12149.0 32711   100
 dplyr::last(x) 16341 19049.5 22080.23 21673.0 23485.5 70047   100
   x[end(x)[1]]  7688 10434.0 13288.05 11889.5 13166.5 78536   100
      rev(x)[1]  7829  8951.5 10995.59  9883.0 10890.0 45763   100
Unit: nanoseconds
           expr   min      lq     mean  median      uq    max neval
   x[length(x)]   204   323.0   475.76   386.5   459.5   6029   100
      mylast(x)  1469  2102.5  2708.50  2462.0  2995.0   9723   100
 tail(x, n = 1)  7671  9504.5 12470.82 10986.5 12748.0  62320   100
 dplyr::last(x) 15703 19933.5 26352.66 22469.5 25356.5 126314   100
   x[end(x)[1]] 13766 18800.5 27137.17 21677.5 26207.5  95982   100
      rev(x)[1] 52785 58624.0 78640.93 60213.0 72778.0 851113   100
Unit: nanoseconds
           expr     min        lq       mean    median        uq     max neval
   x[length(x)]     214     346.0     583.40     529.5     720.0    1512   100
      mylast(x)    1393    2126.0    4872.60    4905.5    7338.0    9806   100
 tail(x, n = 1)    8343   10384.0   19558.05   18121.0   25417.0   69608   100
 dplyr::last(x)   16065   22960.0   36671.13   37212.0   48071.5   75946   100
   x[end(x)[1]]  360176  404965.5  432528.84  424798.0  450996.0  710501   100
      rev(x)[1] 1060547 1140149.0 1189297.38 1180997.5 1225849.0 1383479   100
Unit: nanoseconds
           expr     min        lq        mean    median         uq      max neval
   x[length(x)]     327     584.0     1150.75     996.5     1652.5     3974   100
      mylast(x)    2060    3128.5     7541.51    8899.0     9958.0    16175   100
 tail(x, n = 1)   10484   16936.0    30250.11   34030.0    39355.0    52689   100
 dplyr::last(x)   19133   47444.5    55280.09   61205.5    66312.5   105851   100
   x[end(x)[1]] 1110956 2298408.0  3670360.45 2334753.0  4475915.0 19235341   100
      rev(x)[1] 6536063 7969103.0 11004418.46 9973664.5 12340089.5 28447454   100
Unit: nanoseconds
           expr      min         lq         mean      median          uq       max neval
   x[length(x)]      327      722.0      1644.16      1133.5      2055.5     13724   100
      mylast(x)     1962     3727.5      9578.21      9951.5     12887.5     41773   100
 tail(x, n = 1)     9829    21038.0     36623.67     43710.0     48883.0     66289   100
 dplyr::last(x)    21832    35269.0     60523.40     63726.0     75539.5    200064   100
   x[end(x)[1]] 21008128 23004594.5  37356132.43  30006737.0  47839917.0 105430564   100
      rev(x)[1] 74317382 92985054.0 108618154.55 102328667.5 112443834.0 187925942   100

This immediately rules out anything involving rev or end since they're clearly not O(1) (and the resulting expressions are evaluated in a non-lazy fashion). tail and dplyr::last are not far from being O(1) but they're also considerably slower than mylast(x) and x[length(x)]. Since mylast(x) is slower than x[length(x)] and provides no benefits (rather, it's custom and does not handle an empty vector gracefully), I think the answer is clear: Please use x[length(x)].

Redistrict answered 15/5, 2016 at 12:39 Comment(3)
^ O(1) solutions should be the only acceptable answer in this question.Catlett
I tried mylastR=function(x) {x[length(x)} It's faster than mylast in Rcpp, but one time slower than writing x[length(x)] directlyPotful
Even with big vectors there is no meaningful difference. Transforming to seconds shows that for the longest vector the fastest method takes 0.000001133 seconds and the slowest method takes 0.102328667 seconds (both median). Well, nobody will notice that in real life. I would choose readabilty over benchmarks here.Supraorbital
B
141

If you're looking for something as nice as Python's x[-1] notation, I think you're out of luck. The standard idiom is

x[length(x)]  

but it's easy enough to write a function to do this:

last <- function(x) { return( x[length(x)] ) }

This missing feature in R annoys me too!

Bootleg answered 17/9, 2008 at 13:27 Comment(1)
Do note that if you want the last few elements of a vector rather than just the last element, there's no need to do anything complex when adapting this solution. R's vectorization allows you to do neet things like get the last four elements of x by doing x[length(x)-0:3].Auster
H
55

Combining lindelof's and Gregg Lind's ideas:

last <- function(x) { tail(x, n = 1) }

Working at the prompt, I usually omit the n=, i.e. tail(x, 1).

Unlike last from the pastecs package, head and tail (from utils) work not only on vectors but also on data frames etc., and also can return data "without first/last n elements", e.g.

but.last <- function(x) { head(x, n = -1) }

(Note that you have to use head for this, instead of tail.)

Hurt answered 30/9, 2008 at 16:28 Comment(1)
Please note that my benchmark below shows this to be slower than x[length(x)] by a factor of 30 on average for larger vectors!Redistrict
S
24

The dplyr package includes a function last():

last(mtcars$mpg)
# [1] 21.4
Sprain answered 7/6, 2016 at 18:51 Comment(4)
This basically boils down to x[[length(x)]] again.Searle
Similar under the hood, but with this answer you don't have to write your own function last() and store that function somewhere, like several people have done above. You get the improved readability of a function, with the portability of it coming from CRAN so that someone else can run the code.Sprain
Can also write as mtcars$mpg %>% last, depending on your preference.Photopia
@RichScriven Unfortunately, it's considerably slower than x[[length(x)]], though!Redistrict
L
20

I just benchmarked these two approaches on data frame with 663,552 rows using the following code:

system.time(
  resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
    s <- strsplit(x, ".", fixed=TRUE)[[1]]
    s[length(s)]
  })
  )

 user  system elapsed 
  3.722   0.000   3.594 

and

system.time(
  resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
    s <- strsplit(x, ".", fixed=TRUE)[[1]]
    tail(s, n=1)
  })
  )

   user  system elapsed 
 28.174   0.000  27.662 

So, assuming you're working with vectors, accessing the length position is significantly faster.

Lymphosarcoma answered 13/5, 2014 at 18:20 Comment(1)
Why not testing tail(strsplit(x,".",fixed=T)[[1]],1) for the 2nd case? To me the main advantage of the tail is that you can write it in one line. ;)Alleged
G
13

Another way is to take the first element of the reversed vector:

rev(dat$vect1$vec2)[1]
Gastronome answered 11/2, 2014 at 15:36 Comment(4)
This will be expensive though!Centiliter
Please note that this is an operation whose computational cost is linear in the length of the input; in other words, while O(n), it is not O(1). See also my benchmark below for actual numbers.Redistrict
@Redistrict Unless you use an iteratorGastronome
@Gastronome Right. But in that case, your code also wouldn't work, would it? If by iterator you mean what's provided by the iterators package, then (1) you cannot use [1] to access the first element and (2) while you can apply rev to an iterator, it does not behave as expected: it just treats the iterator object as a list of its members and reverses that.Redistrict
T
12

I have another method for finding the last element in a vector. Say the vector is a.

> a<-c(1:100,555)
> end(a)      #Gives indices of last and first positions
[1] 101   1
> a[end(a)[1]]   #Gives last element in a vector
[1] 555

There you go!

Tila answered 16/1, 2015 at 20:35 Comment(0)
P
11

Package data.table includes last function

library(data.table)
last(c(1:10))
# [1] 10
Pyrimidine answered 7/6, 2016 at 18:42 Comment(1)
This basically boils down to x[[length(x)]] again.Searle
H
8

Whats about

> a <- c(1:100,555)
> a[NROW(a)]
[1] 555
Harr answered 10/9, 2015 at 19:42 Comment(3)
I appreciate that NROW does what you would expect on a lot of different data types, but it's essentially the same as a[length(a)] that OP is hoping to avoid. Using OP's example of a nested vector, dat$vec1$vec2[NROW(dat$vec1$vec2)] is still pretty messy.Rosewood
may be written as nrowChasitychasm
Note: Unlike nrow, NROW treats a vector as 1-column matrix.Dress
D
3

The xts package provides a last function:

library(xts)
a <- 1:100
last(a)
[1] 100
Dawnedawson answered 3/5, 2017 at 12:51 Comment(0)
I
0

As of purrr 1.0.0, pluck now accepts negative integers to index from the right:

library(purrr)

pluck(LETTERS, -1)
"Z"
Investment answered 14/2, 2023 at 21:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.