Extracting the last n characters from a string in R
Asked Answered
C

16

376

How can I get the last n characters from a string in R? Is there a function like SQL's RIGHT?

Counterespionage answered 1/11, 2011 at 8:11 Comment(0)
N
375

I'm not aware of anything in base R, but it's straight-forward to make a function to do this using substr and nchar:

x <- "some text in a string"

substrRight <- function(x, n){
  substr(x, nchar(x)-n+1, nchar(x))
}

substrRight(x, 6)
[1] "string"

substrRight(x, 8)
[1] "a string"

This is vectorised, as @mdsumner points out. Consider:

x <- c("some text in a string", "I really need to learn how to count")
substrRight(x, 6)
[1] "string" " count"
Northumbria answered 1/11, 2011 at 8:19 Comment(5)
Use stringi package. It works fine with NAs and all encoding :)Motorize
Would it be more efficient to avoid calling nchar(x) twice by assigning it to a local variable?Packthread
have been looking for this for a while!Photographer
No mention of which package substrRight is from?Slave
substrRight is a user defined function from above.Sedition
D
310

If you don't mind using the stringr package, str_sub is handy because you can use negatives to count backward:

x <- "some text in a string"
str_sub(x,-6,-1)
[1] "string"

Or, as Max points out in a comment to this answer,

str_sub(x, start= -6)
[1] "string"
Digress answered 1/11, 2011 at 8:27 Comment(3)
also, str_sub(x,start=-n) gets n last characters.Bootstrap
stringr doesn't work well with NA's value and all encoding. I strongly reccomend stringi package :)Motorize
I believe stringr had been remade using stringi as a backend, so should work with NAs etc. now.Danyluk
M
60

Use stri_sub function from stringi package. To get substring from the end, use negative numbers. Look below for the examples:

stri_sub("abcde",1,3)
[1] "abc"
stri_sub("abcde",1,1)
[1] "a"
stri_sub("abcde",-3,-1)
[1] "cde"

You can install this package from github: https://github.com/Rexamine/stringi

It is available on CRAN now, simply type

install.packages("stringi")

to install this package.

Motorize answered 16/7, 2013 at 11:35 Comment(0)
A
22
str = 'This is an example'
n = 7
result = substr(str,(nchar(str)+1)-n,nchar(str))
print(result)

> [1] "example"
> 
Archiplasm answered 1/11, 2011 at 8:36 Comment(0)
O
19

Another reasonably straightforward way is to use regular expressions and sub:

sub('.*(?=.$)', '', string, perl=T)

So, "get rid of everything followed by one character". To grab more characters off the end, add however many dots in the lookahead assertion:

sub('.*(?=.{2}$)', '', string, perl=T)

where .{2} means .., or "any two characters", so meaning "get rid of everything followed by two characters".

sub('.*(?=.{3}$)', '', string, perl=T)

for three characters, etc. You can set the number of characters to grab with a variable, but you'll have to paste the variable value into the regular expression string:

n = 3
sub(paste('.+(?=.{', n, '})', sep=''), '', string, perl=T)
Ollieollis answered 11/9, 2013 at 4:45 Comment(1)
To avoid all the look-aheads etc, you could just do regmatches(x, regexpr(".{6}$", x))Unsightly
C
13

A simple base R solution using the substring() function (who knew this function even existed?):

RIGHT = function(x,n){
  substring(x,nchar(x)-n+1)
}

This takes advantage of basically being substr() underneath but has a default end value of 1,000,000.

Examples:

> RIGHT('Hello World!',2)
[1] "d!"
> RIGHT('Hello World!',8)
[1] "o World!"
Chiasmus answered 4/1, 2018 at 11:24 Comment(0)
B
12

UPDATE: as noted by mdsumner, the original code is already vectorised because substr is. Should have been more careful.

And if you want a vectorised version (based on Andrie's code)

substrRight <- function(x, n){
  sapply(x, function(xx)
         substr(xx, (nchar(xx)-n+1), nchar(xx))
         )
}

> substrRight(c("12345","ABCDE"),2)
12345 ABCDE
 "45"  "DE"

Note that I have changed (nchar(x)-n) to (nchar(x)-n+1) to get n characters.

Borisborja answered 1/11, 2011 at 8:25 Comment(1)
I think you mean "(nchar(x)-n) to (nchar(x)-n+1)"Digress
H
12

Try this:

x <- "some text in a string"
n <- 5
substr(x, nchar(x)-n, nchar(x))

It shoudl give:

[1] "string"
Hildegardehildesheim answered 10/8, 2018 at 19:5 Comment(2)
But this returns the last 6 characters not 5Deuterogamy
So perhaps ... substr(x, nchar(x)-(n-1), nchar(x))Klimesh
D
6

An alternative to substr is to split the string into a list of single characters and process that:

N <- 2
sapply(strsplit(x, ""), function(x, n) paste(tail(x, n), collapse = ""), N)
Dromedary answered 1/11, 2011 at 8:30 Comment(1)
I sense a system.time() battle brewing :-)Hitandmiss
C
4

I use substr too, but in a different way. I want to extract the last 6 characters of "Give me your food." Here are the steps:

(1) Split the characters

splits <- strsplit("Give me your food.", split = "")

(2) Extract the last 6 characters

tail(splits[[1]], n=6)

Output:

[1] " " "f" "o" "o" "d" "."

Each of the character can be accessed by splits[[1]][x], where x is 1 to 6.

Crumble answered 25/6, 2015 at 18:24 Comment(0)
M
4

someone before uses a similar solution to mine, but I find it easier to think as below:

> text<-"some text in a string" # we want to have only the last word "string" with 6 letter
> n<-5 #as the last character will be counted with nchar(), here we discount 1
> substr(x=text,start=nchar(text)-n,stop=nchar(text))

This will bring the last characters as desired.

Mechelle answered 5/3, 2017 at 18:22 Comment(0)
S
4

For those coming from Microsoft Excel or Google Sheets, you would have seen functions like LEFT(), RIGHT(), and MID(). I have created a package known as forstringr and its development version is currently on Github.

if(!require("devtools")){
 install.packages("devtools")
}

devtools::install_github("gbganalyst/forstringr")

library(forstringr)
  • the str_left(): This counts from the left and then extract n characters

  • the str_right()- This counts from the right and then extract n characters

  • the str_mid()- This extract characters from the middle

Examples:


x <- "some text in a string"

str_left(x, 4)

[1] "some"

str_right(x, 6)

[1] "string"

str_mid(x, 6, 4)

[1] "text"

Skaw answered 27/8, 2020 at 11:27 Comment(0)
B
1

I used the following code to get the last character of a string.

    substr(output, nchar(stringOfInterest), nchar(stringOfInterest))

You can play with the nchar(stringOfInterest) to figure out how to get last few characters.

Buke answered 24/7, 2017 at 23:42 Comment(0)
D
0

A little modification on @Andrie solution gives also the complement:

substrR <- function(x, n) { 
  if(n > 0) substr(x, (nchar(x)-n+1), nchar(x)) else substr(x, 1, (nchar(x)+n))
}
x <- "moSvmC20F.5.rda"
substrR(x,-4)
[1] "moSvmC20F.5"

That was what I was looking for. And it invites to the left side:

substrL <- function(x, n){ 
  if(n > 0) substr(x, 1, n) else substr(x, -n+1, nchar(x))
}
substrL(substrR(x,-4),-2)
[1] "SvmC20F.5"
Danaides answered 21/11, 2016 at 18:26 Comment(0)
G
0

Just in case if a range of characters need to be picked:

# For example, to get the date part from the string

substrRightRange <- function(x, m, n){substr(x, nchar(x)-m+1, nchar(x)-m+n)}

value <- "REGNDATE:20170526RN" 
substrRightRange(value, 10, 8)

[1] "20170526"
Gadgeteer answered 2/6, 2018 at 0:20 Comment(0)
D
0

You can use the base R substring() and use first = and last =.


x <- "Just some sample text here"

substring_ex_1 <- substring(x, first = 23, last = 26)

Demetra answered 23/1, 2024 at 16:48 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.