Reverse only alphabetical patterns in a string in R
Asked Answered
D

2

7

I'm trying to learn R and a sample problem is asking to only reverse part of a string that is in alphabetical order:

String: "abctextdefgtext"    
StringNew: "cbatextgfedtext"

Is there a way to identify alphabetical patterns to do this?

Daegal answered 13/4, 2017 at 13:59 Comment(7)
Welcome to SO! What have you tried so far? Please edit your question: stackoverflow.com/posts/43394297/editBilodeau
Fyi, R is a language for statistics, where strings are mostly/always static data.Dorcasdorcea
How you identify this "part of the string"?Ophiology
@Ophiology presumably it is a length 2+ "intersection" with abcdefghijklmnopqrstuvwxyzDorcasdorcea
@RichScriven I'm not quite following. The first four chars abct are in alphabetical order, but just abc get reversed.Ophiology
@Ophiology - Because abc is the only part of abct that is in sequential alphabetical order (if that's a thing - for lack of a better term). t is not the next letter after cAngioma
@RichScriven Yes, but guess you are making an inference which might not be what OP wants. At a first read, I thought that the parts of the string were a given. You are implying that the task is to find them. You are probably right after all but the description is pretty poor, since just the alphabetical order is mentioned, not the sequential part.Ophiology
P
4

Here is one approach with base R based on the patterns showed in the example. We split the string to individual characters ('v1'), use match to find the position of characters with that of alphabet position (letters), get the difference of the index and check if it is equal to 1 ('i1'). Using the logical vector, we subset the vector ('v1'), create a grouping variable and reverse (rev) the vector based on grouping variable. Finally, paste the characters together to get the expected output

v1 <- strsplit(str1, "")[[1]]
i1 <- cumsum(c(TRUE, diff(match(v1, letters)) != 1L))
paste(ave(v1, i1, FUN = rev), collapse="")
#[1] "cbatextgfedtext"

Or as @alexislaz mentioned in the comments

 v1 = as.integer(charToRaw(str1))
 rawToChar(as.raw(ave(v1, cumsum(c(TRUE, diff(v1) != 1L)), FUN = rev))) 
 #[1] "cbatextgfedtext"

EDIT:

1) A mistake was corrected based on @alexislaz's comments

2) Updated with another method suggested by @alexislaz in the comments

data

str1 <- "abctextdefgtext"
Profusion answered 13/4, 2017 at 14:9 Comment(2)
Building on the same approach, an alternative could be v1 = as.integer(charToRaw(str1)); rawToChar(as.raw(ave(v1, cumsum(c(TRUE, diff(v1) != 1L)), FUN = rev))). btw, it seems that the "defg" sequence is not recognized correctly in the above approachLeptospirosis
@Leptospirosis Thank you very much for spotting the mistake and showing another great method (learned a lot). I didn't knew it was not matching.Profusion
L
2

You could do this in base R

vec <- match(unlist(strsplit(s, "")), letters)
x <- c(0, which(diff(vec) != 1), length(vec))
newvec <- unlist(sapply(seq(length(x) - 1),  function(i) rev(vec[(x[i]+1):x[i+1]])))
paste0(letters[newvec], collapse = "")

#[1] "cbatextgfedtext"

Where s <- "abctextdefgtext"

  1. First you find the positions of each letter in the sequence of letters ([1] 1 2 3 20 5 24 20 4 5 6 7 20 5 24 20)
  2. Having the positions in hand, you look for consecutive numbers and, when found, reverse that sequence. ([1] 3 2 1 20 5 24 20 7 6 5 4 20 5 24 20)
  3. Finally, you get the letters back in the last line.
Luana answered 13/4, 2017 at 14:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.