Try this,
df <- data.frame(x=c("A", "B", "A"), y = c("B", "A", "B"))
unique(apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),collapse = " ")))
[1] "A B"
So how does this work?
We are applying a function to each row of the data frame, so we can take each row at a time.
Take the second row of the df,
df[2,]
x y
1 B A
We then split (strsplit
) this, and unlist
into a vector of each letter, (We use as.matrix
to isolate the elements)
unlist(strsplit(as.matrix(df[2,]), " "))
[1] "B" "A"
Use the sort function to put into alphabetical order, then paste them back together,
paste(sort(unlist(strsplit(as.matrix(df[2,]), " "))), collapse = " ")
[1] "A B"
Then the apply
function does this for all the rows, as we set the index to 1, then use the unique
function to identify unique edges.
Extension
This can be extended to n variables, for example n=3,
df <- data.frame(x=c("A", "B", "A"), y = c("B", "A", "B"), z = c("C", "D", "D"))
unique(apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),collapse = " ")))
[1] "A B C" "A B D"
If more letters are needed, just combine two letters like the following,
df <- data.frame(x=c("A", "BC", "A"), y = c("B", "A", "BC"))
df
x y
1 A B
2 BC A
3 A BC
unique(apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),collapse = " ")))
[1] "A B" "A BC"
Old version
Using the tidyverse
package, create a function called rev
that can order our edges, then use mutate
to create a new column combining the x and y columns, in such a way it works well with the rev
function, then run the new column through the function and find the unique pairs.
library(tidyverse)
rev <- function(x){
unname(sapply(x, function(x) {
paste(sort(trimws(strsplit(x[1], ',')[[1]])), collapse=',')} ))
}
df <- data.frame(x=c("A", "B", "A"), y = c("B", "A", "B"))
rows <- df %>%
mutate(both = c(paste(x, y, sep = ", ")))
unique(rev(rows$both))