Filtering the Results of Expand.Grid
Asked Answered
A

6

10

I am trying to generate a list of all combinations numbers that satisfy all the following conditions:

  • Any combination is exactly 6 numbers long
  • The possible numbers are only 1,5,7
  • 1 can only be followed by either 1 or 5
  • 5 can only be followed by either 5 or 7
  • 7 can only be followed by 7
  • There must be at least two 1's

I tried to do this with the expand.grid function.

Step 1: First, I generated a list of all 6 length combinations with 1,5,7:

numbers <- c(1, 5, 7)
all_combinations <- data.frame(expand.grid(rep(list(numbers), 6)))

Step 2: Then, I tried to add variables to satisfy the conditions:

all_combinations$starts_with_1 <- ifelse(all_combinations$Var1 == 1, "yes", "no")
all_combinations$numbers_ascending  <- apply(all_combinations, 1, function(x) all(diff(as.numeric(x)) >= 0))


all_combinations$numbers_ascending  <- ifelse(all_combinations$numbers_ascending , "yes", "no")


all_combinations$at_least_two_ones <- apply(all_combinations, 1, function(x) sum(x == 1) >= 2)

all_combinations$at_least_two_ones <- ifelse(all_combinations$at_least_two_ones, "yes", "no")

Step 3: Finally, I tried to keep rows that satisfy all 3 conditions:

all_combinations <- all_combinations[all_combinations$starts_with_1 == "yes" & all_combinations$numbers_ascending == "yes" & all_combinations$at_least_two_ones == "yes", ]

all_combinations

However, the results are all NA:

      Var1 Var2 Var3 Var4 Var5 Var6 starts_with_1 numbers_ascending at_least_two_ones
NA      NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.1    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.2    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.3    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.4    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.5    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.6    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.7    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.8    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.9    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.10   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.11   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.12   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.13   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.14   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.15   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.16   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.17   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.18   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.19   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.20   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>

Note: I am trying to do this in a flexible way so that if I need to change something (e.g. modify to at least three 1's, or modify to 7 appearing before 5), I can quickly create a variable to test for this condition. This is why I am using the expand.grid approach.

Annapolis answered 1/5 at 3:42 Comment(0)
S
6

I guess we'll be adjusting it, but how about a regex approach?
Check this out:

library(tidyverse)

# ----------------
my_numbers <- c(1, 5, 7)
my_combinations <- data.frame(expand.grid(rep(list(my_numbers), 6)))

# Patterns
looking <- str_c(
  sep = "|",
  "1{2}")      # At least two "1"

not_looking <- str_c(
  sep = "|",
  "17",        # 1 can only be followed by either 1 or 5
  "51",        # 5 can only be followed by either 5 or 7
  "71", "75")  # 7 can only be followed by 7

# ----------------
my_output <- my_combinations %>% 
  rowwise() %>% 
  mutate(combo = str_flatten(c_across(starts_with("var")))) %>% 
  filter(str_detect(combo, looking), !str_detect(combo, not_looking))

The output:

> my_output
# A tibble: 11 × 7
# Rowwise: 
    Var1  Var2  Var3  Var4  Var5  Var6 combo 
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> 
 1     1     1     1     1     1     1 111111
 2     1     1     1     1     1     5 111115
 3     1     1     1     1     5     5 111155
 4     1     1     1     5     5     5 111555
 5     1     1     5     5     5     5 115555
 6     1     1     1     1     5     7 111157
 7     1     1     1     5     5     7 111557
 8     1     1     5     5     5     7 115557
 9     1     1     1     5     7     7 111577
10     1     1     5     5     7     7 115577
11     1     1     5     7     7     7 115777

Created on 2024-05-01 with reprex v2.1.0

Suffice answered 1/5 at 4:21 Comment(4)
@ Adriano Mello: thank you so much for your answer! is it possible to modify the "ascending" condition and make it so that the general order of numbers is 1,7,5 ?Annapolis
@Annapolis how would the order of the numbers look like? Just for example "171 >> 571 >> 777 >> 715 >> 515" would be correct?Suffice
E.g. 117555, 111755 , etcAnnapolis
@stats_noob: Those violate the assumptions in the question; they aren't valid results. If you're deviating from the question, do you mean just to replace all 5s with 7s and 7s with 5s?Flofloat
F
13

1) expand.grid The indexes of the numbers must be non-decreasing and since there must be at least n1 1's every feasible solution of n numbers must start with n1 1's. That leaves n-n1 indexes left so form the grid g and then use apply to get a logical vector indicating which rows of g to keep and subset g by it. Finally convert the indexes to the values in x. This last step allows x to be in any order.

filter_rows <- function(g, x) {
  ok  <- function(z) all(diff(z) %in% 0:1)
  out <- g[apply(g, 1, ok), ]
  replace(out, TRUE, lapply(out, \(i) x[i]))
}

f <- function(x = c(1,5,7), n=6, n1=2) {
  do.call(expand.grid, rep(list(1, seq_along(x)), c(n1, n-n1))) |>
    filter_rows(x)
}

# test runs
f() # as per question 
f(n1 = 3) # 3 ones
f(c(1, 7, 5)) # change order

2) gtools gtools has a function, combinations that can compute combinations with duplicates. The filter_rows function is from (1).

library(gtools)

f2 <- function(x = c(1,5,7), n=6, n1=2) {
  data.frame(as.list(rep(1, n1)), 
    combinations(length(x), n-n1, repeats.allowed = TRUE)) |>
  filter_rows(x)
}

# test runs
f2() # as per question 
f2(n1 = 3) # 3 ones
f2(c(1, 7, 5)) # change order

3) RcppAlgos This is similar to (2) except it uses combGeneral from RcppAlgos. The filter_rows function is from (1).

library(RcppAlgos)

f3 <- function(x = c(1,5,7), n=6, n1=2) {
  data.frame(as.list(rep(1, n1)), 
    comboGeneral(length(x), n-n1, repetition = TRUE)) |>
    filter_rows(x)
}

# test runs
f3() # as per question 
f3(n1 = 3) # 3 ones
f3(c(1, 7, 5)) # change o

Update

  • Added (2) and (3)
  • Fixed. Have added one condition that originally I had missed.
  • factored out code common to (1), (2) and (3)
Foretoken answered 1/5 at 5:17 Comment(0)
W
8

Since two 1s are required, and 1 can be followed only by 1 or 5, the first two columns must be one. I give three options below. The first requires some filtering; the others do not.

The first option uses RcppAlgos::comboGeneral. The results will by default be ordered. The only condition that will not be satisfied is that 1 is not followed by 7, so we filter on that condition.

library(RcppAlgos)

v <- c(1, 5, 7)
x <- cbind(1, 1, comboGeneral(v, 4, TRUE))
x[rowSums(x[,-6] == v[1] & x[,-1] == v[3]) == 0,]
#>       [,1] [,2] [,3] [,4] [,5] [,6]
#>  [1,]    1    1    1    1    1    1
#>  [2,]    1    1    1    1    1    5
#>  [3,]    1    1    1    1    5    5
#>  [4,]    1    1    1    1    5    7
#>  [5,]    1    1    1    5    5    5
#>  [6,]    1    1    1    5    5    7
#>  [7,]    1    1    1    5    7    7
#>  [8,]    1    1    5    5    5    5
#>  [9,]    1    1    5    5    5    7
#> [10,]    1    1    5    5    7    7
#> [11,]    1    1    5    7    7    7

The second option permutes the location of a 1 -> 5 or a 5 -> 7 transition by using the freqs argument of RcppAlgos::permuteGeneral. There can be up to 2 transitions and (excluding the first two columns) up to 4 non-transitions.

library(matrixStats)

cbind(1, 1, matrix(v[rowCumsums(permuteGeneral(0:1, 4, TRUE, c(4, 2))) + 1L], ncol = 4))
#>       [,1] [,2] [,3] [,4] [,5] [,6]
#>  [1,]    1    1    1    1    1    1
#>  [2,]    1    1    1    1    1    5
#>  [3,]    1    1    1    1    5    5
#>  [4,]    1    1    1    1    5    7
#>  [5,]    1    1    1    5    5    5
#>  [6,]    1    1    1    5    5    7
#>  [7,]    1    1    1    5    7    7
#>  [8,]    1    1    5    5    5    5
#>  [9,]    1    1    5    5    5    7
#> [10,]    1    1    5    5    7    7
#> [11,]    1    1    5    7    7    7

The third option uses compositionsGeneral and the observation that the number of times each element is repeated in the last 5 columns is a 3-composition of 5 (with zeros allowed):

x <- t(compositionsGeneral(0:5, 3, repetition = TRUE)[,3:1])
cbind(1, matrix(rep(rep(v, ncol(x)), x), ncol(x), 5, 1))
#>       [,1] [,2] [,3] [,4] [,5] [,6]
#>  [1,]    1    1    1    1    1    1
#>  [2,]    1    1    1    1    1    5
#>  [3,]    1    1    1    1    5    5
#>  [4,]    1    1    1    5    5    5
#>  [5,]    1    1    5    5    5    5
#>  [6,]    1    1    1    1    5    7
#>  [7,]    1    1    1    5    5    7
#>  [8,]    1    1    5    5    5    7
#>  [9,]    1    1    1    5    7    7
#> [10,]    1    1    5    5    7    7
#> [11,]    1    1    5    7    7    7
Wrapping answered 1/5 at 10:21 Comment(0)
S
6

I guess we'll be adjusting it, but how about a regex approach?
Check this out:

library(tidyverse)

# ----------------
my_numbers <- c(1, 5, 7)
my_combinations <- data.frame(expand.grid(rep(list(my_numbers), 6)))

# Patterns
looking <- str_c(
  sep = "|",
  "1{2}")      # At least two "1"

not_looking <- str_c(
  sep = "|",
  "17",        # 1 can only be followed by either 1 or 5
  "51",        # 5 can only be followed by either 5 or 7
  "71", "75")  # 7 can only be followed by 7

# ----------------
my_output <- my_combinations %>% 
  rowwise() %>% 
  mutate(combo = str_flatten(c_across(starts_with("var")))) %>% 
  filter(str_detect(combo, looking), !str_detect(combo, not_looking))

The output:

> my_output
# A tibble: 11 × 7
# Rowwise: 
    Var1  Var2  Var3  Var4  Var5  Var6 combo 
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> 
 1     1     1     1     1     1     1 111111
 2     1     1     1     1     1     5 111115
 3     1     1     1     1     5     5 111155
 4     1     1     1     5     5     5 111555
 5     1     1     5     5     5     5 115555
 6     1     1     1     1     5     7 111157
 7     1     1     1     5     5     7 111557
 8     1     1     5     5     5     7 115557
 9     1     1     1     5     7     7 111577
10     1     1     5     5     7     7 115577
11     1     1     5     7     7     7 115777

Created on 2024-05-01 with reprex v2.1.0

Suffice answered 1/5 at 4:21 Comment(4)
@ Adriano Mello: thank you so much for your answer! is it possible to modify the "ascending" condition and make it so that the general order of numbers is 1,7,5 ?Annapolis
@Annapolis how would the order of the numbers look like? Just for example "171 >> 571 >> 777 >> 715 >> 515" would be correct?Suffice
E.g. 117555, 111755 , etcAnnapolis
@stats_noob: Those violate the assumptions in the question; they aren't valid results. If you're deviating from the question, do you mean just to replace all 5s with 7s and 7s with 5s?Flofloat
C
6

Here's one approach you might use:

fn <- function(numbers, n_cols = 6, n_at_start) {
  v <- seq_along(numbers)
  
  # Generate grid and convert to matrix
  m <- as.matrix(expand.grid(rep(list(v), n_cols)))
  
  # Colwise differences
  dm <- m[, -1] - m[, -n_cols]
  
  # Filter
  m <- m[rowSums(m == 1) >= n_at_start  &   # Row starts with n 1s
         rowSums(dm == 0 | dm == 1) == n_cols - 1, ] # Number followed by itself or next value in sequence
  
  m[] <- numbers[m]
  m

}

fn(c(1,5,7), n_cols = 6, n_at_start = 2)

      Var1 Var2 Var3 Var4 Var5 Var6
 [1,]    1    1    1    1    1    1
 [2,]    1    1    1    1    1    5
 [3,]    1    1    1    1    5    5
 [4,]    1    1    1    5    5    5
 [5,]    1    1    5    5    5    5
 [6,]    1    1    1    1    5    7
 [7,]    1    1    1    5    5    7
 [8,]    1    1    5    5    5    7
 [9,]    1    1    1    5    7    7
[10,]    1    1    5    5    7    7
[11,]    1    1    5    7    7    7
Combo answered 1/5 at 4:58 Comment(0)
A
4

The code below applies the conditions you have coded in your question and produces results that are not all NA.

library(dplyr)

numbers <- c(1, 5, 7)
all_combinations <- data.frame(expand.grid(rep(list(numbers), 6)))

all_combinations %>%
  mutate(
    starts_with_1 = ifelse(Var1 == 1, "yes", "no"),
    numbers_ascending = apply(., 1, function(x) all(diff(as.numeric(x)) >= 0)),
    numbers_ascending = ifelse(numbers_ascending , "yes", "no"),
    at_least_two_ones = apply(., 1, function(x) sum(x == 1) >= 2),
    at_least_two_ones = ifelse(at_least_two_ones, "yes", "no")
  ) %>%
  filter(
    starts_with_1 == "yes",
    numbers_ascending == "yes",
    at_least_two_ones == "yes"
  )

I'm not sure that the logical conditions you have defined give you the required results though because in the output below you can see that there's a case where 1 is followed by 7. There are possibly other cases that don't conform to the rules you defined.

So you might need to refine your logic. But at least this approach will give you results to look at.

   Var1 Var2 Var3 Var4 Var5 Var6 starts_with_1 numbers_ascending at_least_two_ones
1     1    1    1    1    1    1           yes               yes               yes
2     1    1    1    1    1    5           yes               yes               yes
3     1    1    1    1    5    5           yes               yes               yes
4     1    1    1    5    5    5           yes               yes               yes
5     1    1    5    5    5    5           yes               yes               yes
6     1    1    1    1    1    7           yes               yes               yes
7     1    1    1    1    5    7           yes               yes               yes
8     1    1    1    5    5    7           yes               yes               yes
9     1    1    5    5    5    7           yes               yes               yes
10    1    1    1    1    7    7           yes               yes               yes
11    1    1    1    5    7    7           yes               yes               yes
12    1    1    5    5    7    7           yes               yes               yes
13    1    1    1    7    7    7           yes               yes               yes
14    1    1    5    7    7    7           yes               yes               yes
15    1    1    7    7    7    7           yes               yes               yes

You can also simplify the implementation like this:

all_combinations %>%
  mutate(
    numbers_ascending = apply(., 1, function(x) all(diff(as.numeric(x)) >= 0)),
    at_least_two_ones = apply(., 1, function(x) sum(x == 1) >= 2)
  ) %>%
  filter(
    Var1 == 1,
    numbers_ascending,
    at_least_two_ones
  )
Affiche answered 1/5 at 4:31 Comment(4)
@ datawookie: thank you so much for your answer! is it possible to modify the "ascending" condition and make it so that the general order of numbers is 1,7,5 ?Annapolis
Yes, for sure. But that feels like a separate question and I'd suggest you create another question on SO. You can reference this one for context.Affiche
your 6th row doesn't comply with the requirement since 1 should be followed by either 1 or 5 not 7Valerlan
My solution implements the logic supplied in the question. The question as I understood it was not about the logic but why the original implementation was generating a bunch of NA.Affiche
V
4

Idea

Since your have the requirement for at least two 1s and 1 is the smallest digit once you want to start with it. In case, you always have at least two 1s at the beginning of your sequence, e.g., 11xxxx.

For the following four placeholders x, you can sequentially fill it with a feasible number (depending its precedent value, for example, 1is followed by 1 or 5, and 5 is followed by 5 or 7, and so on) and iteratively update the sequence until all of the four x are filled.


Code (with Recursion)

I would say, expand.grid is not a good choice here since it includes too many unnecessary combinations. You can design a recursive function to yield the desired combinations, e.g.,

f <- function(n, v = c(1, 5, 7)) {
    if (n == 2) {
        return(list(c(1, 1)))
    }
    unlist(
        lapply(
            Recall(n - 1),
            \(x)
            Map(
                c,
                list(x),
                unlist(list(v[-3], v[-1], v[3])[match(tail(x, 1), v)])
            )
        ), FALSE
    )
}

which gives

> f(6)
[[1]]
[1] 1 1 1 1 1 1

[[2]]
[1] 1 1 1 1 1 5

[[3]]
[1] 1 1 1 1 5 5

[[4]]
[1] 1 1 1 1 5 7

[[5]]
[1] 1 1 1 5 5 5

[[6]]
[1] 1 1 1 5 5 7

[[7]]
[1] 1 1 1 5 7 7

[[8]]
[1] 1 1 5 5 5 5

[[9]]
[1] 1 1 5 5 5 7

[[10]]
[1] 1 1 5 5 7 7

[[11]]
[1] 1 1 5 7 7 7
Valerlan answered 1/5 at 21:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.