Efficiently enumerating multinomials with constant sum - R

Asked 15/9, 2023 at 23:36 Answered 16/9, 2023 at 6:29

Solved r constraints combinatorics memory-efficient

Let's say I have an N-side dice with non-uniform probabilities for each side and I throw it M times. Now instead of observing the individual outcomes, we only observe the sum.

I have to code the likelihood where I have to sum over the multinomial likelihood components restricted to having that observed sum.

If N=3, M = 2 and the sum is 4, than it is clear that I have to sum over the two cases where one of the throws is 1 and the other 3 plus the case where both are 2.

I could also enumerate all possibilities, calculate the sum and restrict the calculation to the combinations with the sum I'm interested in, but clearly that becomes intractable quite quickly with increasing N and M.

So I am looking for an efficient approach to select the constant-sum combinations in R.

Tubercle answered 15/9, 2023 at 23:36 Comment(0)

One option is to use RcppAlgos::compositionsGeneral() which employs 'efficient algorithms for partitioning numbers under various constraints'.

library(RcppAlgos)

compositionsGeneral(3, 2, repetition = TRUE, target = 4)

     [,1] [,2]
[1,]    1    3
[2,]    2    2
[3,]    3    1

As @ThomasIsCoding has pointed out, this approach can fail with the message:

compositionsGeneral(3, 6, repetition = TRUE, target = 10)

Error: Currently, there is no composition algorithm for this case.
 Use permuteCount, permuteIter, permuteGeneral, permuteSample, or
 permuteRank instead.

So to deal with this, we can catch errors and fall back on permuteGeneral() with constraints in this event:

comps <- \(v, m, x) {
  tryCatch(
    compositionsGeneral(v,
                        m,
                        repetition = TRUE,
                        target = x),
    error = function(e)
      permuteGeneral(v,
                     m,
                     repetition = TRUE,
                     constraintFun = "sum",
                     comparisonFun = "==",
                     limitConstraints = x
      )
  )
}


comps(3, 6, 10)

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    1    1    3    3
 [2,]    1    1    1    3    1    3
 [3,]    1    1    1    3    3    1
 [4,]    1    1    3    1    1    3
 [5,]    1    1    3    1    3    1
...
[85,]    2    2    1    1    2    2
[86,]    2    2    1    2    1    2
[87,]    2    2    1    2    2    1
[88,]    2    2    2    1    1    2
[89,]    2    2    2    1    2    1
[90,]    2    2    2    2    1    1

Note that the documentation includes the following about calculating permutations with contraints:

Finding all combinations/permutations with constraints is optimized by organizing them in such a way that when constraintFun is applied, a partially monotonic sequence is produced. Combinations/permutations are added successively, until a particular combination exceeds the given constraint value for a given constraint/comparison function combo. After this point, we can safely skip several combinations knowing that they will exceed the given constraint value.

Macaroon answered 15/9, 2023 at 23:49 Comment(5)

But there are three cases: 1-3, 3-1, and 2-2. No? – Orometer 16/9, 2023 at 4:20

This is not my understanding. He/she said "the 2 cases where ... plus the case where....". And given the description of the problem, the order matter : {1,3} unordered is a more frequent outcome than {2, 2}. – Orometer 16/9, 2023 at 4:49

Ah indeed - I misread it. Thanks. Will edit. – Macaroon 16/9, 2023 at 4:50

Concise and efficient, +1! By the way, what if target = 10 and M = 6? It seems giving errors. – Tremor 16/9, 2023 at 6:35

@Tremor - see update - I guess it's not ideal but users are largely not going to know if the function can determine a solution algorithmically until they call it. – Macaroon 16/9, 2023 at 11:36

I think you need the block partitions of the partitions package:

> blockparts(rep(3, 2), 4)
          
[1,] 3 2 1
[2,] 1 2 3

If N=5 (number of sides), you throw the dice M=2 times, and you get the sum 4, there's a problem, the cases 4-0 and 0-4 are included:

> blockparts(c(5, 5), 4)
              
[1,] 4 3 2 1 0
[2,] 0 1 2 3 4

Instead of filtering, you can do:

> x <- c(1, 1) # minimal values
> y <- c(5, 5) # maximal values
> sweep(blockparts(y - x, 4 - sum(x)), 1L, x, "+")
          
[1,] 3 2 1
[2,] 1 2 3

(this method is given in the doc of the package)

Massasoit answered 16/9, 2023 at 4:19 Comment(0)

You can define a custom function in a recursive manner, like below

f <- function(S, M, N = 3) {
    if (M == 1) {
        if (S <= N && S > 0) {
            return(S)
        } else {
            return(NULL)
        }
    }
    unlist(
        lapply(1:min(S,N), \(k) {
            Map(`c`, k, f(S - k, M - 1))
        }),
        recursive = FALSE
    )
}

such that:

target = 4 and M = 2

> f(4, 2)
[[1]]
[1] 1 3

[[2]]
[1] 2 2

[[3]]
[1] 3 1

target = 5 and M = 3

> f(5, 3)
[[1]]
[1] 1 1 3

[[2]]
[1] 1 2 2

[[3]]
[1] 1 3 1

[[4]]
[1] 2 1 2

[[5]]
[1] 2 2 1

[[6]]
[1] 3 1 1

target = 10 and M = 6

> f(10, 6) 
[[1]]
[1] 1 1 1 1 3 3

[[2]]
[1] 1 1 1 2 2 3

[[3]]
[1] 1 1 1 2 3 2

[[4]]
[1] 1 1 1 3 1 3

[[5]]
[1] 1 1 1 3 2 2

[[6]]
[1] 1 1 1 3 3 1

[[7]]
[1] 1 1 2 1 2 3

[[8]]
[1] 1 1 2 1 3 2

[[9]]
[1] 1 1 2 2 1 3

[[10]]
[1] 1 1 2 2 2 2

[[11]]
[1] 1 1 2 2 3 1

[[12]]
[1] 1 1 2 3 1 2

[[13]]
[1] 1 1 2 3 2 1

[[14]]
[1] 1 1 3 1 1 3

[[15]]
[1] 1 1 3 1 2 2

[[16]]
[1] 1 1 3 1 3 1

[[17]]
[1] 1 1 3 2 1 2

[[18]]
[1] 1 1 3 2 2 1

[[19]]
[1] 1 1 3 3 1 1

[[20]]
[1] 1 2 1 1 2 3

[[21]]
[1] 1 2 1 1 3 2

[[22]]
[1] 1 2 1 2 1 3

[[23]]
[1] 1 2 1 2 2 2

[[24]]
[1] 1 2 1 2 3 1

[[25]]
[1] 1 2 1 3 1 2

[[26]]
[1] 1 2 1 3 2 1

[[27]]
[1] 1 2 2 1 1 3

[[28]]
[1] 1 2 2 1 2 2

[[29]]
[1] 1 2 2 1 3 1

[[30]]
[1] 1 2 2 2 1 2

[[31]]
[1] 1 2 2 2 2 1

[[32]]
[1] 1 2 2 3 1 1

[[33]]
[1] 1 2 3 1 1 2

[[34]]
[1] 1 2 3 1 2 1

[[35]]
[1] 1 2 3 2 1 1

[[36]]
[1] 1 3 1 1 1 3

[[37]]
[1] 1 3 1 1 2 2

[[38]]
[1] 1 3 1 1 3 1

[[39]]
[1] 1 3 1 2 1 2

[[40]]
[1] 1 3 1 2 2 1

[[41]]
[1] 1 3 1 3 1 1

[[42]]
[1] 1 3 2 1 1 2

[[43]]
[1] 1 3 2 1 2 1

[[44]]
[1] 1 3 2 2 1 1

[[45]]
[1] 1 3 3 1 1 1

[[46]]
[1] 2 1 1 1 2 3

[[47]]
[1] 2 1 1 1 3 2

[[48]]
[1] 2 1 1 2 1 3

[[49]]
[1] 2 1 1 2 2 2

[[50]]
[1] 2 1 1 2 3 1

[[51]]
[1] 2 1 1 3 1 2

[[52]]
[1] 2 1 1 3 2 1

[[53]]
[1] 2 1 2 1 1 3

[[54]]
[1] 2 1 2 1 2 2

[[55]]
[1] 2 1 2 1 3 1

[[56]]
[1] 2 1 2 2 1 2

[[57]]
[1] 2 1 2 2 2 1

[[58]]
[1] 2 1 2 3 1 1

[[59]]
[1] 2 1 3 1 1 2

[[60]]
[1] 2 1 3 1 2 1

[[61]]
[1] 2 1 3 2 1 1

[[62]]
[1] 2 2 1 1 1 3

[[63]]
[1] 2 2 1 1 2 2

[[64]]
[1] 2 2 1 1 3 1

[[65]]
[1] 2 2 1 2 1 2

[[66]]
[1] 2 2 1 2 2 1

[[67]]
[1] 2 2 1 3 1 1

[[68]]
[1] 2 2 2 1 1 2

[[69]]
[1] 2 2 2 1 2 1

[[70]]
[1] 2 2 2 2 1 1

[[71]]
[1] 2 2 3 1 1 1

[[72]]
[1] 2 3 1 1 1 2

[[73]]
[1] 2 3 1 1 2 1

[[74]]
[1] 2 3 1 2 1 1

[[75]]
[1] 2 3 2 1 1 1

[[76]]
[1] 3 1 1 1 1 3

[[77]]
[1] 3 1 1 1 2 2

[[78]]
[1] 3 1 1 1 3 1

[[79]]
[1] 3 1 1 2 1 2

[[80]]
[1] 3 1 1 2 2 1

[[81]]
[1] 3 1 1 3 1 1

[[82]]
[1] 3 1 2 1 1 2

[[83]]
[1] 3 1 2 1 2 1

[[84]]
[1] 3 1 2 2 1 1

[[85]]
[1] 3 1 3 1 1 1

[[86]]
[1] 3 2 1 1 1 2

[[87]]
[1] 3 2 1 1 2 1

[[88]]
[1] 3 2 1 2 1 1

[[89]]
[1] 3 2 2 1 1 1

[[90]]
[1] 3 3 1 1 1 1

Tremor answered 16/9, 2023 at 6:29 Comment(0)

Recommended topics

Hot tags