Efficiently enumerate all possible matrices with given constraints
Asked Answered
S

5

12

Background

Assuming we have a family of matrices Ms of size n-by-n, which should meet the following requirements:

  1. Entries of the matrix are either 0 or 1, i.e., boolean, but diagonal entries are always 0s
  2. The matrix is symmetric, i.e., M == t(M)
  3. There is a constant row (or equivalently, column) sum constraint p, such that all(rowSums(M)==p) == TRUE

Questions

  • Is there any potential features from the particular matrix structure, e.g., symmetry, boolean, or somethings else, such that we can take benefits from that to improve the efficiency of searching?
  • It seems that the problem can be interpreted as a graph theory way. For example, the matrix is an adjacent matrix of a graph that consists of n vertices with both indegrees and outdegrees being equal to p. This can be done by sample_degseq, but we may have to find all of its isomorphic mappings. How can we do that if we use igraph approaches?

My code so far looks like below, but it is slow when we have larger n or p (and also I am not sure if some of matrices are missed during the enumeration).

f <- function(n, p) {
    # helper function to check if requirement holds
    checker <- function(M, p, i = nrow(M) - 1) {
        rs <- rowSums(M)
        if ((i == nrow(M) - 1)) {
            return(all(rs == p))
        } else {
            return(all(rs[1:i] == p) && all(rs[-(1:i)] <= p))
        }
    }

    # start from all-0's matrix
    lst <- list(matrix(0, n, n))
    for (i in 1:(n - 1)) {
        js <- (i + 1):n
        r <- list()
        for (mat in lst) {
            k <- p - sum(mat[i, ])
            if (k == 0) {
                if (checker(mat, p, i)) {
                    r <- c(r, list(mat))
                }
            }
            if (k > 0 && length(js) >= k) {
                idx <- combn(length(js), k, \(v) js[v], simplify = FALSE)
                for (u in idx) {
                    mm <- mat
                    mm[i, u] <- 1
                    mm[u, i] <- 1
                    if (checker(mm, p, i)) {
                        r <- c(r, list(mm))
                    }
                }
            }
        }
        lst <- r
    }
    lst
}

Examples

  • For n <- 4 and p <- 2, we can find 3 matrices
[[1]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    1    0    0    1
[3,]    1    0    0    1
[4,]    0    1    1    0

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    1
[2,]    1    0    1    0
[3,]    0    1    0    1
[4,]    1    0    1    0

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    0    0    1    1
[2,]    0    0    1    1
[3,]    1    1    0    0
[4,]    1    1    0    0
  • For n <- 3 and p <- 2, we can find only 1 matrix
[[1]]
     [,1] [,2] [,3]
[1,]    0    1    1
[2,]    1    0    1
[3,]    1    1    0
Skimmia answered 13/8, 2023 at 22:37 Comment(13)
We are talking about huge numbers. There are at least n!/(n/2)! such matrices for any non-trivial p. Any code generating them will be slow.Dacosta
Do you want to generate all such matrices, or just sample then uniformly? Exhaustive enumeration will be infeasible except for the tiniest matrices.Embayment
@Embayment Yes, i want to generate all of them. The value of n might not be that big like hundreds or thousands, but probably up to 10 or 20. I guess there might be igraph solutions, but no idea how it can be implemented. Your help would be greatly appreciated!Skimmia
@Skimmia I am quite familiar with this problem, but I'm ill at the moment and too exhausted for an answer. You'll find links to the theory in iopscience.iop.org/article/10.1088/2632-072X/abced5/meta for undirected graphs. The principle in general is that we consider each vertex with as many "dangling edges", or stubs, as its degree. Then we connect stubs up in all possible ways. The trick is how to skip the connections which would prevent getting a simple graph (i.e. only 1s and 0s and 0s on the diagonal), and speed up the procedure. There are more sophisticated ways for this,Embayment
but what's most feasible here is to connect all stubs of a single vertex in one go, and look at whether the remaining degree sequence stays graphical (is_graphical). If not, cut the search, otherwise continue generating in a recursive manner. If the graph is directed, we split each vertex into an "in" and "out" instance and connect out to in only.Embayment
I can post an answer in Mathematica + pseudocode in a couple of days and you can help me translate to R.Embayment
@Embayment That would be great if you can provide some pseudocode. Not in a hurry and take you time. Hope you get well soon :)Skimmia
I will take a look about the theory to see if how I can be inspired by it. thanks for sharing!Skimmia
@Dacosta I don't think there are at least n!/(n/2)! such kind of matrices since the row sum constraint p and the symmetry property are two picky filtering conditions. For example, n <- 5 and p <- 3 will give an empty list since no such matrix can be found.Skimmia
@Skimmia Wolfram has a table with the number of r-regular graphs on n nodes in the table toward the bottom of this page. Note that the count is always zero when both n and r are odd.Gurdwara
@Gurdwara aha, that's an interesting finding! Thanks for sharing the info. I think here the r-regular graph should take into account both connected and unconnected graphs.Skimmia
There may be a combinatoric approach. For any given row, there are choose(n-1, p) combinations. Furthermore, all rows would have the same combinations with a 0 inserted at different locations to account for the diag being zero. Last, we know that the upper.tri is equal to n * p / 2. I assume graph theory will get you there faster, but I feel like there might be ways to improve efficiency of brute force algorithm.Clovah
@Skimmia If you want to include unconnected graphs, you just need to generate all (valid) partitions of the nodes and then all of the r-regular graphs for those partitions. Now combine.Gurdwara
E
10

The question boils down to generating all realizations of a degree sequence as a simple undirected graphs. igraph presently contains the following related tools:

  • realize_degseq() creates a single realization (we want all).
  • sample_degseq() does a random sampling of realizations, and some of the methods it implements ensure uniform sampling.
  • is_graphical() tests if any realization exists at all. If it does, we call the degree sequence graphical. This will be the basic building block for our algorithm below.

Suppose our degree sequence is {2, 3, 1, 2}. We can draw this as follows:

enter image description here

To generate all realizations, we need to connect up the stubs in all possible ways. Some of these ways will lead to multi-edges or self-loops, i.e. a non-simple graphs. We need to discard these.

The trick to efficient generation of all simple realizations is to detect early if we'd be forced to create a non-simple graph and stop. We can do this as follows. Let us take the first vertex, which has d_1 = 2 stubs. There are n-1 = 3 other vertices we can connect these two stubs to. There are (n-1) \choose d_1 = 3 \choose 2 = 3 ways to do so, namely the following:

enter image description here

enter image description here

enter image description here

Can we reach a simple graph from all three of these configurations? Notice that after connecting all stubs of vertex 1, what remains is a degree sequence of vertices 2, 3, 4, which may or may not be graphical. We call this the reduced degree sequence. For these three cases, we have:

2 0 2
2 1 1
3 0 1

(Discarding the first vertex whose degree now became 0, and no longer needs to be considered.)

Only the second of these is graphical, so we only need to continue the construction along this path.

> is_graphical(c(2,0,2))
[1] FALSE
> is_graphical(c(2,1,1))
[1] TRUE
> is_graphical(c(3,0,1))
[1] FALSE

Taking then the second remaining degree sequence, we can recursively continue the construction in the same manner, obtaining the single possible realization for our starting degree sequence:

enter image description here

Expressing this recursive constructions in pseudocode:

# degrees is a vector of degrees
# vertices is a vector of corresponding vertex names
# generate(degrees, vertices) returns all realizations 
# of the given degrees as simple graphs
generate(degrees, vertices):
    RESULT = {} # empty list
    d1 = degrees[1]  # first degree
    v1 = vertices[1] # first vertex
    drest = degrees[2:]  # rest of degrees
    vrest = vertices[2:] # rest of vertices
    for each size d1 subset S of vrest:
        dreduced = drest
        dreduced[S] -= 1 # decrement by one elements in S to obtain the reduced degree sequence
        if is_graphical(dreduced):
            for each graph in generate(dreduced, vrest),
            add vertex v1 to it and connect it to each of S,
            then append the graph to RESULT
    return RESULT

I have an implementation of this in Mathematica, but I am not fluent enough in R to be able to take the time to implement it in that language.


I should note that there are more efficient ways to do the enumeration, but they are a bit more complicated. Here we take the first vertex, and connect all its stubs in all possible ways to the rest of vertices. Then out of these possibilities, we only consider the feasible ones, i.e. those that led to a graphical reduced degree sequence. It is possible to take just a single stub of the first vertex, connect it, and already decide if it is feasible to continue. This can be done using the star-constrained graphicality theorem from this paper.

Embayment answered 18/8, 2023 at 11:44 Comment(1)
Awesome answer with elaborations, +1! I really love the graph interpretation! Cheers!Skimmia
C
4

1) This will generate N matrices, not necessarily distinct, that satisfy the row and column totals constraints and then filter that down to the ones that satisfy the other constraints and then reduce that down to the unique ones. It will not necessarily give all possible such matrices but the code is short and maybe good enough.

set.seed(123)
n <- 4      # = no of rows = no of cols
p <- 2      # = row sums = col sums
N <- 1000   # = initial no of matrices to generate 

L <- r2dtable(N, rep(p,n), rep(p,n))
f <- function(x) max(x) == 1 && all(diag(x) == 0) && all(x == t(x))
L2 <- unique(Filter(f, L))
L2

giving:

[[1]]
     [,1] [,2] [,3] [,4]
[1,]    0    0    1    1
[2,]    0    0    1    1
[3,]    1    1    0    0
[4,]    1    1    0    0

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    1    0    0    1
[3,]    1    0    0    1
[4,]    0    1    1    0

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    1
[2,]    1    0    1    0
[3,]    0    1    0    1
[4,]    1    0    1    0

2) If A is one feasible solution, i.e. an n by n matrix satisfying all constraints then A[P, P] is too for any permutation vector P so if the hypothesis that this generates all solutions is true then find one A using (1) or other method and apply all possible permutations and take the unique elements. If this were done a smaller N in (1) could be used since we only need one solution from it. This could be optimized further since not all permutations are needed but this is simple and has short code so maybe it is sufficient.

library(RcppAlgos)

L3 <- unique(permuteGeneral(4, FUN = \(x) L2[[1]][x, x]))
L3  # same as L2 except for order

giving:

[[1]]
     [,1] [,2] [,3] [,4]
[1,]    0    0    1    1
[2,]    0    0    1    1
[3,]    1    1    0    0
[4,]    1    1    0    0

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    1
[2,]    1    0    1    0
[3,]    0    1    0    1
[4,]    1    0    1    0

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    1    0    0    1
[3,]    1    0    0    1
[4,]    0    1    1    0

3) This uses integer linear programming to get the solutions. We must provide n, p (as per question) and nsolns (the number of solutions desired). It gives the same list as (1) and (2) here except possibly order.

library (lpSolve)

# inputs
n <- 4
p <- 2
nsolns <- 5 # no of solutions desired

d <- 0*diag(n)

# constraint matrix on rows (transposed)
cRows <- sapply(1:n, function(i) +(row(d) == i))

# constraints on columns (transposed)
cCols <- sapply(1:n, function(j) +(col(d) == j))

# constraint matrix to ensure symmetry (transposed)
cSym <- combn(n,2, FUN = function(ij) {
 i <- ij[1]
 j <- ij[2]
 d[i,j] <- 1
 d[j,i] <- -1
 c(d)
})

# constraint matrix on diagonal (transposed)
cDiag <- sapply(1:n, function(i) {
 d[i,i] <- 1
 c(d)
})

# combine constraints into single constraint matrix
L <- list(cRows, cCols, cDiag, cSym)
M <- t(do.call("cbind", L))

# define RHS of constraints
b <- rep(c(p,p,0,0), sapply(L, ncol))

# find solutions
res <- lp(objective.in = 0*M[1,], 
          const.mat = M, 
          const.dir = "=",
          const.rhs = b, 
          all.bin = TRUE,
          num.bin.solns = nsolns)

# rm junk at end of res$sokutions
solns.0 <- head(res$solution, nsolns * n * n)

# create 3d array of solutions
solns.a <- array(solns.0, c(n,n,nsolns))

# convert to list of solutions 
solns.L <- lapply(1:nsolns, function(i) solns.a[,,i])

# it will always return nsolns even if there are
# fewer so check for bad ones at end
solns <- Filter(function(x) !all(x == 0) &&
 all(x %in% 0:1), solns.L)

solns

Update

Have simplified (2) as per James Wood comments. Added (3).

Chopper answered 14/8, 2023 at 16:3 Comment(4)
Thanks for the contribution, +1! Since those matrices are nicely structured, I guess there must be some properties that we can take use of to efficiently enumerate all the family instances. Your solution seems using Monte Carlo and keeping the valid ones only, but it would be inefficient when having larger n or p.Skimmia
I think the hypothesis is true since both row and column constraints are not violated, and the diagonal entries remain untouched. The only issue for the second approach is the efficiency, since the number of permutations boosts up dramatically when size of matrix grows, where massive duplicates will be generated such that filtering the unique one becomes extremely computational heavy.Skimmia
RcppAlgos author here. Very much agree with @Skimmia that the second approach will become untenable quickly as n increases. A quick note on RcppAlgos usage. You can utilize the FUN argument, similar to usage in combn, so there is no need for the extra call to lapply. Also, you could use iterators (e.g. permuteIter) in conjunction with FUN in order to build a solution that is both flexible and saves on memory.Fetial
Have added (2) and (3) and updated (2) based on James Wood comment.Chopper
G
3

Brief disclaimer

This is a bad idea, due to the nature of the task. The number of solutions has a lower bound of something like C(n - 1, p) which is number of valid combinations for first row alone. (we omit the first index due to the zero-on-diagonal rule). Combinations grow fast and since this only handles the first row this will be much larger. The code I wrote claims that it generated 286884 solutions for n=10, p=2 and I'm not going to torture the poor machine to let it write them down.
You have been warned.

I've build a solution that appears to be working. However I'm not going to attempt to prove it beyond explaining what I did - there might be bugs lurking. However so far it seems to generate what I would consider reasonable solutions in quite straightforward manner. The beforementioned 286884 solutions it generates in ~7.x seconds and I consider that good enough for python. Printing is bigger bottleneck than generation anyways.
That being said, it is definitely possible to further optimize this code or to select a better language for this purpose. C, C++, Rust all will outperform Python by quite a large factor.
The solution is also quite easy to parallelize if for some reason you need to really, really compute huge numbers of these matrices.

Small rant:

I failed to notice that you're using R and not python. Sorry for that, but using this as a lead to create R code should be rather straightforward.
That being said, DO NOT DO IT. If you issue is a code being slow, do not use R. Just don't.
That being said, using RCPP package can give you an instant speedup if you only rewrite the core of your computation in C++ and then source it from R, if you for some reason need to handle some stuff in R.

Let the combinatorics explode.

So the idea is this:

  • Set up a base matrix and a vector of row remainder_sums.
    • note that these sums also work as column sums due to a symmetry
    • we want to manage this so that remainder_sums[i] == target_sum - already_filled_count[i]
    • so at the beginning it is simply a vector of target values that we will decrease as we go on.
  • Now to sanctify with dynamite:
    • recursion would quickly kill us, so we will use dynamic approach for the generation
    • so until we run out of rows we:
      • take the seed matrix for this particular solution
      • take a first row with nonzero remainder_sum
        • it's index is the row_idx
        • it's value is the target
      • take the rest of nonzero remainder_sums as our indices of interest
      • using these we will generate the seeds for the next round (and sometimes solutions)
        • this generation is basically generating all combinations from indices of interest where the number of combinations is the target
        • for every such combination we do some bookkeeping:
          • set the remainder_sums[row_idx] to zero (we filled this row)
          • decrease all the affected sums(indices in the combination) by one
          • discard the combination if the row_idx matches any of the indices - this effectively prunes the solution tree when we create an unfeasible seed. Although it might take some time to happen - possible point of optimization?
          • update the seed matrix
          • if sum(remainder_sums) == 0, the updated matrix is one of the solutions
          • otherwise add the seed to the set for the new round in form (remainder_sums, seed_matrix)

And thats basically it.

Note: that np.copy is used to bypass the python's way of passing np.matrix as a reference.

import numpy as np
import time

a = time.time()

n = 6
x = np.zeros(shape=[n, n], dtype=int)

target_sum = 2
current_sums = np.ones(shape=[n], dtype=int) * target_sum

row_partial_solutions = [(current_sums, x)]
next_row_partial_solutions = []
solution_aggregator = []
def generate_combinations(index_list, number_to_select):
    if number_to_select == 0:
        return []
    if number_to_select == 1:
        return [[x] for x in index_list]

    selected_indices = [[x] for x in range(len(index_list))]
    number_to_select -= 1
    while number_to_select > 0:
        next_level_indices = []
        for partial_solution in selected_indices:
            x = max(partial_solution)
            if x < len(index_list):
                for i in range(x + 1, len(index_list)):
                    next_level_indices.append(partial_solution + [i])
        selected_indices = next_level_indices
        number_to_select -= 1
    result = []
    for selection in selected_indices:
        index_set = [index_list[x] for x in selection]
        result.append(index_set)
    return result


for i in range(n - 1):
    #range n-1 due to the fact that last row will be filled due to the symetry & filling all the previous rows
    for sums, matrix in row_partial_solutions:
        indices_to_select = []
        # we go from i + 1 since the diagonal is always zero
        first_index = -1
        for j in range(i, n):
            if sums[j] > 0:
                if first_index == -1:
                    first_index = j
                    select_count = sums[j]
                else:
                    indices_to_select.append(j)

        # we always fill the first row that isn't filled yet
        # by doing that, we substract 1 from the affected columns/rows
        # (rows and columns filling is interchangeable due to the symmetry)
        if first_index == -1:
            break

        possible_selections = generate_combinations(indices_to_select, select_count)

        for selection in possible_selections:
            next_matrix = np.copy(matrix)
            next_sums = np.copy(sums)
            next_sums[first_index] = 0
            if first_index in possible_selections:
                continue
            for idx in selection:
                if idx == first_index:
                    continue
                next_matrix[first_index][idx] = 1
                next_matrix[idx][first_index] = 1
                next_sums[idx] -= 1
            next_row_partial_solutions.append((next_sums, next_matrix))

    row_partial_solutions = []
    for sums, matrix in next_row_partial_solutions:
        if np.sum(sums) == 0:
            solution_aggregator.append(matrix)
        else:
            row_partial_solutions.append((sums, matrix))
    next_row_partial_solutions = []


for matrix in solution_aggregator:
    print(matrix)

print(len(solution_aggregator))

print(time.time() - a)

Rant & afterthough:

After I finished playing with this stuff I noticed that you're doing something similar or almost same. That made me feel a bit less proud and a bit more silly. But well, it was fun. However this at least enables me to give you some pointers if you decide to upgrade your code instead of adapting mine:

  • recomputing the sums over and over again is costly in a long run. Keeping it as a vector to pass around should be somewhat faster. (Albeit possibly not by much, memory allocation is also costly) Would be interesting to profile.
  • my code doesn't use the checker function due to the fact that it prunes the solutions during generation whenever the next seed is infeasible (or something along these lines. My discard mechanism still isn't too efficient(leads to some possible dead ends) but it at least doesn't require to check every matrix for symmetry -the solution can't generate non-symmetrical matrices.
  • that's basically the biggest part. You're overusing the checker and it's costly.
  • if in doubt, use profiler, Rprof is decent tool to see where you're spending the bulk of your time.

This ain't a proof, but to showcase the possible result:

[[0 0 1 0 1 0]
 [0 0 1 0 0 1]
 [1 1 0 0 0 0]
 [0 0 0 0 1 1]
 [1 0 0 1 0 0]
 [0 1 0 1 0 0]]
[[0 0 1 0 0 1]
 [0 0 1 0 1 0]
 [1 1 0 0 0 0]
 [0 0 0 0 1 1]
 [0 1 0 1 0 0]
 [1 0 0 1 0 0]]
[[0 0 0 1 1 0]
 [0 0 0 1 0 1]
 [0 0 0 0 1 1]
 [1 1 0 0 0 0]
 [1 0 1 0 0 0]
 [0 1 1 0 0 0]]
[[0 0 0 1 1 0]
 [0 0 0 0 1 1]
 [0 0 0 1 0 1]
 [1 0 1 0 0 0]
 [1 1 0 0 0 0]
 [0 1 1 0 0 0]]
[[0 0 0 1 0 1]
 [0 0 0 1 1 0]
 [0 0 0 0 1 1]
 [1 1 0 0 0 0]
 [0 1 1 0 0 0]
 [1 0 1 0 0 0]]
[[0 0 0 1 0 1]
 [0 0 0 0 1 1]
 [0 0 0 1 1 0]
 [1 0 1 0 0 0]
 [0 1 1 0 0 0]
 [1 1 0 0 0 0]]
[[0 0 0 0 1 1]
 [0 0 0 1 1 0]
 [0 0 0 1 0 1]
 [0 1 1 0 0 0]
 [1 1 0 0 0 0]
 [1 0 1 0 0 0]]
[[0 0 0 0 1 1]
 [0 0 0 1 0 1]
 [0 0 0 1 1 0]
 [0 1 1 0 0 0]
 [1 0 1 0 0 0]
 [1 1 0 0 0 0]]
[[0 1 1 0 0 0]
 [1 0 1 0 0 0]
 [1 1 0 0 0 0]
 [0 0 0 0 1 1]
 [0 0 0 1 0 1]
 [0 0 0 1 1 0]]
[[0 1 1 0 0 0]
 [1 0 0 0 1 0]
 [1 0 0 0 0 1]
 [0 0 0 0 1 1]
 [0 1 0 1 0 0]
 [0 0 1 1 0 0]]
[[0 1 1 0 0 0]
 [1 0 0 0 0 1]
 [1 0 0 0 1 0]
 [0 0 0 0 1 1]
 [0 0 1 1 0 0]
 [0 1 0 1 0 0]]
[[0 1 0 1 0 0]
 [1 0 0 1 0 0]
 [0 0 0 0 1 1]
 [1 1 0 0 0 0]
 [0 0 1 0 0 1]
 [0 0 1 0 1 0]]
[[0 1 0 1 0 0]
 [1 0 0 0 1 0]
 [0 0 0 1 0 1]
 [1 0 1 0 0 0]
 [0 1 0 0 0 1]
 [0 0 1 0 1 0]]
[[0 1 0 1 0 0]
 [1 0 0 0 1 0]
 [0 0 0 0 1 1]
 [1 0 0 0 0 1]
 [0 1 1 0 0 0]
 [0 0 1 1 0 0]]
[[0 1 0 1 0 0]
 [1 0 0 0 0 1]
 [0 0 0 1 1 0]
 [1 0 1 0 0 0]
 [0 0 1 0 0 1]
 [0 1 0 0 1 0]]
[[0 1 0 1 0 0]
 [1 0 0 0 0 1]
 [0 0 0 0 1 1]
 [1 0 0 0 1 0]
 [0 0 1 1 0 0]
 [0 1 1 0 0 0]]
[[0 1 0 0 1 0]
 [1 0 1 0 0 0]
 [0 1 0 0 0 1]
 [0 0 0 0 1 1]
 [1 0 0 1 0 0]
 [0 0 1 1 0 0]]
[[0 1 0 0 1 0]
 [1 0 0 1 0 0]
 [0 0 0 1 0 1]
 [0 1 1 0 0 0]
 [1 0 0 0 0 1]
 [0 0 1 0 1 0]]
[[0 1 0 0 1 0]
 [1 0 0 1 0 0]
 [0 0 0 0 1 1]
 [0 1 0 0 0 1]
 [1 0 1 0 0 0]
 [0 0 1 1 0 0]]
[[0 1 0 0 1 0]
 [1 0 0 0 1 0]
 [0 0 0 1 0 1]
 [0 0 1 0 0 1]
 [1 1 0 0 0 0]
 [0 0 1 1 0 0]]
[[0 1 0 0 1 0]
 [1 0 0 0 0 1]
 [0 0 0 1 1 0]
 [0 0 1 0 0 1]
 [1 0 1 0 0 0]
 [0 1 0 1 0 0]]
[[0 1 0 0 1 0]
 [1 0 0 0 0 1]
 [0 0 0 1 0 1]
 [0 0 1 0 1 0]
 [1 0 0 1 0 0]
 [0 1 1 0 0 0]]
[[0 1 0 0 0 1]
 [1 0 1 0 0 0]
 [0 1 0 0 1 0]
 [0 0 0 0 1 1]
 [0 0 1 1 0 0]
 [1 0 0 1 0 0]]
[[0 1 0 0 0 1]
 [1 0 0 1 0 0]
 [0 0 0 1 1 0]
 [0 1 1 0 0 0]
 [0 0 1 0 0 1]
 [1 0 0 0 1 0]]
[[0 1 0 0 0 1]
 [1 0 0 1 0 0]
 [0 0 0 0 1 1]
 [0 1 0 0 1 0]
 [0 0 1 1 0 0]
 [1 0 1 0 0 0]]
[[0 1 0 0 0 1]
 [1 0 0 0 1 0]
 [0 0 0 1 1 0]
 [0 0 1 0 0 1]
 [0 1 1 0 0 0]
 [1 0 0 1 0 0]]
[[0 1 0 0 0 1]
 [1 0 0 0 1 0]
 [0 0 0 1 0 1]
 [0 0 1 0 1 0]
 [0 1 0 1 0 0]
 [1 0 1 0 0 0]]
[[0 1 0 0 0 1]
 [1 0 0 0 0 1]
 [0 0 0 1 1 0]
 [0 0 1 0 1 0]
 [0 0 1 1 0 0]
 [1 1 0 0 0 0]]
[[0 0 1 1 0 0]
 [0 0 1 0 1 0]
 [1 1 0 0 0 0]
 [1 0 0 0 0 1]
 [0 1 0 0 0 1]
 [0 0 0 1 1 0]]
[[0 0 1 1 0 0]
 [0 0 1 0 0 1]
 [1 1 0 0 0 0]
 [1 0 0 0 1 0]
 [0 0 0 1 0 1]
 [0 1 0 0 1 0]]
[[0 0 1 1 0 0]
 [0 0 0 1 1 0]
 [1 0 0 0 0 1]
 [1 1 0 0 0 0]
 [0 1 0 0 0 1]
 [0 0 1 0 1 0]]
[[0 0 1 1 0 0]
 [0 0 0 1 0 1]
 [1 0 0 0 1 0]
 [1 1 0 0 0 0]
 [0 0 1 0 0 1]
 [0 1 0 0 1 0]]
[[0 0 1 1 0 0]
 [0 0 0 0 1 1]
 [1 0 0 1 0 0]
 [1 0 1 0 0 0]
 [0 1 0 0 0 1]
 [0 1 0 0 1 0]]
[[0 0 1 1 0 0]
 [0 0 0 0 1 1]
 [1 0 0 0 1 0]
 [1 0 0 0 0 1]
 [0 1 1 0 0 0]
 [0 1 0 1 0 0]]
[[0 0 1 1 0 0]
 [0 0 0 0 1 1]
 [1 0 0 0 0 1]
 [1 0 0 0 1 0]
 [0 1 0 1 0 0]
 [0 1 1 0 0 0]]
[[0 0 1 0 1 0]
 [0 0 1 1 0 0]
 [1 1 0 0 0 0]
 [0 1 0 0 0 1]
 [1 0 0 0 0 1]
 [0 0 0 1 1 0]]
[[0 0 1 0 1 0]
 [0 0 0 1 1 0]
 [1 0 0 0 0 1]
 [0 1 0 0 0 1]
 [1 1 0 0 0 0]
 [0 0 1 1 0 0]]
[[0 0 1 0 1 0]
 [0 0 0 1 0 1]
 [1 0 0 1 0 0]
 [0 1 1 0 0 0]
 [1 0 0 0 0 1]
 [0 1 0 0 1 0]]
[[0 0 1 0 1 0]
 [0 0 0 1 0 1]
 [1 0 0 0 1 0]
 [0 1 0 0 0 1]
 [1 0 1 0 0 0]
 [0 1 0 1 0 0]]
[[0 0 1 0 1 0]
 [0 0 0 1 0 1]
 [1 0 0 0 0 1]
 [0 1 0 0 1 0]
 [1 0 0 1 0 0]
 [0 1 1 0 0 0]]
[[0 0 1 0 1 0]
 [0 0 0 0 1 1]
 [1 0 0 1 0 0]
 [0 0 1 0 0 1]
 [1 1 0 0 0 0]
 [0 1 0 1 0 0]]
[[0 0 1 0 0 1]
 [0 0 1 1 0 0]
 [1 1 0 0 0 0]
 [0 1 0 0 1 0]
 [0 0 0 1 0 1]
 [1 0 0 0 1 0]]
[[0 0 1 0 0 1]
 [0 0 0 1 1 0]
 [1 0 0 1 0 0]
 [0 1 1 0 0 0]
 [0 1 0 0 0 1]
 [1 0 0 0 1 0]]
[[0 0 1 0 0 1]
 [0 0 0 1 1 0]
 [1 0 0 0 1 0]
 [0 1 0 0 0 1]
 [0 1 1 0 0 0]
 [1 0 0 1 0 0]]
[[0 0 1 0 0 1]
 [0 0 0 1 1 0]
 [1 0 0 0 0 1]
 [0 1 0 0 1 0]
 [0 1 0 1 0 0]
 [1 0 1 0 0 0]]
[[0 0 1 0 0 1]
 [0 0 0 1 0 1]
 [1 0 0 0 1 0]
 [0 1 0 0 1 0]
 [0 0 1 1 0 0]
 [1 1 0 0 0 0]]
[[0 0 1 0 0 1]
 [0 0 0 0 1 1]
 [1 0 0 1 0 0]
 [0 0 1 0 1 0]
 [0 1 0 1 0 0]
 [1 1 0 0 0 0]]
[[0 0 0 1 1 0]
 [0 0 1 1 0 0]
 [0 1 0 0 0 1]
 [1 1 0 0 0 0]
 [1 0 0 0 0 1]
 [0 0 1 0 1 0]]
[[0 0 0 1 1 0]
 [0 0 1 0 1 0]
 [0 1 0 0 0 1]
 [1 0 0 0 0 1]
 [1 1 0 0 0 0]
 [0 0 1 1 0 0]]
[[0 0 0 1 1 0]
 [0 0 1 0 0 1]
 [0 1 0 1 0 0]
 [1 0 1 0 0 0]
 [1 0 0 0 0 1]
 [0 1 0 0 1 0]]
[[0 0 0 1 1 0]
 [0 0 1 0 0 1]
 [0 1 0 0 1 0]
 [1 0 0 0 0 1]
 [1 0 1 0 0 0]
 [0 1 0 1 0 0]]
[[0 0 0 1 1 0]
 [0 0 1 0 0 1]
 [0 1 0 0 0 1]
 [1 0 0 0 1 0]
 [1 0 0 1 0 0]
 [0 1 1 0 0 0]]
[[0 0 0 1 0 1]
 [0 0 1 1 0 0]
 [0 1 0 0 1 0]
 [1 1 0 0 0 0]
 [0 0 1 0 0 1]
 [1 0 0 0 1 0]]
[[0 0 0 1 0 1]
 [0 0 1 0 1 0]
 [0 1 0 1 0 0]
 [1 0 1 0 0 0]
 [0 1 0 0 0 1]
 [1 0 0 0 1 0]]
[[0 0 0 1 0 1]
 [0 0 1 0 1 0]
 [0 1 0 0 1 0]
 [1 0 0 0 0 1]
 [0 1 1 0 0 0]
 [1 0 0 1 0 0]]
[[0 0 0 1 0 1]
 [0 0 1 0 1 0]
 [0 1 0 0 0 1]
 [1 0 0 0 1 0]
 [0 1 0 1 0 0]
 [1 0 1 0 0 0]]
[[0 0 0 1 0 1]
 [0 0 1 0 0 1]
 [0 1 0 0 1 0]
 [1 0 0 0 1 0]
 [0 0 1 1 0 0]
 [1 1 0 0 0 0]]
[[0 0 0 0 1 1]
 [0 0 1 1 0 0]
 [0 1 0 1 0 0]
 [0 1 1 0 0 0]
 [1 0 0 0 0 1]
 [1 0 0 0 1 0]]
[[0 0 0 0 1 1]
 [0 0 1 1 0 0]
 [0 1 0 0 1 0]
 [0 1 0 0 0 1]
 [1 0 1 0 0 0]
 [1 0 0 1 0 0]]
[[0 0 0 0 1 1]
 [0 0 1 1 0 0]
 [0 1 0 0 0 1]
 [0 1 0 0 1 0]
 [1 0 0 1 0 0]
 [1 0 1 0 0 0]]
[[0 0 0 0 1 1]
 [0 0 1 0 1 0]
 [0 1 0 1 0 0]
 [0 0 1 0 0 1]
 [1 1 0 0 0 0]
 [1 0 0 1 0 0]]
[[0 0 0 0 1 1]
 [0 0 1 0 0 1]
 [0 1 0 1 0 0]
 [0 0 1 0 1 0]
 [1 0 0 1 0 0]
 [1 1 0 0 0 0]]
[[0 1 1 0 0 0]
 [1 0 0 1 0 0]
 [1 0 0 0 1 0]
 [0 1 0 0 0 1]
 [0 0 1 0 0 1]
 [0 0 0 1 1 0]]
[[0 1 1 0 0 0]
 [1 0 0 1 0 0]
 [1 0 0 0 0 1]
 [0 1 0 0 1 0]
 [0 0 0 1 0 1]
 [0 0 1 0 1 0]]
[[0 1 1 0 0 0]
 [1 0 0 0 1 0]
 [1 0 0 1 0 0]
 [0 0 1 0 0 1]
 [0 1 0 0 0 1]
 [0 0 0 1 1 0]]
[[0 1 1 0 0 0]
 [1 0 0 0 0 1]
 [1 0 0 1 0 0]
 [0 0 1 0 1 0]
 [0 0 0 1 0 1]
 [0 1 0 0 1 0]]
[[0 1 0 1 0 0]
 [1 0 1 0 0 0]
 [0 1 0 0 1 0]
 [1 0 0 0 0 1]
 [0 0 1 0 0 1]
 [0 0 0 1 1 0]]
[[0 1 0 1 0 0]
 [1 0 1 0 0 0]
 [0 1 0 0 0 1]
 [1 0 0 0 1 0]
 [0 0 0 1 0 1]
 [0 0 1 0 1 0]]
[[0 1 0 0 1 0]
 [1 0 1 0 0 0]
 [0 1 0 1 0 0]
 [0 0 1 0 0 1]
 [1 0 0 0 0 1]
 [0 0 0 1 1 0]]
[[0 1 0 0 0 1]
 [1 0 1 0 0 0]
 [0 1 0 1 0 0]
 [0 0 1 0 1 0]
 [0 0 0 1 0 1]
 [1 0 0 0 1 0]]
70
0.009359121322631836

Process finished with exit code 0
Gormand answered 17/8, 2023 at 12:32 Comment(1)
Thanks for your effort and nice attempt, +1! I think both of us are thinking in a similar way, and I like your analysis part. I am actually not after the absolute speed to finish the task, but would like to see any possible improvement in algorithmic aspects. Anyway, cheers for your answer!Skimmia
I
2

Here is a partial answer that might help optimise a better solution.

I had a go at constructing a single matrix that satisfies your conditions using the following approach...

n <- 4
p <- 2
mat <- matrix(0, n, n)

for(Row in 1:(n-1)){
  row_sums <- rowSums(mat)
  this_row_sum <- row_sums[Row]
  avail_cols <- which(row_sums < p)           #not already filled...
  avail_cols <- avail_cols[avail_cols > Row]  #and in the upper diagonal
  if(this_row_sum < p){
    mat[Row, tail(avail_cols, p-this_row_sum)] <- 1 #fill from end
  }
  mat <- pmax(mat, t(mat))                    #keep symmetric
}

This successfully reproduces your examples above where the 1s appear in square blocks.

By trying a few examples - and seeing where the algorithm fails - it seems to me that

  • either p must divide n, or (n-p) must divide n, AND
  • n/p (or n/(n-p)) must be even

otherwise there is no way of constructing a matrix of square p-blocks that do not interfere with the zeros on the diagonal.

What I would like to then argue (but cannot prove) is that ANY matrix satisfying your conditions must be a permutation of mat as found above. Given the p-block structure, the number of distinct permutations must be n!/(p!(n/p)!) (or n!/((n-p)!(n/(n-p))!) for the second case). This also agrees with your results - n=4, p=2 gives 3 possible matrices, and n=3, p=2 gives one.

The task (assuming my conjectures above can be proved) then reduces to finding the distinct permutations and applying them to mat.

Ideograph answered 16/8, 2023 at 17:35 Comment(3)
One possible typo: I think you should use this_row_sum <p as the condition in your if statement. Generally, I would say you had a nice attempt (thus upvoted, +1!), but your conjectures are not true. You can try n<-7 and p <- 4 with my code, which produces 465 matrices of the desired kind.Skimmia
You can refer to the- comment by @Gurdwara about the r-regular graph. I don't think the problem with its constraints is trivial as it looks like.Skimmia
@Skimmia Yes - you're right about the typo - I've corrected it. Shame about the conjectures - but I'm relieved that my algorithm does produce a result for 7,4.Ideograph
H
1

If you are not tightly tied to using R, I would recommend using some constraint satisfaction tools to approach this task, e.g. Minizinc. I haven't been using it for a while, but you could start from this pseudocode:

set of int: HEIGHT = 0..h;
set of int: CHEIGHT = 1..h-1;
set of int: WIDTH = 0..w;
set of int: CWIDTH = 1..w-1;
float: p = 10.0;

array[HEIGHT,WIDTH] of var float: t;

% diagonal constraint
constraint forall(i in CHEIGHT, j in CWIDTH)( t[i,j] = 0.0 );

% symmetry constraint
constraint forall(i in CHEIGHT, j in CWIDTH)( t[i,j] = t[j,i] );

% sum constraints
constraint forall(i in CHEIGHT)( sum(t[i, :]) = p );
constraint forall(i in CWIDTH) ( sum(t[:, i]) = p );

solve satisfy;

Call solver like this:

minizinc ./test.mnz --all-solutions

I honestly believe that this is one of the best and efficient way to approach similar tasks. Such solvers are heavily optimized and offer you higher flexibility in problem definition, settings, constraints, and so on and so forth. (NB: I finished PhD in a closely related domain)

Hashum answered 16/8, 2023 at 8:56 Comment(4)
Seems an interesting proposal, +1! I am, actually, more curious about the algorithms applied under the hood for this problem.Skimmia
I would recommend msoos.org/cryptominisat5 as a starting point, it highlights the most of useful keywords for you :) but in the end it boils down to bruteforceHashum
One small note - I've applied similar approach to real-world problems in electronics design automation. It scales pretty well up to thousands of variables dozens of thousands of constraintsHashum
sounds promising if it scales well.Skimmia

© 2022 - 2024 — McMap. All rights reserved.