Convert matrix to block matrix by permuting rows and columns

Asked 31/5, 2024 at 6:4 Answered 8/6, 2024 at 18:31

Solved algorithm matrix permutation mathematical-optimization

I have a rectangular matrix with n rows and m column. All entries of the matrix are natural numbers (including 0).

Among the m columns, I'm given some index, j (< m). I'd like the matrix to become a block matrix as shown below.

For the first i rows (we can choose any i <= n we want), every entry to the right of j should be 0. And for the next (n-i) rows, every entry to the left of index j (and including j) should be 0.

If this is impossible, the sum of the entries in the two shaded areas (dark grey) in the figure above should be as small as possible.

The only operations allowed on the original matrix are swapping rows and swapping columns. I'm interested in an efficient algorithm to achieve this.

Edit

Here is the CSV file for the original matrix (real world data from my application): https://github.com/ryu577/optimizn/blob/master/optimizn/ab_split/testing/arrs_orig.csv

The j is 25.

I have achieved a score of 561 on this matrix with simulated annealing. I would be interested to see if anyone can beat this.

Undercool answered 31/5, 2024 at 6:4 Comment(13)

If you view the matrix as an adjacency matrix of a graph, then the blocks correspond to the connected components. There are efficient algorithms to find those. – Myrwyn 31/5, 2024 at 6:40

But the M1 and M2 needn't be strongly connected component adjacency matrices. – Undercool 31/5, 2024 at 6:42

Also, the matrix is rectangular. – Undercool 31/5, 2024 at 6:51

You can view it as a bipartite graph, it works basically the same. – Myrwyn 31/5, 2024 at 7:6

First pass would be to count the number of 0's in each row which puts bounds on the value of j and immediately tells you if your target block form is impossible. You can do the same (less efficiently) for each column which puts bounds on k in the same way. Sorting the rows by number of zeroes in them seems a reasonable way to proceed after that. Column swaps will be more expensive so the fewer of them the better. – Encephalitis 31/5, 2024 at 8:9

If M1 and M2 are completely non zero, then the number of zeros in each row and column would be the same. Maybe that's an edge case. And if we transpose the matrix before column swaps, they become cheap as well, no? – Undercool 31/5, 2024 at 15:32

Although I generally frown on comments that convey merely hunches, I'll go out on a limb and speculate that this may be a well-studied problem in linear algebra. – Monecious 1/6, 2024 at 5:45

Not clear whether i is given or is an output of the algorithm. – Suzysuzzy 3/6, 2024 at 7:8

(@Suzysuzzy Not clear whether i is given or [an output] we can choose any i <= n we want) – Mazarin 3/6, 2024 at 7:33

@RohitPandey What's the meaning of score (561)? – Propagandist 11/6, 2024 at 8:57

It means if I sum the columns in the two grey regions, I get 561. – Undercool 12/6, 2024 at 3:37

@RohitPandey My code slightly modified to prevent an infinite sorting loop of a sparse matrix yields a minimum of 30 at i=49 and j=166. – Propagandist 13/6, 2024 at 8:26

@SudoKoach - sorry, I should have clarified here.. the j is fixed at 25. We can move the i, but we can't move the j (the question clarifies this). – Undercool 13/6, 2024 at 21:45

Two practical approaches:

Create an optimization problem that minimizes sum of squares of the corresponding coefficients in the target matrix, given two permutation matrices: the matrix C of dimensions mxm and matrix R of dimensions nxn. Add requirements RR^T=I and CC^T=I using Lagrange multipliers. Run optimization to find optimal C and R. Then add a regularizer to the target function that will turn those matrices in a proper permutation (e.g. 0s and 1s).
Use genetic optimization.

Neither guarantees optimal solution.

Pean answered 8/6, 2024 at 18:31 Comment(2)

Thanks! Didn't get the part about adding a regularizer to turn the matrices into proper permutation? – Undercool 9/6, 2024 at 6:3

Add a term to your target function that will favor zeros and ones in R and C. For example, α|(x-0.5)²-0.25|, where x is each coefficient of R and C, and α is gradually increasing in your iterative optimizer, whatever it might be. – Pean 9/6, 2024 at 8:50

This is a size-constrained min-cut problem. It's NP-hard to solve exactly.

Since you've asked the question in terms of matrices, I'm going to go out on a limb and assume that you have ready access to a library that can compute a partial singular value decomposition. If so, let me suggest a cheap spectral heuristic.

Find the m-element singular vector corresponding to the largest singular value. Assuming that the elements of this vector are distinct (without loss of generality by perturbation or by using indices to break ties), select the element of order j; the elements with lesser values correspond to the first j-1 columns in the output, and the elements with greater values correspond to the last m-j columns. With the columns so arranged, each row "votes" on whether it should be in the top half or the bottom half.

Zaratite answered 5/6, 2024 at 13:37 Comment(8)

Thanks. Isn't this the same as sorting the columns by the entries of the largest singular vector? Why would that work? And would the voting among the rows work? – Undercool 7/6, 2024 at 6:30

@RohitPandey yes, you can sort. – Zaratite 7/6, 2024 at 11:11

Sorry, and I meant to ask.. how would the voting among rows work? Maybe sort by the sums before j? – Undercool 7/6, 2024 at 16:30

@RohitPandey yep! signum([sum before j] - [sum after j]) – Zaratite 7/6, 2024 at 18:16

j is given... does this still hold? – Whitworth 8/6, 2024 at 21:0

@Whitworth the key here is that when the voting happens, we've chosen a column order. – Zaratite 8/6, 2024 at 23:22

Can you please clarify why the order of elements of the largest singular vector can be leveraged in this manner? – Undercool 9/6, 2024 at 6:7

@RohitPandey this subject not being my specialty, my intuition is kind of basic, but if M_1 and M_2 were all ones, then there would be two pairs of singular vectors: the first pair would have one vector supported in the first j columns and one vector supported in the first i rows; the second pair would have one vector supported in the other columns and one vector supported in the other rows. The SVD is indifferent to the permutations of rows and columns, which is why it's able to recover the structure. – Zaratite 9/6, 2024 at 15:19

Here is an “old school” solution for clarifying the requirements and for benchmarking. If the only operations allowed are "swapping rows and swapping columns", then the matrix can only be partitioned sorting by row and by column. Rows are sorted by ascending order of their respective weighted average columns. Columns are sorted by ascending order of their respective weighted average rows.

import numpy as np
rng = np.random.default_rng()           
N=7
M=15
nbmax=min(N,M)-4
A = rng.integers(0, 4, (N,M))

def getWACols(A,N,M):
    wacols=[]
    colsom=np.sum(A,axis=1)
    for row in range(N):
        wsum=0
        for col in range(M):
            wsum+=A[row,col]*col
        if colsom[row]!=0 :
            wacols.append(wsum/colsom[row])
        else:
            wacols.append((M-1)/2)
    return np.array(wacols)
            
def getWARows(A,N,M):
    warows=[]
    rowsom=np.sum(A,axis=0)
    for col in range(M):
        wsum=0
        for row in range(N):
            wsum+=A[row,col]*row
        if rowsom[col]!=0:
            warows.append(wsum/rowsom[col])
        else:
            warows.append((N-1)/2)
    return np.array(warows)

def sortByRow(A,N,M):
    B=np.zeros((N,M),np.int32)
    wacols_sorted=np.argsort(getWACols(A,N,M))
    bn=True
    for row in range(N):
        for col in range(M):
            B[row,col]=A[wacols_sorted[row],col]
            bn=bn and row==wacols_sorted[row]
    return B,bn

def sortByCol(A,N,M):
    B=np.zeros((N,M),np.int32)
    warows_sorted=np.argsort(getWARows(A,N,M))
    bn=True
    for col in range(M):
        for row in range(N):
            B[row,col]=A[row,warows_sorted[col]]
            bn=bn and col==warows_sorted[col]
    return B,bn    

def partitionMatrix(A,N,M):            
    B,bn=sortByRow(A,N,M)
    bn1=bn and True
    B,bn=sortByCol(B,N,M)
    return B,bn and bn1

print(A)
B=np.copy(A)
bn=False
cntr=0
while not bn and cntr<100:
    B,bn=partitionMatrix(B,N,M)
    cntr+=1
print(B)

j=np.sum(getWACols(B,N,M))/M
i=np.sum(getWARows(B,N,M))/N

print(i,j)

Propagandist answered 5/6, 2024 at 15:58 Comment(1)

Thanks for the solution. I tried this on my test case and it didn't seem to beat my best solution. But I put the test case in the question in case you want to confirm. – Undercool 9/6, 2024 at 17:31

Two practical approaches:

Create an optimization problem that minimizes sum of squares of the corresponding coefficients in the target matrix, given two permutation matrices: the matrix C of dimensions mxm and matrix R of dimensions nxn. Add requirements RR^T=I and CC^T=I using Lagrange multipliers. Run optimization to find optimal C and R. Then add a regularizer to the target function that will turn those matrices in a proper permutation (e.g. 0s and 1s).
Use genetic optimization.

Neither guarantees optimal solution.

Pean answered 8/6, 2024 at 18:31 Comment(2)

Thanks! Didn't get the part about adding a regularizer to turn the matrices into proper permutation? – Undercool 9/6, 2024 at 6:3

Edit

Recommended topics

Hot tags