Proposing an algorithm for arbitrary shape Bit Matrix Transposition with BDD-like structure

We consider a bit matrix (n x m) to be a regular array containing n lines of integers of size m.

I have looked in Hacker's Delight and in other sources and the algorithms I found for this were rather specialized: square matrices with powers of two sizes like 8x8, 32x32, 64x64, etc. (which is normal because the machine is built that way).

I thought of a more general algorithm (for arbitrary n and m) which is, in the worse case, of the expected complexity (I think), but for matrices containing mostly similar columns, or more zeros than ones, the algorithm seems a bit more interesting (in the extreme, it is linear if the matrix contains the same line over and over). It follows a sort of Binary Decision Diagram manipulation.

The output is not a transposed matrix but a compressed transposed matrix: a list of pairs (V,L) where L is an int_m that indicates the lines of the transposed matrix (by setting the bits of the corresponding position) that should contain the int_n V. The lines of the transposed matrix not appearing in any of the pairs are filled with 0.

For example, for the matrix

1010111
1111000
0001010

having the transposed

the algorithm outputs:

(010,0100000)
(011,0001000)
(100,0000101)
(101,0000010)
(110,1010000)

and one reads the pair (100,0000101) as meaning "the value 100 is put in the 5th and the 7th line of the transposed matrix".

This is the algorithm (written in a pseudo-OCaml/C) and a picture of the progress of the algorithm on the above example.

We will run according to triples (index_of_current_line, V, L), which is of type (int, int_n, int_m), where int_n is the type of n-bit wide integers and int is just a machine integer wide enough to hold n. The function takes a list of these triples, the matrix, the number of lines and an accumulator for the output (list of pairs (int_m, int_n)) and returns, at some point, that accumulator.

list of (int_n, int_m) transpose(list of triple t, 
                                int_m[n] mat, 
                                int n,  
                                list of (int_n, int_m) acc)

The first call of the transpose function is

transpose([(0, 0, 2^m-1)], mat, n, []).

take "&", "|" "xor" to be the usual bit-wise operations

transpose(t, mat, n, acc) =
 match t with 
  | [] -> (* the list is empty, we're done *)
    return acc
  | (i, v, l)::tt -> 
    let colIn = mat[i] & l in
    (* colIn contains the positions that were set in the parent's mask "l" 
     and that are also set in the line "i" *)
    match colIn with
    |0 -> (* None of the positions are set in both, do not branch *)
     if (i<n) then (* not done with the matrix, simply move to next line *)
      transpose((i+1,v,l)::tt,mat,n,acc)
     else (* we reached the end of the matrix, we're at a leaf *)
       if (v>0) then
         transpose(tt,mat,n,(v,l)::acc)
       else  (* We ignore the null values and continue *)
         transpose(tt,mat,n,acc)
     |_ -> (* colIn is non null, ie some of the positions set at the parent
              mask "l" are also set in this line. If ALL the positions are, we 
              do not branch either. If only some of them are and some of them
              are not, we branch *)
         (* First, update v *)
        let vv = v | (2^(n-i-1)) in
         (* Then get the mask for the other branch *)
        let colOut = colIn xor l in, 
        match colOut with
        | 0 -> (* All are in, none are out, no need to branch *)
            if (i<n) then
                transpose((i+1,vv,colIn)::tt,mat,n,acc)
            else (* we reached the end of the matrix, we're at a leaf *)             
               transpose(tt,mat,n,(vv,colIn)::acc)
        | _ -> (* Some in, some out : now we branch *)
            if (i<n) then 
               transpose((i+1,vv,colIn)::(i+1,v,colOut)::tt,mat,n,acc)
            else 
             if (v>0) then
               transpose(tt,mat,n,(vv,colIn)::(v,colOut)::acc)
             else
               transpose(tt,mat,n,(vv,colIn)::acc)

Notice that if the matrix is wider than it is high, it is even faster (if n = 3 and m = 64 for instance)

My questions are :

Is this interesting and/or useful?
Am I reinventing the wheel?
Are "almost zero" matrices or "little differentiated-lines" matrices common enough for this to be interesting ?

PS: If anything doesn't seem clear, please do tell, I will rewrite whatever needs to be!

It's an interesting algorithm for matrices where there may be more zeros than ones or vice versa, however, there are many algorithms that exploit the use of numerical patterns to offer a certain degree of compression.
It could be useful for compressing but not for executing operations on it, for example:

Given a binary matrix M, it is possible to compress it into another N(A, B) -as you have indicated- but this will require a certain amount of computing power P, and the same to decompress.

M := P[N(A, B)]

Unless there are very specific patterns in the matrix, it will not be worth the computational effort P to work on its compressed form N(A, B).

However, it could be useful in the field of cryptography, since the operation P can be executed recursively on the matrix of the right block B. Which provides a way to express M in different factors of A.

M := N(A, B); B := N'(A', B'); ...
M := N(A, N'(A', B'))

In other words, with a set of A "keys" it's possible to compute M.

It may be useful in other scenarios but I don't know of any applications other than data compression and cryptography.

Recommended topics

Hot tags