Numpy ‘smart’ symmetric matrix
Asked Answered
P

6

85

Is there a smart and space-efficient symmetric matrix in numpy which automatically (and transparently) fills the position at [j][i] when [i][j] is written to?

import numpy
a = numpy.symmetric((3, 3))
a[0][1] = 1
a[1][0] == a[0][1]
# True
print(a)
# [[0 1 0], [1 0 0], [0 0 0]]

assert numpy.all(a == a.T) # for any symmetric matrix

An automatic Hermitian would also be nice, although I won’t need that at the time of writing.

Pignus answered 3/4, 2010 at 22:39 Comment(4)
You might consider marking the answer as accepted, if it solves your problem. :)Trabeated
I wanted to wait for a better (i.e. built-in and memory-efficient) answer to come. There’s nothing wrong with your answer, of course, so I’ll accept it anyway.Pignus
I think to this day you can only subclass (no thanks) or wrap around numpy, e.g. wrapping around numpy by changing how you fill the matrix via your own setter functions, in order to get an interface that resembles that. I think you can also throw in masked arrays to avoid double downstream calculations as much as masked arrays support enough of your matrix manipulation scenarios. Nothing built in nor a generically robust way.Benedicite
numpy.all(a == a.T) doesn't seem to work for symmetric matrices with nans on the diagonal.Boaster
T
97

If you can afford to symmetrize the matrix just before doing calculations, the following should be reasonably fast:

def symmetrize(a):
    """
    Return a symmetrized version of NumPy array a.

    Values 0 are replaced by the array value at the symmetric
    position (with respect to the diagonal), i.e. if a_ij = 0,
    then the returned array a' is such that a'_ij = a_ji.

    Diagonal values are left untouched.

    a -- square NumPy array, such that a_ij = 0 or a_ji = 0, 
    for i != j.
    """
    return a + a.T - numpy.diag(a.diagonal())

This works under reasonable assumptions (such as not doing both a[0, 1] = 42 and the contradictory a[1, 0] = 123 before running symmetrize).

If you really need a transparent symmetrization, you might consider subclassing numpy.ndarray and simply redefining __setitem__:

class SymNDArray(numpy.ndarray):
    """
    NumPy array subclass for symmetric matrices.

    A SymNDArray arr is such that doing arr[i,j] = value
    automatically does arr[j,i] = value, so that array
    updates remain symmetrical.
    """

    def __setitem__(self, (i, j), value):
        super(SymNDArray, self).__setitem__((i, j), value)                    
        super(SymNDArray, self).__setitem__((j, i), value)                    

def symarray(input_array):
    """
    Return a symmetrized version of the array-like input_array.

    The returned array has class SymNDArray. Further assignments to the array
    are thus automatically symmetrized.
    """
    return symmetrize(numpy.asarray(input_array)).view(SymNDArray)

# Example:
a = symarray(numpy.zeros((3, 3)))
a[0, 1] = 42
print a  # a[1, 0] == 42 too!

(or the equivalent with matrices instead of arrays, depending on your needs). This approach even handles more complicated assignments, like a[:, 1] = -1, which correctly sets a[1, :] elements.

Note that Python 3 removed the possibility of writing def …(…, (i, j),…), so the code has to be slightly adapted before running with Python 3: def __setitem__(self, indexes, value): (i, j) = indexes

Trabeated answered 4/4, 2010 at 9:6 Comment(4)
Actually, if you do subclass it, you should not overwrite setitem, but rather getitem so that you do not cause more overhead on creating the matrix.Chocolate
This is a very interesting idea, but writing this as the equivalent __getitem__(self, (i, j)) fails when one does a simple print on a subclass instance array. The reason is that print calls __getitem__() with an integer index, so more work is required even for a simple print. The solution with __setitem__() works with print (obviously), but suffers from a similar problem: a[0] = [1, 2, 3] does not work, for the same reason (this is not a perfect solution). A __setitem__() solution has the advantage of being more robust, since the in-memory array is correct. Not too bad. :)Trabeated
your suggestion sounds like blog.sopticek.net/2016/07/24/…... Do you confirm it's almost the same ? Trouble is this optimizes the memory usage, not the computing time. I'm in search of python methods to speed-up some simple computations on symmetric matrices. please let me know if you have info.Testosterone
This answer doesn't save memory and is therefor very different from the approach in the quoted link. Now, saving time with symmetric matrices usually involves going through specialized algorithms instead of general ones, like using eigh() in NumPy instead of eig().Trabeated
L
25

The more general issue of optimal treatment of symmetric matrices in numpy bugged me too.

After looking into it, I think the answer is probably that numpy is somewhat constrained by the memory layout supportd by the underlying BLAS routines for symmetric matrices.

While some BLAS routines do exploit symmetry to speed up computations on symmetric matrices, they still use the same memory structure as a full matrix, that is, n^2 space rather than n(n+1)/2. Just they get told that the matrix is symmetric and to use only the values in either the upper or the lower triangle.

Some of the scipy.linalg routines do accept flags (like sym_pos=True on linalg.solve) which get passed on to BLAS routines, although more support for this in numpy would be nice, in particular wrappers for routines like DSYRK (symmetric rank k update), which would allow a Gram matrix to be computed a fair bit quicker than dot(M.T, M).

(Might seem nitpicky to worry about optimising for a 2x constant factor on time and/or space, but it can make a difference to that threshold of how big a problem you can manage on a single machine...)

Lachrymator answered 26/2, 2012 at 18:6 Comment(4)
The question is about how to automatically create a symmetric matrix through the assignment of a single entry (not about how BLAS can be instructed to use symmetric matrices in its calculations or how symmetric matrices could in principle be stored more efficiently).Trabeated
The question is also about space-efficiency, so BLAS issues are on-topic.Heterogamete
@EOL, the question is not about how to automatically create a symmetric matrix through the assignment of a single entry.Caerphilly
Granted, "creating" could be more appropriately be replaced by "updated". Now, since the question is explicitly about transparently setting M_ji when M_ji is set, and this answer is not about that, you understand that this is in essence the point I raised. The question is about how to efficiently do this (not about efficiently handling symmetric matrices, even though this might be the right question: something better put on the comments, or given as an answer that does solve the more general problem instead of merely discussing it).Trabeated
P
10

There are a number of well-known ways of storing symmetric matrices so they don't need to occupy n^2 storage elements. Moreover, it is feasible to rewrite common operations to access these revised means of storage. The definitive work is Golub and Van Loan, Matrix Computations, 3rd edition 1996, Johns Hopkins University Press, sections 1.27-1.2.9. For example, quoting them from form (1.2.2), in a symmetric matrix only need to store A = [a_{i,j} ] fori >= j. Then, assuming the vector holding the matrix is denoted V, and that A is n-by-n, put a_{i,j} in

V[(j-1)n - j(j-1)/2 + i]

This assumes 1-indexing.

Golub and Van Loan offer an Algorithm 1.2.3 which shows how to access such a stored V to calculate y = V x + y.

Golub and Van Loan also provide a way of storing a matrix in diagonal dominant form. This does not save storage, but supports ready access for certain other kinds of operations.

Pentapody answered 3/7, 2014 at 20:47 Comment(2)
There is also the Rectangular Full Packed storage (RFP), for example Lapack ZPPTRF uses it. Is it supported by numpy?Harveyharvie
@isti_spl: No, but you could implement a wrapper that doesReading
D
1

This is plain python and not numpy, but I just threw together a routine to fill a symmetric matrix (and a test program to make sure it is correct):

import random

# fill a symmetric matrix with costs (i.e. m[x][y] == m[y][x]
# For demonstration purposes, this routine connect each node to all the others
# Since a matrix stores the costs, numbers are used to represent the nodes
# so the row and column indices can represent nodes

def fillCostMatrix(dim):        # square array of arrays
    # Create zero matrix
    new_square = [[0 for row in range(dim)] for col in range(dim)]
    # fill in main diagonal
    for v in range(0,dim):
        new_square[v][v] = random.randrange(1,10)

    # fill upper and lower triangles symmetrically by replicating diagonally
    for v in range(1,dim):
        iterations = dim - v
        x = v
        y = 0
        while iterations > 0:
            new_square[x][y] = new_square[y][x] = random.randrange(1,10)
            x += 1
            y += 1
            iterations -= 1
    return new_square

# sanity test
def test_symmetry(square):
    dim = len(square[0])
    isSymmetric = ''
    for x in range(0, dim):
        for y in range(0, dim):
            if square[x][y] != square[y][x]:
                isSymmetric = 'NOT'
    print "Matrix is", isSymmetric, "symmetric"

def showSquare(square):
    # Print out square matrix
    columnHeader = ' '
    for i in range(len(square)):
        columnHeader += '  ' + str(i)
    print columnHeader

    i = 0;
    for col in square:
        print i, col    # print row number and data
        i += 1

def myMain(argv):
    if len(argv) == 1:
        nodeCount = 6
    else:
        try:
            nodeCount = int(argv[1])
        except:
            print  "argument must be numeric"
            quit()

    # keep nodeCount <= 9 to keep the cost matrix pretty
    costMatrix = fillCostMatrix(nodeCount)
    print  "Cost Matrix"
    showSquare(costMatrix)
    test_symmetry(costMatrix)   # sanity test
if __name__ == "__main__":
    import sys
    myMain(sys.argv)

# vim:tabstop=8:shiftwidth=4:expandtab
Dallis answered 5/6, 2014 at 16:53 Comment(0)
S
1

To construct a NxN matrix that is symmetric along the main diagonal, and with 0's on the main diagonal you can do :

a = np.array([1, 2, 3, 4, 5])
b = np.zeros(shape=(a.shape[0], a.shape[0]))
upper = np.triu(b + a)
lower = np.tril(np.transpose(b + a))
D = (upper + lower) * (np.full(a.shape[0], fill_value=1) - np.eye(a.shape[0]))

This is kind of a special case, but recently I've used this kind of matrix for network adjacency representation.

Hope that helps. Cheers.

Starvation answered 31/3, 2020 at 21:51 Comment(0)
B
0

It is trivial to Pythonically fill in [i][j] if [j][i] is filled in. The storage question is a little more interesting. One can augment the numpy array class with a packed attribute that is useful both to save storage and to later read the data.

class Sym(np.ndarray):

    # wrapper class for numpy array for symmetric matrices. New attribute can pack matrix to optimize storage.
    # Usage:
    # If you have a symmetric matrix A as a shape (n,n) numpy ndarray, Sym(A).packed is a shape (n(n+1)/2,) numpy array 
    # that is a packed version of A.  To convert it back, just wrap the flat list in Sym().  Note that Sym(Sym(A).packed)


    def __new__(cls, input_array):
        obj = np.asarray(input_array).view(cls)

        if len(obj.shape) == 1:
            l = obj.copy()
            p = obj.copy()
            m = int((np.sqrt(8 * len(obj) + 1) - 1) / 2)
            sqrt_m = np.sqrt(m)

            if np.isclose(sqrt_m, np.round(sqrt_m)):
                A = np.zeros((m, m))
                for i in range(m):
                    A[i, i:] = l[:(m-i)]
                    A[i:, i] = l[:(m-i)]
                    l = l[(m-i):]
                obj = np.asarray(A).view(cls)
                obj.packed = p

            else:
                raise ValueError('One dimensional input length must be a triangular number.')

        elif len(obj.shape) == 2:
            if obj.shape[0] != obj.shape[1]:
                raise ValueError('Two dimensional input must be a square matrix.')
            packed_out = []
            for i in range(obj.shape[0]):
                packed_out.append(obj[i, i:])
            obj.packed = np.concatenate(packed_out)

        else:
            raise ValueError('Input array must be 1 or 2 dimensional.')

        return obj

    def __array_finalize__(self, obj):
        if obj is None: return
        self.packed = getattr(obj, 'packed', None)

```

Bewhiskered answered 8/6, 2017 at 21:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.