Problem

I have a set of equations with variables denoted with lowercase variables and constants with uppercase variables as such

A = a + b  
B = c + d  
C = a + b + c + d + e

I'm provided the information as to the structure of these equations in a pandas DataFrame with two columns: Constants and Variables

E.g.

df = pd.DataFrame([['A','a'],['A','b'],['B','c'],['B','d'],['C','a'],['C','b'], 
['C','c'],['C','d'],['C','e']],columns=['Constants','Variables'])

I then convert this to a sparse CSC matrix by using NetworkX

table = nx.bipartite.biadjacency_matrix(nx.from_pandas_dataframe(df,'Constants','Variables')  
,df.Constants.unique(),df.Variables.unique(),format='csc')

When converted to a dense matrix, table looks like the following

matrix([[1, 1, 0, 0, 0],[0, 0, 1, 1, 0],[1, 1, 1, 1, 1]], dtype=int64)

What I want from here is to find which variables are solvable (in this example, only e is solvable) and for each solvable variable, what constants is its value dependent on (in this case, since e = C-B-A, it is dependent on A, B, and C)

Attempts at Solution

I first tried to use rref to solve for the solvable variables. I used the symbolics library sympy and the function sympy.Matrix.rref, which gave me exactly what I wanted, since any solvable variable would have its own row with almost all zeros and 1 one, which I could check for row by row.

However, this solution was not stable. Primarily, it was exceedingly slow, and didn't make use of the fact that my datasets are likely to be very sparse. Moreover, rref doesn't do too well with floating points. So I decided to move on to another approach motivated by Removing unsolvable equations from an underdetermined system, which suggested using svd

Conveniently, there is a svd function in the scipy.sparse library, namely scipy.sparse.linalg.svds. However, given my lack of linear algebra background, I don't understand the results outputted by running this function on my table, or how to use those results to get what I want.

Further Details in the Problem

The coefficient of every variable in my problem is 1. This is how the data can be expressed in the two column pandas DataFrame shown earlier
The vast majority of variables in my actual examples will not be solvable. The goal is to find the few that are solvable
I'm more than willing to try an alternate approach if it fits the constraints of this problem.

This is my first time posting a question, so I apologize if this doesn't exactly follow guidelines. Please leave constructive criticism but be gentle!

Asphyxiant answered 13/6, 2018 at 17:24 Comment(0)

The system you are solving has the form

[ 1 1 0 0 0 ] [a]   [A]
[ 0 0 1 1 0 ] [b] = [B]
[ 1 1 1 1 1 ] [c]   [C]
              [d]
              [e]

i.e., three equations for five variables a, b, c, d, e. As the answer linked in your question mentions, one can tackle such underdetermined system with the pseudoinverse, which Numpy directly provides in terms of the pinv function.

Since M has linearly independent rows, the psudoinverse has in this case the property that M.pinv(M) = I, where I denotes identity matrix (3x3 in this case). Thus formally, we can write the solution as:

v = pinv(M) . b

where v is the 5-component solution vector, and b denotes the right-hand side 3-component vector [A, B, C]. However, this solution is not unique, since one can add a vector from the so-called kernel or null space of the matrix M (i.e., a vector w for which M.w=0) and it will be still a solution:

M.(v + w) = M.v + M.w = b + 0 = b

Therefore, the only variables for which there is a unique solution are those for which the corresponding component of all possible vectors from the null space of M is zero. In other words, if you assemble the basis of the null space into a matrix (one basis vector per column), then the "solvable variables" will correspond to zero rows of this matrix (the corresponding component of any linear combination of the columns will be then also zero).

Let's apply this to your particular example:

import numpy as np
from numpy.linalg import pinv

M = [
    [1, 1, 0, 0, 0],
    [0, 0, 1, 1, 0],
    [1, 1, 1, 1, 1]
]

print(pinv(M))

[[ 5.00000000e-01 -2.01966890e-16  1.54302378e-16]
 [ 5.00000000e-01  1.48779676e-16 -2.10806254e-16]
 [-8.76351626e-17  5.00000000e-01  8.66819360e-17]
 [-2.60659800e-17  5.00000000e-01  3.43000417e-17]
 [-1.00000000e+00 -1.00000000e+00  1.00000000e+00]]

From this pseudoinverse, we see that the variable e (last row) is indeed expressible as - A - B + C. However, it also "predicts" that a=A/2 and b=A/2. To eliminate these non-unique solutions (equally valid would be also a=A and b=0 for example), let's calculate the null space borrowing the function from SciPy Cookbook:

print(nullspace(M))

[[ 5.00000000e-01 -5.00000000e-01]
 [-5.00000000e-01  5.00000000e-01]
 [-5.00000000e-01 -5.00000000e-01]
 [ 5.00000000e-01  5.00000000e-01]
 [-1.77302319e-16  2.22044605e-16]]

This function returns already the basis of the null space assembled into a matrix (one vector per column) and we see that, within a reasonable precision, the only zero row is indeed only the last one corresponding to the variable e.

EDIT:

For the set of equations

A = a + b, B = b + c, C = a + c

the corresponding matrix M is

[ 1 1 0 ]
[ 0 1 1 ]
[ 1 0 1 ]

Here we see that the matrix is in fact square, and invertible (the determinant is 2). Thus the pseudoinverse coincides with "normal" inverse:

[[ 0.5 -0.5  0.5]
 [ 0.5  0.5 -0.5]
 [-0.5  0.5  0.5]]

which corresponds to the solution a = (A - B + C)/2, .... Since M is invertible, its kernel / null space is empty, that's why the cookbook function returns only []. To see this, let's use the definition of the kernel - it is formed by all non-zero vectors x such that M.x = 0. However, since M^{-1} exists, x is given as x = M^{-1} . 0 = 0 which is a contradiction. Formally, this means that the found solution is unique (or that all variables are "solvable").

Gracious answered 14/6, 2018 at 23:4 Comment(3)

Thank you for the detailed reply! I like the idea of using nullspace, but I have two questions. First, do you know whether it's possible to use the sparse svd function in the body of nullspace? I tried to sub it in but I wasn't able to figure that out. And the nullspace function from the cookbook also seems to not work for the case of the equations A = a + b, B = b + c, C = a + c. I don't want to be a help vampire here, but I honestly don't get how nullspace works. – Asphyxiant 15/6, 2018 at 17:6

@RushabhMehta my pleasure! I haven't looked into the sparse version in detail, but as long as it provides the left/right singular vectors and singular values, it should be mathematically equivalent. As for the example, I update the answer... – Gracious 15/6, 2018 at 19:1

It seems as though the sparse version is truncated, and only provides <min(table.shape) singular values and left/right vectors. I'm presuming this isn't enough to calculate the nullspace, right? – Asphyxiant 17/6, 2018 at 21:31

To build on ewcz's answer, both the nullspace and pseudo-inverse can be calculated using numpy.linalg.svd. See the links below:

pseudo-inverse

nullspace

Asphyxiant answered 25/7, 2018 at 17:0 Comment(0)

Problem

Attempts at Solution

Further Details in the Problem

Recommended topics

Hot tags