The rules of this problem are fairly specific because I'm actually looking at a subset of GLn, where the row and column vectors must have a certain form (call these vectors valid -- examples below), so please bear with me. Here are the rules:
You can select a valid nonzero vector of length n uniformly at random.
Given valid vectors
v1, v2, ..., vk
, you can determine whether or not the partial columns they form are prefixes of valid vectors (e.g. whether[v1_1 v2_1 ... vk_1]
occurs as the first k entries of a valid vector) using the functionisPrefix
.Given valid vectors v1, v2, ..., vk, you can determine whether or not they are linearly dependent using the function
areIndependent
.
The goal is to sample from this subset of GLn uniformly at random. Here is a naive solution that fails:
Step 1: Select a valid v1 uniformly at random.
If isPrefix(v1) then Go to Step 2.
Otherwise repeat Step 1.
Step 2: Select a valid v2 uniformly at random.
If areIndependent(v1,v2) & isPrefix(v1,v2), then go to Step 3.
Otherwise, repeat Step 2.
...
Step n: Select a valid vn uniformly at random.
If areIndependent(v1,v2,...,vn) & isPrefix(v1,v2,...,vn), then END.
Otherwise, repeat Step n.
The problem with this would-be solution is that it can get stuck in an infinite loop in the not-too-unlikely event that areIndependent(v1,v2,...,vk) & isPrefix(v1,v2,...,vk)
correctly returns TRUE
but there is no way to complete this k-tuple to a linearly independent valid n-tuple. A common instance of this is when k=n-1
and there is a unique valid vector vn
such that isPrefix(v1,v2,...,vn)
is TRUE but this vector is not independent of the previous n-1 vectors.
Therefore I need to add in some way to back up a step or two when I'm in this loop, but I don't necessarily know when I'm in it. I'm looking for a solution along these lines: If Step k fails f(k)
times for some explicit function f
that may depend on the distribution of valid vectors, then go back to Step k-1 (or even further, in some explicit way).
Any suggestions, comments, references, etc will be greatly appreciated! Thanks!
EXAMPLES:
I'm actually looking for a general reference or guideline for how to proceed with the sampling. I have several examples of valid vectors on which I would like to do this, and it's more helpful to me to be able to do it on my own eventually than to list every example and have the SO community hash out solutions. However, to be concrete and give a sense of the difficulties involved, here is one example:
Alternating Sign Matrices: A vector is valid iff its entries are all 0, -1, 1, the nonzero entries alternate between 1s and -1s, and the sum of the entries is 1. For instance, every permutation matrix will consist of valid vectors, and also the following:
0 1 0
1 -1 1
0 1 0
Distinct Entries: A vector is valid iff it has distinct entries. This one is particularly annoying because the condition applies to rows as well as columns.
Thanks again to all the folks who've looked at this!
v1..vk
instead of generatingv(k+1)
again and again until there are all independant (which may never happen), you generate oncev(k+1)
, then (if they are not independant anymore) you choose a number at randomr
in[1:k+1]
and replacevr
with a new randomly sampled vector. Repeat untilv1..v(k+1)
are independant.I do not know if it's much better than the pure rejection sampling though :P – Gun