I have some integer value representing a bitmask, for example 154 = 0b10011010
, and I want to construct a corresponding signal Vector<T>
instance <0, -1, 0, -1, -1, 0, 0, -1>
(note the least/most significant bit order is reversed).
Is there any more performant way than this (beyond unrolling the loop)?
int mask = 0b10011010;// 154
// -1 = (unsigned) 0xFFFFFFFF is the "all true" value
Vector<int> maskVector = new Vector<int>(
Enumerable.Range(0, Vector<int>.Count)
.Select(i => (mask & (1 << i)) > 0 ? -1 : 0)
.ToArray());
// <0, -1, 0, -1, -1, 0, 0, -1>
string maskVectorStr = string.Join("", maskVector);
(Note the debugger is bugged displaying Vector<T>
values, showing only half of the components and the rest as zeros, hence my use of string.Join
.)
Namely, is there a way with only a single instruction? It is the Single Instruction, Multiple Data
(SIMD) datatype after all!
Furthermore, how I can do this when working with the generic Vector<T>
version?
A signal vector, or integral mask vector, is used with the ConditionalSelect
method to choose between values of two other masks:
//powers of two <1, 2, 4, 8, 16, 32, 64, 128>
Vector<int> ifTrueVector = new Vector<int>(Enumerable.Range(0, Vector<int>.Count).Select(i => 1 << i).ToArray());
Vector<int> ifFalseVector = Vector<int>.Zero;// or some other vector
// <0, 2, 0, 8, 16, 0, 0, 128>
Vector<int> resultVector = Vector.ConditionalSelect(maskVector, ifTrueVector, ifFalseVector);
string resultStr = string.Join("", resultVector);
// our original mask value back
int sum = Vector.Dot(resultVector, Vector<int>.One);
The documentation of ConditionalSelect
explicitly says the mask vector has integral values for every overload, but spamming Vector<T>.Zero[0]
and Vector<T>.One[0]
to get them is surely improper? (And you can get the T
version of -1 with (-Vector<T>.One)[0]
)
P.S. would there also be a corresponding solution to populating with powers of 2?
0b10011010
to<0, -1, 0, -1, -1, 0, 0, -1>
? – ThreegaitedConditionalSelect
could use be an&
, I think? That means no need forifFalseVector
. See the implementation) – ThreegaitedConditionalSelect
statements from the bitmask (and sometimes two subtractions), and intrinsics only take a couple CPU cycles. – EncyclicalConditionalSelect
. – EncyclicalSystem.Runtime.Intrinsics.X86
? If so, there's x86 specific efficient solution – Sybyl