I have a __m256
value that holds random bits.
I would like to to "interpret" it, to obtain another __m256
that holds float
values in a uniform [0.0f, 1.0f]
range.
Planning to do it using:
__m256 randomBits = /* generated random bits, uniformly distribution */;
__m256 invFloatRange = _mm256_set1_ps( numeric_limits<float>::min() ); //min is a smallest increment of float precision
__m256 float01 = _mm256_mul(randomBits, invFloatRange);
//float01 is now ready to be used
Question 1:
However, will this cause a problem in very rare cases where randomBits
has all bits as 1 and is therefore NAN?
What can I do to protect myself from this?
I want the float01
to always be a usable number
Question 2:
Will the [0 to 1] range remain uniform after I obtain it using the above approach? I know float has varying precision at different magnitudes
randomBits
as unit32 then divide by uint32 max (making sure to convert to float first)? Random bits in a floating point number won't give a uniform distribution even without the problems of nan and infinity – Ducks_mm256
instructions? uint32 would have a different range (than float) from what I can see. Maybe we should use int32 and mask-away the minus sign? This should also eliminate any possibility of NaN occuring – Grazuint32
tofloat
, but you can convertint32
tofloat
using_mm256_cvtepi32_ps
, then multiply bypow(2,-32)
and add0.5
(using FMA, if available). This won't be perfect, especially the smallest non-zero result will bepow(2,-23)
. – Incertitudepow(2,-31)
(this gets numbers in[-1, +1)
) and then mask away the sign bit. You will only lose 1 bit of the generated number, instead of 8. – Incertitude