This pseudo-code should work:
Let M = the largest X[i].
For each i:
Subtract M from X[i].
Let S = the sum of exp(X[i]) for all i.
For each i:
The probability for this i is exp(X[i]) / S.
If M is large, then, after the subtraction step, some X[i] will be so small (have large negative values) that their exp(X[i]) will evaluate to zero in double precision. However, the actual probability of these items is so minuscule that there is no practical difference between their actual probability and zero, so it is okay that exp(X[i]) underflows to zero.
Aside from underflow and rounding errors, the probabilities should be the same after the subtraction transformation, because:
- exp(x-M) = exp(x)/exp(M).
- This division affects both the numerator and the denominator of the probability the same way, so the ratio remains the same.