Question
In Haskell, the base
libraries and Hackage packages provide several means of converting binary IEEE-754 floating point data to and from the lifted Float
and Double
types. However, the accuracy, performance, and portability of these methods are unclear.
For a GHC-targeted library intended to (de)serialize a binary format across platforms, what is the best approach for handling IEEE-754 floating point data?
Approaches
These are the methods I've encountered in existing libraries and online resources.
FFI Marshaling
This is the approach used by the data-binary-ieee754
package. Since Float
, Double
, Word32
and Word64
are each instances of Storable
, one can poke
a value of the source type into an external buffer, and then peek
a value of the target type:
toFloat :: (F.Storable word, F.Storable float) => word -> float
toFloat word = F.unsafePerformIO $ F.alloca $ \buf -> do
F.poke (F.castPtr buf) word
F.peek buf
On my machine this works, but I cringe to see allocation being performed just to accomplish the coercion. Also, although not unique to this solution, there's an implicit assumption here that IEEE-754 is actually the in-memory representation. The tests accompanying the package give it the "works on my machine" seal of approval, but this is not ideal.
unsafeCoerce
With the same implicit assumption of in-memory IEEE-754 representation, the following code gets the "works on my machine" seal as well:
toFloat :: Word32 -> Float
toFloat = unsafeCoerce
This has the benefit of not performing explicit allocation like the approach above, but the documentation says "it is your responsibility to ensure that the old and new types have identical internal representations". That implicit assumption is still doing all the work, and is even more strenuous when dealing with lifted types.
unsafeCoerce#
Stretching the limits of what might be considered "portable":
toFloat :: Word -> Float
toFloat (W# w) = F# (unsafeCoerce# w)
This seems to work, but doesn't seem practical at all since it's limited to the GHC.Exts
types. It's nice to bypass the lifted types, but that's about all that can be said.
encodeFloat
and decodeFloat
This approach has the nice property of bypassing anything with unsafe
in the name, but doesn't seem to get IEEE-754 quite right. A previous SO answer to a similar question offers a concise approach, and the ieee754-parser
package used a more general approach before being deprecated in favor of data-binary-ieee754
.
There's quite a bit of appeal to having code that needs no implicit assumptions about underlying representation, but these solutions rely on encodeFloat
and decodeFloat
, which are apparently fraught with inconsistencies. I've not yet found a way around these problems.