Converting IEEE 754 floating point in Haskell Word32/64 to and from Haskell Float/Double

Asked 8/8, 2011 at 0:7 Answered 9/8, 2011 at 20:49

Solved haskell floating-point ghc ieee-754

Question

In Haskell, the base libraries and Hackage packages provide several means of converting binary IEEE-754 floating point data to and from the lifted Float and Double types. However, the accuracy, performance, and portability of these methods are unclear.

For a GHC-targeted library intended to (de)serialize a binary format across platforms, what is the best approach for handling IEEE-754 floating point data?

Approaches

These are the methods I've encountered in existing libraries and online resources.

FFI Marshaling

This is the approach used by the data-binary-ieee754 package. Since Float, Double, Word32 and Word64 are each instances of Storable, one can poke a value of the source type into an external buffer, and then peek a value of the target type:

toFloat :: (F.Storable word, F.Storable float) => word -> float
toFloat word = F.unsafePerformIO $ F.alloca $ \buf -> do
    F.poke (F.castPtr buf) word
    F.peek buf

On my machine this works, but I cringe to see allocation being performed just to accomplish the coercion. Also, although not unique to this solution, there's an implicit assumption here that IEEE-754 is actually the in-memory representation. The tests accompanying the package give it the "works on my machine" seal of approval, but this is not ideal.

`unsafeCoerce`

With the same implicit assumption of in-memory IEEE-754 representation, the following code gets the "works on my machine" seal as well:

toFloat :: Word32 -> Float
toFloat = unsafeCoerce

This has the benefit of not performing explicit allocation like the approach above, but the documentation says "it is your responsibility to ensure that the old and new types have identical internal representations". That implicit assumption is still doing all the work, and is even more strenuous when dealing with lifted types.

`unsafeCoerce#`

Stretching the limits of what might be considered "portable":

toFloat :: Word -> Float
toFloat (W# w) = F# (unsafeCoerce# w)

This seems to work, but doesn't seem practical at all since it's limited to the GHC.Exts types. It's nice to bypass the lifted types, but that's about all that can be said.

`encodeFloat` and `decodeFloat`

This approach has the nice property of bypassing anything with unsafe in the name, but doesn't seem to get IEEE-754 quite right. A previous SO answer to a similar question offers a concise approach, and the ieee754-parser package used a more general approach before being deprecated in favor of data-binary-ieee754.

There's quite a bit of appeal to having code that needs no implicit assumptions about underlying representation, but these solutions rely on encodeFloat and decodeFloat, which are apparently fraught with inconsistencies. I've not yet found a way around these problems.

Stulin answered 8/8, 2011 at 0:7 Comment(0)

Simon Marlow mentions another approach in GHC bug 2209 (also linked to from Bryan O'Sullivan's answer)

You can achieve the desired effect using castSTUArray, incidentally (this is the way we do it in GHC).

I've used this option in some of my libraries in order to avoid the unsafePerformIO required for the FFI marshalling method.

{-# LANGUAGE FlexibleContexts #-}

import Data.Word (Word32, Word64)
import Data.Array.ST (newArray, castSTUArray, readArray, MArray, STUArray)
import GHC.ST (runST, ST)

wordToFloat :: Word32 -> Float
wordToFloat x = runST (cast x)

floatToWord :: Float -> Word32
floatToWord x = runST (cast x)

wordToDouble :: Word64 -> Double
wordToDouble x = runST (cast x)

doubleToWord :: Double -> Word64
doubleToWord x = runST (cast x)

{-# INLINE cast #-}
cast :: (MArray (STUArray s) a (ST s),
         MArray (STUArray s) b (ST s)) => a -> ST s b
cast x = newArray (0 :: Int, 0) x >>= castSTUArray >>= flip readArray 0

I inlined the cast function because doing so causes GHC to generate much tighter core. After inlining, wordToFloat is translated to a call to runSTRep and three primops (newByteArray#, writeWord32Array#, readFloatArray#).

I'm not sure what performance is like compared to the FFI marshalling method, but just for fun I compared the core generated by both options.

Doing FFI marshalling is a fair bit more complicated in this regard. It calls unsafeDupablePerformIO and 7 primops (noDuplicate#, newAlignedPinnedByteArray#, unsafeFreezeByteArray#, byteArrayContents#, writeWord32OffAddr#, readFloatOffAddr#, touch#).

I've only just started learning how to analyse core, perhaps someone with more experience can comment on the cost of these operations?

Thielen answered 9/8, 2011 at 20:49 Comment(5)

This addresses my main concern with the FFI option. I said I didn't like allocating memory, but what I really meant was that it seems like it was employing too much extra machinery just to accomplish the cast. The core looks great for this method. – Stulin 11/8, 2011 at 5:24

I have released a package on Hackage that implements this, and tries to stay updated with the fastest implementation known: hackage.haskell.org/package/reinterpret-cast – Untrue 30/4, 2014 at 14:57

So I guess this works (and will hopefully continue to) if it's used in GHC, but why won't this give different results on machines with different endianness? Or is that irrelevant? – Francis 19/3, 2015 at 2:1

@Francis endianness isn't relevant here, presumably you have got your Word64 in a state that it represents a valid Double on the current machine before you call this function. If you need to do byte swapping because you've read data from disk then you should do that prior to calling these functions. GHC has primops for that, or you can do it non-GHC specific by bit shifting. – Thielen 31/3, 2015 at 5:41

I'm not sure I get the difference with the unsafeCoerce option. Doesn't it still rely on the memory representation being exactly the same? – Vaca 3/5, 2016 at 11:6

All modern CPUs use IEEE754 for floating point, and this seems very unlikely to change within our lifetime. So don't worry about code making that assumption.

You are very definitely not free to use unsafeCoerce or unsafeCoerce# to convert between integral and floating point types, as this can cause both compilation failures and runtime crashes. See GHC bug 2209 for details.

Until GHC bug 4092, which addresses the need for int↔fp coercions, is fixed, the only safe and reliable approach is via the FFI.

Sobriety answered 9/8, 2011 at 6:13 Comment(2)

Thanks! While researching for this question, I came across quite a few semi-relevant tickets in the GHC trac, but somehow missed that dead-on one. I'll stick with the data-binary-ieee754 approach for now, but will watch that ticket. – Stulin 9/8, 2011 at 16:52

Issue 4092 has recently been resolved. Unfortunately, the solution is not useful to user who want to decode Float and Double from bytes. – Undersize 27/8, 2018 at 18:3

I'm the author of data-binary-ieee754. It has at some point used each of the three options.

encodeFloat and decodeFloat work well enough for most cases, but the accessory code required to use them adds tremendous overhead. They do not react well to NaN or Infinity, so some GHC-specific assumptions are required for any casts based upon them.

unsafeCoerce was an attempted replacement, to get better performance. It was really fast, but reports of other libraries having significant problems made me eventually decide to avoid it.

The FFI code has so far been the most reliable, and has decent performance. The overhead of allocation isn't as bad as it seems, likely due to the GHC memory model. And it actually doesn't depend on the internal format of floats, merely on the behavior of the Storable instance. The compiler can use whatever representation it wants as long as Storable is IEEE-754. GHC uses IEEE-754 internally anyway, and I don't worry much about non-GHC compilers any more, so it's a moot point.

Until the GHC developers see fit to bestow upon us unlifted fixed-width words, with associated conversion functions, FFI seems the best option.

Kwei answered 9/8, 2011 at 18:29 Comment(1)

This makes a lot more sense now. I mistakenly thought alloca was allocating in a C heap because I'm not too familiar with the FFI, and didn't know the conversion code would be tied to the Storable instance. Sounds like there's a good consensus for the FFI; thanks! – Stulin 9/8, 2011 at 20:24

I'd use the FFI method for conversion. But be sure to use the alignment when allocating memory so you get memory that is acceptable for load/store for both the floating point number and the integer. You should also put in some assert about the sizes of the float and word being the same so you can detect if anything goes wrong.

If allocating memory makes you cringe you should not be using Haskell. :)

Literature answered 9/8, 2011 at 7:57 Comment(6)

Touché :) Won't specializing the bit of code I pasted provide a guarantee about the sizes? Or do you have something else in mind on the C side? – Stulin 9/8, 2011 at 16:54

I just like to be on the safe side, so I'd add an assertion. If the sizes are the same it will be compiled away as always being true. – Literature 9/8, 2011 at 17:58

It's not included in the post, but data-binary-ieee754 (from which that code is copied) uses type-safe wrappers to ensure the sizes match. For example, Double can only be converted to Word64. alloca handles the alignment. – Kwei 9/8, 2011 at 18:30

How does alloca handle the alignment? You need to align to the greater of the two alignment constraints of the two types. I don't see how alloca can do that. – Literature 9/8, 2011 at 20:51

alloca aligns to whatever type is being allocated. The wrapper functions ensure both types are the same size (assuming Float is 32 bits and Double is 64 bits). – Kwei 10/8, 2011 at 3:42

Yes, I know what alloca does and it's not enough. Just because the two types have the same size does not mean they have the same alignment needs. For instance, Double might have to be aligned on 8 byte boundary and Word64 on 4 byte boundary (this is a real example). So you need to compute the maximum of the alignments and allocate on that boundary, otherwise it might fail on some platforms. – Literature 10/8, 2011 at 7:35

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Question

Approaches

FFI Marshaling

unsafeCoerce

unsafeCoerce#

encodeFloat and decodeFloat

Recommended topics

Hot tags

`unsafeCoerce`

`unsafeCoerce#`

`encodeFloat` and `decodeFloat`