Problem with bit swapping in Haskell

B

3

5

As part of a school project I'm implementing some crypthographic algorithms in Haskell. As you probably know this involves quite a lot of low level bit fiddling. Now I am stuck on one particular sub routine which causes me a headache. The routine, which is a permutation on 256 bits, works as follows:

Input: a 256 bit block.
Then all the even numbered bits (0,2,...) in the input block are taken to be the first 128 bits in the output block. While the odd numbered bits are taken to be the 128 last bits in the output block. More specifically, the formula for the i'th bit in the output is given as (a_i is the i'th bit in the input block, and b is the output):

b_i = a_2i

b_i+2^d-1 = a_{2i + 1}

for i from 0 to 2^d-1-1, d = 8.

As a toy example, assume we used a reduced version of the routine which worked with 16 bit blocks instead of 256 bits. Then the following bitstring would be permuted as follows:

1010 1010 1010 1010 -> 1111 1111 0000 0000

I have not been able to come up with a clean implementation for this function. In particular I have been trying with a ByteString -> ByteString signature, but that sort of forces me to work on a Word8 kind of granularity. But each byte in the output bytestring is a function of bits in all the other bytes, which requires some really messy operations.

I will be really grateful for any kind of hint or advice on how to approach this problem.

Blandish answered 4/9, 2011 at 0:34 Comment(1)

haskell.org/ghc/docs/latest/html/libraries/base/… – Braden 4/9, 2011 at 10:33

G

4

If you want an efficient implementation, I don't think you can avoid working with bytes. Here is an example solution. It assumes that there is always an even number of bytes in the ByteString. I'm not very familiar with unboxing or strictness tweaking, but I think these would be necessary if you want to be very efficient.

import Data.ByteString (pack, unpack, ByteString)
import Data.Bits
import Data.Word

-- the main attraction
packString :: ByteString -> ByteString
packString = pack . packWords . unpack

-- main attraction equivalent, in [Word8]
packWords :: [Word8] -> [Word8]
packWords ws = evenPacked ++ unevenPacked
    where evenBits = map packEven ws
          unevenBits = map packUneven ws
          evenPacked = consumePairs packNibbles evenBits
          unevenPacked = consumePairs packNibbles unevenBits

-- combines 2 low nibbles (first 4 bytes) into a (high nibble, low nibble) word
-- assumes that only the low nibble of both arguments can be non-zero. 
packNibbles :: Word8 -> Word8 -> Word8
packNibbles w1 w2 = (shiftL w1 4) .|. w2 

packEven w = packBits w [0, 2, 4, 6]

packUneven w = packBits w [1, 3, 5, 7]

-- packBits 254 [0, 2, 4, 6] = 14 
-- packBits 254 [1, 3, 5, 7] = 15
packBits :: Word8 -> [Int] -> Word8
packBits w is = foldr (.|.) 0 $ map (packBit w) is

-- packBit 255 0 = 1
-- packBit 255 1 = 1
-- packBit 255 2 = 2
-- packBit 255 3 = 2
-- packBit 255 4 = 4
-- packBit 255 5 = 4
-- packBit 255 6 = 8
-- packBit 255 7 = 8
packBit :: Word8 -> Int -> Word8
packBit w i = shiftR (w .&. 2^i) ((i `div` 2) + (i `mod` 2))

-- sort of like map, but halves the list in size by consuming two elements. 
-- Is there a clearer way to write this with built-in function?
consumePairs :: (a -> a -> b) -> [a] -> [b]
consumePairs f (x : x' : xs) = f x x' : consumePairs f xs
consumePairs _ [] = []
consumePairs _ _ = error "list must contain even number of elements"

Godfearing answered 4/9, 2011 at 12:7 Comment(8)

Nice! That's a good solution. May you allow me to incorporate this into my algorithm? I noticed that you swapped the order of the bits (i.e. 1111 1111 0000 0000 turns into 0000 1111 0000 1111 as opposed to 1111 0000 1111 0000), but that's a quick fix. I agree that Chris' solution will constitute a really good test model in quickcheck. – Blandish 4/9, 2011 at 13:55

@Blandish : fixed. I didn't think clearly. Somehow I got the idea that the LSB of the first byte would be the first bit in the sequence., so I interpreted the 10101010 as 85 instead of 170. – Godfearing 4/9, 2011 at 15:37

Down in the footer of this page is a notice, that user-submitted content is licenced under a Creative Commons Attribution-ShareAlike 3.0 Unported licence, which means that you are free to use (privately and commercially), share and remix the content, on the following conditions: if you use the work, you must attribute the author and if any derivative work is released, it must be under a similar licence. I hereby waive these conditions, so you are free to use the code as you will. – Godfearing 4/9, 2011 at 15:48

I'm sorry, but I think it was I who mixed up the MSB/LSB in my example above, as compared to the reprensation in Haskell. It appears your original version was absolutely correct :) I think I have shuffled too many bits in my head the last day. But now I'm too confused to be sure of anything anymore :D – Blandish 4/9, 2011 at 15:54

@hakoja: If you use the code, you should consider any honor code your school might have. This video discusses the Stanford honor code, starting at about 34:30. – Godfearing 4/9, 2011 at 15:59

Thanks for your help, I really appreciate it. With regards to your code, I will of course not use it as-is, but I hope I could use some of the underlying algorithmic ideas when devoloping my own version. I will happily give you full credit for this work. To this end I would need your full name if you want that attributed, as opposed to your online profile only. – Blandish 4/9, 2011 at 16:34

let us continue this discussion in chat – Blandish 4/9, 2011 at 16:38

@hakoja. Actually, I still think I was wrong. I sat down and worked your 16-bit example from the comment through with pen and paper, and I think the example is correct, according to the spcification, regardless of whether the left bit in the sequence is assigned index 0 or 15. And when you interpret 8 bits as a byte, with the leftmost being the MSB, then packWords [255, 0] should be [240, 240], corresponding to 1111 0000 1111 0000. This is what the code gives now, whereas before it gave [15, 15]. – Godfearing 4/9, 2011 at 16:48

T

4

this should work:

import Data.List
import Data.Function

map fst $ sortBy (compare `on` snd) $ zip yourList $ cycle [0,1]

A bit of explanation: As sortBy preserve the original order, we can pair each value at an even position a "0" and each value at an odd position a "1", then we simply sort on the second value of the pair. So all values at even positions will be placed before the values at odd positions but their order will be kept.

Chris

Thomasson answered 4/9, 2011 at 6:24 Comment(3)

This function presumes you're working with a list of bit values, which may not be practical because of the required size (at least a Word8 to store each bit) and speed (probably much slower than bit operations). – Shafer 4/9, 2011 at 9:44

@Chris: Thank you for your answer, it's really clever. Still I don't see how I can massage a ByteString down to a list of '1's and '0's in a time and space efficient manner (as mentioned by John L). I feel that the tools for working on a bit-level of several bytes at once is somewhat lacking in Haskell. I know of the Data.Bits module as linked to further up, but I doesn't provide exactly what I need in this case. I will try to figure out som other approaches as well and see how they compare to your solution. Maybe an unboxed array of Word8 might do the trick? – Blandish 4/9, 2011 at 11:31

I like it. It is, if not very efficient, then pretty obviously correct. Depending on the application, it may not matter, if it takes a longer time to compute. Even if you need more efficiency, this is a really good model for to test against in quickcheck. – Godfearing 4/9, 2011 at 12:29

G

4

If you want an efficient implementation, I don't think you can avoid working with bytes. Here is an example solution. It assumes that there is always an even number of bytes in the ByteString. I'm not very familiar with unboxing or strictness tweaking, but I think these would be necessary if you want to be very efficient.

import Data.ByteString (pack, unpack, ByteString)
import Data.Bits
import Data.Word

-- the main attraction
packString :: ByteString -> ByteString
packString = pack . packWords . unpack

-- main attraction equivalent, in [Word8]
packWords :: [Word8] -> [Word8]
packWords ws = evenPacked ++ unevenPacked
    where evenBits = map packEven ws
          unevenBits = map packUneven ws
          evenPacked = consumePairs packNibbles evenBits
          unevenPacked = consumePairs packNibbles unevenBits

-- combines 2 low nibbles (first 4 bytes) into a (high nibble, low nibble) word
-- assumes that only the low nibble of both arguments can be non-zero. 
packNibbles :: Word8 -> Word8 -> Word8
packNibbles w1 w2 = (shiftL w1 4) .|. w2 

packEven w = packBits w [0, 2, 4, 6]

packUneven w = packBits w [1, 3, 5, 7]

-- packBits 254 [0, 2, 4, 6] = 14 
-- packBits 254 [1, 3, 5, 7] = 15
packBits :: Word8 -> [Int] -> Word8
packBits w is = foldr (.|.) 0 $ map (packBit w) is

-- packBit 255 0 = 1
-- packBit 255 1 = 1
-- packBit 255 2 = 2
-- packBit 255 3 = 2
-- packBit 255 4 = 4
-- packBit 255 5 = 4
-- packBit 255 6 = 8
-- packBit 255 7 = 8
packBit :: Word8 -> Int -> Word8
packBit w i = shiftR (w .&. 2^i) ((i `div` 2) + (i `mod` 2))

-- sort of like map, but halves the list in size by consuming two elements. 
-- Is there a clearer way to write this with built-in function?
consumePairs :: (a -> a -> b) -> [a] -> [b]
consumePairs f (x : x' : xs) = f x x' : consumePairs f xs
consumePairs _ [] = []
consumePairs _ _ = error "list must contain even number of elements"

Godfearing answered 4/9, 2011 at 12:7 Comment(8)

Nice! That's a good solution. May you allow me to incorporate this into my algorithm? I noticed that you swapped the order of the bits (i.e. 1111 1111 0000 0000 turns into 0000 1111 0000 1111 as opposed to 1111 0000 1111 0000), but that's a quick fix. I agree that Chris' solution will constitute a really good test model in quickcheck. – Blandish 4/9, 2011 at 13:55

@Blandish : fixed. I didn't think clearly. Somehow I got the idea that the LSB of the first byte would be the first bit in the sequence., so I interpreted the 10101010 as 85 instead of 170. – Godfearing 4/9, 2011 at 15:37

Down in the footer of this page is a notice, that user-submitted content is licenced under a Creative Commons Attribution-ShareAlike 3.0 Unported licence, which means that you are free to use (privately and commercially), share and remix the content, on the following conditions: if you use the work, you must attribute the author and if any derivative work is released, it must be under a similar licence. I hereby waive these conditions, so you are free to use the code as you will. – Godfearing 4/9, 2011 at 15:48

I'm sorry, but I think it was I who mixed up the MSB/LSB in my example above, as compared to the reprensation in Haskell. It appears your original version was absolutely correct :) I think I have shuffled too many bits in my head the last day. But now I'm too confused to be sure of anything anymore :D – Blandish 4/9, 2011 at 15:54

@hakoja: If you use the code, you should consider any honor code your school might have. This video discusses the Stanford honor code, starting at about 34:30. – Godfearing 4/9, 2011 at 15:59

Thanks for your help, I really appreciate it. With regards to your code, I will of course not use it as-is, but I hope I could use some of the underlying algorithmic ideas when devoloping my own version. I will happily give you full credit for this work. To this end I would need your full name if you want that attributed, as opposed to your online profile only. – Blandish 4/9, 2011 at 16:34

let us continue this discussion in chat – Blandish 4/9, 2011 at 16:38

@hakoja. Actually, I still think I was wrong. I sat down and worked your 16-bit example from the comment through with pen and paper, and I think the example is correct, according to the spcification, regardless of whether the left bit in the sequence is assigned index 0 or 15. And when you interpret 8 bits as a byte, with the leftmost being the MSB, then packWords [255, 0] should be [240, 240], corresponding to 1111 0000 1111 0000. This is what the code gives now, whereas before it gave [15, 15]. – Godfearing 4/9, 2011 at 16:48

V

3

Unless performance is critical, I'd recommend using a bit vector representation for a project like this. As you've discovered, randomly accessing individual bits is something of a pain when they're in packed form, but Data.Vector provides a wealth of functions for these sorts of tasks.

import Data.Bits
import qualified Data.Vector as V

type BitVector = V.Vector Bool

unpack :: (Bits a) => a -> BitVector
unpack w = V.generate (bitSize w) (testBit w)

pack :: (Bits a) => BitVector -> a
pack v = V.ifoldl' set 0 v
  where
    set w i True = w `setBit` i
    set w _ _    = w

mkPermutationVector :: Int -> V.Vector Int
mkPermutationVector d = V.generate (2^d) b
  where
    b i | i < 2^(d-1) = 2*i
        | otherwise   = let i' = i-2^(d-1)
                        in 2*i'+1

permute :: Int -> BitVector -> BitVector
permute d v = V.backpermute v (mkPermutationVector d)

Notice how this lets you specify the permutation by closely transcribing the mathematical description. This substantially reduces the likelihood of errors, and is more pleasant to write than bit-twiddly code.

To test with your example vector (in base 10):

*Main> import Data.Word
*Main Data.Word> let permute16 = pack . permute 4 . unpack :: Word16 -> Word16
*Main Data.Word> permute16 43690
65280

Now, by moving to bit vectors as your representation, you lose a lot of what you get for free by using Haskell types, such as Num instances. However, you can always implement the Num operations for your representation; here's a start:

plus :: BitVector -> BitVector -> BitVector
plus as bs = V.tail sums
  where
    (sums, carries) = V.unzip sumsAndCarries
    sumsAndCarries  = V.scanl' fullAdd (False, False) (V.zip as bs)
    fullAdd (_, cin) (a, b) = ((a /= b) /= cin
                              , (a && b) || (cin && (a /= b)))

You may also find Levent Erkok's sbv package useful, although I'm not sure it exposes a function as convenient as backpermute for your particular question.

Update: I thought this was a fun question to answer, so I went ahead and fleshed the code out a bit as a library: bit-vector.

Vaporous answered 4/9, 2011 at 18:27 Comment(1)

I had never really looked at Data.Vector before, but that backpermute makes this solution look really good. – Godfearing 4/9, 2011 at 18:56

Recommended topics

Hot tags