Does Python have a bitfield type?
Asked Answered
C

12

58

I need a compact representation of an array of booleans, does Python have a builtin bitfield type or will I need to find an alternate solution?

Charlottetown answered 27/9, 2008 at 2:47 Comment(1)
For cases in which the term is ambiguous, I take it that you want the sorts of features available in C bit fields, or as described here? en.wikipedia.org/wiki/Bit_fieldIona
C
33

Bitarray was the best answer I found, when I recently had a similar need. It's a C extension (so much faster than BitVector, which is pure python) and stores its data in an actual bitfield (so it's eight times more memory efficient than a numpy boolean array, which appears to use a byte per element.)

Cardie answered 27/9, 2008 at 8:20 Comment(3)
Is BitArray available for install on Windows?Pretentious
It looks like BitArray is readily available for installation on Linux but nothing on the page suggests a PIP installation for Windows. Bummer...Pretentious
Good old Christoph Gohlke I say windows bitarray build :) The site might say "Unofficial Windows Binaries for Python Extension Packages" but I've used umpteen packages and never once had a problem.Darwen
I
53

If you mainly want to be able to name your bit fields and easily manipulate them, e.g. to work with flags represented as single bits in a communications protocol, then you can use the standard Structure and Union features of ctypes, as described at How Do I Properly Declare a ctype Structure + Union in Python? - Stack Overflow

For example, to work with the 4 least-significant bits of a byte individually, just name them from least to most significant in a LittleEndianStructure. You use a union to provide access to the same data as a byte or int so you can move the data in or out of the communication protocol. In this case that is done via the flags.asbyte field:

import ctypes
c_uint8 = ctypes.c_uint8

class Flags_bits(ctypes.LittleEndianStructure):
    _fields_ = [
            ("logout", c_uint8, 1),
            ("userswitch", c_uint8, 1),
            ("suspend", c_uint8, 1),
            ("idle", c_uint8, 1),
        ]

class Flags(ctypes.Union):
    _fields_ = [("b", Flags_bits),
                ("asbyte", c_uint8)]

flags = Flags()
flags.asbyte = 0xc

print(flags.b.idle)
print(flags.b.suspend)
print(flags.b.userswitch)
print(flags.b.logout)

The four bits (which I've printed here starting with the most significant, which seems more natural when printing) are 1, 1, 0, 0, i.e. 0xc in binary.

Iona answered 14/7, 2012 at 6:2 Comment(1)
Struct packing can result unexpected results occasionally: github.com/python/cpython/pull/19850#issuecomment-869410686Sternum
C
33

Bitarray was the best answer I found, when I recently had a similar need. It's a C extension (so much faster than BitVector, which is pure python) and stores its data in an actual bitfield (so it's eight times more memory efficient than a numpy boolean array, which appears to use a byte per element.)

Cardie answered 27/9, 2008 at 8:20 Comment(3)
Is BitArray available for install on Windows?Pretentious
It looks like BitArray is readily available for installation on Linux but nothing on the page suggests a PIP installation for Windows. Bummer...Pretentious
Good old Christoph Gohlke I say windows bitarray build :) The site might say "Unofficial Windows Binaries for Python Extension Packages" but I've used umpteen packages and never once had a problem.Darwen
G
16

You should take a look at the bitstring module, which has recently reached version 4.0. The binary data is compactly stored as a byte array and can be easily created, modified and analysed.

You can create bitstring objects from binary, octal, hex, integers (big or little endian), strings, bytes, floats, files and more.

from bitstring import BitArray, BitStream
a = BitArray('0xed44')
b = BitArray('0b11010010')
c = BitArray(int=100, length=14)
d = BitArray('uintle:16=55, 0b110, 0o34')
e = BitArray(bytes='hello')
f = pack('<2H, bin:3', 5, 17, '001') 

You can then analyse and modify them with simple functions or slice notation - no need to worry about bit masks etc.

a.prepend('0b110')
if '0b11' in b:
    c.reverse()
g = a.join([b, d, e])
g.replace('0b101', '0x3400ee1')
if g[14]:
    del g[14:17]
else:
    g[55:58] = 'uint11=33, int9=-1'

There is also a concept of a bit position, so that you can treat it like a file or stream if that's useful to you. Properties are used to give different interpretations of the bit data.

g = BitStream(g)
w = g.read(10).uint
x, y, z = g.readlist('int4, int4, hex32')
if g.peek(8) == '0x00':
    g.pos += 10

Plus there's support for the standard bit-wise binary operators, packing, unpacking, endianness and more. The latest version is for Python 3.7 and later, and it is reasonably well optimised in terms of memory and speed.

Gluttonous answered 15/10, 2009 at 20:43 Comment(3)
I like that one! A little more intuitive than bitarray for me. Thanks!Plebe
You should reveal yourself as the author of the bitstring moduleLanugo
@shrewmouse: Well it's not something I'm hiding exactly - I'm using my real name on S.O. and I do mention it's my module in a number of other answers here, and of course in the module itself and its documentation, plus it's in my bio here on S.O. which means it's even in the tooltip for my profile picture in this answer.Gluttonous
D
10

Represent each of your values as a power of two:

testA = 2**0
testB = 2**1
testC = 2**3

Then to set a value true:

table = table | testB

To set a value false:

table = table & (~testC)

To test for a value:

bitfield_length = 0xff
if ((table & testB & bitfield_length) != 0):
    print "Field B set"

Dig a little deeper into hexadecimal representation if this doesn't make sense to you. This is basically how you keep track of your boolean flags in an embedded C application as well (if you have limitted memory).

Dentoid answered 5/11, 2008 at 15:30 Comment(1)
Great answer. I like and dislike that it's manual at the same time. There isn't a more logical way to manually construct a bitfield class though.Schooling
A
7

I use the binary bit-wise operators !, &, |, ^, >>, and <<. They work really well and are implemented directly in the underlying C, which is usually directly on the underlying hardware.

Alpestrine answered 27/9, 2008 at 13:26 Comment(0)
F
5

The BitVector package may be what you need. It's not built in to my python installation, but easy to track down on the python site.

https://pypi.python.org/pypi/BitVector for the current version.

Firstling answered 27/9, 2008 at 3:3 Comment(0)
I
4

NumPy has a array interface module that you can use to make a bitfield.

Immateriality answered 27/9, 2008 at 2:52 Comment(1)
The built-in array module is sufficient for a bit-array too, and more portable (across Python impls) than NumPy.Koski
L
2

If your bitfield is short, you can probably use the struct module. Otherwise I'd recommend some sort of a wrapper around the array module.

Also, the ctypes module does contain bitfields, but I've never used it myself. Caveat emptor.

Lowbred answered 27/9, 2008 at 8:41 Comment(1)
But it seems that the struct module represents each bit as a char or byte, so it doesn't really handle bit fields as normally defined (where bits are packed tightly together in memory).Iona
S
2

I needed a minimal, memory efficient bitfield with no external dependencies, here it is:

import math

class Bitfield:
    def __init__(self, size):
        self.bytes = bytearray(math.ceil(size / 8))

    def __getitem__(self, idx):
        return self.bytes[idx // 8] >> (idx % 8) & 1

    def __setitem__(self, idx, value):
        mask = 1 << (idx % 8)
        if value:
            self.bytes[idx // 8] |= mask
        else:
            self.bytes[idx // 8] &= ~mask

Use:

# if size is not a multiple of 8, actual size will be the next multiple of 8
bf = Bitfield(1000)
bf[432] # 0
bf[432] = 1
bf[432] # 1
Symposium answered 1/12, 2020 at 14:48 Comment(0)
R
1

If you want to use ints (or long ints) to represent as arrays of bools (or as sets of integers), take a look at http://sourceforge.net/projects/pybitop/files/

It provides insert/extract of bitfields into long ints; finding the most-significant, or least-significant '1' bit; counting all the 1's; bit-reversal; stuff like that which is all possible in pure python but much faster in C.

Repetend answered 16/9, 2010 at 21:1 Comment(0)
S
0

For mostly-consecutive bits there's the https://pypi.org/project/range_set/ module which is API compatible to Python's built-in set. As the name implies, it stores the bits as begin/end pairs.

Scatter answered 30/12, 2018 at 21:45 Comment(0)
T
0

I had to deal with some control words / flags in a communication protocol and my focus was that the editor gives me suggestions of the flag names and jumps to the definition of the flags with "F3". The code below suffices theses requirements (The solution with ctypes by @nealmcb unfortunately is not supported by the PyCharm indexer today. ). Suggestions welcome:

""" The following bit-manipulation methods are written to take a tuple as input, which is provided by the Bitfield class. The construct 
looks weired, however the call to a setBit() looks ok and the editor (PyCharm) suggests all 
possible bit names. I did not find a more elegant solution that calls the setBit()-function and needs 
only one argument.
Example call:
    setBit( STW1.bm01NoOff2() ) """

def setBit(TupleBitField_BitMask):
    # word = word | bit_mask
    TupleBitField_BitMask[0].word = TupleBitField_BitMask[0].word | TupleBitField_BitMask[1]


def isBit(TupleBitField_BitMask):
    # (word & bit_mask) != 0
    return (TupleBitField_BitMask[0].word & TupleBitField_BitMask[1]) !=0


def clrBit(TupleBitField_BitMask):
    #word = word & (~ BitMask)
    TupleBitField_BitMask[0].word = TupleBitField_BitMask[0].word & (~ TupleBitField_BitMask[1])


def toggleBit(TupleBitField_BitMask):
    #word = word ^ BitMask
    TupleBitField_BitMask[0].word = TupleBitField_BitMask[0].word ^ TupleBitField_BitMask[1]

""" Create a Bitfield type for each control word of the application. (e.g. 16bit length). 
Assign a name for each bit in order that the editor (e.g. PyCharm) suggests the names from outside. 
The bits are defined as methods that return the corresponding bit mask in order that the bit masks are read-only
and will not be corrupted by chance.
The return of each "bit"-function is a tuple (handle to bitfield, bit_mask) in order that they can be 
sent as arguments to the single bit manipulation functions (see above): isBit(), setBit(), clrBit(), toggleBit()
The complete word of the Bitfield is accessed from outside by xxx.word.
Examples:
    STW1 = STW1Type(0x1234) # instanciates and inits the bitfield STW1, STW1.word = 0x1234
    setBit(STW1.bm00() )    # set the bit with the name bm00(), e.g. bm00 = bitmask 0x0001
    print("STW1.word =", hex(STW1.word))
"""
class STW1Type():
    # assign names to the bit masks for each bit (these names will be suggested by PyCharm)
    #    tip: copy the application's manual description here
    def __init__(self, word):
        # word = initial value, e.g. 0x0000
        self.word = word

    # define all bits here and copy the description of each bit from the application manual. Then you can jump
    #    to this explanation with "F3"
    #    return the handle to the bitfield and the BitMask of the bit.
    def bm00NoOff1_MeansON(self):
        # 0001 0/1= ON (edge)(pulses can be enabled)
        #        0 = OFF1 (braking with ramp-function generator, then pulse suppression & ready for switching on)
        return self, 0x0001

    def bm01NoOff2(self):
        # 0002  1 = No OFF2 (enable is possible)
        #       0 = OFF2 (immediate pulse suppression and switching on inhibited)
        return self, 0x0002

    def bm02NoOff3(self):
        # 0004  1 = No OFF3 (enable possible)
        #       0 = OFF3 (braking with the OFF3 ramp p1135, then pulse suppression and switching on inhibited)
        return self, 0x0004

    def bm03EnableOperation(self):
        # 0008  1 = Enable operation (pulses can be enabled)
        #       0 = Inhibit operation (suppress pulses)
        return self, 0x0008

    def bm04RampGenEnable(self):
        # 0010  1 = Hochlaufgeber freigeben (the ramp-function generator can be enabled)
        #       0 = Inhibit ramp-function generator (set the ramp-function generator output to zero)
        return self, 0x0010

    def b05RampGenContinue(self):
        # 0020  1 = Continue ramp-function generator
        #       0 = Freeze ramp-function generator (freeze the ramp-function generator output)
        return self, 0x0020

    def b06RampGenEnable(self):
        # 0040  1 = Enable speed setpoint; Drehzahlsollwert freigeben
        #       0 = Inhibit setpoint; Drehzahlsollwert sperren (set the ramp-function generator input to zero)
        return self, 0x0040

    def b07AcknowledgeFaults(self):
        # 0080 0/1= 1. Acknowledge faults; 1. Quittieren Störung
        return self, 0x0080

    def b08Reserved(self):
        # 0100 Reserved
        return self, 0x0100

    def b09Reserved(self):
        # 0200 Reserved
        return self, 0x0200

    def b10ControlByPLC(self):
        # 0400  1 = Control by PLC; Führung durch PLC
        return self, 0x0400

    def b11SetpointInversion(self):
        # 0800  1 = Setpoint inversion; Sollwert Invertierung
        return self, 0x0800

    def b12Reserved(self):
        # 1000 Reserved
        return self, 0x1000

    def b13MotorPotiSPRaise(self):
        # 2000 1 = Motorized potentiometer setpoint raise; (Motorpotenziometer Sollwert höher)
        return self, 0x2000

    def b14MotorPotiSPLower(self):
        # 4000 1 = Motorized potentiometer setpoint lower; (Motorpotenziometer Sollwert tiefer)
        return self, 0x4000

    def b15Reserved(self):
        # 8000 Reserved
        return self, 0x8000


""" test the constrution and methods """
STW1 = STW1Type(0xffff)
print("STW1.word                =", hex(STW1.word))

clrBit(STW1.bm00NoOff1_MeansON())
print("STW1.word                =", hex(STW1.word))

STW1.word = 0x1234
print("STW1.word                =", hex(STW1.word))

setBit( STW1.bm00NoOff1_MeansON() )
print("STW1.word                =", hex(STW1.word))

clrBit( STW1.bm00NoOff1_MeansON() )
print("STW1.word                =", hex(STW1.word))

toggleBit(STW1.bm03EnableOperation())
print("STW1.word                =", hex(STW1.word))

toggleBit(STW1.bm03EnableOperation())
print("STW1.word                =", hex(STW1.word))

print("STW1.bm00ON              =", isBit(STW1.bm00NoOff1_MeansON() ) )
print("STW1.bm04                =", isBit(STW1.bm04RampGenEnable()  ) )

It prints out:

STW1.word                = 0xffff
STW1.word                = 0xfffe
STW1.word                = 0x1234
STW1.word                = 0x1235
STW1.word                = 0x1234
STW1.word                = 0x123c
STW1.word                = 0x1234
STW1.bm00ON              = False
STW1.bm04                = True
Tulipwood answered 28/7, 2019 at 11:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.