I am fairly new to C programming, and I encountered bit masking. What is the general concept and function of bit masking?
Examples are much appreciated.
I am fairly new to C programming, and I encountered bit masking. What is the general concept and function of bit masking?
Examples are much appreciated.
A mask defines which bits you want to keep, and which bits you want to clear.
Masking is the act of applying a mask to a value. This is accomplished by doing:
Below is an example of extracting a subset of the bits in the value:
Mask: 00001111b
Value: 01010101b
Applying the mask to the value means that we want to clear the first (higher) 4 bits, and keep the last (lower) 4 bits. Thus we have extracted the lower 4 bits. The result is:
Mask: 00001111b
Value: 01010101b
Result: 00000101b
Masking is implemented using AND, so in C we get:
uint8_t stuff(...) {
uint8_t mask = 0x0f; // 00001111b
uint8_t value = 0x55; // 01010101b
return mask & value;
}
Here is a fairly common use-case: Extracting individual bytes from a larger word. We define the high-order bits in the word as the first byte. We use two operators for this, &
, and >>
(shift right). This is how we can extract the four bytes from a 32-bit integer:
void more_stuff(uint32_t value) { // Example value: 0x01020304
uint32_t byte1 = (value >> 24); // 0x01020304 >> 24 is 0x01 so
// no masking is necessary
uint32_t byte2 = (value >> 16) & 0xff; // 0x01020304 >> 16 is 0x0102 so
// we must mask to get 0x02
uint32_t byte3 = (value >> 8) & 0xff; // 0x01020304 >> 8 is 0x010203 so
// we must mask to get 0x03
uint32_t byte4 = value & 0xff; // here we only mask, no shifting
// is necessary
...
}
Notice that you could switch the order of the operators above, you could first do the mask, then the shift. The results are the same, but now you would have to use a different mask:
uint32_t byte3 = (value & 0xff00) >> 8;
&
. –
Iridic #define MASK 0x000000FF .... my_uint32_t &= ~MASK
. –
Sansculotte b
to indicate binary literal is not supported by all compilers, correct? –
Gopherwood value & 0x0f
does not contain any assignment so value
does not change. –
Canonist Masking means to keep, change, or remove a desired part of information. Let’s see an image-masking operation; like this masking operation is removing anything that is not skin:
We are doing an AND operation in this example. There are also other masking operators—OR and XOR.
Bitmasking means imposing mask over bits. Here is a bitmasking with AND—
1 1 1 0 1 1 0 1 input
(&) 0 0 1 1 1 1 0 0 mask
------------------------------
0 0 1 0 1 1 0 0 output
So, only the middle four bits (as these bits are 1
in this mask) remain.
Let’s see this with XOR—
1 1 1 0 1 1 0 1 input
(^) 0 0 1 1 1 1 0 0 mask
------------------------------
1 1 0 1 0 0 0 1 output
Now, the middle four bits are flipped (1
became 0
, 0
became 1
).
So, using a bitmask, we can access individual bits (examples). Sometimes, this technique may also be used for improving performance. Take this for example-
bool isOdd(int i) {
return i%2;
}
This function tells if an integer is odd/even. We can achieve the same result with more efficiency using a bit-mask—
bool isOdd(int i) {
return i&1;
}
Short Explanation: If the least significant bit of a binary number is 1
then it is odd; for 0
it will be even. So, by doing AND with 1
we are removing all other bits except for the least significant bit, i.e.:
55 -> 0 0 1 1 0 1 1 1 input
(&) 1 -> 0 0 0 0 0 0 0 1 mask
---------------------------------------
1 <- 0 0 0 0 0 0 0 1 output
Bits and Bytes
In computing, numbers are internally represented in binary. This means, where you use an integer type for a variable, this will actually be represented internally as a summation of zeros and ones.
As you might know, a single bit represents one 0
or one 1
. A concatenation of eight of those bits represent a Byte, e.g. 00000101
, which is the number 5.
I presume you know how numbers are represented in binary, if not, take a look here.
In PHP a number is (mostly) 4 Bytes long. This means that your number actually uses 32 bits of the internal storage. But for simplicity reasons, throughout this answer I will use 8 bit numbers.
Storing states in bits
Now imagine you want to create a program that holds a state, which is based on multiple values that are one(true
) or zero(false
). One could store these values in different variables, may they be booleans or integers. Or instead use a single integer variable and use each bit of its internal 32 bits to represent the different true and falses.
An example:
00000101
. Here the the first bit (reading from right to left) is true, which represents the first variable. The 2nd is false, which represents the 2nd variable. The third true. And so on...
This is a very compact way of storing data and has many usages.
Bit Masking
This is where bit masking comes in. It sounds complex but actually it's very simple.
Bit masking allows you to use operations that work on bit-level.
You actually apply a mask to a value, where in our case the value is our state 00000101
and the mask is again a binary number, which indicates the bits of interest.
By performing binary operations on the mask and the state one could achieve the following:
If we want to set a particular value to true, we could do this by using the OR operator and the following bit mask:
Mask: 10000000b
Value: 00000101b
---- OR ---------
Result: 10000101b
Or one could select a particular value from the state by using the AND operator:
Mask: 00001100b
Value: 00000101b
---- AND ---------
Result: 00000100b
I suggest you to take some deeper look into it and get familiar with the jargon. A good start may be this link.
Goodluck!
It is just a number, as represented in binary. For example, let's say I have 8 boolean values (true
or false
) that I want to store. I could store it as an array of 8 booleans, or I could store it as a single byte (8 bits), each of which store one of the booleans (0
= false
, 1
= true
).
At this point, I can easily manipulate my byte so that I can (1) set specific bits to be on or off (true or false), and (2) check whether specific bits are on or off.
mask = mask | (1 << bitIndex)
mask = mask & ~(1 << bitIndex)
(mask & (1 << bitIndex)) != 0
All of these operations use the left-shift operator, which moves bits up from least-significant to most-significant positions.
In essence, Bitmask is a list of boolean flags (for example isAlive, isMoving, etc) compressed into a single field, usually an integer. It can cut quite a significant amount of JSON string size or memory footprint.
This can be significant especially in PHP where a single boolean in an array can take the same amount of RAM as an integer. There is a very simple Bitmask guide that will explain step by step everything you need to know including how and when to use it.
Edit: Here's an archive link: https://archive.is/QwCUX, since the original site seems to be down.
I will take a stab at the meaning of "Mask" when applied to computing with other data structures than bit arrays, and will use a few examples of how masks might apply. Bit masking works a certain way because they're defined in fixed size or have a presumed 0 value, so they do AND/OR masking (see @DJanssens great answer).
First an analogy to illustrate: start with an image of your face. When you look in the mirror? (square on) You see your face-- your cheeks, forehead, jaw, your eyes, etc.
Now you put on a mask (search images of "Venetian Half mask" if you don't know what a mask might look like).
What do you see in the mirror? You see the mask where the mask covers whats beneath. You your skin or jaw or whatever is not covered. A mask obscures what it covers, but presents a set the size of the base.
Ok now into computing we can also do masking across many data types. For example Objects, Arrays, Trees... They can be done additively (adding missing values from the mask, or only on the intersection of keys meaning if a key is not in the base, any mask is ignored).
Eg 1 Objects
Base =
{
"A": "baseA",
"B": "baseB",
}
Mask =
{
"A": "maskA",
"C": "maskC",
}
==>
{
"A": "maskA", // Masked / overridden by the mask
"B": "baseB", // Pass through from base because not masked
"C": "maskC", // Added because present in the mask (null/undefined in base)
}
Eg2 Arrays Assume null
is a canary value, not a valid value
base = ["base1", "base2"]
mask = [null, "mask2", "mask3"]
==> ["base1", "mask2", "mask3"]
Eg3 Trees Assume null
is a canary value, not a valid value
Base =
B
/
A
\
C
Mask =
E
/
D
/
null
Output =
E // Added by presence in mask
/
D // Override base value because present in mask
/
A // Passthrough from base because `null` in mask means pass through
\
C // pass through because undefined in mask
EG 4 Strings
my colleague informed me of another key use of the word "mask" in computing which can be applied to strings, often in logs. The mask is usually not explicitly defined as an input value, but more like a function that redacts sensitive data
base = "[INFO] user logged in with [email protected] password=123456"
mask(base) ==> "[INFO] user logged in with [email protected] password=******"
(sometimes not matching masked length to also hide the length of the masked item)
see: https://github.com/gwpmad/mask-deep as an example of this usage of the word mask.
© 2022 - 2025 — McMap. All rights reserved.