Basically, the idea behind a 7-bit encoded Int32
is to reduce the number of bytes required for small values. It works like this:
- The first 7 least significant bits of the original value are taken.
- If this value exceeds what can fit into these 7 bits, the 8th bit is set to 1, indicating another byte has to be read. Otherwise that bit is 0 and reading ends here.
- The next byte is read, its value shifted left by 7 bits and ORed to the previously read value to combine them together. Again, the 8th bit of this byte indicates if another byte must be read (shifting the read value further 7 more times).
- This continues until a maximum of 5 bytes has been read (because even
Int32.MaxValue
would not require more than 5 bytes when only 1 bit is stolen from each byte). If the highest bit of the 5th byte is still set, you've read something that isn't a 7-bit encoded Int32.
Note that since it is written byte-by-byte, endianness doesn't matter at all for these values. The following number of bytes are required for a given range of values:
- 1 byte: 0 to 127
- 2 bytes: 128 to 16,383
- 3 bytes: 16,384 to 2,097,151
- 4 bytes: 2,097,152 to 268,435,455
- 5 bytes: 268,435,456 to 2,147,483,647 (
Int32.MaxValue
) and -2,147,483,648 (Int32.MinValue
) to -1
As you can see, the implementation is kinda dumb and always requires 5 bytes for negative values as the sign bit is the 32nd bit of the original value, always ending up in the 5th byte.
Thus, I do not recommend it for negative values or values bigger than ~250,000,000. I've only seen it used internally for the string length prefix of .NET strings (those you can read/write with BinaryReader.ReadString
and BinaryReader.WriteString
), describing the number of characters following of which the string consists, only having positive values.
While you can look up the original .NET source, I use different implementations in my BinaryData library.
protected
until .NET 5 where they becamepublic
. – Jadeite