How does pack() and unpack() work in Ruby

irb(main):003:0> n = [ 65, 66, 67 ] => [65, 66, 67] irb(main):004:0> n.pack("ccc") => "ABC" irb(main):005:0> n.pack("C") => "A" irb(main):006:0> n.pack("CCC") => "ABC" irb(main):007:0> n.pack("qqq") => "A\x00\x00\x00\x00\x00\x00\x00B\x00\x00\x00\x00\x00\x00\x00C\x00\x00\x00\x00\ x00\x00\x00" irb(main):008:0> n.pack("QQQ") => "A\x00\x00\x00\x00\x00\x00\x00B\x00\x00\x00\x00\x00\x00\x00C\x00\x00\x00\x00\ x00\x00\x00" irb(main):009:0> n.pack("SSS") => "A\x00B\x00C\x00" irb(main):010:0> n.pack("sss") => "A\x00B\x00C\x00" irb(main):011:0>

You are asking a question about the fundamental principles of how computers store numbers in memory. For example you can look at these to learn more:

http://en.wikipedia.org/wiki/Computer_number_format#Binary_Number_Representation
http://en.wikipedia.org/wiki/Signed_number_representations

As an example take the difference between S and s; both are used for packing and unpacking 16-bit numbers, but one is for signed integers and the other for unsigned. This has significant meaning when you want to unpack the string back into the original integers.

S: 16-bit unsigned means numbers 0 - 65535 (0 to (2^16-1))
s: 16-bit signed integer numbers -32768 - 32767 (-(2^15) to (2^15-1)) (one bit used for sign)

The difference can be seen here:

# S = unsigned: you cannot pack/unpack negative numbers
> [-1, 65535, 32767, 32768].pack('SSSS').unpack('SSSS')
=> [65535, 65535, 32767, 32768]   

# s = signed: you cannot pack/unpack numbers outside range -32768 - 32767
> [-1, 65535, 32767, 32768].pack('ssss').unpack('ssss')
=> [-1, -1, 32767, -32768]

So you see you have to know how numbers are represented in computer memory in order to understand your question. Signed numbers use one bit to represent the sign, while unsigned numbers do not need this extra bit, but you cannot represent negative numbers then.

This is the very basic of how numbers are represented as binary in computer memory.

The reason you need packing for example is when you need to send numbers as a byte stream from one computer to another (like over a network connection). You have to pack your integer numbers into bytes in order to be sent over a stream. The other option is to send the numbers as strings; then you encode and decode them as strings on both ends instead of packing and unpacking.

Or let's say you need to call a C-function in a system library from Ruby. System libraries written in C operate on basic integers (int, uint, long, short, etc.) and C-structures (struct). You will need to convert your Ruby integers into system integers or C-structures before calling such system methods. In those cases pack and unpack can be used to interface which such methods.

Regarding the additional directives they deal with the endianness of how to represent the packed byte sequence. See here on what endianness means and how it works:

http://en.wikipedia.org/wiki/Endianness

In simplified terms it just tells the packing method in which order the integers should be converted into bytes:

# Big endian
> [34567].pack('S>').bytes.map(&:to_i)
=> [135, 7]   
# 34567 = (135 * 2^8) + 7

# Little endian
> [34567].pack('S<').bytes.map(&:to_i)
=> [7, 135]   
# 34567 = 7 + (135 * 2^8)

Recommended topics

Hot tags