How does GMP stores its integers, on an arbitrary number of bytes?
Asked Answered
S

1

11

2^64 is still far from the "infinity" my ram/hard drive can handle...

First I wonder how GMP works with memory/processor since it does some kind of shady optimisations...

I was also wondering if there is a way of storing an integer (unsigned, it's easier) on an arbitrary number of bytes. For example, on 50 bytes, I would have a cap of 2^400 -1. The thing to do is to work well with carries to keep the number consistent from one byte to another, I have some idea about that, but I'm really not sure it would be the fastest way to do this. I'm not even sure if I'm right.

I'm guessing GMP uses this kind of way to store its data, but I just want some (even little) explanation or some forwarding to some theory (I don't have any doctorate, so don't be tough).

Sizable answered 13/7, 2010 at 23:11 Comment(0)
B
26

GMP dynamically allocates space to represent numbers (and reallocates when it needs to grow).

This is described in light detail in Integer Internals, in the GMP manual, it describes how it chunks up the representation into "limbs" and stores the limbs in an array.

The description of the term "limbs" comes from GMP Basics: Nomenclature and Types:

A limb means the part of a multi-precision number that fits in a single word. (We chose this word because a limb of the human body is analogous to a digit, only larger, and containing several digits.) Normally a limb contains 32 or 64 bits. The C data type for a limb is mp_limb_t.

So, representing a number in GMP works by grouping a number of limbs together to represent the magnitude of the integer, stored with a sign bit (the sign bit is dual purposed to store the number of limbs).

What does this mean to you? Well, normally an int64 is represented in 64 bits. Done. If you package a bunch of these together, you can significantly increase that. Put two together, 2^64*2^64, or 2^128. Add two more limbs and you get 2^256. That's a lot of numbers, stored in 4 words (plus the representation overhead).

Of course, the representation of floats is more complicated (see here), storing the representation using a mantissa (consisting of a sign and magnitude) and an exponent.

Bryon answered 14/7, 2010 at 0:27 Comment(3)
<quote>If you package a bunch of these together</quote> By package do you mean just considering one left and one right word, and multiplying the second by 2^64 ? Thus to display a number stored on 4 64-bits words, gmp multiplies the third by 2^192, the second by 2^128, and finally the second by 2^64.Sizable
@Sizable : Effectively, yes. Of course, it can't actually do the multiplication and store the bits in a 64bit word :) So, to print the decimal value of a gmp integer, you would use the gmp_printf routine, which will convert those limbs into a character string.Bryon
FWIW, note that Knuth, in his "The Art Of Computer Programming" books, which are probably quite a lot older than GMP, also calls them limbs, so I assume that he coined the term, and not the authors of GMP (who probably have read the book and taken inspiration from it).Jacobah

© 2022 - 2024 — McMap. All rights reserved.