Some questions about a single-instance array in typedef
Asked Answered
S

3

10

I was perusing some code using arbitrary-length integers using the GNU Multi-Precision (GMP) library code. The type for a MP integer is mpz_t as defined in gmp.h header file.

But, I've some questions about the lower-level definition of this library-defined mpz_t type. In the header code:

/* THIS IS FROM THE GNU MP LIBRARY gmp.h HEADER FILE */
typedef struct
{
    /* SOME OTHER STUFF HERE */
} __mpz_struct;

typedef __mpz_struct mpz_t[1];

First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?

Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?

Third question (perhaps indirectly related to second question): The GMP documentation for the mpz_init_set(mpz_t, unsigned long int) function says to use it as pass-by-value only, although one would assume that this function would be modifying its contents within the called function (and thus would need pass-by-reference) syntax. Refer to my code:

/* FROM MY CODE */
mpz_t fact_val;                /* declaration */
mpz_init_set_ui(fact_val, 1);  /* Initialize fact_val */

Does the single-occurrence array enable pass-by-reference automatically (due to the breakdown of array/pointer semantics in C)? I freely admit I'm kinda over-analyzing this, but I'd certainly love any discussion on this. Thanks!

Swisher answered 29/1, 2011 at 4:14 Comment(0)
M
3

*First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?*

Yes.

Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?

Beats me. Don't know, but one possibility is that the author wanted to make an object that was passed by reference automatically, or, "yes", possibly the struct hack. If you ever see an mpz_t object as the last member of a struct, then "almost certainly" it's the struct hack. An allocation looking like

malloc(sizeof(struct whatever) + sizeof(mpz_t) * some_number)`

would be a dead giveaway.

Does the single-occurrence array enable pass-by-reference automatically...?

Aha, you figured it out too. "Yes", one possible reason is to simplify pass-by-reference at the expense of more complex references.

I suppose another possibility is that something changed in the data model or the algorithm, and the author wanted to find every reference and change it in some way. A change in type like this would leave the program with the same base type but error-out every unconverted reference.

Mello answered 29/1, 2011 at 4:21 Comment(1)
Interestingly enough, the last element in the __mpz_struct is indeed a pointer (to a mp_limb_t, which itself is typedef'ed as an unsigned long long int [what a mouthful!]). I eliminated the contents in the interest of brevity.Swisher
C
5

This does not appear to be a struct hack in the sense described on C2. It appears that they want mpz_t to have pointer semantics (presumably, they want people to use it like an opaque pointer). Consider the syntactic difference between the following snippets:

struct __mpz_struct data[1];

(&data[0])->access = 1;
gmp_func(data, ...);

And

mpz_t data;

data->access = 1;
gmp_func(data, ...);

Because C arrays decay into pointers, this also allows for automatic pass by reference for the mpz_t type.

It also allows you to use a pointer-like type without needing to malloc or free it.

Castello answered 29/1, 2011 at 4:29 Comment(1)
Wouldn't struct __mpz_struct data; data.access = 1; gmp_func(&data) illustrate better the intention? Because actually both code examples as now are the same apart from using the typedef one time and the underlying type the other one...Volplane
M
3

*First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?*

Yes.

Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?

Beats me. Don't know, but one possibility is that the author wanted to make an object that was passed by reference automatically, or, "yes", possibly the struct hack. If you ever see an mpz_t object as the last member of a struct, then "almost certainly" it's the struct hack. An allocation looking like

malloc(sizeof(struct whatever) + sizeof(mpz_t) * some_number)`

would be a dead giveaway.

Does the single-occurrence array enable pass-by-reference automatically...?

Aha, you figured it out too. "Yes", one possible reason is to simplify pass-by-reference at the expense of more complex references.

I suppose another possibility is that something changed in the data model or the algorithm, and the author wanted to find every reference and change it in some way. A change in type like this would leave the program with the same base type but error-out every unconverted reference.

Mello answered 29/1, 2011 at 4:21 Comment(1)
Interestingly enough, the last element in the __mpz_struct is indeed a pointer (to a mp_limb_t, which itself is typedef'ed as an unsigned long long int [what a mouthful!]). I eliminated the contents in the interest of brevity.Swisher
P
3

The reason for this comes from the implementation of mpn. Specifically, if you're mathematically inclined you'll realise N is the set of natural numbers (1,2,3,4...) whereas Z is the set of integers (...,-2,-1,0,1,2,...).

Implementing a bignum library for Z is equivalent to doing so for N and taking into account some special rules for sign operations, i.e. keeping track of whether you need to do an addition or a subtraction and what the result is.

Now, as for how a bignum library is implemented... here's a line to give you a clue:

typedef unsigned int        mp_limb_t;
typedef mp_limb_t *     mp_ptr;

And now let's look at a function signature operating on that:

__GMP_DECLSPEC mp_limb_t mpn_add __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr,mp_size_t));

Basically, what it comes down to is that a "limb" is an integer field representing the bits of a number and the whole number is represented as a huge array. The clever part is that gmp does all this in a very efficient, well optimised manner.

Anyway, back to the discussion. Basically, the only way to pass arrays around in C is, as you know, to pass pointers to those arrays which effectively enables pass by reference. Now, in order to keep track of what's going on, two types are defined, a mp_ptr which is an array of mp_limb_t big enough to store your number, and mp_srcptr which is a const version of that, so that you cannot accidentally alter the bits of the source bignums on what you are operating. The basic idea is that most of the functions follow this pattern:

func(ptr output, src in1, src in2)

etc. Thus, I suspect mpz_* functions follow this convention simply to be consistent and it is because that is how the authors are thinking.

Short version: Because of how you have to implement a bignum lib, this is necessary.

Pentadactyl answered 29/1, 2011 at 8:6 Comment(2)
Well, I was more curious about the language/syntax specifics rather than this implementation's technique, but I still like your answer. +1. Thanks! P.S. I've also perused the source code for GNU bc command-line arbitrary-precision calculator in seeing how they do big numbers programmatically, so this isn't too foreign to me.Swisher
@Swisher I did some brief experimental stuff on mpir (a gmp fork) which I never committed, which is why I know a bit about it, just thought I'd share it anyway for added info.Pentadactyl

© 2022 - 2024 — McMap. All rights reserved.