what is the reason for explicitly declaring L or UL for long values
Asked Answered
D

4

60

From an Example

unsigned long x = 12345678UL

We have always learnt that the compiler needs to see only "long" in the above example to set 4 bytes (in 32 bit) of memory. The question is why is should we use L/UL in long constants even after declaring it to be a long.

Djokjakarta answered 30/10, 2012 at 8:10 Comment(4)
Do you mean why UL is used instead of L or instead of nothing?Servitude
@NikosChantziaras No. Why do we need to use L/UL in long values.Djokjakarta
@Patrick This question is for C. That question is for C++ and the accepted answer is about overloading. This is not a duplicate of that.Basipetal
Pascal is correct, this question is not a duplicate of the linked question. Nominating to reopen.Frosting
B
96

When a suffix L or UL is not used, the compiler uses the first type that can contain the constant from a list (see details in C99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long int, long long int).

As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long. See also codebauer's answer for an example where the U part of the suffix is necessary.


There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:

printf("%lld", 1LL); // correct, because 1LL has type long long
printf("%lld", 1);   // undefined behavior, because 1 has type int

A common reason to use a suffix is ensuring that the result of a computation doesn't overflow. Two examples are:

long x = 10000L * 4096L;
unsigned long long y = 1ULL << 36;

In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.

As Lightness Races in Orbit puts it, the litteral's suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.


Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.

int x = -3;
printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12

With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:

printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large

In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.


Note that, for decimal constants, the list of types suggested by C99 does not contain unsigned long long. In C90, the list ended with the largest standardized unsigned integer type at the time (which was unsigned long). A consequence was that the meaning of some programs was changed by adding the standard type long long to C99: the same constant that was typed as unsigned long in C90 could now be typed as a signed long long instead. I believe this is the reason why in C99, it was decided not to have unsigned long long in the list of types for decimal constants. See this and this blog posts for an example.

Basipetal answered 30/10, 2012 at 8:28 Comment(11)
Small addition: it can also improve readability and hint about the suggested usage in some cases. E.g. you might have something like #define MY_DEFINE 123456789UL and you use MY_DEFINE later in the code. Naturally, it doesn't have type associated with it so UL addition may be of little help here.Gallice
Even though the compiler can pick the size of a numeric literal, it doesn't automatically determine whether it's signed or not. For example, 18446744073709551615 is treated as -1L on systems with a 64-bit long. You have to explicitly use UL.Servitude
@NikosChantziaras Perhaps at the same time you were writing your comment, I was expanding on the case of the list of types for decimal constants not containing any unsigned types, with a theory for the reason.Basipetal
Another fairly common scenario where type suffixes are needed are bit shifts, 1 << 36 is probably UB, 1ULL << 36 is safe. Perhaps worth to be added in the list of examples.Cicisbeo
@DanielFischer I have grouped that with caf's multiplication example.Basipetal
IMHO it would be justified to mention the *_C macros in stdint.h in this answer as well. It's already quite verbose ;) E.g. UINT64_C(x) produces a literal of value x with the right suffix to make its type uint64_t - thus there is no need for specific prefixes for the stdint.h data types.Maldives
@Maldives You can put this information in your own answer if you think it is useful. Re-reading the question, I think it's a digression. There is nothing in the question that indicates that the OP wants to know about the _C macro. Thanks for your comment.Basipetal
@pascal Cuoq: I am getting correct result for this: signed long long var3 = 2147483648+2; In this case both operands in RHS are ints and result is int and must overflow right? But result is 2147483650. This was using C++11.Costate
@Costate No, 2147483648 is not an intBasipetal
@pascal Cuoq : I think I got it. Default integer type literal could be int, long or long long depending on the value of the literal. In the case of unsigned long var1=4294967299*2; 4294967299 is considered as long long. 2 is promoted to long long and hence 4294967299*2 = 8589934598. Since LHS is unsigned long, truncation occurs and hence Result = 8589934598 - 4294967296 - 4294967296 = 6. Hope I am right.Costate
@Costate You are entirely correct. On most architectures, 4294967299*2 is a well-defined expression, of type long or long long (long if long is 64-bit). On the other hand 2000000000*3 is an expression of type int that contains undefined behavior, because 2000000000 is typed as int and the multiplication overflows. It's a funny language.Basipetal
L
21

Because numerical literals are of typicaly of type int. The UL/L tells the compiler that they are not of type int, e.g. assuming 32bit int and 64bit long

long i = 0xffff;
long j = 0xffffUL;

Here the values on the right must be converted to signed longs (32bit -> 64bit)

  1. The "0xffff", an int, would converted to a long using sign extension, resulting in a negative value (0xffffffff)
  2. The "0xffffUL", an unsigned long, would be converted to a long, resulting in a positive value (0x0000ffff)
Lotic answered 30/10, 2012 at 8:40 Comment(6)
Never thought about the printf. I work lots with arm, and have seen some 'interesting' vargs problems...Lotic
I believe there is an example in there, but the details seem slightly off: 1) Hexadecimal constants are typed from another list that includes unsigned types 2) 0xffff is too small to set the sign bit on a 32-bit int 3) if a positive constant does not fit in a signed type without setting the sign bit, the next type in the list is tried. I tried to make a verified example, but I couldn't find the right constants.Basipetal
@PascalCuoq Disagree that this is a nice example. with C99, 0xffff is an int with the value of 65,535. Assigning that to i is not an issue. 0xffffUL is an unsigned long with the value of 65,535. Assigning that to j is also not an issue. Had this example been long i = 0xffffffff; 0xffffffff is an unsigned with the value of 4,294,967,295 and assigning that to 64-long is not an issue. Also a non-issue with long j = 0xffffffffUL; This answer's #1 "converted to a long using sign extension" is not true here.Blanding
@chux You should argue with someone who said it were then. Why did you mention me? Please stop.Basipetal
To answer your querry, I mentioned you in this comment in response to your commentBlanding
@chux “Hey, in a 2012 discussion you said that an example of weird C behavior was nice at 8:42 before relenting and explaining why the example doesn't work at 9:50. Please let me explain to you now in 2016 at great length all the things that you have already shown you understand four years ago” “No thanks”Basipetal
E
14

The question is why is should we use L/UL in long constants even after declaring it to be a long.

Because it's not "after"; it's "before".

First you have the literal, then it is converted to whatever the type is of the variable you're trying to squeeze it into.

They are two objects. The type of the target is designated by the unsigned long keywords, as you've said. The type of the source is designated by this suffix because that's the only way to specify the type of a literal.

Embrasure answered 30/10, 2012 at 12:47 Comment(2)
I wish we could write LUL instead of ULL :)Oslo
@Oslo the C standard really missed a golden opportunity there. Maybe C23 can change that.Neddy
B
2

Related to this post is why a u.

A reason for u is to allow an integer constant greater than LLONG_MAX in decimal form.

// Likely to generate a warning.
unsigned long long limit63bit = 18446744073709551615; // 2^64 - 1

// OK
unsigned long long limit63bit = 18446744073709551615u;
Blanding answered 26/10, 2016 at 15:8 Comment(1)

© 2022 - 2024 — McMap. All rights reserved.