Why would R use the "L" suffix to denote an integer?
Asked Answered
G

2

57

In R we all know it is convenient for those times we want to ensure we are dealing with an integer to specify it using the "L" suffix like this:

1L
# [1] 1

If we don't explicitly tell R we want an integer it will assume we meant to use a numeric data type...

str( 1 * 1 )
# num 1
str( 1L * 1L )
# int 1

Why is "L" the preferred suffix, why not "I" for instance? Is there a historical reason?

In addition, why does R allow me to do (with warnings):

str(1.0L)
# int 1
# Warning message:
# integer literal 1.0L contains unnecessary decimal point 

But not..

str(1.1L)
# num 1.1
#Warning message:
#integer literal 1.1L contains decimal; using numeric value 

I'd expect both to either return an error.

Guardafui answered 22/6, 2014 at 11:21 Comment(2)
For the first part of your question ("Why is "L" used as a suffix"): in this answer I refer to a thread where William Dunlop and Brian Ripley discuss "a possible explanation of why the letter L".Polynuclear
@Polynuclear thank for (both) the links. Your answer there is also nice and informative.Dianoia
G
62

Why is "L" used as a suffix?

I've never seen it written down, but I theorise in short for two reasons:

  1. Because R handles complex numbers which may be specified using the suffix "i" and this would be too simillar to "I"

  2. Because R's integers are 32-bit long integers and "L" therefore appears to be sensible shorthand for referring to this data type.

The value a long integer can take depends on the word size. R does not natively support integers with a word length of 64-bits. Integers in R have a word length of 32 bits and are signed and therefore have a range of −2,147,483,648 to 2,147,483,647. Larger values are stored as double.

This wiki page has more information on common data types, their conventional names and ranges.

And also from ?integer

Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.


Why do 1.0L and 1.1L return different types?

The reason that 1.0L and 1.1L will return different data types is because returning an integer for 1.1 will result in loss of information, whilst for 1.0 it will not (but you might want to know you no longer have a floating point numeric). Buried deep with the lexical analyser (/src/main/gram.c:4463-4485) is this code (part of the function NumericValue()) which actually creates a int data type from a double input that is suffixed by an ascii "L":

/* Make certain that things are okay. */
if(c == 'L') {
double a = R_atof(yytext);
int b = (int) a;
/* We are asked to create an integer via the L, so we check that the
   double and int values are the same. If not, this is a problem and we
   will not lose information and so use the numeric value.
*/
if(a != (double) b) {
    if(GenerateCode) {
    if(seendot == 1 && seenexp == 0)
        warning(_("integer literal %s contains decimal; using numeric value"), yytext);
    else {
        /* hide the L for the warning message */
        *(yyp-2) = '\0';
        warning(_("non-integer value %s qualified with L; using numeric value"), yytext);
        *(yyp-2) = (char)c;
    }
    }
    asNumeric = 1;
    seenexp = 1;
}
}
Guardafui answered 22/6, 2014 at 11:23 Comment(3)
the relevant section of the r language definition doesn't seem to say anything more enlighteningMessalina
I think in the new version of R integer size of 64 bit is supported.Furey
@Furey I don't think so. I quoted from R 3.1.0. These would be double types as stated in the answer. See e.g. here. An integer use the C int type which is 32bits long. Integers are now also representable using double precision vectors using the double data type.Dianoia
A
6

Probably because R is written in C, and L is used for a (long) integer in C

Algorism answered 13/4, 2022 at 0:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.