C++ int vs long long in 64 bit machine
Asked Answered
M

1

27

My computer has 64 bit processor and when I look for sizeof(int), sizeof(long), and sizeof(long long), it turns out that int and long are 32 bits, and long long is 64 bit. I researched the reason, and it appears that popular assumption telling that int in C++ fits machine's word size is wrong. As I understood it is up to compiler to define what will be the size, and mine is Mingw-w64. The reason for my research was understanding that if the usage of types smaller than word size is beneficial for speed(for instance, short vs int) or if it has negative effect. In 32 bit system, one popular opinion is: due to the fact that word size is int, short will be converted into int and it would cause additional bit shifts and etc, thus leading to worse performance. Opposing opinion is that there will be benefit in cache level(I didn't go deep into it), and using short would be usefull for virtual memory economy. So, in addition to confusion between this dilemma, I also face another problem. My system is 64 bit, and it doesn't matter if I use int or short , it still will be less than the word size, and I start thinking that wouldn't it be efficient to use 64 bit long long because it is at the level the system is designed to. Also I read that there is another constraint, which is library(ILP64, LP64) of OS that defines the type sizes. In ILP64 default int is 64 bit in contrast to LP64, would it speed up the program if I use OS with ILP64 support? Once I started to ask which type should I use for speeding up my C++ program, I faced more deep topics in which I have no expertise and some explanations seems to contradict to each other. Can you please explain:

1) If it is best practice to use long long in x64 for achieving maximum performance even for for 1-4 byte data?

2) Trade-off in using a type less than word size(memory win vs additional operations)

3) Does a x64 computer where word&int size is 64 bits, has possibility of processing a short, using 16 bit word size by using so called backward compatibility? Or it must put the 16bit file into 64 bit file, and the fact that it can be done defines the system as backward compatible.

4) Can we force the compiler to make the int 64 bit?

5) How to incorporate ILP64 into PC that uses LP64?

6) What are possible problems of using code adapted to above issues with other compilers, OS's, and architectures(32 bit processor)?

Mantra answered 29/9, 2016 at 20:55 Comment(2)
Don't ever rely on standard data types to have a specific size. C++11 has fixed-width integer types for this. (Before C++11 there were compiler specific types for that)Ipswich
You could have a 512 bit CPU and a 16 bit int would still be 100% standard compliant. Why anyone would do this is beyond me, but it's still legal.Massage
A
52

1) If it is best practice to use long long in x64 for achieving maximum performance even for for 1-4 byte data?

No- and it will probably in fact make your performance worse. For example, if you use 64-bit integers where you could have gotten away with 32-bit integers then you have just doubled the amount of data that must be sent between the processor and memory and the memory is orders of magnitude slower. All of your caches and memory buses will crap out twice as fast.

2) Trade-off in using a type less than word size(memory win vs additional operations)

Generally, the dominant driver of performance in a modern machine is going to be how much data needs to be stored in order to run a program. You are going to see significant performance cliffs once the working set size of your program exceeds the capacity of your registers, L1 cache, L2 cache, L3 cache, and RAM, in that order.

In addition, using a smaller data type can be a win if your compiler is smart enough to figure out how to use your processor's vector instructions (aka SSE instructions). Modern vector processing units are smart enough to cram eight 16-bit short integers into the same space as two 64-bit long long integers, so you can do four times as many operations at once.

3) Does a x64 computer where word&int size is 64 bits, has possibility of processing a short, using 16 bit word size by using so called backward compatibility? Or it must put the 16bit file into 64 bit file, and the fact that it can be done defines the system as backward compatible.

I'm not sure what you're asking here. In general, 64-bit machines are capable of executing 32-bit and 16-bit executable files because those earlier executable files use a subset of the 64-bit machine's potential.

Hardware instruction sets are generally backwards compatible, meaning that processor designers tend to add capabilities, but rarely if ever remove capabilities.

4) Can we force the compiler to make the int 64 bit?

There are fairly standard extensions for all compilers that allow you to work with fixed-bit-size data. For example, the header file stdint.h declares types such as int64_t, uint64_t, etc.

5) How to incorporate ILP64 into PC that uses LP64?

https://software.intel.com/en-us/node/528682

6) What are possible problems of using code adapted to above issues with other compilers, OS's, and architectures(32 bit processor)?

Generally the compilers and systems are smart enough to figure out how to execute your code on any given system. However, 32-bit processors are going to have to do extra work to operate on 64-bit data. In other words, correctness should not be an issue, but performance will be.

But it's generally the case that if performance is really critical to you, then you need to program for a specific architecture and platform anyway.

Clarification Request: Thanks alot! I wanted to clarify question no:1. You say that it is bad for memory. Lets take an example of 32 bit int. When you send it to memory, because it is 64 bit system, for a desired integer 0xee ee ee ee, when we send it won't it become 0x ee ee ee ee+ 32 other bits? How can a processor send 32 bits when the word size is 64 bits? 32 bits are the desired values, but won't it be combined with 32 unused bits and sent this way? If my assumption is true, then there is no difference for memory.

There are two things to discuss here.

First, the situation you discuss does not occur. A processor does not need to "promote" a 32-bit value into a 64-bit value in order to use it appropriately. This is because modern processors have different accessing modes that are capable of dealing with different size data appropriately.

For example, a 64-bit Intel processor has a 64-bit register named RAX. However, this same register can be used in 32-bit mode by referring to it as EAX, and even in 16-bit and 8-bit modes. I stole a diagram from here:

x86_64 registers rax/eax/ax/al overwriting full register contents

1122334455667788
================ rax (64 bits)
        ======== eax (32 bits)
            ====  ax (16 bits)
            ==    ah (8 bits)
              ==  al (8 bits)

Between the compiler and assembler, the correct code is generated so that a 32-bit value is handled appropriately.

Second, when we're talking about memory overhead and performance we should be more specific. Modern memory systems are composed of a disk, then main memory (RAM) and typically two or three caches (e.g. L3, L2, and L1). The smallest quantity of data that can be addressed on the disk is called a page, and page sizes are usually 4096 bytes (though they don't have to be). Then, the smallest quantity of data that can be addressed in memory is called a cache line, which is usually much larger than 32 or 64 bits. On my computer the cache line size is 64 bytes. The processor is the only place where data is actually transferred and addressed at the word level and below.

So if you want to change one 64-bit word in a file that resides on disk, then, on my computer, this actually requires that you load 4096 bytes from the disk into memory, and then 64 bytes from memory into the L3, L2, and L1 caches, and then the processor takes a single 64-bit word from the L1 cache.

The result is that the word size means nothing for memory bandwidth. However, you can fit 16 of those 32-bit integers in the same space you can pack 8 of those 64-bit integers. Or you could even fit 32 16-bit values or 64 8-bit values in the same space. If your program uses a lot of different data values you can significantly improve performance by using the smallest data type necessary.

Adagietto answered 29/9, 2016 at 21:11 Comment(10)
Thanks alot! I wanted to clarify question no:1. You say that it is bad for memory. Lets take an example of 32 bit int. When you send it to memory, because it is 64 bit system, for a desired integer 0xee ee ee ee, when we send it won't it become 0x ee ee ee ee+ 32 other bits? How can a processor send 32 bits when the word size is 64 bits? 32 bits are the desired values, but won't it be combined with 32 unused bits and sent this way? If my assumption is true, then there is no difference for memory.Mantra
@Mantra You have a 64 bit register and a 64 bit bus, but the cache and RAM are still just rows of bits. If you specify 32 bits, you use 32 bits.Massage
The effort that went into this answer far exceeds the effort that went into the question. Bravo.Barna
Perhaps it was a complicated question that required a lot of information to be complete.Sanctimonious
What I still do not understand is why do people generally recommend using 4-byte integers over 2-byte short integers with a justification of 4-byte being the processor's natural word size and thus generally produces the most optimal performance. However you state above that in today's world of 64-bit processors you do not want to use an 8-byte integer everywhere. How do you reconcile these two notions? Note here I am talking purely about performance not memory footprint.Update
@SiddharthaGandhi I don't think it is generally recommended to use 4-byte integers. It just so happens to be the default integer size for many compilers. For programs with just a few variables the considerations in this answer are not relevant. However programmers who work with large amounts of data absolutely do care about and select the appropriate size numeric types for their application. I think 4-byte variables are common because 2-byte variables tend to be too small for many purposes, having a limit of 0-65,000 or -32,000-32,000 depending on signedness.Adagietto
E.g. here: https://mcmap.net/q/258386/-performance-of-built-in-types-char-vs-short-vs-int-vs-float-vs-double or https://mcmap.net/q/505886/-when-to-use-short-over-int . I'm asking from the vantage point of a developer whose applications are performance-sensitive first and foremost and where absolute memory consumption is generally immaterial (editorializing for a second, I don't think my perspective is unique in that regard).Update
Neither of those links say that you should use 4-byte integers, they say that the choice between 2-byte and 4-byte integers is mostly immaterial except if it makes a difference in memory performance. Absolute memory consumption is virtually never immaterial, as memory use is the dominating factor when it comes to performance on modern systems. There is virtually no difference in the time it takes a processor to operate on different size integers as long as the size is less than the word size. Any program using more than 64 bytes of data will incur some cache penalty on a modern system.Adagietto
> "...then you have just doubled the amount of data that must be sent between the processor and memory..." But the data is sent in parallel, what's the difference then? It still is done in one tact so there's no difference. Or I'm wrong?Clichy
@HeilProgrammierung If you're sending a lot of data then it matters. More than a cache line of data incurs multiple memory transfers to the CPU.Adagietto

© 2022 - 2024 — McMap. All rights reserved.