C++ performance vs. Java/C#

E

31

119

My understanding is that C/C++ produces native code to run on a particular machine architecture. Conversely, languages like Java and C# run on top of a virtual machine which abstracts away the native architecture. Logically it would seem impossible for Java or C# to match the speed of C++ because of this intermediate step, however I've been told that the latest compilers ("hot spot") can attain this speed or even exceed it.

Perhaps this is more of a compiler question than a language question, but can anyone explain in plain English how it is possible for one of these virtual machine languages to perform better than a native language?

Ema answered 28/9, 2008 at 3:17 Comment(2)

Java and C# can make optimisation based on how the application is actually run using the code as it is available at runtime. e.g. it can inline code in a shared library which can actually change while the program is running and still be correct. – Brynnbrynna 16/5, 2009 at 19:57

Some actual measurements to check before reading a lot of very flaky theory in these answers: shootout.alioth.debian.org/u32/… – Huffy 16/9, 2011 at 22:49

P

178

Generally, C# and Java can be just as fast or faster because the JIT compiler -- a compiler that compiles your IL the first time it's executed -- can make optimizations that a C++ compiled program cannot because it can query the machine. It can determine if the machine is Intel or AMD; Pentium 4, Core Solo, or Core Duo; or if supports SSE4, etc.

A C++ program has to be compiled beforehand usually with mixed optimizations so that it runs decently well on all machines, but is not optimized as much as it could be for a single configuration (i.e. processor, instruction set, other hardware).

Additionally certain language features allow the compiler in C# and Java to make assumptions about your code that allows it to optimize certain parts away that just aren't safe for the C/C++ compiler to do. When you have access to pointers there's a lot of optimizations that just aren't safe.

Also Java and C# can do heap allocations more efficiently than C++ because the layer of abstraction between the garbage collector and your code allows it to do all of its heap compression at once (a fairly expensive operation).

Now I can't speak for Java on this next point, but I know that C# for example will actually remove methods and method calls when it knows the body of the method is empty. And it will use this kind of logic throughout your code.

So as you can see, there are lots of reasons why certain C# or Java implementations will be faster.

Now this all said, specific optimizations can be made in C++ that will blow away anything that you could do with C#, especially in the graphics realm and anytime you're close to the hardware. Pointers do wonders here.

So depending on what you're writing I would go with one or the other. But if you're writing something that isn't hardware dependent (driver, video game, etc), I wouldn't worry about the performance of C# (again can't speak about Java). It'll do just fine.

One the Java side, @Swati points out a good article:

https://www.ibm.com/developerworks/library/j-jtp09275

Piedadpiedmont answered 28/9, 2008 at 3:17 Comment(19)

Your reasoning is bogus - C++ programs get built for their target architecture, they don't need to switch at runtime. – Huffy 16/9, 2011 at 22:48

@Huffy The best your c++ compiler will offer for different architectures is usually x86, x64, ARM and whatnot. Now you can tell it to use specific features (say SSE2) and if you're lucky it'll even generate some backup code if that feature isn't available, but that's about as fine-grained as one can get. Certainly no specialization depending on cache sizes and whatnot. – Nashom 17/9, 2011 at 22:46

See shootout.alioth.debian.org/u32/… for examples of this theory not happening. – Huffy 18/9, 2011 at 20:27

To be honest, this is one of the worst answers. It is so unfounded, I could just invert it. Too much generalisation, too much unknowledge (optimizing away empty functions is really just the tip of the iceberg). One luxury C++ compilers have: Time. Another luxury: No checking is enforced. But find more in https://mcmap.net/q/75460/-c-performance-vs-java-c/… . – Seroka 4/10, 2011 at 16:8

you are defending C++ because you know everyone is switching to JAVA. All academic universities, research institutes have java as their preferred simulation/development environment, and if you go to production level development, again C# is most widely used specially for embedded systems. Servers are shifting to Java EE, C++ would probably remain the choice for desktop application developers only targeting naive end users. – Compliant 25/1, 2012 at 11:23

@Huffy I went there and looked at the C# Mono code which looked like a C++ programmer went and wrong C# code in C++ style. Of course it's not going to behave or even be as maintainable as C# code. But I can write a lot of really complicated stuff that's relatively efficient in only a few lines of code in C# because it offers those high-level features. Besides most of the C# code out there isn't going to need to do very fast math processing or manual register manipulation. All these comparisons feel weighted to what C++ does well and what the others languages do poorly. The tests are biased. – Piedadpiedmont 25/1, 2012 at 17:1

There seems to be a misconception or lack of knowledge of when the C# compiler does a lot of its work. The C# compiler does most of the hard lifting at build time. IL is incredibly easy to work with and optimize which is why the supposed benefit of infinite time for the C++ compilers just isn't borne out as much as some would claim. In the end you do loose some speed for protection features (e.g. index bounds checking), but the really powerful structures you gain out of it along with the simple nature of the language usually pays off in increased productivity and targeted optimization. – Piedadpiedmont 25/1, 2012 at 17:8

SO @OrionAdrian - 1. The C# code in the tests is wrong, or 2. C# doesn't need to be fast. Which is it? – Huffy 25/1, 2012 at 18:28

@Huffy Pure raw performance isn't the only consideration, otherwise we'd all be hand-coding perfect assembly. The question for me has always been, given X time to develop and Y-sized task, what will give me the best performance Z. And depending on what I'm doing, different languages will have the best Z. C# doesn't try to directly compete with C++ for being the fastest at heavy math processing -- why would it, that's what C++ is for. C# has a decent core, but adds fast, safe metadata programming, reflection, database access, ORM, MVC and easy to use Windows controls. It does those fast. – Piedadpiedmont 27/1, 2012 at 19:27

@OrionAdrian The idea that it's easier or more appropriate to use other languages is irrelevant to the question. – Huffy 28/1, 2012 at 21:51

@Huffy The question asks how can code run faster with the abstraction layer of the VM. For one, key, basic algorithms can be simpler in the C# world because you get to make assumptions about a lot of things, like memory (de-)allocation. In general, the more assumptions you can make about something the faster you can make it. Algorithm choice is going to have a larger effect on performance than anything and the benefit to having so many core parts of the experience done for you is that those can be super-optimized. The C# allocator is really fast along with strings and file I/O and others. – Piedadpiedmont 30/1, 2012 at 15:27

@OrionAdrian ok we're full circle now ... See shootout.alioth.debian.org/u32/… for examples of this theory not happening. In other words, show us that your theory can be proven correct before making vague speculative statements. – Huffy 30/1, 2012 at 19:13

We're all speaking very conceptually though. In practice, one would be hard-pressed to find something like an efficient raytracer implemented in managed languages like Java or C#, and there's a reason mobile devices are turning back to C/C++ for native applications (Android switching more to C/C++ instead of Java, MS from C# to C/C++, etc). Embedded systems are very much in the same vein. That said, these are all valid points are JIT performance, but we also have to keep in mind the language-level overhead of managed languages. For example, in C/C++, my user-defined types can be allocated... – Nanci 28/2, 2012 at 18:15

... on the stack (and often are except for variable-sized aggregates or objects that need to persist outside of some immediate scope). Regardless of the fact that Java and C# have that additional level of indirection where the GC is allowed to compress the heap, we're in turn generally using the heap a lot more in those languages and a heap allocation is still generally hundreds of cycles compared to a few for the stack. We're also paying a heavier cost for the indirection and super performance-critical apps generally need everything to fit in the cache, e.g. - not something easy in Java. – Nanci 28/2, 2012 at 18:18

We rarely see people dealing with managed languages talking about things like the cache line or memory access patterns because everything is all over the heap. Efficient code of that sort is generally impractical for anything but arrays of plain old data. We can't do it with, say, a matrix library or hand-coded raster operations, pixel filters, raytracers, physics simulators, particle engines, realtime motion tracking, video compositing, audio processing, etc. On the other hand, if it's business applications we're talking about, e.g., or database middleware, I wouldn't be surprised to find... – Nanci 28/2, 2012 at 18:31

... the average Java or C# or even Python application outperforming the average C++ equivalent since, 1) most of these are actually implemented in C/C++ anyway (the JNI libraries, native implementations of .NET, or Python C modules involved), and 2) the reasons cited above about the benefits of JIT compilation and 3) these languages optimize productivity above all else, and the programmer who gets things done faster just has more time to spend optimizing and fixing bugs. – Nanci 28/2, 2012 at 18:35

@Nanci DotNet uses the stack quite effectively actually. Registers too. Any custom type that uses a struct will also be placed on the stack (though that's an implementation issue). DotNet does keep track of cachelines and the JIT compiler is smart about these kinds of resources. – Piedadpiedmont 1/3, 2012 at 15:0

@OrionAdrian Efficient cache line usage comes down to contiguous memory. If you have a ArrayList of particles, a fundamental aspect of whether that can be efficiently traversed is whether it is, indeed, contiguous. By definition, if each particle is created with operator new, it is on the managed heap - efficiency goes out the door. Only types inheriting from System.ValueType have this special CLR characteristic of not being allocated on the managed heap. That's, by far, not the majority of types, and I've seen too many people trying to create something like particle instances on the heap. – Nanci 1/3, 2012 at 23:18

@OrionAdrian That's not to disagree entirely about conceptual differences and implementation-defined issues. But the fact that operator new creates data on the managed heap (not the hardware stack) is not implementation-defined, that's how new is defined in .NET. And this is a fundamental point of where managed languages tend to perform poorly in comparison to something like C or FORTRAN. Those latter languages excel at working with large, contiguous buffers of data in performance-critical loops: areas like video processing. – Nanci 1/3, 2012 at 23:20

P

197

JIT vs. Static Compiler

As already said in the previous posts, JIT can compile IL/bytecode into native code at runtime. The cost of that was mentionned, but not to its conclusion:

JIT has one massive problem is that it can't compile everything: JIT compiling takes time, so the JIT will compile only some parts of the code, whereas a static compiler will produce a full native binary: For some kind of programs, the static compiler will simply easily outperform the JIT.

Of course, C# (or Java, or VB) is usually faster to produce viable and robust solution than is C++ (if only because C++ has complex semantics, and C++ standard library, while interesting and powerful, is quite poor when compared with the full scope of the standard library from .NET or Java), so usually, the difference between C++ and .NET or Java JIT won't be visible to most users, and for those binaries that are critical, well, you can still call C++ processing from C# or Java (even if this kind of native calls can be quite costly in themselves)...

C++ metaprograming

Note that usually, you are comparing C++ runtime code with its equivalent in C# or Java. But C++ has one feature that can outperform Java/C# out of the box, that is template metaprograming: The code processing will be done at compilation time (thus, increasing vastly compilation time), resulting into zero (or almost zero) runtime.

I have yet so see a real life effect on this (I played only with concepts, but by then, the difference was seconds of execution for JIT, and zero for C++), but this is worth mentioning, alongside the fact template metaprograming is not trivial...

Edit 2011-06-10: In C++, playing with types is done at compile time, meaning producing generic code which calls non-generic code (e.g. a generic parser from string to type T, calling standard library API for types T it recognizes, and making the parser easily extensible by its user) is very easy and very efficient, whereas the equivalent in Java or C# is painful at best to write, and will always be slower and resolved at runtime even when the types are known at compile time, meaning your only hope is for the JIT to inline the whole thing.

...

Edit 2011-09-20: The team behind Blitz++ (Homepage, Wikipedia) went that way, and apparently, their goal is to reach FORTRAN's performance on scientific calculations by moving as much as possible from runtime execution to compilation time, via C++ template metaprogramming. So the "I have yet so see a real life effect on this" part I wrote above apparently does exist in real life.

Native C++ Memory Usage

C++ has a memory usage different from Java/C#, and thus, has different advantages/flaws.

No matter the JIT optimization, nothing will go has fast as direct pointer access to memory (let's ignore for a moment processor caches, etc.). So, if you have contiguous data in memory, accessing it through C++ pointers (i.e. C pointers... Let's give Caesar its due) will goes times faster than in Java/C#. And C++ has RAII, which makes a lot of processing a lot easier than in C# or even in Java. C++ does not need using to scope the existence of its objects. And C++ does not have a finally clause. This is not an error.

:-)

And despite C# primitive-like structs, C++ "on the stack" objects will cost nothing at allocation and destruction, and will need no GC to work in an independent thread to do the cleaning.

As for memory fragmentation, memory allocators in 2008 are not the old memory allocators from 1980 that are usually compared with a GC: C++ allocation can't be moved in memory, true, but then, like on a Linux filesystem: Who needs hard disk defragmenting when fragmentation does not happen? Using the right allocator for the right task should be part of the C++ developer toolkit. Now, writing allocators is not easy, and then, most of us have better things to do, and for the most of use, RAII or GC is more than good enough.

Edit 2011-10-04: For examples about efficient allocators: On Windows platforms, since Vista, the Low Fragmentation Heap is enabled by default. For previous versions, the LFH can be activated by calling the WinAPI function HeapSetInformation). On other OSes, alternative allocators are provided (see https://secure.wikimedia.org/wikipedia/en/wiki/Malloc for a list)

Now, the memory model is somewhat becoming more complicated with the rise of multicore and multithreading technology. In this field, I guess .NET has the advantage, and Java, I was told, held the upper ground. It's easy for some "on the bare metal" hacker to praise his "near the machine" code. But now, it is quite more difficult to produce better assembly by hand than letting the compiler to its job. For C++, the compiler became usually better than the hacker since a decade. For C# and Java, this is even easier.

Still, the new standard C++0x will impose a simple memory model to C++ compilers, which will standardize (and thus simplify) effective multiprocessing/parallel/threading code in C++, and make optimizations easier and safer for compilers. But then, we'll see in some couple of years if its promises are held true.

C++/CLI vs. C#/VB.NET

Note: In this section, I am talking about C++/CLI, that is, the C++ hosted by .NET, not the native C++.

Last week, I had a training on .NET optimization, and discovered that the static compiler is very important anyway. As important than JIT.

The very same code compiled in C++/CLI (or its ancestor, Managed C++) could be times faster than the same code produced in C# (or VB.NET, whose compiler produces the same IL than C#).

Because the C++ static compiler was a lot better to produce already optimized code than C#'s.

For example, function inlining in .NET is limited to functions whose bytecode is less or equal than 32 bytes in length. So, some code in C# will produce a 40 bytes accessor, which won't be ever inlined by the JIT. The same code in C++/CLI will produce a 20 bytes accessor, which will be inlined by the JIT.

Another example is temporary variables, that are simply compiled away by the C++ compiler while still being mentioned in the IL produced by the C# compiler. C++ static compilation optimization will result in less code, thus authorizes a more aggressive JIT optimization, again.

The reason for this was speculated to be the fact C++/CLI compiler profited from the vast optimization techniques from C++ native compiler.

Conclusion

I love C++.

But as far as I see it, C# or Java are all in all a better bet. Not because they are faster than C++, but because when you add up their qualities, they end up being more productive, needing less training, and having more complete standard libraries than C++. And as for most of programs, their speed differences (in one way or another) will be negligible...

Edit (2011-06-06)

My experience on C#/.NET

I have now 5 months of almost exclusive professional C# coding (which adds up to my CV already full of C++ and Java, and a touch of C++/CLI).

I played with WinForms (Ahem...) and WCF (cool!), and WPF (Cool!!!! Both through XAML and raw C#. WPF is so easy I believe Swing just cannot compare to it), and C# 4.0.

The conclusion is that while it's easier/faster to produce a code that works in C#/Java than in C++, it's a lot harder to produce a strong, safe and robust code in C# (and even harder in Java) than in C++. Reasons abound, but it can be summarized by:

Generics are not as powerful as templates (try to write an efficient generic Parse method (from string to T), or an efficient equivalent of boost::lexical_cast in C# to understand the problem)
RAII remains unmatched (GC still can leak (yes, I had to handle that problem) and will only handle memory. Even C#'s using is not as easy and powerful because writing a correct Dispose implementations is difficult)
C# readonly and Java final are nowhere as useful as C++'s const (There's no way you can expose readonly complex data (a Tree of Nodes, for example) in C# without tremendous work, while it's a built-in feature of C++. Immutable data is an interesting solution, but not everything can be made immutable, so it's not even enough, by far).

So, C# remains an pleasant language as long as you want something that works, but a frustrating language the moment you want something that always and safely works.

Java is even more frustrating, as it has the same problems than C#, and more: Lacking the equivalent of C#'s using keyword, a very skilled colleague of mine spent too much time making sure its resources where correctly freed, whereas the equivalent in C++ would have been easy (using destructors and smart pointers).

So I guess C#/Java's productivity gain is visible for most code... until the day you need the code to be as perfect as possible. That day, you'll know pain. (you won't believe what's asked from our server and GUI apps...).

About Server-side Java and C++

I kept contact with the server teams (I worked 2 years among them, before getting back to the GUI team), at the other side of the building, and I learned something interesting.

Last years, the trend was to have the Java server apps be destined to replace the old C++ server apps, as Java has a lot of frameworks/tools, and is easy to maintain, deploy, etc. etc..

...Until the problem of low-latency reared its ugly head the last months. Then, the Java server apps, no matter the optimization attempted by our skilled Java team, simply and cleanly lost the race against the old, not really optimized C++ server.

Currently, the decision is to keep the Java servers for common use where performance while still important, is not concerned by the low-latency target, and aggressively optimize the already faster C++ server applications for low-latency and ultra-low-latency needs.

Conclusion

Nothing is as simple as expected.

Java, and even more C#, are cool languages, with extensive standard libraries and frameworks, where you can code fast, and have result very soon.

But when you need raw power, powerful and systematic optimizations, strong compiler support, powerful language features and absolute safety, Java and C# make it difficult to win the last missing but critical percents of quality you need to remain above the competition.

It's as if you needed less time and less experienced developers in C#/Java than in C++ to produce average quality code, but in the other hand, the moment you needed excellent to perfect quality code, it was suddenly easier and faster to get the results right in C++.

Of course, this is my own perception, perhaps limited to our specific needs.

But still, it is what happens today, both in the GUI teams and the server-side teams.

Of course, I'll update this post if something new happens.

Edit (2011-06-22)

"We find that in regards to performance, C++ wins out by a large margin. However, it also required the most extensive tuning efforts, many of which were done at a level of sophistication that would not be available to the average programmer.

[...] The Java version was probably the simplest to implement, but the hardest to analyze for performance. Specifically the effects around garbage collection were complicated and very hard to tune."

Sources:

Edit (2011-09-20)

"The going word at Facebook is that 'reasonably written C++ code just runs fast,' which underscores the enormous effort spent at optimizing PHP and Java code. Paradoxically, C++ code is more difficult to write than in other languages, but efficient code is a lot easier [to write in C++ than in other languages]."

– Herb Sutter at //build/, quoting Andrei Alexandrescu

Sources:

Pilfer answered 28/9, 2008 at 3:17 Comment(6)

You edit after 5 months of C# describes exactly my own experience (templates better, const better, RAII). +1. Those three remain my personal killer features for C++ (or D, which I hadn't the time for, yet). – Seroka 4/10, 2011 at 15:59

"The code processing will be done at compilation time". Hence template metaprogramming only works in the program is available at compile time which is often not the case, e.g. it is impossible to write a competitively performant regular expression library in vanilla C++ because it is incapable of run-time code generation (an important aspect of metaprogramming). – Wadmal 17/1, 2012 at 10:8

"playing with types is done at compile time...the equivalent in Java or C# is painful at best to write, and will always be slower and resolved at runtime even when the types are known at compile time". In C#, that is only true of reference types and is not true for value types. – Wadmal 17/1, 2012 at 10:10

"No matter the JIT optimization, nothing will go has fast as direct pointer access to memory...if you have contiguous data in memory, accessing it through C++ pointers (i.e. C pointers... Let's give Caesar its due) will goes times faster than in Java/C#". People have observed Java beating C++ on the SOR test from the SciMark2 benchmark precisely because pointers impede aliasing-related optimizations. blogs.oracle.com/dagastine/entry/sun_java_is_faster_than – Wadmal 17/1, 2012 at 10:16

Also worth noting that .NET does type specialization of generics across dynamically-linked libraries after linking whereas C++ cannot because templates must be resolved before linking. And obviously the big advantage generics have over templates is comprehensible error messages. – Wadmal 17/1, 2012 at 10:37

RAII and Generic programming are the critical missing pieces in Java and C#. Your answer matches my experience. Multi pass full code optimization is also very hard to beat. – Senary 1/2, 2012 at 10:49