Impact of Java streams from GC perspective or handling short-lived objects by the GC
Asked Answered
G

1

9

There are some articles available online where they mention some of the cons of using Stream-s over old loop-s:

But is there any impact from the GC perspective? As I assume (is it correct?) that every stream call creates some short-lived objects underneath. If the particular code fragment which uses streams is called frequently by the underlying system could it cause eventually some performance issue from the GC perspective or put extra pressure on GC? Or the impact is minimal and could be ignored most of the time?

Are there any articles covering this more in detail?

Gastrectomy answered 13/1, 2020 at 14:19 Comment(1)
You mean the calls to build the pipeline? So a stream with five stages creates five objects (in that magnitude)? That’s rarely an issue. See this answer for the impact of a small number of objects. A large number of objects, like when using a Stream<Integer> instead of IntStream may be a different thing. Likewise, nobody cares about the temporary Iterator instance when using a for loop.Libnah
E
5

To be fair, it's very complicated to give an answer when Holger already linked the main idea via his answer; still I will try to.

Extra pressure on GC - may be. Extra time for a GC cycle to execute - most probably not. Ignorable? I'd say totally. In the end what you care from a GC - that it takes little time to reclaim lots of space, preferably with super tiny stop-the-world events.

Let's talk about the potential overhead in the GC main two phases : mark and evacuation/realocation (Shenandoah/ZGC). First mark phase, where GC finds out what is garbage (by actually identifying what is alive).

If objects that were created by the Stream internals are not reachable, they will never be scanned (zero overhead here) and if they are reachable, scanning them will be extremely fast. The other side of the story is: when you create an Object and GC might touch it while it's running in the mark phase, the slow path of a LoadBarrier (in case of Shenandoah) will be active. This will add some tens of ns I assume to the total time of that particular phase of the GC as well as some space in the SATB queues. Aleksey Shipilev in one talk said that he tried to measure the overhead from executing a single barrier and could not, so he measured 3 and the time was in the region of tens of ns. I don't know the exact details of ZGC, but a LoadBarrier is there in place too.

The main point is that this mark phase is done in a concurrent fashion, while the application is running, so you application will still run perfectly fine. And even if some GC code will be triggered to do something specific work (Load Barrier), it will be extremely fast and completely transparent to you.

The second phase is "compactation", or making space for future allocations. What a GC does is move live objects from regions with the most garbage (Shenandoah for sure) to regions that are empty. But only live objects. So if a certain region has 100 objects and only 1 is alive, only 1 will be moved, then that entire region is going to be marked as free. So potentially if the Stream implementation generated only garbage (i.e.: not currently alive), it is "free lunch" for GC, it will not even know it existed.

The better picture here is that this phase is still done concurrently. To keep the "concurrency" active, you need to know how much was allocated from start to end of a GC cycle. This amount is the minimum "extra" space you need to have on top of the java process in order for a GC to be happy.

So overall, you are looking at a super tiny impact; if any at all.

Enfeoff answered 15/1, 2020 at 4:47 Comment(6)
I’m not a native speaker either, but I think “compaction” is better than “compactation”.Libnah
Well, I "could not measure the single barrier", because it rather hard to construct a program that does only one barrier on the hot path. And yes, single barrier fastpath takes just a few CPU cycles, and can even be hidden in the pipelining. But it is the cost nevertheless, and measurable across the entire application, may routinely take single-digit percents off the throughput.Swinford
The problem with "dead objects are free" argument is that while it is technically true, it misses the other part of the story: allocations force GCs to act, which makes them deal with live objects more frequently. So, in fully-young workload, the GC cost is indeed marginal, but once you have live data in the heap, allocations force GC hand quite a bit.Swinford
Here is a happy thought: worrying about GC before seeing the problem empirically is the fool's errand. The performance model for GC-managed code is quite complicated, and you are as likely to be at the point where GC costs are ignorable, as at the point where GC costs are the deal-breaker. Arguing where you would end up from the first principles works only if you know many exact things about the application and GC in use.Swinford
@AlekseyShipilev WOW. Aleksey himself commented on my answer, this made my month. Spasibo. I really like the argument that allocations forces the GC to act, extremely well put.Enfeoff
@AlekseyShipilev 👍😊 you have very interesting videos on youtube about java memory model, in particular, you are to be trust that you know what you are talking about.Camelliacamelopard

© 2022 - 2024 — McMap. All rights reserved.