How to use JMH properly? Example with ArrayList
Asked Answered
I

3

6

In my examples theoretically performance of 2 methods should be pretty similar. In the first case I use array, at the second - ArrayList with ensured capacity.

The results is the following:

LessonBenchmark2.capacityTestArray avgt 5 1,354 ± 0,057 ms/op

LessonBenchmark2.capacityTestArrayListEnsured avgt 5 32,018 ± 81,911 ms/op

Here it seems that array is much faster (1.354 vs 32.018 ms/op). It might be that the settings of my benchmark with JMH is not correct. How to make it right?

Also if I use @Setup(Level.Invocation), then the results are close (1,405 vs 1,496 ms/op):

LessonBenchmark.capacityTestArray avgt 5 1,405 ± 0,143 ms/op

LessonBenchmark.capacityTestArrayListEnsured avgt 5 1,496 ± 0,104 ms/op

However it is said to use Invocation with care. Also Iteration mode seems logically right.

Here is the code:

public static void main(String[] args) throws Exception {
    org.openjdk.jmh.Main.main(args);
}

static final int iter = 5;
static final int fork = 1;
static final int warmIter = 5;

@State(Scope.Benchmark)
public static class Params {
    public int length = 100_000;
    public Person[] people;
    public ArrayList<Person> peopleArrayListEnsure;

    // before each iteration of the benchmark
    @Setup(Level.Iteration)
    public void setup() {
        people = new Person[length];
        peopleArrayListEnsure = new ArrayList<>(length);
    }
}

@Benchmark
@Warmup(iterations = warmIter)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = fork)
@Measurement(iterations = iter)
public void capacityTestArray(Params p) {
    for (int i = 0; i < p.length; i++) {
        p.people[i] = new Person(i, new Address(i, i), new Pet(i, i));
    }
}

@Benchmark
@Warmup(iterations = warmIter)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = fork)
@Measurement(iterations = iter)
public void capacityTestArrayListEnsured(Params p) {
    for (int i = 0; i < p.length; i++) {
        p.peopleArrayListEnsure.add(new Person(i, new Address(i, i), new Pet(i, i)));
    }
}

public static class Person {
    private int id;
    private Address address;
    private Pet pet;

    public Person(int id, Address address, Pet pet) {
        this.id = id;
        this.address = address;
        this.pet = pet;
    }
}

public static class Address {
    private int countryId;
    private int cityId;

    public Address(int countryId, int cityId) {
        this.countryId = countryId;
        this.cityId = cityId;
    }
}

public static class Pet {
    private int age;
    private int typeId;

    public Pet(int age, int typeId) {
        this.age = age;
        this.typeId = typeId;
    }
}
Irfan answered 17/2, 2021 at 13:39 Comment(1)
It's worth noting that this benchmark chiefly measures the performance of memory allocation, since 3 objects are allocated for each array element written. Unless the JVM suceeds in stack allocating these objects (which I doubt, considering they're reachable from a field of a long lived object), you're actually benchmarking the garbage collector here, which can cause all kinds of nasty interference. Not knowing JMH well, I can't say how robust it is with to respect to garbage collector interference, so this may or may not explain your measurement, but it is something to check.Noctambulism
I
3

As soon as you understand the difference between Trial, Iteration and Invocation, your question becomes very easy to answer. And what place to better understand these then the samples themselves.

Invocation is the a single execution of the method. Let's say there are 3 threads and each execute this benchmark method 100 times. This means Invocation == 300. That is why you get very similar results using this as the set-up.

Iteration would be 3 from the example above.

Trial would be 1, when all the threads execute all their methods.

Invocation, though has a scary documentation has its usage, like a sorted data structure; but I've used in various other places too. Also the notion of operation can be "altered" with @OperationsPerInvocation - which is another sharp tool.


Armed with this - it gets easy to answer. When you use Iteration, your ArrayList will grow constantly - which internally means System::arrayCopy, while your array does not.

Once you figure this out, you need to read the samples and see that your second problem is that your @Benchmark methods return void. And, contrary, to the other answer - I would not suggest to bulk everything with the test method itself, but this raises the question on what do you want to test, to begin with. Do not forget that these are just numbers, in the end, you need to reason about what they mean and how to properly set-up a JMH test.

Infernal answered 18/2, 2021 at 4:33 Comment(1)
Thanks a lot. This answer clear the situation.Irfan
L
5

The test is badly designed; in your test, because the arraylist is created only once for multiple invocations, the array-based code just overwrites the same array a bunch of times, whereas the arraylist version adds more and more, and needs to grow.

One trivial fix is to clear it first. Another fix is to stop using state here and just make the creation of the object (be it the 100k person array, or the person arraylist, presized for 100k persons) part of the test harness. Once you take care of this, the results are the exact same taking into account the error, there is no performance different at all between arrays and arraylists for this.

MyBenchmark.capacityTestArray             avgt    5  1,325 ± 0,059  ms/op
MyBenchmark.capacityTestArrayListEnsured  avgt    5  1,287 ± 0,157  ms/op

I simplified by removing the Params state entirely, and making the creation of the list and array part of each test's outlay:

    static final int LEN = 100_000;
    
    public void capacityTestArray() {
        Person[] people = new Person[LEN];
        for (int i = 0; i < LEN; i++) {
            people[i] = new Person(i, new Address(i, i), new Pet(i, i));
        }
    }

    public void capacityTestArrayListEnsured() {
        List<Person> p = new ArrayList<Person>(LEN);
        for (int i = 0; i < LEN; i++) {
            p.add(new Person(i, new Address(i, i), new Pet(i, i)));
        }
    }

(keeping all annotations and the Person, Address, etc classes the same).

Alternatively, take your existing code and just toss a list.clear() at the top.

Luigi answered 17/2, 2021 at 15:1 Comment(3)
I should stop assuming things before testing them....you and @Noctambulism were right about thisPiane
Thanks a lot. You are absolutely right that arraylist grows and grows beyond 100_000 elements. Am I right that it is correct for this case to use state object with "Invocation" annotation at setup() method? At this point I do not understand why setup() method with "Iteration" annotation does not reinitialize arraylist before every iteration. May be iteration is something different and not every running of a method?Irfan
I'm not sure why the creation of the data structure is being isolated; the only point of that setup method is [A] to remove the running cost of that from the stats that JMH generates, and [B] to perhaps not do it as often as each iteration. Neither is warranted here. As the warnings state, using per-invocation is bad, because you're really just timing the locking system and the CPU lines for asking for the time, which dwarfs what you are trying to measure. Do exactly what I did: Either remove the setup entirely, or, just .clear() the list within the benchmark method.Luigi
I
3

As soon as you understand the difference between Trial, Iteration and Invocation, your question becomes very easy to answer. And what place to better understand these then the samples themselves.

Invocation is the a single execution of the method. Let's say there are 3 threads and each execute this benchmark method 100 times. This means Invocation == 300. That is why you get very similar results using this as the set-up.

Iteration would be 3 from the example above.

Trial would be 1, when all the threads execute all their methods.

Invocation, though has a scary documentation has its usage, like a sorted data structure; but I've used in various other places too. Also the notion of operation can be "altered" with @OperationsPerInvocation - which is another sharp tool.


Armed with this - it gets easy to answer. When you use Iteration, your ArrayList will grow constantly - which internally means System::arrayCopy, while your array does not.

Once you figure this out, you need to read the samples and see that your second problem is that your @Benchmark methods return void. And, contrary, to the other answer - I would not suggest to bulk everything with the test method itself, but this raises the question on what do you want to test, to begin with. Do not forget that these are just numbers, in the end, you need to reason about what they mean and how to properly set-up a JMH test.

Infernal answered 18/2, 2021 at 4:33 Comment(1)
Thanks a lot. This answer clear the situation.Irfan
P
1

Even if initially thought it was a natural performance difference, below's comment were right


As commented below, the difference is indeed higher than expected.

The only scenario in which the add() goes from O(1) to O(n) is if it grows. May it be that the tests are reusing the same arraylist (as result of setup not being called more than once)? This would only affect to the arraylist test, as the array would just override the values.

Just to be sure the arraylist isn't growing:

public void capacityTestArrayListEnsured(Params p) 
{
    p.peopleArrayListEnsure = new ArrayList<>(p.length); //or clear()?
    for (int i = 0; i < p.length; i++) 
        p.peopleArrayListEnsure.add(new Person(i, new Address(i, i), new Pet(i, i)));
}

In order to make it fair, you could also initialize the array in the other method so the elapsed times are equally added:

public void capacityTestArray(Params p)  
{
    p.people = new Person[p.length];
    for (int i = 0; i < p.length; i++) 
        p.people[i] = new Person(i, new Address(i, i), new Pet(i, i));
}
Piane answered 17/2, 2021 at 14:1 Comment(6)
A factor of 30 for checking the array size and modcount? Compared with the existing work of that loop, which allocated 3 objects per array element written?Noctambulism
The private add method has explicit javadoc indicating it is designed and intended to be inlined by hotspot. The only additional 'work' done here is to increment 2 fields, and to perform one comparison which is presumably jump-optimized by hotspot as well, as it 100% branches the same way for this loop. array[i] = e itself (shared by both loops) internally still does an index check, so these 2 styles feel like they could be closer in performance.Luigi
FWIW, checked on java14 on x64, very similar results.Luigi
Yes, tested on java8 and they're not as far, you both are right. May it be that setup is not called at each iteration, forcing the arraylist to grow after the first test? That's the only reasonable thing happening here for such a difference..Piane
Figured it out - see my answer. Performance is in fact 100% the same between these two modes. The gap isn't a 30x factor. It's nothing.Luigi
or, you can properly understand what those @Setups actually do.Infernal

© 2022 - 2024 — McMap. All rights reserved.