Java 8 streams vs iterator performance
Asked Answered
E

2

5

I'm comparing 2 ways to filter lists, with and without using streams. It turns out that the method without using streams is faster for a list of 10,000 items. I'm interested in understanding why is it so. Can anyone explain the results please?

public static int countLongWordsWithoutUsingStreams(
        final List<String> words, final int longWordMinLength) {
    words.removeIf(word -> word.length() <= longWordMinLength);

    return words.size();
}

public static int countLongWordsUsingStreams(final List<String> words, final int longWordMinLength) {
    return (int) words.stream().filter(w -> w.length() > longWordMinLength).count();
}

Microbenchmark using JMH:

@Benchmark
@BenchmarkMode(Throughput)
@OutputTimeUnit(MILLISECONDS)
public void benchmarkCountLongWordsWithoutUsingStreams() {
    countLongWordsWithoutUsingStreams(nCopies(10000, "IAmALongWord"), 3);
}

@Benchmark
@BenchmarkMode(Throughput)
@OutputTimeUnit(MILLISECONDS)
public void benchmarkCountLongWordsUsingStreams() {
    countLongWordsUsingStreams(nCopies(10000, "IAmALongWord"), 3);
}

public static void main(String[] args) throws RunnerException {
    final Options opts = new OptionsBuilder()
        .include(PracticeQuestionsCh8Benchmark.class.getSimpleName())
        .warmupIterations(5).measurementIterations(5).forks(1).build();

    new Runner(opts).run();
}

java -jar target/benchmarks.jar -wi 5 -i 5 -f 1

Benchmark
Mode Cnt Score Error Units
PracticeQuestionsCh8Benchmark.benchmarkCountLongWordsUsingStreams thrpt 5 10.219 ± 0.408 ops/ms
PracticeQuestionsCh8Benchmark.benchmarkCountLongWordsWithoutUsingStreams thrpt 5 910.785 ± 21.215 ops/ms

Edit: (as someone deleted the update posted as an answer)

public class PracticeQuestionsCh8Benchmark {
    private static final int NUM_WORDS = 10000;
    private static final int LONG_WORD_MIN_LEN = 10;

    private final List<String> words = makeUpWords();

    public List<String> makeUpWords() {
        List<String> words = new ArrayList<>();
        final Random random = new Random();

        for (int i = 0; i < NUM_WORDS; i++) {
            if (random.nextBoolean()) {
                /*
                 * Do this to avoid string interning. c.f.
                 * http://en.wikipedia.org/wiki/String_interning
                 */
                words.add(String.format("%" + LONG_WORD_MIN_LEN + "s", i));
            } else {
                words.add(String.valueOf(i));
            }
        }

        return words;
    }

    @Benchmark
    @BenchmarkMode(AverageTime)
    @OutputTimeUnit(MILLISECONDS)
    public int benchmarkCountLongWordsWithoutUsingStreams() {
        return countLongWordsWithoutUsingStreams(words, LONG_WORD_MIN_LEN);
    }

    @Benchmark
    @BenchmarkMode(AverageTime)
    @OutputTimeUnit(MILLISECONDS)
    public int benchmarkCountLongWordsUsingStreams() {
        return countLongWordsUsingStreams(words, LONG_WORD_MIN_LEN);
    }
}
public static int countLongWordsWithoutUsingStreams(
    final List<String> words, final int longWordMinLength) {
    final Predicate<String> p = s -> s.length() >= longWordMinLength;

    int count = 0;

    for (String aWord : words) {
        if (p.test(aWord)) {
            ++count;
        }
    }

    return count;
}

public static int countLongWordsUsingStreams(final List<String> words,
    final int longWordMinLength) {
    return (int) words.stream()
    .filter(w -> w.length() >= longWordMinLength).count();
}
Extrusive answered 6/2, 2015 at 0:53 Comment(7)
You need to return the value: return countLongWordxxx(); from your benchmark methods.Schnitzler
@Schnitzler You're right, see my updated response below.Extrusive
Guys, instead of playing a guessing game in this post, would anyone please profile both benchmarks and figure out why those are different? This is not House M.D., you don't need to conduct more experiments to produce the differential diagnosis, you can actually vivisect both tests and see how they are different. Performance profiling should be a basic skill, go!Illyes
When making comparisons, it's also helpful to compare things that are equivalent. The streams code actually has to process and count items, but asking the collection returned by Collections.nCopies(n, obj) for its size will simply return n!Labial
@StuartMarks You missed my edit.Extrusive
@AbhijitSarkar The edit improves things, but my point still stands. Others pointed out some problems with the original, including removeIf not working on an immutable collection, and not returning a value from the benchmark. Even if these were fixed, my point about size() doing different work from counting makes the comparison invalid. Now, the edit avoids nCopies and fixes these problems, but there are still differences between the two benchmarks.Labial
@AbhijitSarkar Specifically, one assigns a lambda to a local variable whereas the other uses it inline; and one counts using int values whereas the other counts using long values. It seems unlikely to me that these would make a significant difference, but the point is, we don't know, and they affect the validity of the comparison. Only when you have two benchmarks whose only difference is what you're measuring can you proceed with the analysis that Aleksey Shipilev suggested.Labial
T
5

Whenever your benchmark says that some operation over 10000 elements takes 1ns (edit: 1µs), you probably found a case of clever JVM figuring out that your code doesn't actually do anything.

Collections.nCopies doesn't actually make a list of 10000 elements. It makes a sort of a fake list with 1 element and a count of how many times it's supposedly there. That list is also immutable, so your countLongWordsWithoutUsingStreams would throw an exception if there was something for removeIf to do.

Theoretician answered 6/2, 2015 at 1:54 Comment(10)
Your comment about immutable list is right, I'll change the code so that there's actually something to remove and see what happens. As far as JVM optimizations go, it still has to iterate over the list to know the length of each element, right? BTW, where do you see 1ns, I see 10.219 ms and 910.785 ms?Extrusive
You are right. That's 1 microsecond, not nanosecond. Still way too fast to actually do 10000 of anything.Theoretician
@Abhijit Sarkar: First of all, you are not using the result value, therefore the JVM might optimize away the entire computation. Second, it’s heavily implementation-dependent whether it actually need to iterate. Both methods may use the knowledge that they actually consist of a single element to perform only one predicate test. However, since the nCopies list doesn’t support mutation, it’s unlikely to have an optimized removeIf. But if you are going to use a real mutable list, it’s clear that mutating a List is more expensive than simply counting the matches.Winifield
@Winifield Makes sense but oddly that's not what is happening, see my updated response below.Extrusive
@Abhijit Sarkar: You seemed to have missed the last sentence of my comment: “But if you are going to use a real mutable list, it’s clear that mutating a List is more expensive than simply counting the matches.” That’s especially true for ArrayList where removing isn’t cheap. So what’s surprising here? Using removeIf(…) to count occurrences is a bad idea. Why don’t you compare with an ordinary counting loop?Winifield
@Winifield here's my implementation, streams still better (score 0.209 vs 0.355) public static int countLongWordsWithoutUsingStreams( final List<String> words, final int longWordMinLength) { final LongAdder count = new LongAdder(); words.forEach(s -> { if (s.length() >= longWordMinLength) { count.increment(); } }); return count.intValue(); }Extrusive
@Winifield I tested with your implementation too, same relative result. Have you tried running the tests yourself?Extrusive
@Abhijit Sarkar: I have no problems with streams being faster in some scenarios. You may want to read this statementWinifield
@Winifield That thread doesn't offer much explanation of why's one faster than the other, which's what I'm looking for. The top answer essentially rewrites the original code and comes up with a conclusion exactly opposite to what I'm seeing - streams are slower for their code.Extrusive
@Abhijit Sarkar: I didn’t link to a thread nor answer but a particular comment from Brian Goetz.Winifield
B
2

You do not return any values from your benchmark methods, thus, JMH has no chance to escape the computed values and your benchmark suffers dead code elimination. You compute how long it takes to do nothing. See the JMH page for further guidance.

Saying this, streams can be slower in some cases: Java 8: performance of Streams vs Collections

Barrel answered 6/2, 2015 at 6:42 Comment(1)
Probably due to my inexact wording on streams being always slower what is of course not true. Its corrected now.Barrel

© 2022 - 2024 — McMap. All rights reserved.