Java 8: Applying Stream map and filter in one go
Asked Answered
S

3

12

I am writing a parser for a file in Java 8. The file is read using Files.lines and returns a sequential Stream<String>.

Each line is mapped to a data object Result like this:

Result parse(String _line) {
  // ... code here
  Result _result = new Result().
  if (/* line is not needed */) {
    return null;
  } else {
    /* parse line into result */
   return _result;
  }
}

Now we can map each line in the stream to its according result:

public Stream<Result> parseFile(Path _file) {
  Stream<String> _stream = Files.lines(_file);
  Stream<Result> _resultStream = _stream.map(this::parse);
}

However the stream now contains null values which I want to remove:

parseFile(_file).filter(v -> v != null);

How can I combine the map/filter operation, as I already know in parseLine/_stream.map if the result is needed?

Solo answered 6/11, 2014 at 12:37 Comment(6)
I don't get it, what's wrong with return Files.line(_file).map(this::parse).filter(v -> v != null); ?Sitter
Well I assume the stream has to be processed two times, once for map and once for filter. I want to discard the unnessecary elements within the map operation, that should be faster in any case.Solo
The stream will be processed in one run and only if you use a terminal operation that requires full iteration (e.g. forEach, collect, reduce).Tirewoman
@Solo See #23696817Sitter
See also: https://mcmap.net/q/64655/-stream-and-lazy-evaluationSaltation
Your assumption about multiple passes is incorrect. Filtering and mapping are processed in a single pass. (In general, the entire pipeline is processed in one pass, unless there are operations like sorting that must see all the data before yielding any data.)Heaton
S
15

As already pointed out in the comments the stream will be processed in one pass, so there isn't really a need to change anything. For what it's worth you could use flatMap and let parse return a stream:

Stream<Result> parse(String _line) {
  .. code here
  Result _result = new Result().
  if (/* line is not needed */) {
    return Stream.empty();
  } else {
    /** parse line into result */
   return Stream.of(_result);
  }
}  

public Stream<Result> parseFile(Path _file) {
  return Files.lines(_file)
              .flatMap(this::parse);
}

That way you won't have any null values in the first place.

Scandalmonger answered 6/11, 2014 at 16:7 Comment(1)
It should be noted that this approach is measurable slower than map(..).filter(..)Pertinacity
M
8

Updating for Java 9:

Using Stream<Result> seems like the wrong return type for the parse() function. A stream can contain many, many values, so the user of parse() either has to assume there will be at most one value in the stream, or use something like collect to extract and use the results of the parse() operation. If the function and its usage are only separated by a few lines of code, this may be fine, but if the distance increases, such as in a completely different file for JUnit testing, the interface contract isn't clear from the return value.

Instead of returning a Stream, it would be a better interface contract to return an empty Optional when the line is not needed.

Optional<Result> parse(String _line) {
   ... code here
   Result _result = null;
   if (/* line needed */) {
      /** parse line into result */
   }
   return Optional.ofNullable(_result);
}

Unfortunately, now _stream.map(this::parse) returns a stream of Optional values, so with Java 8, again you'd need to filter and map this with .filter(Optional::isPresent).map(Optional::get), and the question was looking for a solution which could do this "in one go".

This question was posted 3 years ago. With Java 9, we now have the option (pun intended) of using the Optional::stream method, so we can instead write:

public Stream<Result> parseFile(Path _file) {
  return Files.lines(_file)
      .map(this::parse)
      .flatMap(Optional::stream)
}

to transform the stream of Optional values into a stream of Result values, without any of the empty optionals.

Margarito answered 28/1, 2018 at 22:21 Comment(1)
In that case you're replacing the map&filter into a map&flatMap. There's not much difference to using the null directly and then filtering it out.Hotspur
C
1

Update for Java 16:

Use Stream.mapMulti() to do it with a single Stream operation as intended by OP.

    public Stream<Result> parseFile(Path _file) throws IOException {
        return Files.lines(_file)
                        .mapMulti((_line, acc) -> {
                            Result result = parse(_line);
                            if (result != null) acc.accept(result);
                        });
    }

Changing the return type of parse() to Optional<Result> would make it possible to simplify it:

    public Stream<Result> parseFile(Path _file) throws IOException {
        return Files.lines(_file)
                        .mapMulti((_line, acc) -> parse(_line).ifPresent(acc));
    }
Churchyard answered 20/6, 2024 at 14:41 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.