Read all text from a file
Java 11 added the readString() method to read small files as a String
, preserving line terminators:
String content = Files.readString(path, encoding);
For versions between Java 7 and 11, here's a compact, robust idiom, wrapped up in a utility method:
static String readFile(String path, Charset encoding)
throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
Read lines of text from a file
Java 7 added a convenience method to read a file as lines of text, represented as a List<String>
. This approach is "lossy" because the line separators are stripped from the end of each line.
List<String> lines = Files.readAllLines(Paths.get(path), encoding);
Java 8 added the Files.lines()
method to produce a Stream<String>
. Again, this method is lossy because line separators are stripped. If an IOException
is encountered while reading the file, it is wrapped in an UncheckedIOException
, since Stream
doesn't accept lambdas that throw checked exceptions.
try (Stream<String> lines = Files.lines(path, encoding)) {
lines.forEach(System.out::println);
}
This Stream
does need a close()
call; this is poorly documented on the API, and I suspect many people don't even notice Stream
has a close()
method. Be sure to use an ARM-block as shown.
If you are working with a source other than a file, you can use the lines()
method in BufferedReader
instead.
Memory utilization
If your file is small enough relative to your available memory, reading the entire file at once might work fine. However, if your file is too large, reading one line at a time, processing it, and then discarding it before moving on to the next could be a better approach. Stream processing in this way can eliminate the total file size as a factor in your memory requirement.
Character encoding
One thing that is missing from the sample in the original post is the character encoding. This encoding generally can't be determined from the file itself, and requires meta-data such as an HTTP header to convey this important information.
The StandardCharsets
class defines some constants for the encodings required of all Java runtimes:
String content = readFile("test.txt", StandardCharsets.UTF_8);
The platform default is available from the Charset
class itself:
String content = readFile("test.txt", Charset.defaultCharset());
There are some special cases where the platform default is what you want, but they are rare. You should be able justify your choice, because the platform default is not portable. One example where it might be correct is when reading standard input or writing standard output.
Note: This answer largely replaces my Java 6 version. The utility of Java 7 safely simplifies the code, and the old answer, which used a mapped byte buffer, prevented the file that was read from being deleted until the mapped buffer was garbage collected. You can view the old version via the "edited" link on this answer.
byte[] Files.readAllBytes(file);
To those, who suggest the 'one-line' Scanner solution: Don't yo need to close it? – Nombles