How to remove line breaks from a file in Java?
Asked Answered
M

17

311

How can I replace all line breaks from a string in Java in such a way that will work on Windows and Linux (ie no OS specific problems of carriage return/line feed/new line etc.)?

I've tried (note readFileAsString is a function that reads a text file into a String):

String text = readFileAsString("textfile.txt");
text.replace("\n", "");

but this doesn't seem to work.

How can this be done?

Michaella answered 29/1, 2010 at 15:44 Comment(3)
Do you want to eliminate all line breaks? Or you want to uniformize them to a standard solution?Interpol
Oh, if you want to delete all linefeeds, remove all \n AND all \r (because Windows linebreak is \r\n).Interpol
Hey, FYI if you can want to replace simultaneous muti-linebreaks with single line break then you can use myString.trim().replaceAll("[\n]{2,}", "\n") Or replace with a single space myString.trim().replaceAll("[\n]{2,}", " ")Which
H
516

You need to set text to the results of text.replace():

String text = readFileAsString("textfile.txt");
text = text.replace("\n", "").replace("\r", "");

This is necessary because Strings are immutable -- calling replace doesn't change the original String, it returns a new one that's been changed. If you don't assign the result to text, then that new String is lost and garbage collected.

As for getting the newline String for any environment -- that is available by calling System.getProperty("line.separator").

Haire answered 29/1, 2010 at 15:47 Comment(11)
+1, correct. As to the reason: String is immutable. The replace() method returns the desired result. Also see the API docs: java.sun.com/javase/6/docs/api/java/lang/… Edit: ah you already edited that yourself in afterwards :)Eley
Perhaps text = text.replace("\r\n", " ").replace("\n", " "); is a better solution: otherwise words will be "glued" to each other (without the single-space replacement).Teepee
Yeah, that's possible. It all depends on what type of data you're trying to modify. Sometimes (for data such as COBOL copybooks) you don't want there to be any spaces between the lines.Haire
True, it all depends on what the OP is trying to do.Teepee
You could also use square brackets to match newlines properly for any OS: .replaceAll("[\\r\\n]+", "")Falcongentle
As the question is asking for replacing ALL occurrences, the solution is rather text = text.replaceAll("\n", "").replaceAll("\r", "");Burk
@Burk replaceAll takes in regex, replace takes in literal strings, both replace all occurences.Cymogene
be careful creating a System dependency, it might be what you want but I had a unit test with a static String that was passing on local Windows machine but failing on Continuous Integration build machine (Linux)Purpura
When the input text is like this : C:\SomeFolder\AnotherFolder , we get the output as : C:SomeFolderAnotherFolder. Means, it removes the backslashes, although, it was not intended to do so.Disequilibrium
You won't believe what this fixedBillings
I found it this useful to handle the ascii value combo for "\n" at times: content.replace("\\" + (char)92 + (char)110, "replacement");Holdover
U
274

As noted in other answers, your code is not working primarily because String.replace(...) does not change the target String. (It can't - Java strings are immutable!) What replace actually does is to create and return a new String object with the characters changed as required. But your code then throws away that String ...


Here are some possible solutions. Which one is most correct depends on what exactly you are trying to do.

// #1
text = text.replace("\n", "");

Simply removes all the newline characters. This does not cope with Windows or Mac line terminations.

// #2
text = text.replace(System.getProperty("line.separator"), "");

Removes all line terminators for the current platform. This does not cope with the case where you are trying to process (for example) a UNIX file on Windows, or vice versa.

// #3
text = text.replaceAll("\\r|\\n", "");

Removes all Windows, UNIX or Mac line terminators. However, if the input file is text, this will concatenate words; e.g.

Goodbye cruel
world.

becomes

Goodbye cruelworld.

So you might actually want to do this:

// #4
text = text.replaceAll("\\r\\n|\\r|\\n", " ");

which replaces each line terminator with a space1. Since Java 8 you can also do this:

// #5
text = text.replaceAll("\\R", " ");

And if you want to replace multiple line terminator with one space:

// #6
text = text.replaceAll("\\R+", " ");

1 - Note there is a subtle difference between #3 and #4. The sequence \r\n represents a single (Windows) line terminator, so we need to be careful not to replace it with two spaces.

Unreserved answered 29/1, 2010 at 16:7 Comment(5)
This is an EXCELLENT answer. Kudos for the Java 8 examples. Thank you for the help!Giselagiselbert
Thanks this worked for me... btw can u explain text = text.replaceAll("\\r\\n|\\r|\\n", " ");Dunkirk
Option 4: A \r will normally not be alone. If there is a \r there is a \n.Clause
@Dunkirk It's a regex. | means or. It will replace the first block that matches. So if there is \r\n, it will be replaced with one space. If there is a \r but no \n or the other way around, it will also be one space. He does it this way to prevent replacing \r and \n by a space and ending up with 2 spaces.Clause
@Clause - Prior to MacOS 9, a \r without an \n was the line separator; see en.wikipedia.org/wiki/Newline. And on other old systems.Unreserved
I
41

This function normalizes down all whitespace, including line breaks, to single spaces. Not exactly what the original question asked for, but likely to do exactly what is needed in many cases:

import org.apache.commons.lang3.StringUtils;

final String cleansedString = StringUtils.normalizeSpace(rawString);
Intercollegiate answered 28/4, 2017 at 22:38 Comment(0)
S
26

If you want to remove only line terminators that are valid on the current OS, you could do this:

text = text.replaceAll(System.getProperty("line.separator"), "");

If you want to make sure you remove any line separators, you can do it like this:

text = text.replaceAll("\\r|\\n", "");

Or, slightly more verbose, but less regexy:

text = text.replaceAll("\\r", "").replaceAll("\\n", "");
Sardonyx answered 29/1, 2010 at 15:52 Comment(1)
To avoid gluing word together (as discussed in comments to Kaleb's answer) the regex approach could be modified to text.replaceAll("(\\r|\\n)+", " ") and (assuming greedy is default in Java?) you will have a solution with just one space for each sequence of new line chars.Faqir
O
14

This would be efficient I guess

String s;
s = "try this\n try me.";
s.replaceAll("[\\r\\n]+", "")
Odele answered 9/3, 2013 at 17:16 Comment(1)
Make sure you have the exact same code, rather than losing the "\n" chars while pasting. Because it should work. Maybe it's because I forgot the last semicolon (;) at the end.Odele
G
14
str = str.replaceAll("\\r\\n|\\r|\\n", " ");

Worked perfectly for me after searching a lot, having failed with every other line.

Gertiegertrud answered 25/7, 2014 at 16:38 Comment(1)
I was trying to do it individually, not sure why it was not working, this one works like charm.Limbus
E
7

Linebreaks are not the same under windows/linux/mac. You should use System.getProperties with the attribute line.separator.

Equities answered 29/1, 2010 at 15:49 Comment(0)
I
5

In Kotlin, and also since Java 11, String has lines() method, which returns list of lines in the multi-line string. You can get all the lines and then merge them into a single string.

With Kotlin it will be as simple as

str.lines().joinToString("")
Incorruption answered 29/6, 2021 at 18:18 Comment(6)
This is not at all useful for what was asked.Spitzer
The question states: "replace all line breaks from a string" - and the solution does exactly that, in a simple, clean and reliable way.Incorruption
They asked almost 12 years ago about Java - your answer about Kotlin is not of any useSpitzer
My answer contains information about a Java method: ...since Java 11, String has lines() method.... Kotlin example is a bonus.Incorruption
I don't know about Kotlin and haven't done Java recently. If it's the case that you aren't providing a complete solution for Java but you are for Kotlin, then I couldn't say you've provided a full answer to the question that was asked.Hepburn
Well, the question was asked more than 10 years ago. Welcome to the year 2022, absolute majority of professional Java developers heard about Kotlin and a lot of them are using Kotlin now. So this answer is relevant for a lot of people.Incorruption
O
4
String text = readFileAsString("textfile.txt").replaceAll("\n", "");

Even though the definition of trim() in oracle website is "Returns a copy of the string, with leading and trailing whitespace omitted."

the documentation omits to say that new line characters (leading and trailing) will also be removed.

In short String text = readFileAsString("textfile.txt").trim(); will also work for you. (Checked with Java 6)

Optician answered 31/8, 2012 at 4:27 Comment(0)
D
3
String text = readFileAsString("textfile.txt").replace("\n","");

.replace returns a new string, strings in Java are Immutable.

Diandrous answered 29/1, 2010 at 15:49 Comment(0)
M
3

You may want to read your file with a BufferedReader. This class can break input into individual lines, which you can assemble at will. The way BufferedReader operates recognizes line ending conventions of the Linux, Windows and MacOS worlds automatically, regardless of the current platform.

Hence:

BufferedReader br = new BufferedReader(
    new InputStreamReader("textfile.txt"));
StringBuilder sb = new StringBuilder();
for (;;) {
    String line = br.readLine();
    if (line == null)
        break;
    sb.append(line);
    sb.append(' ');   // SEE BELOW
}
String text = sb.toString();

Note that readLine() does not include the line terminator in the returned string. The code above appends a space to avoid gluing together the last word of a line and the first word of the next line.

Mota answered 29/1, 2010 at 16:57 Comment(0)
F
1

I find it odd that (Apache) StringUtils wasn't covered here yet.

you can remove all newlines (or any other occurences of a substring for that matter) from a string using the .replace method

StringUtils.replace(myString, "\n", "");

This line will replace all newlines with the empty string.

because newline is technically a character you can optionally use the .replaceChars method that will replace characters

StringUtils.replaceChars(myString, '\n', '');
Foreworn answered 2/7, 2016 at 13:45 Comment(1)
StringUtils.replaceEachRepeatedly(myString, new String[]{"\n", "\t"}, new String[]{StringUtils.Empty, StringUtils.Empty});Hypocoristic
E
0

You can use apache commons IOUtils to iterate through the line and append each line to StringBuilder. And don't forget to close the InputStream

StringBuilder sb = new StringBuilder();
FileInputStream fin=new FileInputStream("textfile.txt");
LineIterator lt=IOUtils.lineIterator(fin, "utf-8");
while(lt.hasNext())
{
  sb.append(lt.nextLine());
}
String text = sb.toString();
IOUtils.closeQuitely(fin);
Eunaeunice answered 21/1, 2016 at 21:22 Comment(0)
W
0

FYI if you can want to replace simultaneous muti-linebreaks with single line break then you can use

myString.trim().replaceAll("[\n]{2,}", "\n")

Or replace with a single space

myString.trim().replaceAll("[\n]{2,}", " ")
Which answered 19/9, 2016 at 13:44 Comment(0)
D
0

You can use generic methods to replace any char with any char.

public static void removeWithAnyChar(String str, char replceChar,
        char replaceWith) {
    char chrs[] = str.toCharArray();
    int i = 0;
    while (i < chrs.length) {

        if (chrs[i] == replceChar) {
            chrs[i] = replaceWith;
        }
        i++;
    }

}
Dine answered 31/3, 2017 at 5:0 Comment(0)
H
-1

org.apache.commons.lang.StringUtils#chopNewline

Hanks answered 22/5, 2018 at 5:42 Comment(1)
-1 because Deprecated and only removes at the end of the string.Mayce
A
-2

Try doing this:

 textValue= textValue.replaceAll("\n", "");
 textValue= textValue.replaceAll("\t", "");
 textValue= textValue.replaceAll("\\n", "");
 textValue= textValue.replaceAll("\\t", "");
 textValue= textValue.replaceAll("\r", "");
 textValue= textValue.replaceAll("\\r", "");
 textValue= textValue.replaceAll("\r\n", "");
 textValue= textValue.replaceAll("\\r\\n", "");
Addlepated answered 9/11, 2012 at 10:18 Comment(1)
if you replace \n there is no \r\n anymore if you replace \n and there is an \\n it will be replaced so only the \ will remain.Livengood

© 2022 - 2024 — McMap. All rights reserved.