How to remove empty lines from a formatted string
Asked Answered
L

11

42

How can I remove empty lines in a string in C#?

I am generating some text files in C# (Windows Forms) and for some reason there are some empty lines. How can I remove them after the string is generated (using StringBuilder and TextWrite).

Example text file:

THIS IS A LINE



THIS IS ANOTHER LINE AFTER SOME EMPTY LINES!
Learnt answered 4/10, 2011 at 12:14 Comment(4)
Is removing the lines after generation really what you want to do? I think you should look at why you are generating extra lines. If you use the WriteLine(...) methods they will write the new line for you. The Write(...) methods do not write a new line sequence.Tisiphone
Well it is not my fault, I am extacting text from some text files and that is the problem!Learnt
#4974024Ledger
#4141223Quinine
M
106

If you also want to remove lines that only contain whitespace, use

resultString = Regex.Replace(subjectString, @"^\s+$[\r\n]*", string.Empty, RegexOptions.Multiline);

^\s+$ will remove everything from the first blank line to the last (in a contiguous block of empty lines), including lines that only contain tabs or spaces.

[\r\n]* will then remove the last CRLF (or just LF which is important because the .NET regex engine matches the $ between a \r and a \n, funnily enough).

Metopic answered 4/10, 2011 at 12:17 Comment(9)
This almost works, however I have one problem : The last line is empty and it isn't removed. I'm lousy at regex so I'm not sure why?Kriemhild
@RobinRye: This is because it requires at least one whitespace character to match. If you change the \s+ to \s*, then it should also remove the last line.Metopic
Thanks Tim, I thought so too after researching Regex a bit but it didn't help. Changed to \s* but the last line was still left in the result string. I used str.Trim() to get rid of it.Kriemhild
This removes the last empty line too: Regex.Replace(subjectString, @"[\r\n]*^\s*$[\r\n]*", "", RegexOptions.Multiline);Cheffetz
@Diana: This might have a side effect. In some cases, to many "newline" are removed with this method.Veneaux
@RobinRye Using str.Trim() will remove space and tab characters from the beginning of the first line of text. You may want to use str.TrimEnd() instead. If you also want to preserve spaces/tabs at the end of the last line of text, use str.TrimEnd('\r','\n').Polyclinic
@RicardoFontana: Can you elaborate how it‘s not working? This answer is rather specific to .NET regexes - how are you using that under Unix?Metopic
@TimPietzcker I write a test method [Theory] [InlineData("\nText sample")] // Windows break line [InlineData("\r\nText sample")] // Unix break line public void RemoveBlankLinesInLinuxAndWindows(string text) { resultString = Regex.Replace(text, @"^\s+$[\r\n]*", string.Empty, RegexOptions.Multiline); Assert.Equal("Text sample", resultString); }Anglo
@TimPietzcker change the regex like oobe @"^\s*$\n|\r" start to work.Anglo
R
22

Tim Pietzcker - it is not working for me. I have to change a little bit, but thanks!

Ehhh C# Regex.. I had to change it again, but this it working well:

private string RemoveEmptyLines(string lines)
{
  return Regex.Replace(lines, @"^\s*$\n|\r", string.Empty, RegexOptions.Multiline).TrimEnd();
}

Example: http://regex101.com/r/vE5mP1/2

Rodolforodolph answered 23/7, 2014 at 11:27 Comment(0)
U
12

You could try String.Replace("\n\n", "\n");

Unijugate answered 4/10, 2011 at 12:17 Comment(4)
well thanks but this is not a general solution, would not include tabs, spaces and stuff like thatLearnt
Your question didn't say anything about that. In fact you specifically said "empty lines."Unijugate
I tacked Trim() on as well. But still, it won't work in cases of \n\n\n.Leery
Well, that is not actually resolves all empty lines. I faced a situation when I have variable amount of endlines going together. So in that case we need to iterate several times through the text.Chide
D
4

Try this

Regex.Replace(subjectString, @"^\r?\n?$", "", RegexOptions.Multiline);
Devout answered 4/10, 2011 at 12:16 Comment(0)
M
3
private string remove_space(string st)
{
    String final = "";

    char[] b = new char[] { '\r', '\n' };
    String[] lines = st.Split(b, StringSplitOptions.RemoveEmptyEntries);
    foreach (String s in lines)
    {
        if (!String.IsNullOrWhiteSpace(s))
        {
            final += s;
            final += Environment.NewLine;
        }
    }

    return final;
}
Mahon answered 16/8, 2015 at 18:32 Comment(3)
please add descriptionPeroration
You have performance issue here. consider testing your method with a string that has 1 million \n inside. consider using StringBuilder instead of +String And I think calling your function RemoveEmptyLines makes more sense.Slicker
An explanation would in order.Real
T
1

I found a simple answer to this problem:

YourradTextBox.Lines = YourradTextBox.Lines.Where(p => p.Length > 0).ToArray();

Adapted from Marco Minerva [MCPD] at Delete Lines from multiline textbox if it's contain certain string - C#

Tyrannize answered 11/3, 2018 at 20:53 Comment(0)
T
1
private static string RemoveEmptyLines(string text)
{
    var lines = text.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);

    var sb = new StringBuilder(text.Length);

    foreach (var line in lines)
    {
        sb.AppendLine(line);
    }

    return sb.ToString();
}
Thelma answered 8/1, 2019 at 10:56 Comment(2)
AppendLine appends an empty line at the end of the returned string.Crinkle
@thomasgalliker, That is the intention. split removes the newline from end of the line, thus you will need to add it back, otherwise all your lines are going to garble into one line! The only issue is Environment.NewLine is a string and cannot fit into char arraySlicker
C
1

Based on Evgeny Sobolev's code, I wrote this extension method, which also trims the last (obsolete) line break using TrimEnd(TrimNewLineChars):

public static class StringExtensions
{
    private static readonly char[] TrimNewLineChars = Environment.NewLine.ToCharArray();

    public static string RemoveEmptyLines(this string str)
    {
        if (str == null)
        {
            return null;
        }

        var lines = str.Split(TrimNewLineChars, StringSplitOptions.RemoveEmptyEntries);

        var stringBuilder = new StringBuilder(str.Length);

        foreach (var line in lines)
        {
            stringBuilder.AppendLine(line);
        }

        return stringBuilder.ToString().TrimEnd(TrimNewLineChars);
    }
}
Crinkle answered 25/5, 2019 at 16:35 Comment(5)
Your extension only works if the string in question is originated from same system. if it is transferring between systems such as lnux,web to windows, it won't work at all. Consider changing TrimNewLineChars to actual arraySlicker
I don't know what you mean. Can you post a sample string where it won't work and I gonna write a unit test with it. Thanks.Crinkle
Try it on text files where the end-of-line sequence is CR + LF (Windows), LF (Linux), and Mac (classic, before Max OS X) (CR). CR = ASCII 13. [LF](LF) = ASCII 10.Real
That is what AaA hinted at. Environment.NewLine only works if the file was created with the default line-end sequence of the current system. Most advanced text editors can handle/set/save in the formats (in Visual Studio Code it is by the somewhat hidden feature that you can click on the displayed setting (e.g., "LF") for a given file in the lower right and change it right there).Real
Please read the question carefully before voting everyone down.Crinkle
O
1

None of the methods mentioned here helped me all the way, but I found a workaround.

  1. Split text to lines - collection of strings (with or without empty strings, also Trim() each string).

  2. Add these lines to multiline string.

     public static IEnumerable<string> SplitToLines(this string inputText, bool removeEmptyLines = true)
     {
         if (inputText == null)
         {
             yield break;
         }
    
         using (StringReader reader = new StringReader(inputText))
         {
             string line;
             while ((line = reader.ReadLine()) != null)
             {
                 if (removeEmptyLines && !string.IsNullOrWhiteSpace(line))
                     yield return line.Trim();
                 else
                     yield return line.Trim();
             }
         }
     }
    
     public static string ToMultilineText(this string text)
     {
         var lines = text.SplitToLines();
    
         return string.Join(Environment.NewLine, lines);
     }
    
Ornie answered 22/7, 2020 at 7:54 Comment(0)
V
0

I tried the previous answers, but some of them with regex do not work right.

If you use a regex to find the empty lines, you can’t use the same for deleting.

Because it will erase "break lines" of lines that are not empty.

You have to use "regex groups" for this replace.

Some others answers here without regex can have performance issues.

    private string remove_empty_lines(string text) {
        StringBuilder text_sb = new StringBuilder(text);
        Regex rg_spaces = new Regex(@"(\r\n|\r|\n)([\s]+\r\n|[\s]+\r|[\s]+\n)");
        Match m = rg_spaces.Match(text_sb.ToString());
        while (m.Success) {
            text_sb = text_sb.Replace(m.Groups[2].Value, "");
            m = rg_spaces.Match(text_sb.ToString());
        }
        return text_sb.ToString().Trim();
    }
Vaccination answered 5/12, 2019 at 1:58 Comment(0)
H
-1

This pattern works perfect to remove empty lines and lines with only spaces and/or tabs.

s = Regex.Replace(s, "^\s*(\r\n|\Z)", "", RegexOptions.Multiline)
Huang answered 30/9, 2016 at 14:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.