XDocument adds carriage return when generating final xml string
Asked Answered
M

3

7

I have a case where I would like to generate xml prior to posting it to an API, containing line breaks (\n) but not carriage returns (no \r).

In C# though, it seems that XDocument automatically adds carriage returns in its to-string method:

var inputXmlString = "<root>Some text without carriage return\nthis is the new line</root>";

// inputXmlString: <root>Some text without carriage return\nthis is the new line</root>

var doc = XDocument.Parse(inputXmlString);

var xmlString = doc.Root.ToString();

// xmlString: <root>Some text without carriage return\n\rthis is the new line</root>

In doc.Root.ToString(), sets of \n\r are added between elements for indentation which does not matter for the receivers interpretation of the xml message as a whole. However, the ToString() method also adds \r inside the actual text field where I need to preserve standalone line breaks (\n without \r after it).

I know I could do a final string replace, removing all carriage returns from the final string prior to the actual HTTP post to be performed, but this just seems wrong.

The issue is the same when constructing the xml-document using XElement objects instead of Document.Parse. The issue also persists, even if I use a CData element to wrap the text.

Can anyone explain to me, if I do something wrong or if there is some clean way of achieving what I try to do?

Mandate answered 29/9, 2016 at 15:25 Comment(0)
K
10

XNode.ToString is a convenience that uses an XmlWriter under the covers - you can see the code in the reference source.

Per the documentation for XmlWriterSettings.NewLineHandling:

The Replace setting tells the XmlWriter to replace new line characters with \r\n, which is the new line format used by the Microsoft Windows operating system. This helps to ensure that the file can be correctly displayed by the Notepad or Microsoft Word applications. This setting also replaces new lines in attributes with character entities to preserve the characters. This is the default value.

So this is why you're seeing this when you convert your element back to a string. If you want to change this behaviour, you'll have to create your own XmlWriter with your own XmlWriterSettings:

var settings = new XmlWriterSettings
{
    OmitXmlDeclaration = true,        
    NewLineHandling =  NewLineHandling.None
};

string xmlString;

using (var sw = new StringWriter())
{
    using (var xw = XmlWriter.Create(sw, settings))
    {
        doc.Root.WriteTo(xw);                    
    }
    xmlString = sw.ToString();
}
Kordofanian answered 29/9, 2016 at 16:1 Comment(0)
K
2

Have you tried:

how to remove carriage returns, newlines, spaces from a string

string result = XElement.Parse(input).ToString(SaveOptions.DisableFormatting);
Console.WriteLine(result);
Kalynkam answered 29/9, 2016 at 15:34 Comment(2)
Doesn't make a differenceHemotherapy
That fixed my issue as doing xdoc.Root.Value kept all the formatting but doing xdoc.Root?.ToString() was removing the carriage return. Thanks!Pankey
I
0

That other answer didn't work for me (after I converted it to VB)

but this did:

Return xDoc.ToString(SaveOptions.DisableFormatting)

Iphlgenia answered 8/2, 2019 at 0:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.