How can I force XDocument to output "UTF-8" in the declaration line?
Asked Answered
P

3

12

The following code produces this output:

<?xml version="1.0" encoding="utf-16" standalone="yes"?>
<customers>
  <customer>
    <firstName>Jim</firstName>
    <lastName>Smith</lastName>
  </customer>
</customers>

How can I get it to produce encoding="utf-8" instead of encoding="utf-16"?

using System;
using System.Collections.Generic;
using System.IO;
using System.Xml.Linq;

namespace test_xml2
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Customer> customers = new List<Customer> {
                new Customer {FirstName="Jim", LastName="Smith", Age=27},
                new Customer {FirstName="Hank", LastName="Moore", Age=28},
                new Customer {FirstName="Jay", LastName="Smythe", Age=44},
                new Customer {FirstName="Angie", LastName="Thompson", Age=25},
                new Customer {FirstName="Sarah", LastName="Conners", Age=66}
            };

            Console.WriteLine(BuildXmlWithLINQ(customers));

            Console.ReadLine();

        }
        private static string BuildXmlWithLINQ(List<Customer> customers)
        {
            XDocument xdoc =
                new XDocument(
                    new XDeclaration("1.0", "utf-8", "yes"),
                    new XElement("customers",
                        new XElement("customer",
                            new XElement("firstName", "Jim"),
                            new XElement("lastName", "Smith")
                        )
                    )
                );

            var wr = new StringWriter();
            xdoc.Save(wr);

            return wr.GetStringBuilder().ToString();
        }
    }

    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public int Age { get; set; }

        public string Display()
        {
            return String.Format("{0}, {1} ({2})", LastName, FirstName, Age);
        }
    }
}
Pled answered 20/7, 2010 at 8:44 Comment(0)
H
19

This is not a bug in .NET. This is due to you using StringWriter as the target for your XDocument. Since StringWriter internally uses UTF-16, the document must also use UTF-16 as encoding. If you save the XDoc to a stream or a file, it will use UTF-8 as instructed.

For more information, see MSDN information about StringWriter.Encoding:

This property is necessary for some XML scenarios where a header must be written containing the encoding used by the StringWriter. This allows the XML code to consume an arbitrary StringWriter and generate the correct XML header.

Hale answered 6/6, 2014 at 5:47 Comment(0)
P
17

Allow me answer my own question, this seems to work:

private static string BuildXmlWithLINQ()
{
    XDocument xdoc = new XDocument
    (
        new XDeclaration("1.0", "utf-8", "yes"),
        new XElement("customers",
            new XElement("customer",
                new XElement("firstName", "Jim"),
                new XElement("lastName", "Smith")
            )
        )
    );
    return xdoc.Declaration.ToString() + Environment.NewLine + xdoc.ToString();
}
Pled answered 20/7, 2010 at 8:54 Comment(3)
Seems like a bug in the API to me that the serializer ignores this value in your XDeclaration.Planer
@KirkWoll Not really, since the default encoding for XML is UTF8 so it can be omitted. Therefore it begs the question why the need for explicitly write UTF-8 there. Probably the problem was UTF-16 being there, not that there wasn't UTF-8.Hale
See also thread How to print <?xml version=“1.0”?> using XDocument.Bouley
F
0

You can use the following code as an example

XDocument doc = GetXmlDoc();
using (var stream = new MemoryStream())
{
    doc.Save(stream, SaveOptions.DisableFormatting);
    var docBytes = stream.ToArray();
    File.WriteAllBytes("fileName.xml", docBytes);
}
Fillagree answered 9/11, 2019 at 10:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.