Deserialize Xml with empty elements
Asked Answered
K

3

10

Consider the following XML:

<a>
    <b>2</b>
    <c></c>
</a>  

I need to deserialize this xml to an object. So, i wrote the following class.

public class A
{
    [XmlElement("b", Namespace = "")]
    public int? B { get; set; }

    [XmlElement("c", Namespace = "")]
    public int? C { get; set; }

}

Since i'm using nullables, i was expecting that, when deserialing the above xml, i would get an object A with a null C property.

Instead of this, i get an exception telling the document has an error.

Kuching answered 12/3, 2012 at 18:51 Comment(1)
<c></c> is not null. It is a zero-length string.Polanco
C
12

There's a difference between a missing element and a null element.

A missing element, <a><b>2</b></a>. Here C would take whatever default value you specify, using the DefaultValue attribute, or null if there's no explicit default.

A null element <a><b>2</b><c xs:Nil='true'/></a>. Here you will get null.

When you do <a><b>2</b><c></c><a/> the xml serializer will try to parse string.Empty as an integer an will correctly fail.

Since your provider is generating invalid xml you will need to do this, if using the XmlSerializer:

[XmlRoot(ElementName = "a")]
public class A
{
    [XmlElement(ElementName = "b")]
    public int? B { get; set; }

    [XmlElement(ElementName = "c")]
    public string _c { get; set; }

    public int? C
    {
        get
        {
            int retval;

            return !string.IsNullOrWhiteSpace(_c) && int.TryParse(_c, out retval) ? (int?) retval : null;
        }
    }
}

or slightly better using the DataContractSerializer

[DataContract(Name="a")]
public class A1
{
    [DataMember(Name = "b")]
    public int? B { get; set; }

    [DataMember(Name = "c")]
    private string _c { get; set; }

    public int? C
    {
        get
        {
            int retval;

            return !string.IsNullOrWhiteSpace(_c) && int.TryParse(_c, out retval) ? (int?)retval : null;
        }
    }
}

although the DataContractSerializer doesn't support attributes if that's a problem.

Capsicum answered 12/3, 2012 at 19:3 Comment(5)
Thanks for your post. I'm aware of what you are telling me. But i don't have control of xml data (it's comming from an external service) . And the sevice provider are returning <c></c> when in fact it should return a xs:Nil='true', so i have to deal with this.Whensoever
You will need to make C a string and then have a wrapper property that parses the deserialized string to an int? as required.Capsicum
Thanks. Shouldn't your test be "Strng.IsNullOrEmpty" instead of "IsNullOrWhiteSpace"? You don't need to perform the cast (int?)retval. However, to avoid compile errors, we need to cast (int?)nullWhensoever
I guess you don't need the string.IsNullOrWhitespace or string.IsNullOrEmpty since TryParse is enough. Yes you need the (int?) case when using the ?: operator so both sides evaluate to (int?).Capsicum
It may be worth noting that if your class properties' names exactly match your XML element names (above a property lower case "c" was mapping to upper case property "C" through attributes), redirecting an element (by the XmlElement or DataMember attributes) can cause a conflict if your intended target already matched implicitly by name. Adding an [XmlIgnore] attribute to the target element can clear this up.Wiggs
G
11

To deserialize empty tags like 'c' in your example:

    <foo>
        <b>2</b>
        <c></c>
    </foo>

I used this approach. First it removes the null or empty elements from the XML file using LINQ and then it deserialize the new document without the null or empty tags to the Foo class.

    public static Foo ReadXML(string file)
    {
            Foo foo = null;
            XDocument xdoc = XDocument.Load(file);
            xdoc.Descendants().Where(e => string.IsNullOrEmpty(e.Value)).Remove();

            XmlSerializer xmlSer = new XmlSerializer(typeof(Foo));
            using (var reader = xdoc.Root.CreateReader())
            {
                foo = (Foo)xmlSer.Deserialize(reader);
                reader.Close();
            }
            if (foo == null)
                foo = new Foo();

            return foo;
    }

Which will give you default values on the missing properties.

    foo.b = 2;
    foo.c = 0; //for example, if it's an integer

I joined information from this links:

Remove empty XML tags

Use XDocument as the source for XmlSerializer.Deserialize?

Gibeon answered 13/4, 2017 at 11:26 Comment(5)
Why not xdoc.Descendants().Where(e => e.IsEmpty || String.IsNullOrWhiteSpace(e.Value))?Guideboard
You can also do using (TextReader reader = new StringReader(xdoc.ToString())) instead.Guideboard
You sir saved my day!Denver
Great job. Still valid in 2022. Thank you.Stanzel
Still valid in 2024. After looking for solution for an hours, this one is the most efficient way to handle empty tags like <amounttip/>. As i did not want to change cs file generated by the xsd tool. ` xdoc.Descendants().Where(e => string.IsNullOrEmpty(e.Value)).Remove();` removed the empty tags for me efficientlyHemangioma
R
0

null and empty are two different things.

You need to use two properties:

[XmlElement("c")]
string CAsString { get; set; } = "";

public int? C 
{
    get 
    {
        if (string.IsNullOrWhiteSpace(CAsString)) return null;
        return int.Parse(CAsString);
    }
}

The first one can be kept private and will not pollute your API.

The second one is read-only so it will be ignored by the serialization process.

Of course, this is valid only for read, you can add a setter to the C property if you need to write.

Rubric answered 15/11, 2023 at 16:8 Comment(2)
How is this answer different from the accepted answer? (accepted answer is: https://mcmap.net/q/906708/-deserialize-xml-with-empty-elements )Sheerness
It's clearer and provides more information.Rubric

© 2022 - 2024 — McMap. All rights reserved.