.NET XML Deserialization ignore namespaces
Asked Answered
S

4

12

I got thousands of XML files following all the same schema/structure. I implemented IXmlSerializable and thus am reading the elements and attributes myself.

My problem is that these files each use a different phony namespace. These files come from an other source so I cannot change that :D Also, there are too many of those namespaces for me to just build an array of the possible namespaces and pass it along to the xmlserializer.

Right now, if I don't specify a namespace, it throws a [xmlns:ns0="http://tempuri.org/abcd.xsd" was not expected] error.

I would like to be able to tell the serializer to simply ignore the namespace when deserializing my object and just fire ReadXML. Or just be able to tell it to accept any "http://tempuri.org/" namespace.

Is that possible?

I would like to avoid modifying the files as much as possible.

Thank you!

Stinkhorn answered 25/9, 2012 at 20:24 Comment(3)
Have you considered loading the XML first in order to get the namespace so that you could then pass that into the XmlSerializer?Brendanbrenden
@StevenDoggart Yes I did, but I would like to know if there's a more "appropriate" way to do this before I start working around it. It just seems silly that you can't ignore namespaces without getting an exception :SStinkhorn
Yes, it's a very good question and I'm very curious if there is an answer to it as well.Brendanbrenden
R
2

Yes, it is possible. When you call the Deserialize method of your XmlSerializer, you can specify an XmlTextReader instance.

This answer by Cheeso on a related C# question shows how to create an XmlTextReader which ignores any namespaces occurring in the XML file. I have taken the liberty to translate his idea to VB and create a simple proof-of-concept example based on your requirements:

Imports System.IO
Imports System.Text
Imports System.Xml
Imports System.Xml.Serialization

' Helper class
Class NamespaceIgnorantXmlTextReader
    Inherits XmlTextReader

    Public Sub New(stream As Stream)
        MyBase.New(stream)
    End Sub

    Public Overrides ReadOnly Property NamespaceURI As String
        Get
            Return ""
        End Get
    End Property
End Class

' Serializable class
Public Class ExampleClass
    Public Property MyProperty As String
End Class

' Example
Module Module1
    Sub Main()
        Dim testXmlStream = New MemoryStream(Encoding.UTF8.GetBytes(
            "<ExampleClass xmlns=""http://tempuri.org/SomePhonyNamespace1.xsd"" 
                           xmlns:ph2=""http://tempuri.org/SomePhonyNamespace2.xsd"">
                 <ph2:MyProperty>SomeValue</ph2:MyProperty>
             </ExampleClass>"))

        Dim serializer As New XmlSerializer(GetType(ExampleClass))
        Dim reader As New NamespaceIgnorantXmlTextReader(testXmlStream)
        Dim example = DirectCast(serializer.Deserialize(reader), ExampleClass)

        Console.WriteLine(example.MyProperty)   ' prints SomeValue
    End Sub
End Module

Note: If it's just the document's default namespace which is different (i.e., the individual tags don't have different namespaces), using a standard TextXmlReader with the Namespaces property set to False suffices.

Imports System.IO
Imports System.Text
Imports System.Xml
Imports System.Xml.Serialization

' Serializable Class
Public Class ExampleClass
    Public Property MyProperty As String
End Class

' Example
Module Module1
    Sub Main()
        Dim testXmlStream = New MemoryStream(Encoding.UTF8.GetBytes(
            "<ExampleClass xmlns=""http://tempuri.org/SomePhonyNamespace1.xsd"">
                 <MyProperty>SomeValue</MyProperty>
             </ExampleClass>"))

        Dim serializer As New XmlSerializer(GetType(ExampleClass))
        Dim reader As New XmlTextReader(testXmlStream)
        reader.Namespaces = False
        Dim example = DirectCast(serializer.Deserialize(reader), ExampleClass)

        Console.WriteLine(example.MyProperty)   ' prints SomeValue
    End Sub
End Module
Recension answered 13/10, 2016 at 21:9 Comment(0)
L
0

It is not an answer to your question about how to tell the XmlSerialiser to ignore namespaces but a workaround. You can use an xslt transform to strip the namespaces from the xml before serializing it.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/|comment()|processing-instruction()">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="*">
    <xsl:element name="{local-name()}">
      <xsl:apply-templates select="@*|node()"/>
    </xsl:element>
  </xsl:template>

  <xsl:template match="@*">
    <xsl:attribute name="{local-name()}">
      <xsl:value-of select="."/>
    </xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

A couple of extension methods as helpers for this, going to be a bit tricky to get them all maybe but I'll try:

/// <summary>
/// Transforms the xmldocument to remove all namespaces using xslt
/// https://mcmap.net/q/186692/-how-to-remove-all-namespaces-from-xml-with-c
/// http://msdn.microsoft.com/en-us/library/42d26t30.aspx
/// </summary>
/// <param name="xmlDocument"></param>
/// <param name="indent"></param>
public static XmlDocument RemoveXmlNameSpaces(this XmlDocument xmlDocument, bool indent = true)
{
    return xmlDocument.ApplyXsltTransform(Properties.Resources.RemoveNamespaces, indent);
}

public static XmlDocument ApplyXsltTransform(this XmlDocument xmlDocument, string xsltString,bool indent= true)
{
    var xslCompiledTransform = new XslCompiledTransform();
    Encoding encoding;
    if (xmlDocument.GetEncoding() == null)
    {
        encoding = DefaultEncoding;
    }
    else
    {
        encoding = Encoding.GetEncoding(xmlDocument.GetXmlDeclaration().Encoding);
    }
    using (var xmlTextReader = xsltString.GetXmlTextReader())
    {
        xslCompiledTransform.Load(xmlTextReader);
    }
    XPathDocument xPathDocument = null;
    using (XmlTextReader xmlTextReader = xmlDocument.OuterXml.GetXmlTextReader())
    {
        xPathDocument = new XPathDocument(xmlTextReader);
    }
    using (var memoryStream = new MemoryStream())
    {
        using (XmlWriter xmlWriter = XmlWriter.Create(memoryStream, new XmlWriterSettings()
            {
                Encoding = encoding,
                Indent = indent
            }))
        {
            xslCompiledTransform.Transform(xPathDocument, xmlWriter);
        }
        memoryStream.Position = 0;
        using (var streamReader = new StreamReader(memoryStream, encoding))
        {
            string readToEnd = streamReader.ReadToEnd();
            return readToEnd.ToXmlDocument();
        }
    }
}

public static Encoding GetEncoding(this XmlDocument xmlDocument)
{
    XmlDeclaration xmlDeclaration = xmlDocument.GetXmlDeclaration();
    if (xmlDeclaration == null)
        return null;
    return Encoding.GetEncoding(xmlDeclaration.Encoding);
}

public static XmlDeclaration GetXmlDeclaration(this XmlDocument xmlDocument)
{
    XmlDeclaration xmlDeclaration = null;
    if (xmlDocument.HasChildNodes)
        xmlDeclaration = xmlDocument.FirstChild as XmlDeclaration;
    return xmlDeclaration;
}

public static XmlTextReader GetXmlTextReader(this string xml)
{
    return new XmlTextReader(new StringReader(xml));
}
Lumber answered 11/10, 2012 at 16:15 Comment(0)
S
0

To expound on Heinzi's answer, I needed to change the default namespace (technically, the namespace of the root element) so I could deserialize documents using XmlAttributeOverrides applied to a class hierarchy I didn't control. As part of that, I had to assign an XmlRootAttribute attribute to the first class. The problem is that XmlSerializer expects the Namespace value to match the XmlRootAttribute's namespace in order to deserialize a document, which cannot be guaranteed.

With the following class, derived from XmlReader, the root element's namespace can be assigned a known value for the deserializer. The reader's namespace can be forced to match the XmlRootAttribute attribute's namespace (even if it is an empty string).

Simplifying the solution is using the XmlWrappingReader from Alterant's answer on StackOverflow to How do I create a XmlTextReader that ignores Namespaces and does not check characters.

/// <summary>
/// XML document reader replaces the namespace of the root element.
/// </summary>
public class MyXmlReader : Mvp.Xml.Common.XmlWrappingReader
{
    // Namespace of the document's root element. Read from document.
    private string rootNamespace = "";

    /// <summary>
    /// Get or set the target namespace to use when deserializing.
    /// </summary>
    public string TargetNamespace { get; set; }

    /// <summary>
    /// Initialize a new instance of the MXmlReader class.
    /// </summary>
    /// <param name="reader">XmlReader instance to modify.</param>
    public MyXmlReader(XmlReader reader) : base(reader)
    {
        TargetNamespace = "";
    }

    /// <summary>
    /// Return the namespace of the XML node. Substitute the target namespace if it matches the namespace of the root element.
    /// </summary>
    public override string NamespaceURI
    {
        get
        {
            if (Depth == 0 && NodeType == XmlNodeType.Element)
            {
                // Save the namespace from the document's root element.
                rootNamespace = base.NamespaceURI;
            }

            if (base.NamespaceURI == rootNamespace)
            {
                // Substitute the schema's targetNamespace for the root namespace.
                return TargetNamespace;
            }

            // Use the native namespace of the XML node.
            return base.NamespaceURI;
        }
    }
}

I instantiated MyXmlReader and used it do deserialize to an object marked with XmlRootAttribute(ElementName = "DocumentRoot", Namespace = "http://my.target.namespace"):

var reader = new MyXmlReader(XmlReader.Create(stream));
reader.TargetNamespace = "http://my.target.namespace";

// Deserialize using the defined XML attribute overrides that can
// supply XML serialization attributes to types at runtime.
Type t = typeof(SomeDeserializedObject);
var xo = SomeDeserializedObject.GetXmlAttributeOverrides();
XmlSerializer serializer = new XmlSerializer(t, xo);
SomeDeserializedObject o = (SomeDeserializedObject)serializer.Deserialize(reader);

In the event the XML document I import has a different root namespace, or doesn't specify one at all, now I can still deserialize it.

Speedwriting answered 12/3, 2020 at 19:43 Comment(0)
E
-1

you can remove namespaces from xml file by using this code

using (FileStream stream = new FileStream("FilePath",FileMode.Create))
                {
                    XmlSerializer serializer = new XmlSerializer(typeof(YourClass));
                    XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
                    ns.Add("", "");                    
                    serializer.Serialize(stream," Your Object to Serialize",ns);
                }
Ethiopia answered 4/2, 2015 at 20:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.