.NET: Prevent XmlDocument.LoadXml from retrieving DTD
Asked Answered
B

2

14

I have following code (C#), it takes too long and it throws exception:

new XmlDocument().
LoadXml("<?xml version='1.0' ?><!DOCTYPE note SYSTEM 'http://someserver/dtd'><note></note>");

I understand why it does that. My question is how do I make it stop? I don't care about DTD validation. I suppose I could just regex-replace it, but I am looking for more elegant solution.

Background:
The actual XML is received from a web site I do not own. When site is undergoing maintenance it returns XML with DOCTYPE that points to the DTD that's not available during maintenance. So my service gets unnecessary slow because it tries to get DTD for each XML I need to parse.

Here is exception stack:

Unhandled Exception: System.Net.WebException: The remote name could not be resolved: 'someserver'
at System.Net.HttpWebRequest.GetResponse()
at System.Xml.XmlDownloadManager.GetNonFileStream(Uri uri, ICredentials credentials)
at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials credentials)
at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role, Type ofObjectToReturn)
at System.Xml.XmlTextReaderImpl.OpenStream(Uri uri)
at System.Xml.XmlTextReaderImpl.DtdParserProxy_PushExternalSubset(String systemId, String publicId)
at System.Xml.XmlTextReaderImpl.DtdParserProxy.System.Xml.IDtdParserAdapter.PushExternalSubset(String systemId, String publicId)
at System.Xml.DtdParser.ParseExternalSubset()
at System.Xml.DtdParser.ParseInDocumentDtd(Boolean saveInternalSubset)
at System.Xml.DtdParser.Parse(Boolean saveInternalSubset)
at System.Xml.XmlTextReaderImpl.DtdParserProxy.Parse(Boolean saveInternalSubset)
at System.Xml.XmlTextReaderImpl.ParseDoctypeDecl()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.LoadXml(String xml)
at ConsoleApplication36.Program.Main(String[] args) in c:\Projects\temp\ConsoleApplication36\Program.cs:line 11
Boorish answered 14/12, 2010 at 23:30 Comment(2)
Important to mitigate this for XEE as well owasp.org/index.php/XML_External_Entity_(XXE)_ProcessingSepulcher
objXML.validateOnParse=false and objXML.resolveExternals=false works for meMi
B
12

Well, in .NET 4.0 XmlTextReader has a property called DtdProcessing. When set to DtdProcessing.Ignore it should disable DTD processing.

Bourgeon answered 16/12, 2010 at 15:4 Comment(1)
You should try setting XmlReader.Settings.ValidationType to ValidationType.None. Alternatively, I think setting XmlReader.Settings.XmlResolver to null could also do the trickBourgeon
S
1

In .net 4.5.1 I had no luck setting doc.XmlResolver to null.

The easiest fix for me was to use a string replacement to change "xmlns=" to "ignore=" before calling LoadXml(), e.g.

var responseText = await response.Content.ReadAsStringAsync();
responseText = responseText.Replace("xmlns=", "ignore=");
try
{
    var doc = new XmlDocument();
    doc.LoadXml(responseText);
    ...
}
Socket answered 6/7, 2016 at 17:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.