Why "Data at the root level is invalid. Line 1, position 1." for XML Document?
Asked Answered
P

5

33

I am using a third-party DLL which transmits an XML document over the internet.

Why would the DLL be throwing the following exception?

Data at the root level is invalid. Line 1, position 1. (see below for full exception text.)

Here are the first few lines of the XML Document:

<?xml version="1.0" encoding="utf-8"?> <REQUEST>   <HEADER>
    <REQUESTID>8a5f6d56-d56d-4b7b-b7bf-afcf89cd970d</REQUESTID>
    <MESSAGETYPE>101</MESSAGETYPE>
    <MESSAGEVERSION>3.0.2</MESSAGEVERSION>

Exception:

System.ApplicationException was caught
      Message=Unexpected exception.
      Source=FooSDK
      StackTrace:
           at FooSDK.RequestProcessor.Send(String SocketServerAddress, Int32 port)
           at Foo.ExecuteRequest(Int32 messageID, IPayload payload, Provider prov)
           at Foo.SendOrder(Int32 OrderNo)
      InnerException: System.Xml.XmlException
           LineNumber=1
           LinePosition=1
           Message=Data at the root level is invalid. Line 1, position 1.
           Source=System.Xml
           SourceUri=""
           StackTrace:
                at System.Xml.XmlTextReaderImpl.Throw(Exception e)
                at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
                at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
                at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
                at System.Xml.XmlTextReaderImpl.Read()
                at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
                at System.Xml.XmlDocument.Load(XmlReader reader)
                at System.Xml.XmlDocument.LoadXml(String xml)
                at XYZ.RequestProcessor.GetObjectFromXML(String xmlResult)
                at XYZ.RequestProcessor.Send(String SocketServerAddress, Int32 port)
           InnerException:
Pantry answered 30/7, 2013 at 12:38 Comment(2)
How is the xml file transmitted over the internet? HTTP? If so, check if a) the file has a BOM, and b) the HTTP header also specifies a non-UTF8 charset.Reticle
Looking at the stack trace, I'm thinking this is the issue: https://mcmap.net/q/206142/-xmldocument-load-vs-xmldocument-loadxmlFerocious
D
54

I eventually figured out there was a byte mark exception and removed it using this code:

 string _byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
    if (xml.StartsWith(_byteOrderMarkUtf8))
    {
        var lastIndexOfUtf8 = _byteOrderMarkUtf8.Length-1;
        xml = xml.Remove(0, lastIndexOfUtf8);
    }
Duplicator answered 1/7, 2014 at 14:51 Comment(3)
This is zerobased char array - I had to use xml = xml.Remove(0, _byteOrderMarkUtf8.Length-1);Nuke
Surely it should be xml = xml.Remove(0, _byteOrderMarkUtf8.Length); as the second argument to String.Remove(Int32, Int32) is the number of characters to remove, starting at startIndex. See the MSDN docs.Luby
To make this code work on Windows Server 2012 as well as Windows 7 I had to use if (xml.StartsWith(_byteOrderMarkUtf8, StringComparison.Ordinal)). Check out https://mcmap.net/q/205140/-startswith-change-in-windows-server-2012 for details.Luby
V
15

I can give you two advices:

  1. It seems you are using "LoadXml" instead of "Load" method. In some cases, it helps me.
  2. You have an encoding problem. Could you check the encoding of the XML file and write it?
Variolite answered 30/7, 2013 at 12:48 Comment(1)
Just saw the comment. Yes, try to set file encoding to UTF8-WITHOUT-BOMVariolite
P
3

Remove everything before <?xml version="1.0" encoding="utf-8"?>

Sometimes, there is some "invisible" (not visible in all text editors). Some programs add this.

It's called BOM, you can read more about it here: https://en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding

Painty answered 7/12, 2016 at 20:49 Comment(0)
J
2

Main culprit for this error is logic which determines encoding when converting Stream or byte[] array to .NET string.

Using StreamReader created with 2nd constructor parameter detectEncodingFromByteOrderMarks set to true, will determine proper encoding and create string which does not break XmlDocument.LoadXml method.

public string GetXmlString(string url)
{
    using var stream = GetResponseStream(url);
    using var reader = new StreamReader(stream, true);
    return reader.ReadToEnd(); // no exception on `LoadXml`
}

Common mistake would be to just blindly use UTF8 encoding on the stream or byte[]. Code bellow would produce string that looks valid when inspected in Visual Studio debugger, or copy-pasted somewhere, but it will produce the exception when used with Load or LoadXml if file is encoded differently then UTF8 without BOM.

public string GetXmlString(string url)
{
    byte[] bytes = GetResponseByteArray(url);
    return System.Text.Encoding.UTF8.GetString(bytes); // potentially exception on `LoadXml`
}

So, in the case of your third party library, they probably use 2nd approach to decode XML stream to string, thus the exception.

Jeopardous answered 19/10, 2020 at 15:11 Comment(0)
Z
1

if you are using XDocument.Load(url); to fetch xml from another domain, it's possible that the host will reject the request and return and unexpected (non-xml) result, which results in the above XmlException

See my solution to this eventuality here: XDocument.Load(feedUrl) returns "Data at the root level is invalid. Line 1, position 1."

Zapateado answered 12/11, 2014 at 7:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.