There is no Unicode byte order mark. Cannot switch to Unicode
Asked Answered
D

4

56

I am writing an XML validator with XSD.

Below is what I did, but when the validator reached the line while (list.Read()) it gives me the error

There is no Unicode byte order mark. Cannot switch to Unicode.

Can anybody help me fix it?

public class Validator
    {
        public void Validate(string xmlString)
        {
            Boolean bRet = true;
            string xmlPath = @"C:\x.xml";
            string xsdPath = @"C:\general.xsd";

            XmlReaderSettings Settings = new XmlReaderSettings();
            Settings.Schemas.Add("", xsdPath);
            Settings.ValidationType = ValidationType.Schema;
            Settings.ValidationEventHandler += 
               new ValidationEventHandler(SettingsValidationEventHandler);

            XmlReader list = XmlReader.Create(xmlPath, Settings);
            //StringBuilder output = new StringBuilder();
            while (list.Read()) 
            {
            }
            //File.WriteAllText(@"D:\Output.xml", output.ToString());
        }
        static void SettingsValidationEventHandler(object sender,
                                                   ValidationEventArgs e)
        {
            if (e.Severity == XmlSeverityType.Warning)
            {
                MessageBox.Show( "WARNING: ");
                MessageBox.Show(e.Message);
            }
            else if (e.Severity == XmlSeverityType.Error)
            {
                MessageBox.Show("ERROR: ");
                MessageBox.Show(e.Message);
            }
        }
    }

XML

<?xml version="1.0" encoding="utf-16"?>
<FlashList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xmlns:xsd="http://www.w3.org/2001/XMLSchema" vin="xxxxxxxxxxxxx">
  <flash ECUtype="xxx" />
</FlashList>

XSD

<?xml version="1.0" encoding="utf-16"?>
<xs:schema attributeFormDefault="unqualified" 
           elementFormDefault="qualified"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="FlashList">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="flash" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string">
                <xs:attribute type="xs:string" name="ECUtype" use="optional"/>
              </xs:extension>
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
        <xs:element name="Error" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string">
                <xs:attribute type="xs:byte" name="code" use="optional" />
              </xs:extension>
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
      <xs:attribute type="xs:string" name="vin"/>
    </xs:complexType>
  </xs:element>
</xs:schema>
Drumfish answered 28/4, 2015 at 9:25 Comment(2)
Are you sure the "physical" file x.xml is properly encoded? Open it with a text editor such as Sublime or jEdit, to check the actual encoding.Pollitt
yes, I have made this XML file on the server side using the c# generated class from the same xsd file and it is well formed. this code is on the client side and I just want to validate my received xml file with the same xsd on the client side alsoDrumfish
H
98

The reality of your file's encoding appears to conflict with that specified by your XML declaration. If your file actually uses one-byte characters, declaring encoding="utf-16" won't change it to use two-byte characters, for example.

Try removing the conflicting encoding from the XML declaration. Replace

<?xml version="1.0" encoding="utf-16"?>

with

<?xml version="1.0"?>

You may also be able to load the file into a string as a work-around using LoadXML().

Haga answered 28/4, 2015 at 11:39 Comment(5)
FWIW: <?xml version="1.0" encoding="utf-8"?> might do the trick too.Retrogress
Yes, because utf-8 is the default encoding.Haga
After encountering a similar error, this answer helped me solving my own problem. In my case, I was first creating the xml programmatically, then reading and writing to it at a later point. If you want to remove/change the encoding version in the processing instruction using xmlwriter, use writer.WriteProcessingInstruction("xml", "version='1.0'"); (with writer being an instance of XmlWriter). See msdn docOosperm
The workaround "You may also be able to load the file into a string as a work-around using LoadXML()." worked for me.Mezzorilievo
But the question is if the workaround is safe to be implemented?Lawmaker
L
4

This error is thrown, when you declare encoding by UTF-16 in XML head, but physically don't save this file in such encoding.

You can check using simple Windows Notepad, clicking to Save As, and then in the bottom check encoding of xml file (probably it is still UTF-8, instead of UTF-16).

Screenshot of notepad encoding setting

Lawmaker answered 1/7, 2021 at 9:56 Comment(0)
M
3

If you are not able to change the xml file encoding as

<?xml version="1.0"?>

Alternatively, you can read the xml content directly as raw xml instead of loading it with xml path.

XmlReader.Create(new StringReader(File.ReadAllText(fileName)));

If you use XmlDocument;

var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(File.ReadAllText(filePath));
Mulligatawny answered 13/5, 2020 at 12:44 Comment(2)
Do not use File.ReadAllText. Always create a StreamReader and FileStream. Never allocate file-sized chunks in memory.Scop
@Mr.TA If it is a known, small file, like settings or whatever File.ReadAllText is perfectly OK.Spacesuit
A
0

You can use a StreamReader to set the encoding:

  return (TReport)xmlSerializer.Deserialize(
      new StreamReader(
          new FileStream(filename, FileMode.Open, FileAccess.Read), Encoding.UTF8));

Depending on your application, it might not be optimal to use a string to pass the xml, consider a stream instead.

Apteryx answered 24/11, 2022 at 17:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.