.NET binary XML with pre-shared dictionary
Asked Answered
A

2

8

I'm using XmlDictionaryWriter to serialize objects to a database with data contract serializer. It works great, both size and speed are 2 times better then using text/xml.

However, I'll have to deal with enormous count of records in my database, where any extra bytes are directly translated into the gigabytes of the DB size. That's why I'd love to reduce the size further, by using an XML dictionary.

How do I do that?

I see that XmlDictionaryWriter.CreateBinaryWriter static method accepts the 2-nd parameter of type IXmlDictionary. The MSDN says "The XmlDictionary to use as the shared dictionary".

First I've tried to use the system-supplied implementation:

XmlDictionary dict = new XmlDictionary();
string[] dictEntries = new string[]
{
    "http://schemas.datacontract.org/2004/07/MyContracts",
    "http://www.w3.org/2001/XMLSchema-instance",
    "MyElementName1",
    "MyElementName2",
    "MyElementName3",
};
foreach ( string s in dictEntries )
        dict.Add( s );

The result is .NET framework completely ignores the dictionary, and still inserts the above strings as plain text instead of just referencing a corresponding dictionary entry.

Then I've created my own implementation of IXmlDictionary:

class MyDictionary : IXmlDictionary
{
    Dictionary<int, string> values = new Dictionary<int, string>();
    Dictionary<string, int> keys = new Dictionary<string, int>();

    MyDictionary()
    {
        string[] dictEntries = new string[]
        {
            "http://schemas.datacontract.org/2004/07/MyContracts",
            "http://www.w3.org/2001/XMLSchema-instance",
            "MyElementName1",
            "MyElementName2",
            "MyElementName3",
        };

        foreach ( var s in dictEntries )
            this.Add( s );
    }

    static IXmlDictionary s_instance = new MyDictionary();
    public static IXmlDictionary instance { get { return s_instance; } }

    void Add( string val )
    {
        if ( keys.ContainsKey( val ) )
            return;
        int id = values.Count + 1;
        values.Add( id, val );
        keys.Add( val, id );
    }

    bool IXmlDictionary.TryLookup( XmlDictionaryString value, out XmlDictionaryString result )
    {
        if ( value.Dictionary == this )
        {
            result = value;
            return true;
        }
        return this.TryLookup( value.Value, out result );
    }

    bool IXmlDictionary.TryLookup( int key, out XmlDictionaryString result )
    {
        string res;
        if ( !values.TryGetValue( key, out res ) )
        {
            result = null;
            return false;
        }
        result = new XmlDictionaryString( this, res, key );
        return true;
    }

    public bool /* IXmlDictionary. */ TryLookup( string value, out XmlDictionaryString result )
    {
        int key;
        if ( !keys.TryGetValue( value, out key ) )
        {
            result = null;
            return false;
        }

        result = new XmlDictionaryString( this, value, key );
        return true;
    }
}

The result is - my TryLookup methods are called OK, however DataContractSerializer.WriteObject produces an empty document.

How do I use a pre-shared dictionary?

Thanks in advance!

P.S. I don't want to mess with XmlBinaryReaderSession/XmlBinaryWriterSession: I don't have "sessions", instead I have a 10 GB+ database accessed by many threads at once. What I want is just static pre-defined dictionary.

Update: OK I've figured out that I just need to call "XmlDictionaryWriter.Flush". The only remaining question is - why doesn't the system-supplied IXmlDictionary implementation work as expected?

Autostability answered 14/3, 2011 at 18:55 Comment(0)
W
0

for the XmlDictionaryWriter you need to use session.
example:

   private static Stream SerializeBinaryWithDictionary(Person person,DataContractSerializer serializer)
    {
        var stream = new MemoryStream();
        var dictionary = new XmlDictionary();
        var session = new XmlBinaryWriterSession();
        var key = 0;
        session.TryAdd(dictionary.Add("FirstName"), out key);
        session.TryAdd(dictionary.Add("LastName"), out key);
        session.TryAdd(dictionary.Add("Birthday"), out key);
        session.TryAdd(dictionary.Add("Person"), out key);
        session.TryAdd(dictionary.Add("http://www.friseton.com/Name/2010/06"),out key);
        session.TryAdd(dictionary.Add("http://www.w3.org/2001/XMLSchema-instance"),out key);

        var writer = XmlDictionaryWriter.CreateBinaryWriter(stream, dictionary, session);
        serializer.WriteObject(writer, person);
        writer.Flush();
        return stream;
    }
Woods answered 20/3, 2011 at 23:55 Comment(0)
F
0

The only way I way able to replicate the issue with the IXmlDictionary not being used was when my class wasn't decorated with a DataContract attribute. The following app displays the difference in sizes with decorated and undecorated classes.

using System;
using System.Runtime.Serialization;
using System.Xml;

namespace XmlPresharedDictionary
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Serialized sizes");
            Console.WriteLine("-------------------------");
            TestSerialization<MyXmlClassUndecorated>("Undecorated: ");
            TestSerialization<MyXmlClassDecorated>("Decorated:   ");
            Console.ReadLine();
        }

        private static void TestSerialization<T>(string lineComment) where T : new()
        {
            XmlDictionary xmlDict = new XmlDictionary();
            xmlDict.Add("MyElementName1");

            DataContractSerializer serializer = new DataContractSerializer(typeof(T));

            using (System.IO.MemoryStream stream = new System.IO.MemoryStream())
            using (var writer = XmlDictionaryWriter.CreateBinaryWriter(stream, xmlDict))
            {
                serializer.WriteObject(writer, new T());
                writer.Flush();
                Console.WriteLine(lineComment + stream.Length.ToString());
            }
        }
    }

    //[DataContract]
    public class MyXmlClassUndecorated
    {
        public MyElementName1[] MyElementName1 { get; set; }

        public MyXmlClassUndecorated()
        {
            MyElementName1 = new MyElementName1[] { new MyElementName1("A A A A A"), new MyElementName1("A A A A A") };
        }
    }

    [DataContract]
    public class MyXmlClassDecorated
    {
        public MyElementName1[] MyElementName1 { get; set; }

        public MyXmlClassDecorated()
        {
            MyElementName1 = new MyElementName1[] { new MyElementName1("A A A A A"), new MyElementName1("A A A A A") };
        }
    }

    [DataContract]
    public class MyElementName1
    {
        [DataMember]
        public string Value { get; set; }

        public MyElementName1(string value) { Value = value; }
    }
}
Futurism answered 6/3, 2016 at 20:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.