DataContractSerializer vs BinaryFormatter performance
Asked Answered
A

1

6

I was going through articles to understand more about the datacontractserializer and binaryformatter serializers. Based on the reading done so far I was under the impression that binaryformatter should have a lesser footprint than datacontractserializer. Reason being DataContractSerializer serializes to xml infoset while binaryformatter serializes to a proprietary binary format.

Following is the test

    [Serializable]
    [DataContract]
    public class Packet
    {
        [DataMember]
        public DataSet Data { get; set; }
        [DataMember]
        public string Name { get; set; }
        [DataMember]
        public string Description { get; set; }
    }

DataSet was populated with 121317 rows from [AdventureWorks].[Sales].[SalesOrderDetail] table

    using (var fs = new FileStream("test1.txt", FileMode.Create))
    {
        var dcs = new DataContractSerializer(typeof(Packet));
        dcs.WriteObject(fs, packet);
        Console.WriteLine("Total bytes with dcs = " + fs.Length);
    }



    using(var fs = new FileStream("test2.txt", FileMode.Create))
    {
       var bf = new BinaryFormatter();
       bf.Serialize(fs, packet);
       Console.WriteLine("Total bytes with binaryformatter = " + fs.Length);
    }


Results
Total bytes with dcs = 57133023
Total bytes with binaryformatter = 57133984

Question Why is the byte count for binaryformatter more than datacontractserializer? Shouldn't it be much lesser?

Adversative answered 20/1, 2011 at 0:36 Comment(0)
T
7

DataSet has a bad habit: it implements ISerializable and then serializes its contents as a string of XML by default, even when passed to a BinaryFormatter. This is why the two streams are nearly identical in size. If you change its RemotingFormat property to Binary, it will do the same thing but by creating a new BinaryFormatter, dumping itself into a MemoryStream, and then putting the resulting byte array as a value in the outer BinaryFormatter's stream.

Outside of that, BinaryFormatter carries more information about types, such as the full name of the assembly they came from; also, there is the per-object overhead on top of the XML for a DataSet.

If you're trying to compare the behavior of the two serializers, DataSet is a poor choice because it overrides too much.

Trilemma answered 20/1, 2011 at 1:24 Comment(2)
Thanks for the insight. I am stuck with using DataSet. Too many issues !!Adversative
@stackoverflowuser: I've had excellent luck with just running the serialized stream through a deflater, if size is a big issue.Trilemma

© 2022 - 2024 — McMap. All rights reserved.