What are the differences between the XmlSerializer and BinaryFormatter
Asked Answered
U

5

51

I spent a good portion of time last week working on serialization. During that time I found many examples utilizing either the BinaryFormatter or XmlSerializer. Unfortunately, what I did not find were any examples comprehensively detailing the differences between the two.

The genesis of my curiosity lies in why the BinaryFormatter is able to deserialize directly to an interface whilst the XmlSerializer is not. Jon Skeet in an answer to "casting to multiple (unknown types) at runtime" provides an example of direct binary serialization to an interface. Stan R. provided me with the means of accomplishing my goal using the XmlSerializer in his answer to "XML Object Deserialization to Interface."

Beyond the obvious of the BinaryFormatter utilizes binary serialization whilst the XmlSerializer uses XML I'd like to more fully understand the fundamental differences. When to use one or the other and the pros and cons of each.

Uranic answered 20/7, 2009 at 15:18 Comment(0)
E
101

The reason a binary formatter is able to deserialize directly to an interface type is because when an object is originally serialized to a binary stream metadata containing type and assembly information is stuck in with the object data. This means that when the binary formatter deserializes the object it knows its type, builds the correct object and you can then cast that to an interface type that object implements.

The XML serializer on the otherhand just serializes to a schema and only serializes the public fields and values of the object and no type information other then that (e.g. interfaces the type implements).

Here is a good post, .NET Serialization, comparing the BinaryFormatter, SoapFormatter, and XmlSerializer. I recommend you look at the following table which in addition to the previously mentioned serializers includes the DataContractSerializer, NetDataContractSerializer and protobuf-net.

Serialization Comparison

Enneahedron answered 20/7, 2009 at 16:4 Comment(6)
Good table. I always found the lack of generics annoying SOAP.Vasomotor
I believe the "best performance" classification is wrong. Binary formatter is the worst performing serializer in .net (except maybe the soap formatter). At least that's what most benchmarks show: blogs.msdn.com/b/youssefm/archive/2009/07/10/…, james.newtonking.com/archive/2010/01/01/…, techmikael.blogspot.com/2010/01/…Regularize
@Regularize I haven't clicked through to read the referenced articles, but if it's not the binary formatter which serializer has the best performance?Uranic
@Uranic In this particular list it would be Protobuf-net first and then the DataContractSerializerRegularize
@CallumRogers the soap formatter was obsoleted with .net 2.0 in favour of XmlSerializer, so was never updated for genericsCrankpin
@Regularize best performance always depends on usage, XML doesn't reuse elements so any data set that features lots of shared references will have lots of duplication in a XML process, that will reduce performance but tests with flat objects will favour XMLCrankpin
B
6

Just to weigh in...

The obvious difference between the two is "binary vs xml", but it does go a lot deeper than that:

  • fields (BinaryFormatter=bf) vs public members (typically properties) (XmlSerializer=xs)
  • type-metadata based (bf) vs contract-based (xs)
  • version-brittle (bf) vs version-tolerant (xs)
  • "graph" (bf) vs "tree" (xs)
  • .NET specific (bf) vs portable (xs)
  • opaque (bf) vs human-readable (xs)

As a discussion of why BinaryFormatter can be brittle, see here.

It is impossible to discuss which is bigger; all the type metadata in BinaryFormatter can make it bigger. And XmlSerializer can work very with compression like gzip.

However, it is possible to take the strengths of each; for example, Google have open-sourced their own data serialization format, "protocol buffers". This is:

  • contract-based
  • portable (see list of implementations)
  • version-tolerant
  • tree-based
  • opaque (although there are tools to show data when combined with a .proto)
  • typically "contract first", but some implementations allow implicit contracts based on reflection

But importantly, it is very dense data (no type metadata, pure binary representation, short tags, tricks like variant-length base-7 encoding), and very efficient to process (no complex xml structure, no strings to match to members, etc).

I might be a little biased; I maintain one of the implementations (including several suitable for C#/.NET), but you'll note I haven't linked to any specific implementation; the format stands under its own merits ;-p

Boxboard answered 16/8, 2009 at 21:2 Comment(0)
C
2

The XML Serializer produces XML and also an XML Schema (implicitly). It will produce XML that conforms to this schema.

One implication is that it will not serialize anything which cannot be described in XML Schema. For instance, there is no way to distinguish between a list and an array in XML Schema, so the XML Schema produced by the serializer can be interpreted either way.

Runtime serialization (which the BinaryFormatter is part of) serializes the actual .NET types to the other side, so if you send a List<int>, the other side will get a List<int>.

That obviously works better if the other side is running .NET.

Compression answered 20/7, 2009 at 15:35 Comment(2)
hi John, I was just wondering if handling a List<T> is not an issue, would you still prefer BinaryFormatter?Spinal
No. List was just an example. The two have totally different use cases. BTW, see SoapFormatter. Runtime serialization (the two formatters) is totally different from XML Serialization, which is very different from Data Contract serialization.Compression
F
1

The XmlSerializer serialises the type by reading all the type's properties that have both a public getter and a public setter (and also any public fields). In this sense the XmlSerializer serializes/deserializes the "public view" of the instance.

The binary formatter, by contrast, serializes a type by serializing the instance's "internals", i.e. its fields. Any fields that are not marked as [NonSerialized] will be serialized to the binary stream. The type itself must be marked as [Serializable] as must any internal fields that are also to be serialized.

Faradmeter answered 20/7, 2009 at 15:26 Comment(0)
S
0

I guess one of the most important ones is that binary serialization can serialize both public and private members, whereas the other one works only with public ones.

In here, it provides a very helpful comparison between these two in terms of size. It's a very important issue, because you might send your serialized object to a remote machine.

http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/

Spinal answered 20/7, 2009 at 15:42 Comment(1)
link is dead and not archived.Fertility

© 2022 - 2024 — McMap. All rights reserved.