Fast and compact object serialization in .NET
Asked Answered
E

8

60

I want to use object serialization to communicate over the network between a Mono server and Silverlight clients. It is pretty important that serialization is space efficient and pretty fast, as the server is going to host multiple real time games.

What technique should I use? The BinaryFormatter adds a lot of overhead to serialized classes (Version, culture, class name, property names, etc.) that is not required within this application.

What can I do to make this more space efficient?

Enravish answered 14/2, 2009 at 13:40 Comment(1)
D'oh! Arrived an hour too late ;-pWeichsel
G
70

You can use Protocol Buffers. I'm changing all my serialization code from BinaryFormatter with compression to Protocol Buffers and obtaining very good results. It's more efficient in both time and space.

There are two .NET implementations by Jon Skeet and Marc Gravell.

Update: Official .NET implementation can be found here.

Gallicism answered 14/2, 2009 at 14:3 Comment(9)
Thanks. These implementations also seem to work with silverlight!Enravish
@Jorge - btw, did you realise that protobuf-net can hook directly into BinaryFormatter if you want to reduce the changes? You can implement ISerializable on your root objects, and just call into Serializer.Serialize/Serializer.MergeWeichsel
@Jorge - out of curiosity, which framework did you go with? I won't be offended if the answer is "Jon's" - I'm just interested... I'm glad it is working for you, whichever it is.Weichsel
@Marc - Yes, I know. My major changes are removing compression, it was adding too much processing work with little benefits. Thanks for your work!Gallicism
@Marc - I'm using yours. It seems more flexible and better integrated with .NET framework.Gallicism
Well, surely Marc's version is better suited for .NET developent, Jon's version generate a 1200 line of code class for 3 line proto file used in tutorial, which could be written as 3 line C# class. Thought I was unable to use Marc's version in real life project due to unsopperted scenario for my model. #7794027Jeopardous
The Protocol Buffers link without the language identifier (using the newer URL format).Suburbanite
Is it possible to use protobuf-net for an arbitrary collection of objects? I'm trying to use it to implement a session-state provider that stores SessionStateItemCollection in couchbase. It seemed to serialize it fine, but then errored when attempting to deserialize.Revitalize
I have just tried ProtoBuf and found it is over 30 x slower than the binary formatter, but gets much better compression ... here are the outputs from each for the same object ... Size (protoBuf): 847 bytes, time: 101ms, Size (binFormatter): 4981 bytes, time: 3ms ...Wells
R
53

I have some benchmarks for the leading .NET serializers available based on the Northwind dataset.

Northwind .NET serialization benchmarks

@marcgravell binary protobuf-net is the fastest implementations benchmarked that is about 7x faster than Microsoft fastest serializer available (the XML DataContractSerializer) in the BCL.

I also maintain some open-source high-performance .NET text serializers as well:

  • JSV TypeSerializer a compact, clean, JSON+CSV-like format that's 3.1x quicker than the DataContractSerializer
  • as well as a JsonSerializer that's 2.6x quicker.
Reimers answered 18/8, 2010 at 4:16 Comment(6)
Just wondering about the fact that the file size of the Json serializer is much smaller than the BinaryFormatter, how did they test this? I wrote my own serializer which is very similar to the BinaryFormatter and the results are very different. The range is from 1:1 to 1:7.87 in file size for my formatter. "Don't ever trust statistics that you haven't falsified yourself"Gamp
@FelixK. the linked benchmarks already includes a notice and reference to the benchmark source code used to generate the tests, i.e. see: code.google.com/p/servicestack/source/browse/trunk/Common/…Reimers
After taking a look on the benchmark code i don't think the results of the file size are representative for all cases, the BinaryFormatter file size is much smaller when using complex objects. Serializing a bunch of small objects to the formatter doesn't makes sense( he submits a lot of informations when calling Serialize for each of the objects ), when using a array instead the size of the file is much smaller.Gamp
The benchmark says what it does on the tin, I.e. a db row in every Table of Northwind X times. Submit your own benchmarks + source code if you want something more representative.Reimers
the "benchmarks for the leading .NET serializers" link seems dead.. do you have an updated url?Airdrop
@Reimers can you also include netserializer github.com/tomba/netserializer and see how it fares..Mountaineering
W
29

As the author, I would invite you to try protobuf-net; it ships with binaries for both Mono 2.0 and Silverlight 2.0, and is fast and efficient. If you have any problems whatsoever, just drop me an e-mail (see my Stack Overflow profile); support is free.

Jon's version (see the earlier accepted answer) is also very good, but IMO the protobuf-net version is more idiomatic for C# - Jon's would be ideal if you were talking C# to Java, so you could have a similar API at both ends.

Weichsel answered 14/2, 2009 at 15:36 Comment(9)
I am already experimenting with protobuf-net, really great work, thanks a lot!Enravish
Is adding the ProtoContract attribute to classes and ProtoMember attribute to the members necessary for using your library?Mountaineering
@Binoj - with "v1" (the current pre-compiled downloadable dll) "yes". However, "v2" fixes this (see marcgravell.blogspot.com/2010/02/protobuf-net-v2-on-iphone.html for full "v2" details). It isn't feature-complete yet (and to play with it you'd have to compile from the trunk), but the existing "v2" code works fine for a range of simple messages.Weichsel
@Binoj - as a separate note; it doesn't have to be ProtoContract - it will also work with [XmlType]+[XmlElement(Order=n)] or [DataContract]+[DataMember(Order=n)].Weichsel
@MarcGravell How do you handle WeakReference-objects? I've created some serializers too and i recently noticed that i'm not handling WeakReference yet, because it works with the special constructor for deserialization.Gamp
@Felix I don't think I've added specific handling for that, in part because "object" is a bit rare for me (I prefer properly typed DTO models)Weichsel
You can get this from NuGet (protobug-net-data). It is crazy fast (2x faster than ReadXML in my testing) and super easy to use: rdingwall.github.io/protobuf-net-dataGastrostomy
@Mark netserializer github.com/tomba/netserializer claims to be more efficient than protobuf, any thoughts?Mountaineering
@BinojAntony I haven't evaluated itWeichsel
O
8

I had a similar problem, although I'm just using .NET. I wanted to send data over the Internet as quickly and easily as possible. I didn't find anything that would be optimized enough, so I made my own serializer, named NetSerializer.

NetSerializer has its limitations, but they didn't affect my use case. And I haven't done benchmarks for a while, but it was much much faster than anything else I found.

I haven't tried it on Mono or Silverlight. I'd bet it works on Mono, but I'm not sure what the level of support is for DynamicMethods on Silverlight.

Overstuffed answered 13/6, 2011 at 10:24 Comment(1)
Wow this is really awesome. Thank you. It looks like this is much faster if you can deal with less stuff (versioning, etc). In this case less is more :)Brockbrocken
S
5

You could try using JSON. It's not as bandwidth efficient as Protocol Buffers, but it would be a lot easier to monitor messages with tools like Wireshark, which helps a lot when debugging problems. .NET 3.5 comes with a JSON serializer.

Sting answered 14/2, 2009 at 14:45 Comment(0)
M
4

You could pass the data through a DeflateStream or GZipStream to compress it prior to transmission. These classes live in the System.IO.Compression namespace.

Martguerita answered 14/2, 2009 at 13:48 Comment(2)
Thanks! Do you know how bad this will impact on the de/serialization speed?Enravish
In my experience they're not great for huge streams of data, but for most other cases the impact would be minor, but you'll need to try it and measure the time implications. It's only a few lines of code to add to the serialization calls, so easy to try out.Martguerita
C
3

I had a very similar problem - saving to a file. But the following can also be used over a network as it was actually designed for remoting.

The solution is to use Simon Hewitt's library - see Optimizing Serialization in .NET - part 2.

Part 1 of the article states (the bold is my emphasis): "... If you've ever used .NET remoting for large amounts of data, you will have found that there are problems with scalability. For small amounts of data, it works well enough, but larger amounts take a lot of CPU and memory, generate massive amounts of data for transmission, and can fail with Out Of Memory exceptions. There is also a big problem with the time taken to actually perform the serialization - large amounts of data can make it unfeasible for use in apps ...."

I got a similar result for my particular application, 40 times faster saving and 20 times faster loading (from minutes to seconds). The size of the serialised data was also much reduced. I don't remember exactly, but it was at least 2-3 times.

It is quite easy to get started. However there is one gotcha: only use .NET serialisation for the very highest level datastructure (to get serialisation/deserialisation started) and then call the serialisation/deserialisation functions directly for the fields in the highest level datastructure. Otherwise there will not be any speed-up... For instance, if a particular data structure (say Generic.List) is not supported by the library then .NET serialisation will used instead and this is a no-no. Instead serialise the list in client code (or similar). For an example see near "'This is our own encoding." in the same function as listed below.

For reference: code from my application - see near "Note: this is the only place where we use the built-in .NET ...".

Collar answered 10/8, 2009 at 16:41 Comment(0)
B
2

You can try BOIS which focuses on packed data size and provides the best packing so far. (I haven't seen better optimization yet.)

https://github.com/salarcode/Bois

Beulabeulah answered 22/3, 2013 at 6:13 Comment(6)
+1, but MPL?for real? who does that?I was holding my breath for the speed boost and bahh.. It was ruined.Remsen
What part of MPL you have problem with? It lets you to use the components in both open-source and commercial software. While keeps the right for developer to have access to modified version of the component in that software (if there is any).Beulabeulah
That's fine then. I imagined it gives access to everything.Remsen
Very slow to deserialize. Dataset with 1 table, 50000 rows, 50 columns, containing programmatically generated string data (all columns are of type string). 2 seconds to serialize into file, 15(!) seconds to deserialize. BinaryFormatter manages to finish in 12 seconds total, so BOIS appears to be slower, and hence of no practical use.Waste
@Neolisk You should know that the Dataset support is only available for desktop and not other platforms. It is only for backward compatibility. No optimization has been done for that, will never be. You should reconsider you data storage strategy.Beulabeulah
@SalarKhalilzadeh: Yes, I'm working with desktop solutions for Windows. This isn't likely to change in the near future. reconsider you data storage strategy <-- do you know that some companies still need mainframe developers? My point is that sometimes you cannot change how legacy code is written under the hood. It's like old car vs new. Yes, buying a Maserati will fix your problem, but do you have those extra 200K $, if all what is required is an engine rebuild for 3K $?Waste

© 2022 - 2024 — McMap. All rights reserved.