OnSerializing/OnSerialized methods not always called
Asked Answered
C

1

10

Here is a structure I serialize in my project:

[Serializable]
class A : List<B> //root object being serialized

[Serializable]
class B
  + [A few serializable fields]
  + C customList

[Serializable]
class C : List<D>

[Serializable]
class D
  + [several serializable fields]
  |
  + [NonSerialized] nonserializable3rdPartyClass data
  + string xmlOf3rdPartyData
  |
  + [OnSerializing]
  + private void OnSerializing(StreamingContext context)
  |
  + [OnSerialized]
  + private void OnSerialized(StreamingContext context)
  |
  + [OnDeserialized]
  + private void OnDeserialized(StreamingContext context)

The nonserializable3rdPartyClass, although not marked as [Serializable], provides .ToXml and .FromXml methods which I use in my .OnSerializing and .OnDeserialized methods, respectively, to store and retrieve the XML string in xmlof3rdPartyData.

I've recently come across an issue where, under certain unknown circumstances (I have so far only been able to reproduce the issue using a serialized data file from a client, who first reported the issue), my .OnSerializing and .OnSerialized methods are only being called 57/160 times (where 160 is the total number of D objects in the structure) when using a BinaryFormatter to serialize to a file, leaving me with 103 D objects with xmlOf3rdPartyData set to null. When cloning the structure using the method described here (which is basically the same as what I use to serialize to a file), I see the same results for .OnSerializing/.OnSerialized, but my .OnDeserialized method is called the full 160 times.

This bit of code has been in use for months without issue (at least, as far as I know), and I'm still trying to determine why this is happening now and not earlier. I'm not seeing any first chance exceptions while debugging, and my breakpoints at the start of the methods are simply not being hit more than 57 times. Any ideas on why this would occur/how to fix it?

Civics answered 10/7, 2012 at 20:53 Comment(1)
TL;DR Within a single call of BinaryFormatter.Serialize, my OnSerializing/OnSerialized marked methods are only being called ~35% of the time, but only in this one scenario that I can't yet qualify/find a reason for.Civics
C
9

After a few days of digging, I discovered that the problem was both my fault and a possible bug in the .NET Framework.

The .NET half of the problem

While poking around in the stacktrace for my OnSerializing method, I came across the RegisterObject method in System.Runtime.Serialization.SerializationObjectManager, which determines whether to call any OnSerializing methods in the object being serialized. It determines this in two ways (this is based off decompiled code from .NET Reflector):

  1. Does the class have any OnSerializing methods to call
  2. Is this an previously unseen object (within this call to BinaryFormatter.Serialize)

Number 2 is the problem child. It tracks objects that have already been seen by storing them as an object/bool pair in a Hashtable (which uses GetHashCode, of course). If either of these is false, the object's OnSerializing methods are not called. This apparently works fine in the vast majority of situations (otherwise Microsoft would have fixed it at some point, right?), except for the one I seem to have stumbled upon.

My half of the problem

Simply enough, I forgot to include the non-serializable field in my GetHashCode for my D class, so I was getting collisions. Stupid mistake, I know, don't know how I missed it.

But wait...

...wouldn't that mean that it's not .NET's fault at all, just my own? No, and here's why. I expect OnSerializing and OnSerialized methods to be called 100% of the time no matter what. No where in the docs does it say otherwise. When that doesn't happen my objects aren't serialized correctly, and I end up spending way more time than I'd like trying to solve mysteries. Even if two identical objects are being purposefully serialized, they apparently don't end up pointing to the same binary data/location in the Stream, so they don't deserialize the same. I'd consider this a bug, not a feature.

I've written up a test-case that demonstrates all this. If I'm doing anything blatantly wrong I'd appreciate feedback saying so, otherwise I'll probably post this on the MSDN forums or as a Connect bug. And before anyone suggests, I've planned on switching away from BinaryFormatter for some time now for all the various reasons posted elsewhere on SO, I've just have more important things to deal with.

Edit: Apparently this bug was filed over a year and a half ago.

Civics answered 13/7, 2012 at 1:9 Comment(1)
Think the "feature" has now been documented, though to me it was incomprehensible other than as a warning, your issue/explanation helped. Thanks : learn.microsoft.com/en-us/dotnet/api/… "When using binary serialization, the binary serialization process omits the call to the onSerializing method when the serialized objects are equal."Balmuth

© 2022 - 2024 — McMap. All rights reserved.