I created a binary serialization alternative, which is fully compatible with the IFormatter
infrastructure, including the ISerializable
, IDeserializationCallback
, IObjectReference
interfaces, the (de)serialization event methods, surrogate selectors and binders.
🔐 Security note: Please read the security notes at the Remarks section of the BinarySerializationFormatter
class. Make sure you enable SafeMode
, which eliminates a lot of security issues that BinaryFormatter
suffers from (starting with v8.0.0 it's enabled by default).
You can find the NuGet package here and an online demo here, which also demonstrates how compact the result can be compared to BinaryFormatter
.
Update for clarifying the security questions:
Does your package solve the security issue which is with the BinaryFormatter?
TL;DR: Yes, if you use only the natively supported types by the serializer and enable SafeMode
on deserialization.
Elaborated answer:
BinaryFormatter
is dangerous at multiple levels. Whereas some of these threats can be cured by a reimplementation (eg. we can prevent auto-loading assemblies referred by the serialization stream, we can use pessimistic allocation for arrays, strings and collections and increase the capacity dynamically to prevent OutOfMemoryException
attacks, etc.), others come from the IFormatter
instrastructure itself. For example, serializable types that can have an invalid state in terms of their fields usually do not validate the deserialized data in their deserialization events or in the special constructor. So it is partly the implementer's responsibility to ensure security completely.
The main reason of BinaryFormatter
"cannot be made secure" that it's a polymorphic serializer, meaning, an object field can hold any serializable type - just name it in the serialization stream and it will be resolved. It is a common misunderstanding that it's insecure because it uses a binary format, whereas JSON serialization is safe. No, polymorphic JSON serialization (eg. Newtonsoft's JSON.NET) is also insecure if you allow the type names to be dumped and resolved. That's why System.Text.Json
does not support polymorphism automatically, and applying some polymorphism to it can be a pain.
And vice versa, Google's ProtoBuf is safe, even though it has a binary format because it only uses a few primitive types that are not resolved by name but from a closed set of identifiers. The most complex thing you can encode is a list of key-value pairs. In return, it's really hard to serialize a nested object graph with ProtoBuf.
BinarySerializationFormatter
attempts to minimize the risks by supporting a lot of types (including collections) natively. These types are encoded without any assembly identity so it ensures both safety and a very compact payload if you don't use any custom types. As long as you use only these types and enable SafeMode
on deserialization you are completely safe.
If you are using a custom type (even if it's just a simple enum
) the assembly qualified name of the type must be stored, which can be manipulated. In SafeMode
type resolving does not actually load any types but selects the matching type from the collection of expected types that you must pass as a parameter for the deserialization. But this can be safe only if your expected types cannot be exploited by any security hole. For example, if you target .NET Framework, the TempFileCollection
class can be exploited to delete files (this specific attack is now protected by a special handling in SafeMode
, sot it cannot be deserialized even if it's an expected type).
I have already serialized the data using binaryformatter and written to the file. now I want to use some other alternative which can read this file and deserialize the file content.
The format of the streams are not compatible. So BinarySerializationFormatter
can only assure that if you were able to serialize your objects with BinaryFormatter
, then it will work also with my serializer but the binary stream will be different (in fact, much more compact).
However, if you really need to use the stream serialized by BinaryFormatter
, a small step towards security can be using a serialization binder (eg. this one), which can be used also with BinaryFormatter
. It can ensure that only the expected types are allowed to be deserialized, and if its SafeMode
is true
, then unexpected type names will not be resolved even if the serialization stream contains a manipulated assembly identity with a potentially harmful module initializer or a type with malicious constructor, etc.
technical constraints
? What is stored in that file? There are several duplicate questions, beyond the options described in the deprecation docs. Parquet, Arrow, Protobuf, MessagePack, ORC, HDF5 etc are all options that can be read by both .NET code and other applications. – WendishDictionary<string,DateTime>
you can store it into a two-column table or text file. UsingBinaryFormatter
was already wasteful, storing more data than necessary – Wendish[
with<
before outputing the file and call itkson
:D – AlthingBinaryFormatter
class itself. That file contains type names and signatures, not just values. It's dangerous because reading from it will try to use whatever type or method matches the stored signature, whether it's the original or not. It will even create any missing types, which could be used instead of the application's own dynamically loaded types. That's why BinaryFormatter is unfixable – WendishBinaryFormatter.Deserialize
has no type parameter so it will return whatever is stored in that file. – WendishConverter.exe
would be possible, that gets executed in a sandbox eg. stackoverflow.com/questions/3029214 - does this eliminate the security risks? – AlthingDeserialize
would still load whatever was in that file, even if it was eg a 1GB array, generating any missing types. A better option, one taken by .NET itself, is to migrate the data and gradually remove BinaryFormatter entirely. After all, the application is already at risk by using it. A gradual migration won't increase the risk. – Wendish