Alternatives of BinaryFormatter in C#?
Asked Answered
V

2

1

I have a file in which I am writing the content using the below C# code.

ConcurrentDictionary<string, DateTime> _jobsAck;
public void SaveToDisk()
{
    var binaryFormatter = new BinaryFormatter();
    using (var stream = File.Open(BINARY_FILENAME, FileMode.OpenOrCreate))
    {
        binaryFormatter.Serialize(stream, _jobsAck);
    }
}

I am reading this file and deserializing using the below C# code.

public void LoadFromDisk()
{
    if (!File.Exists(BINARY_FILENAME)) return;

    var binaryFormatter = new BinaryFormatter();
    using (var stream = File.Open(BINARY_FILENAME, FileMode.Open, FileAccess.Read))
    {
        var deserializedStream = binaryFormatter.Deserialize(stream);
        _jobsAck = deserializedStream as ConcurrentDictionary<string, DateTime>;
        if (_jobsAck == null)
        {
            _jobsAck = new ConcurrentDictionary<string, DateTime>();
            if (!(deserializedStream is Dictionary<string, DateTime> ackDict)) return;
            foreach (var pair in ackDict)
            {
                _jobsAck.TryAdd(pair.Key, pair.Value);
            }
        }
    }
}

We have been asked not to use the BinaryFormatter because it has some security-related issues. So Is there any alternative way which could read/write in binary format?

.Net framework version: 4.7.2

Villenage answered 22/8, 2023 at 12:16 Comment(16)
Use JSON serializer/deserializer.Depot
nuget.org/packages/MessagePackGaitan
Looks like it depends what data you are handling: learn.microsoft.com/en-us/dotnet/standard/serialization/…Becki
@Depot I cannot use json or xml serializer due to some technical constraints in this case.Villenage
Could you please express your constraints within your question so that we don't have to guess?Damalus
Alternatives based on what criteria? What are the technical constraints? What is stored in that file? There are several duplicate questions, beyond the options described in the deprecation docs. Parquet, Arrow, Protobuf, MessagePack, ORC, HDF5 etc are all options that can be read by both .NET code and other applications.Wendish
You could even use SQLite or even a plain old CSV file. If the data is a Dictionary<string,DateTime> you can store it into a two-column table or text file. Using BinaryFormatter was already wasteful, storing more data than necessaryWendish
without knowing the constraints, I suggest use json serializer internally but replace [ with < before outputing the file and call it kson :DAlthing
@PanagiotisKanavos the problem here is I need to read the binary file first and then I need to write the file later. so my binary file already has some data and I don't want to lose it.Villenage
well, the obvious answer would be do not overwrite the original file, or? - but why would you lose the data at all? if you want to replace it, check the format of the file - if its binary format - read it with binary formatter and "transform" it to whateverAlthing
@RandRandom I want to completely get rid of BinaryFormatter due to security issues.Villenage
but, you apparently can't if you still need to be able to read data formated with binary formatter - IMHO you can't have it both ways, still be downwards compatible with old data written with binary formatter and at the same time get rid of it - you at the very least need to be able to read data, you can get rid of writing data with the binary formatterAlthing
@VivekNuna the security issue is the contents of the file, not the BinaryFormatter class itself. That file contains type names and signatures, not just values. It's dangerous because reading from it will try to use whatever type or method matches the stored signature, whether it's the original or not. It will even create any missing types, which could be used instead of the application's own dynamically loaded types. That's why BinaryFormatter is unfixableWendish
@VivekNuna if that file referred to a class in your target application, deserializing it would create an instance and run its constructor, before your code had a chance to check that data. BinaryFormatter.Deserialize has no type parameter so it will return whatever is stored in that file.Wendish
@PanagiotisKanavos - do you happen to know if a Converter.exe would be possible, that gets executed in a sandbox eg. stackoverflow.com/questions/3029214 - does this eliminate the security risks?Althing
A minimal console application that only deserialized the data and wrote it to a file would be simpler and just as safe. Deserialize would still load whatever was in that file, even if it was eg a 1GB array, generating any missing types. A better option, one taken by .NET itself, is to migrate the data and gradually remove BinaryFormatter entirely. After all, the application is already at risk by using it. A gradual migration won't increase the risk.Wendish
M
2

If your customers already have data saved with binaryFormatter you need to keep it for reading files, regardless of its security issues, until you have migrated all, or most, of your customers to some new format. There is to my knowledge no publish specification of the format for BinaryFormatter, nor any other compatible libraries. And even if there where, I'm not sure it could solve the security problems, since the problems are inherent to the format itself.

So the first step should be to create a new format, using some well designed serialization library. I mostly use json and protobuf (.net), but there are plenty of good alternatives, see https://softwarerecs.stackexchange.com/ if you want recommendations. Just about anything should be better than BinaryFormatter.

You should then update your application so that it can no longer save files using binaryFormtter, only in your new format. Depending on your exact use case you might be able to convert saved data as soon as the new version is installed, in other cases you might only be able to do so when a user explicitly saves a file.

Once your updated application with support for the new format has been out for a while you can start thinking about removing support for BinaryFormatter. Users of older versions might be forced to update to an intermediate version and convert their files. Or you might publish a separate tool that only does conversions between the old format and the new format. You could also add a security warning when opening a file in the old format, to at least warn the user of the risk.

The main point here is that the sooner you introduce a new format, the sooner you can drop support for the old format. The length of this process will largely depend on your support commitments to customers, and willingness to make breaking changes.

Mid answered 22/8, 2023 at 14:1 Comment(6)
I am getting runtime error when using protobuf Could not load file or assembly 'System.Memory, Version=4.0.1.2, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)Villenage
@VivekNuna That is likely because you are missing a reference to "'System.Memory, Version=4.0.1.2" or one of its dependencies. Nuget sometimes has problems installing transitive dependencies for projects in the old format, so you might need to install such dependencies yourself.Mid
does that version even exist? nuget.org/packages/System.Memory#versions-body-tabAlthing
@RandRandom It at least existed at one point in time. But as long as the major version is the same, never versions should be backward compatible. So using the latest 4.x should work.Mid
@Mid that is why I am wondering. I already have 4.5.5 version of System.Memory. I have raised bug here as well github.com/protobuf-net/protobuf-net/issues/1092 . lets seeVillenage
@VivekNuna I have no real idea. Normally, assembly bindings in the app.config file should instruct the CLR to load the newer version if an older version is requested. But there are plenty of ways for dependencies to go wrong.Mid
S
5

I created a binary serialization alternative, which is fully compatible with the IFormatter infrastructure, including the ISerializable, IDeserializationCallback, IObjectReference interfaces, the (de)serialization event methods, surrogate selectors and binders.

🔐 Security note: Please read the security notes at the Remarks section of the BinarySerializationFormatter class. Make sure you enable SafeMode, which eliminates a lot of security issues that BinaryFormatter suffers from (starting with v8.0.0 it's enabled by default).

You can find the NuGet package here and an online demo here, which also demonstrates how compact the result can be compared to BinaryFormatter.


Update for clarifying the security questions:

Does your package solve the security issue which is with the BinaryFormatter?

TL;DR: Yes, if you use only the natively supported types by the serializer and enable SafeMode on deserialization.

Elaborated answer:

BinaryFormatter is dangerous at multiple levels. Whereas some of these threats can be cured by a reimplementation (eg. we can prevent auto-loading assemblies referred by the serialization stream, we can use pessimistic allocation for arrays, strings and collections and increase the capacity dynamically to prevent OutOfMemoryException attacks, etc.), others come from the IFormatter instrastructure itself. For example, serializable types that can have an invalid state in terms of their fields usually do not validate the deserialized data in their deserialization events or in the special constructor. So it is partly the implementer's responsibility to ensure security completely.

The main reason of BinaryFormatter "cannot be made secure" that it's a polymorphic serializer, meaning, an object field can hold any serializable type - just name it in the serialization stream and it will be resolved. It is a common misunderstanding that it's insecure because it uses a binary format, whereas JSON serialization is safe. No, polymorphic JSON serialization (eg. Newtonsoft's JSON.NET) is also insecure if you allow the type names to be dumped and resolved. That's why System.Text.Json does not support polymorphism automatically, and applying some polymorphism to it can be a pain.

And vice versa, Google's ProtoBuf is safe, even though it has a binary format because it only uses a few primitive types that are not resolved by name but from a closed set of identifiers. The most complex thing you can encode is a list of key-value pairs. In return, it's really hard to serialize a nested object graph with ProtoBuf.

BinarySerializationFormatter attempts to minimize the risks by supporting a lot of types (including collections) natively. These types are encoded without any assembly identity so it ensures both safety and a very compact payload if you don't use any custom types. As long as you use only these types and enable SafeMode on deserialization you are completely safe.

If you are using a custom type (even if it's just a simple enum) the assembly qualified name of the type must be stored, which can be manipulated. In SafeMode type resolving does not actually load any types but selects the matching type from the collection of expected types that you must pass as a parameter for the deserialization. But this can be safe only if your expected types cannot be exploited by any security hole. For example, if you target .NET Framework, the TempFileCollection class can be exploited to delete files (this specific attack is now protected by a special handling in SafeMode, sot it cannot be deserialized even if it's an expected type).

I have already serialized the data using binaryformatter and written to the file. now I want to use some other alternative which can read this file and deserialize the file content.

The format of the streams are not compatible. So BinarySerializationFormatter can only assure that if you were able to serialize your objects with BinaryFormatter, then it will work also with my serializer but the binary stream will be different (in fact, much more compact).

However, if you really need to use the stream serialized by BinaryFormatter, a small step towards security can be using a serialization binder (eg. this one), which can be used also with BinaryFormatter. It can ensure that only the expected types are allowed to be deserialized, and if its SafeMode is true, then unexpected type names will not be resolved even if the serialization stream contains a manipulated assembly identity with a potentially harmful module initializer or a type with malicious constructor, etc.

Stitch answered 22/8, 2023 at 15:52 Comment(4)
Thank you for your answer. But the problem remains the same, Please refer to the comments in the question. let me summarize for you. I have already serialized the data using binaryformatter and written to the file. now I want to use some other alternative which can read this file and deserialize the file content. Note: The file was written using BinaryFormatter.Villenage
one more question, Does your package solve the security issue which is with the BinaryFormatter?Villenage
Also I tried to deserialise my file using your library , it gave exception.you could reproduce it as well. Just create a dictionary, then serialise it to a file using binary formatter then deserialise using your libraryVillenage
@VivekNuna: "I tried to deserialise my file using your library" - I've just updated my answer. Of course, that will not work. And if it did, it would just reintroduce the security issues of BinaryFormatter. As it supports only a very few types natively, it goes for recursive serialization even for a decimal, DateTime or List<T>. Meaning, it stores the assembly qualified identity for them that can be manipulated. See the details in the updated answer.Modred
M
2

If your customers already have data saved with binaryFormatter you need to keep it for reading files, regardless of its security issues, until you have migrated all, or most, of your customers to some new format. There is to my knowledge no publish specification of the format for BinaryFormatter, nor any other compatible libraries. And even if there where, I'm not sure it could solve the security problems, since the problems are inherent to the format itself.

So the first step should be to create a new format, using some well designed serialization library. I mostly use json and protobuf (.net), but there are plenty of good alternatives, see https://softwarerecs.stackexchange.com/ if you want recommendations. Just about anything should be better than BinaryFormatter.

You should then update your application so that it can no longer save files using binaryFormtter, only in your new format. Depending on your exact use case you might be able to convert saved data as soon as the new version is installed, in other cases you might only be able to do so when a user explicitly saves a file.

Once your updated application with support for the new format has been out for a while you can start thinking about removing support for BinaryFormatter. Users of older versions might be forced to update to an intermediate version and convert their files. Or you might publish a separate tool that only does conversions between the old format and the new format. You could also add a security warning when opening a file in the old format, to at least warn the user of the risk.

The main point here is that the sooner you introduce a new format, the sooner you can drop support for the old format. The length of this process will largely depend on your support commitments to customers, and willingness to make breaking changes.

Mid answered 22/8, 2023 at 14:1 Comment(6)
I am getting runtime error when using protobuf Could not load file or assembly 'System.Memory, Version=4.0.1.2, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)Villenage
@VivekNuna That is likely because you are missing a reference to "'System.Memory, Version=4.0.1.2" or one of its dependencies. Nuget sometimes has problems installing transitive dependencies for projects in the old format, so you might need to install such dependencies yourself.Mid
does that version even exist? nuget.org/packages/System.Memory#versions-body-tabAlthing
@RandRandom It at least existed at one point in time. But as long as the major version is the same, never versions should be backward compatible. So using the latest 4.x should work.Mid
@Mid that is why I am wondering. I already have 4.5.5 version of System.Memory. I have raised bug here as well github.com/protobuf-net/protobuf-net/issues/1092 . lets seeVillenage
@VivekNuna I have no real idea. Normally, assembly bindings in the app.config file should instruct the CLR to load the newer version if an older version is requested. But there are plenty of ways for dependencies to go wrong.Mid

© 2022 - 2024 — McMap. All rights reserved.