.NET Binary serialization metadata
Asked Answered
C

4

6

A week ago I got in a situation where I had to read a binary serialized object made by another application made by somebody else. I only had the someSerializedData.bin file, so I tried to manually recreate the class definition for the unknown object and I was able to do so, because of the metadata in the serialized file. Oddly, I couldn't find any tool on google.

Q1: Why is there no tool that recreates the class definition from a binary serialized file/data?

And it leads to my second question

Q2: Is there such case when it's impossible to restore the class definition from the serialized data? (Assuming it is not encrypted or obfuscated in any way, I'm interested in cases involving the "default" .NET Binaryserializer properties, to disable type information and metadata included)

Competitive answered 14/8, 2013 at 17:34 Comment(5)
Please don't just downvote, tell me what's wrong, so I can improve the question.Competitive
i think what u are looking for is called reflection ( taking machine code and reverse it back to C# ) am i right?Faulkner
Do you have a copy of the app that made the .bin file? If so, you can de-comple it and look at the code. See dotPeak: jetbrains.com/decompilerSubteen
May be this link can help you. msdn.microsoft.com/en-us/library/cc236844.aspxVenturesome
Binary serialization is often customized by the class implementing ISerializable. Lots of .NET classes to this for example. You cannot recover that from the binary data.Columelliform
L
1

The reason that no tool exists is because it's often not enough to create a type that only contains the data. The methods are often just as important as the data, especially with properties that don't just set their private variables. No one knows what those methods are.

With that said, it may be useful to have a tool that is at least able to generate a type to hold the data. Maybe you'll be the first one to create such a tool?

Laporte answered 2/9, 2013 at 22:43 Comment(0)
J
2

It is impossible to deserialize binary data without knowing what's in it. The only way to do this is serializing it using JSON or XML for example. An example to illustrate:

Your name "Casual" can be serialized in this way: 67,97,115,117,97,108. In case you didn't notice: this is done using the ASCII coding (if I didn't make any mistakes). So now, imagine you don't know this is done with ASCII, who says this is not just an array with numbers? Or 3 arrays of 2 numbers? Or an object with ID 67 and an object with ID 117. Nobody knows so your task is impossible.

The only option is communicating with the person who serialized it originally and asks him/her how this is done and what objects are serialized in this binary object.

Kind regards

Julenejulep answered 15/8, 2013 at 10:34 Comment(2)
We are talking about .NET binary serialization, not customized serialization. My question wasn't that if I can or cannot deserialize an unknown object. (btw it's possible due to the meta data included --> questions/17996701/binary-deserialization-without-object-definition) My question was if you can serialize with the Binaryformatter (not customized!) without that much metadata, that would aid anyone to restore the original class definition of the object. Thank you for the answer anyway :)Competitive
@Casual My question was if you can serialize with the Binaryformatter (not customized!) without that much metadata, that would aid anyone to restore the original class definition of the object, No your question isn't this... It isn't equal to Is there such case when it's impossible to restore the class definition from the serialized data?Canty
S
2

Q1: Why is there no tool that recreates the class definition from a binary serialized file/data?

My guess is that very few people need this. To start with, binary serialization isn't as popular as XML, JSON and other formats which are standardized and are supported virtually anywhere.

There's no documentation on the binary format. One needs to dig into .NET Framework sources to understand it. It's not fun.

Q2: Is there such case when it's impossible to restore the class definition from the serialized data?

Looks like the binary format contains enough data. If you absolutely need a tool to reverse engineer original classes and their fields from the serializied files, you can start with reading sources of System.Runtime.Serialization.Formatters.Binary.BinaryFormatter, System.Runtime.Serialization.Formatters.Binary.ObjectReader and other classes from mscorlib.

However, if the application which produced the files isn't obfuscated, I suggest trying to decompile it first. It will likely be much easier.

P.S. Don't forget to consult your lawyer.

Saccharose answered 3/9, 2013 at 14:2 Comment(2)
#Athari What do you mean by "Don't forget to consult your lawyer." ? :PCompetitive
@Casual Decompiling software you don't own is illegal in many countries. And by owning commercial software I mean actual owning, not having bought a copy for use, as many EULAs disallow decompiling. There're some countries, including Australia and Russia, which always allow decompiling for specific purposes, overriding all EULAs. I'm not sure about reverse-engineering file formats and protocols. On the whole, if you're not sure, it's a good idea to consult a lawyer. P.S. It's "@", not "#". And notifications about comments are always sent to the question's and answer's authors.Saccharose
L
1

The reason that no tool exists is because it's often not enough to create a type that only contains the data. The methods are often just as important as the data, especially with properties that don't just set their private variables. No one knows what those methods are.

With that said, it may be useful to have a tool that is at least able to generate a type to hold the data. Maybe you'll be the first one to create such a tool?

Laporte answered 2/9, 2013 at 22:43 Comment(0)
C
1

I am not sure there's enough information in the metadata to re-create the type. Imagine complex (like nested) object graphs. In your previous question, member types (String vs int) have been an issue.

Regarding your second question, I am not sure what you try to achieve. I am not sure if you can use the BinaryFormatter to output data in a way that is not too easy to reverse engineer, but other methods should be simple to implement.

Christenachristendom answered 3/9, 2013 at 6:20 Comment(1)
string vs int was only an issue because he made a guess at the types instead of reading the actual binary data.Tahmosh

© 2022 - 2024 — McMap. All rights reserved.