Java serialization, Kryo and the object graph
Asked Answered
F

1

6

Lets say I have an array arr of objects of type A in memory, each of which has a reference field pointing to the same object B.

Illustration:

A_1  A_2  A_3 ... A_N
 |    |    |       |
 |    |    V       |
 \--->\--> B <-----/

Note that the reference field in every object of type A points to the same object of type B.

Now, I serialize the array arr containing objects of type A to an ObjectOutputStream. I then deserialize the bytes obtained in this way.

I get a new array arr1.

1) Does the array arr1 have objects of type A such that they all point to the same object of type B? (I don't mean the same object before serialization, but a unique newly created object of type B)

2) In other words, does calling serialize/deserialize in Java retain the same object graph as it was before serialization? (i.e. is the newly deserialized object graph isomorphic to the old one)

3) Where is this documented? (i.e. please provide a citation)

4) The same questions 1-3, but applied to the Kryo serialization framework for Java.

Thank you.

Farandole answered 16/9, 2012 at 12:26 Comment(0)
D
10

http://docs.oracle.com/javase/6/docs/api/java/io/ObjectOutputStream.html

The default serialization mechanism for an object writes the class of the object, the class signature, and the values of all non-transient and non-static fields. References to other objects (except in transient or static fields) cause those objects to be written also. Multiple references to a single object are encoded using a reference sharing mechanism so that graphs of objects can be restored to the same shape as when the original was written.

As for my understanding of the specification, you get shared object references if the object instances to be shared go throught the same ObjectOutputStream.

So when you serialize the class containing the arr array, each object written gets an ID, and for each reference that passes through the stream, only that ID is written. The deserialized graph in that case remain homogeneous with the original graph.

I am sorry but I cannot help with krio library own serialization mechanism, I would be very happy to learn from someone who used it as well.

EDIT about kryo:

Some documentation I found:

  • By default, each appearance of an object in the graph after the first is stored as an integer ordinal. This allows multiple references to the same object and cyclic graphs to be serialized. This has a small amount of overhead and can be disabled to save space if it is not needed: kryo.setReferences(false);

  • This (github) is the contract of the reference resolver; two implementation are given: ArrayList-based for small objects graphs, Map-based for larger ones

  • This is implementation of the default object array (de)serializer

  • Classes need to be registered for (de)serialization; each registered class can be coupled with a serializer (among which, the default Java serialization mechanism)

Dyer answered 16/9, 2012 at 12:35 Comment(2)
Through the same ObjectOutputStream, or additionally through the same invocation of the writeObject method?Farandole
Just through the same stream; think of two objects referencing each other. You write the first (and thus the reference to the second) to the stream. When you then write the second to the same stream, which has IDs for both, the references get shared.Dyer

© 2022 - 2024 — McMap. All rights reserved.