What are the pros and cons of java serialization vs kryo serialization?
Asked Answered
R

2

12

In spark, java serialization is the default, if kryo is that efficient then why it is not set as default. Is there some cons using kryo or in what scenarios we should use kryo or java serialization?

Radiotherapy answered 20/11, 2019 at 4:59 Comment(0)
X
16

Here is comment from documentation:

Kryo is significantly faster and more compact than Java serialization (often as much as 10x), but does not support all Serializable types and requires you to register the classes you’ll use in the program in advance for best performance.

So it is not used by default because:

  1. Not every java.io.Serializable is supported out of the box - if you have custom class that extends Serializable it still cannot be serialized with Kryo, unless registered.
  2. One needs to register custom classes.

Note according to documentation:

Spark automatically includes Kryo serializers for the many commonly-used core Scala classes covered in the AllScalaRegistrar from the Twitter chill library.

Xerophthalmia answered 20/11, 2019 at 9:3 Comment(4)
Not every serializable is supported - didn't understand this part @Vladislav Varslavans (could you give some example), Isn't registering the class doesn't make it serializableRadiotherapy
I have updated my answer. Hopefully it brings clarity.Xerophthalmia
Is registering a class an overhead? I mean, as long as I am getting an improvement in memory and time, I have no problem registering 10 classes. Or did I misunderstand something?Brunabrunch
I don't think it's an overhead. At least documentation does not mention anything about that.Xerophthalmia
I
2

Kryo Pros : Memory consumption is low

The time kryo didnt work for me as is was when I was dealing with google protobufs. Thats when I had to first register the proto class

https://mvnrepository.com/artifact/de.javakaffee/kryo-serializers/0.45

Inaction answered 20/11, 2019 at 17:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.