The accepted answer from 2013 proposes a library that is no longer maintained. There are many similar questions on StackOverflow but I really couldn't find a good answer which would meet the following criteria:
- serialization/ deserialization should be fast
- high performance data exchange over the wire where you only encode as much metadata as you need
- supports schema evolution so that changing the serialized object (ex:
case class
) doesn't break past deserializations
I recommend against using low-level JDK SerDes (like ByteArrayOutputStream
and ByteArrayInputStream
). Supporting schema evolution becomes a pain and it's difficult to make it work with external services (ex: Thrift) since you have no control if the data being sent back used the same type of streams.
Some people use the JSON spec, using libraries like json4s but it is not suitable for distributed computing message transfer. It marshalls data as a JSON string so it'll be both slower and storage inefficient, since it will use 8 bits to store every character in the string.
I highly recommend using the MessagePack binary serialization format. I would recommend reading the spec to understand the encoding specifics. It has implementations in many different languages, here's a generic example I wrote for a Scala case class
that you can copy-paste in your code.
import java.nio.ByteBuffer
import java.util.concurrent.TimeUnit
import org.msgpack.core.MessagePack
case class Data(message: String, number: Long, timeUnit: TimeUnit, price: Long)
object Data extends App {
def serialize(data: Data): ByteBuffer = {
val packer = MessagePack.newDefaultBufferPacker
packer
.packString(data.message)
.packLong(data.number)
.packString(data.timeUnit.toString)
.packLong(data.price)
packer.close()
ByteBuffer.wrap(packer.toByteArray)
}
def deserialize(data: ByteBuffer): Data = {
val unpacker = MessagePack.newDefaultUnpacker(convertDataToByteArray(data))
val newdata = Data.apply(
message = unpacker.unpackString(),
number = unpacker.unpackLong(),
timeUnit = TimeUnit.valueOf(unpacker.unpackString()),
price = unpacker.unpackLong()
)
unpacker.close()
newdata
}
def convertDataToByteArray(data: ByteBuffer): Array[Byte] = {
val buffer = Array.ofDim[Byte](data.remaining())
data.duplicate().get(buffer)
buffer
}
println(deserialize(serialize(Data("Hello world!", 1L, TimeUnit.DAYS, 3L))))
}
It will print:
Data(Hello world!,1,DAYS,3)