Fastest possible Javascript object serialization with Google V8
Asked Answered
C

4

26

I need to serialize moderately complex objects with 1-100's of mixed type properties.

JSON was used originally, then I switched to BSON which is marginally faster.

Encoding 10000 sample objects

JSON:        1807mS
BSON:        1687mS
MessagePack: 2644mS (JS, modified for BinaryF)

I want an order of magnitude increase; it is having a ridiculously bad impact on the rest of the system.

Part of the motivation to move to BSON is the requirement to encode binary data, so JSON is (now) unsuitable. And because it simply skips the binary data present in the objects it is "cheating" in those benchmarks.

Profiled BSON performance hot-spots

  • (unavoidable?) conversion of UTF16 V8 JS strings to UTF8.
  • malloc and string ops inside the BSON library

The BSON encoder is based on the Mongo BSON library.

A native V8 binary serializer might be wonderful, yet as JSON is native and quick to serialize I fear even that might not provide the answer. Perhaps my best bet is to optimize the heck out of the BSON library or write my own plus figure out far more efficient way to pull strings out of V8. One tactic might be to add UTF16 support to BSON.

So I'm here for ideas, and perhaps a sanity check.

Edit

Added MessagePack benchmark. This was modified from the original JS to use BinaryF.

The C++ MessagePack library may offer further improvements, I may benchmark it in isolation to compare directly with the BSON library.

Cappella answered 2/6, 2011 at 18:13 Comment(8)
Maybe you could provide a jsperf.com test case to aid in understanding the type of data you need to storeAmberjack
Just standard JS objects: {param1:"name",param2:{paramA:1,paramB:[0x0,0x1,0x2],paramC:<BINARY>}} with up to 100 properties, arbitrarily nested, some of which will be contain byte arrays using CommonJS BinaryF. Without BinaryF and a BSON serializer, it is impossible to make any useful comparisons.Cappella
Do you have any link/references to what you used for BSON, MsgPack, etc?Gosney
How applicable are those benchmarks as per 2013? PS: You can pack binary objects into JSON via window.btoa for a long time.Triptych
I have not run these tests for a long time. AFAIK window.btoa/atob are base64 so would need additional processing (like JSON.stringify/parse) to get to/from base64 compatible data, so it would be even slower.Cappella
@McTrousers It'd be interesting to see what solution you came up with, eventually. I'm looking for a way to persist V8's objects as well and have not found a good enough solution so far.Osi
In the five years since this was asked and the benchmarks in the question were run, v8 and other JS engines have done a ton of optimization to JSON de/serialization. JSON is now about 8x faster than BSON, and the fastest msgpack lib is slightly slower than JSON.Jaquesdalcroze
And in the two years since I last commented... the fastest msgpack lib AFAIK is now notepack.io, which is faster than JSON.stringify in v8 in many cases.Jaquesdalcroze
T
10

For serialization / deserialization protobuf is pretty tough to beat. I don't know if you can switch out the transport protocol. But if you can protobuf should definitely be considered.

Take a look at all the answers to Protocol Buffers versus JSON or BSON.

The accepted answer chooses thrift. It is however slower than protobuf. I suspect it was chosen for ease of use (with Java) not speed. These Java benchmarks are very telling.
Of note

  • MongoDB-BSON 45042
  • protobuf 6539
  • protostuff/protobuf 3318

The benchmarks are Java, I'd imagine that you can achieve speeds near the protostuff implementation of protobuf, ie 13.5 times faster. Worst case (if for some reason Java is just better for serialization) you can do no worse the the plain unoptimized protobuf implementation which runs 6.8 times faster.

Thieve answered 14/6, 2011 at 20:21 Comment(2)
Thank you for the benchmarks. The data is user-generated so the PB protocol would need to carry key/value pairs undermining some of its performance benefits, but I expect it would still come out on top by a large margin. It doesn't support UTF16 though so adding that type shouldn;t be difficult.Cappella
I have stuck with BSON and incrementally optimize is as and when. But I may well return to protobuf at some point. Thanks.Cappella
E
20

I made a recent (2020) article and benchmark comparing binary serialization libraries in JavaScript.

The following formats and libraries are compared:

  • Protocol Buffer: protobuf-js, pbf, protons, google-protobuf
  • Avro: avsc
  • BSON: bson
  • BSER: bser
  • JSBinary: js-binary

Based on the current benchmark results I would rank the top libraries in the following order (higher values are better, measurements are given as x times faster than JSON):

  1. avsc: 10x encoding, 3-10x decoding
  2. js-binary: 2x encoding, 2-8x decoding
  3. protobuf-js: 0.5-1x encoding, 2-6x decoding,
  4. pbf: 1.2x encoding, 1.0x decoding
  5. bser: 0.5x encoding, 0.5x decoding
  6. bson: 0.5x encoding, 0.7x decoding

I did not include msgpack in the benchmark as it is currently slower than the build-in JSON library according to its NPM description.

For details, see the full article.

Earwig answered 29/7, 2020 at 14:17 Comment(0)
T
10

For serialization / deserialization protobuf is pretty tough to beat. I don't know if you can switch out the transport protocol. But if you can protobuf should definitely be considered.

Take a look at all the answers to Protocol Buffers versus JSON or BSON.

The accepted answer chooses thrift. It is however slower than protobuf. I suspect it was chosen for ease of use (with Java) not speed. These Java benchmarks are very telling.
Of note

  • MongoDB-BSON 45042
  • protobuf 6539
  • protostuff/protobuf 3318

The benchmarks are Java, I'd imagine that you can achieve speeds near the protostuff implementation of protobuf, ie 13.5 times faster. Worst case (if for some reason Java is just better for serialization) you can do no worse the the plain unoptimized protobuf implementation which runs 6.8 times faster.

Thieve answered 14/6, 2011 at 20:21 Comment(2)
Thank you for the benchmarks. The data is user-generated so the PB protocol would need to carry key/value pairs undermining some of its performance benefits, but I expect it would still come out on top by a large margin. It doesn't support UTF16 though so adding that type shouldn;t be difficult.Cappella
I have stuck with BSON and incrementally optimize is as and when. But I may well return to protobuf at some point. Thanks.Cappella
F
4

Take a look at MessagePack. It's compatible with JSON. From the docs:

Fast and Compact Serialization

MessagePack is a binary-based efficient object serialization library. It enables to exchange structured objects between many languages like JSON. But unlike JSON, it is very fast and small.

Typical small integer (like flags or error code) is saved only in 1 byte, and typical short string only needs 1 byte except the length of the string itself. [1,2,3] (3 elements array) is serialized in 4 bytes using MessagePack as follows:

Foreandafter answered 3/6, 2011 at 8:24 Comment(3)
Fantastic! Definitely worth testing! I'll give the JS implementation a shot first.Cappella
Test MessagePack again now vs JSON :) JSON completely destroys it, same for bson (Even native bson with c++ bindings). Yes I know this question is roughly 5 years old, but man, JSON is pretty hard to beat right now. I do give MsgPack props for acknowledging defeat on their github page though, ahaha. -- It's not even closeWipe
i just compared serialize string return MessagePack vs JSON in one of my text example JSON.stringify generate string of length 1011 as compare to MessagePack string length 2231Kirst
C
0

If you are more interested on the de-serialisation speed, take a look at JBB (Javascript Binary Bundles) library. It is faster than BSON or MsgPack.

From the Wiki, page JBB vs BSON vs MsgPack:

...

  • JBB is about 70% faster than Binary-JSON (BSON) and about 30% faster than MsgPack on decoding speed, even with one negative test-case (#3).
  • JBB creates files that (even their compressed versions) are about 61% smaller than Binary-JSON (BSON) and about 55% smaller than MsgPack.

...

Unfortunately, it's not a streaming format, meaning that you must pre-process your data offline. However there is a plan for converting it into a streaming format (check the milestones).

Chambers answered 26/7, 2016 at 10:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.