Showing posts with label enkodo. Show all posts
Showing posts with label enkodo. Show all posts

2020-12-31

Golang Serialization Benchmark 2020 Edition

These benchmark results taken from electhomas' repo, have a quite interesting result (new serialization formats). Let's see how much serialization cost, the fastest:

Formattypens/opbytes/opallocs/opns/alloc
Mumser974800.00
GencodeUnsafeser9846482.05
Colferser12451641.94
Bebopser12455641.94
Gotinyser1304800.00
GotinyNoTimeser1364800.00
Gogoprotobufser14753642.30
XDR2ser15960642.48
Msgpser174971281.36
Gencodeser18653802.33
FlatBuffersser2989500.00
Goprotobufser31753644.95
CapNProtoser38696566.89
CapNProto2ser586962442.40
Hprose2ser6048500.00
Ikeaser67055729.31
ShamatonArrayMsgpackser758501764.31
Protobufser801521525.27
ShamatonMapMsgpackser819922083.94
Gobser834634817.38
Hproseser915854022.28
GoAvro2Binaryser950474881.95
VmihailencoMsgpackser11161004002.79
Bsonser11711103922.99
Binaryser1364613204.26
UgorjiCodecMsgpackser14229113121.08
UgorjiCodecBincser14689513281.11
EasyJsonser17891498952.00
XDRser1836924564.03
Jsonser221215020810.63
JsonIterser23921392489.65
GoAvroser28284710082.81
Serealser29361329043.25
GoAvro2Textser297513413202.25
SSZNoTimeNoStringNoFloatAser49225544011.19

As we can see, Mum, Gencode, Colfer, Bebop, Gotiny, XDR2, MsgPack wins in terms of performace, in cost of serialization size. Let's check the deserialization performace, the fastest are:

Formattypens/opbytes/opallocs/opns/alloc
Bebopdes10455323.25
XDR2des13160324.09
GencodeUnsafedes16146961.68
Colferdes197501121.76
Mumdes21648802.70
Gencodedes222531121.98
Gogoprotobufdes23053962.40
GotinyNoTimedes24148962.51
FlatBuffersdes265951122.37
Gotinydes267481122.38
Msgpdes314971122.80
CapNProtodes443962002.21
Goprotobufdes481531682.86
ShamatonArrayMsgpackdes483501443.35
Hprose2des609851444.23
ShamatonMapMsgpackdes738921445.12
CapNProto2des778963202.43
Protobufdes790521924.11
Ikeades871551605.44
Gobdes900631128.04
GoAvro2Binarydes1092475601.95
Hprosedes1195853193.75
UgorjiCodecMsgpackdes1398914962.82
Binarydes1511613204.72
UgorjiCodecBincdes1587956562.42
Bsondes16941102327.30
VmihailencoMsgpackdes17221004164.14
EasyJsondes17241502885.99
JsonIterdes18741392647.10
XDRdes2255902359.60
GoAvro2Textdes28261347993.54
Serealdes337713210083.35
Jsondes457414939111.70
GoAvrodes69624733282.09
SSZNoTimeNoStringNoFloatAdes76945513925.53

In this part, Bebop, XDR2, Gencode, Colfer, Mum, Gogoprotobuf, Gotiny, FlatBuffers, MsgPack wins this part. So if your bandwidth is unlimited, you can choose these format as your serialization format. You can access the reformatted spreadsheet here. Here's combined result of serialization and deserialization and addition of serialization size and allocation needed to deserialize.

Formatns/opbytes
Bebop228110
GencodeUnsafe25992
XDR2290120
Mum31396
Colfer321101
GotinyNoTime37796
Gogoprotobuf377106
Gotiny39796
Gencode408106
Msgp488194
FlatBuffers563190
Goprotobuf798106
CapNProto829192
Hprose21,213170
ShamatonArrayMsgpack1,241100
CapNProto21,364192
Ikea1,541110
ShamatonMapMsgpack1,557184
Protobuf1,591104
Gob1,734126
GoAvro2Binary2,04294
Hprose2,110170
UgorjiCodecMsgpack2,820182
VmihailencoMsgpack2,838200
Bson2,865220
Binary2,875122
UgorjiCodecBinc3,055190
EasyJson3,513299
XDR4,091182
JsonIter4,266278
GoAvro2Text5,801268
Sereal6,313264
Json6,786299
GoAvro9,79094
SSZNoTimeNoStringNoFloatA12,616110

Here are the links for high performing libraries used:
  • Bebop codegen from .bop (need to create schema like protobuf)
  • Gencode codegen from .schema (similar to go syntax)
  • XDR2 codegen version (other libs are using reflection/automatic), unmaintained
  • Mum manual (must create the method to serialize and deserialize yourself), new name is enkodo
  • Colfer codegen from .colf (similar to go syntax)
  • Gogoprotobuf codegen from .proto
  • Gotiny codegen, not recommended for production
  • MsgPack codegen version (other libs are using reflection/automatic), from .go source file using go:generate, supported in lots of language
  • FlatBuffers codegen, can be used in gRPC
What are the difference between codegen, automatic, manual?
  • codegen means there are step to generate golang function by writing a schema definition file then run a program to convert that file to specific programming language implementation, sometimes using other format (so you cannot add custom tag to generated .go file), sometimes using golang struct with tag (like codegen version of msgpack above you need to label each property with `msg:"foo"`, so you can add your own custom tag on the struct property, eg. `json:"bar,omitempty" form:"baz" bson:"blabla"`)
  • manual means you must write the serialization and deserialization yourself for each property of the struct (the library only the helper), this allows highly flexible system, so for example you read 1 byte first, then if the value is 1 then you read a string, if 2 you read int32, and so on. This can also be useful to parse network packet if having the same endian.
  • automatic means you don't need to write any schema, it uses reflection so should be slower than codegen version.