Programming Rants: Kafka vs RedPanda Benchmark (also Tarantool and Clickhouse as queue)

Using default settings from their docker-compose example, today we're gonna benchmark one of popular MQ/PubSub software. I never used MQ extensively before (only NATS, Google PubSub, ActiveMQ, and Amazon SQS), usually just using standard database that stores event is sufficient (the consumer using pull, tailing from last primary key counter, and if need to fan-out just use multiple goroutine and multiple channel), because my projects never been a latency sensitive applications.

Some issues:

the benchmark has locking (atomic counters, sync.Map, etc), so consumer might not utilize whole CPU cores.
confluent's kafka docker always error when starting because /var/lib/kafka/data not writable, so I bind on /var/lib/kafka instead. Clickhouse also always failed to start when bind to /var/lib/clickhouse/data, so I don't bind volume for Clickhouse.
RedPanda failed to start when fs.aio-max-nr even when it's already ~1 million (originally only 64K), so I set it to 4194304

Benchmarking 1000 goroutines publishing 2000 messages each, with 100 goroutines consuming in parallel.

REDPANDA version: v21.10.1 (rev e7b6714)

=== redpanda single:

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 2387

MaxLatency (ms): 2125

AvgLatency (ms): 432

Total (s) 3.457646367s

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 2408

MaxLatency (ms): 2663

AvgLatency (ms): 490

Total (s) 3.459949739s

=== redpanda multi:

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 4187

MaxLatency (ms): 12146

AvgLatency (ms): 9701

Total (s) 13.610533861s

# ^ weird, maybe startup not yet complete?

# retried reinit docker-compose, 1st time always slow

# but 2nd time always fast:

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 2413

MaxLatency (ms): 2704

AvgLatency (ms): 467

Total (s) 3.545496041s

KAFKA version: 7.0.0-ccs (Commit:c6d7e3013b411760)

equal to kafka 3.0.0

=== kafka single:

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 6634

MaxLatency (ms): 12052

AvgLatency (ms): 8579

Total (s) 13.722706977s

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 6380

MaxLatency (ms): 11856

AvgLatency (ms): 8636

Total (s) 13.625928209s

=== kafka multi:

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 6596

MaxLatency (ms): 11932

AvgLatency (ms): 8523

Total (s) 13.659630863s

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 6535

MaxLatency (ms): 11903

AvgLatency (ms): 8588

Total (s) 13.677644818s

These benchmark using default settings that exists in the docker examples I found, except SMP (I set it to the same amount of cores in the server that used to benchmark to make it fair with Kafka that uses JVM that by default can utilize all cores -- apparently this has insignificant impact). Current conclusion is, RedPanda way faster than Kafka, in terms of publishing speed (around ~1μs per message, 477K-837K msg/s) and consuming latency (432ms to 2.7s per message), while Kafka (around ~3μs per message, 301K-313K msg/s) and 8.5s to 12s per message. The RAM statistics tho, RedPanda uses 12GB for each node (10% of server's RAM), while Kafka only uses 355MB, 375MB, 788MB for nodes, and 120MB for zookeeper. The repo to reproduce this benchmark is here on 2021mq directory.

Btw if you're looking for Kafka/RedPanda GUI, try KOwl, this way more beautiful than ActiveMQ default Web UI.

Bonus rounds, using one of the fastest OLTP database: Tarantool and one of the fastest OLAP database: Clickhouse as Queue, by laveraging sequence (auto increment) or internal function to generate a sequence, the difference is there's only one consumer group (have to manually fan out using goroutine), no json encode and decode since it's structured database:

TARANTOOL version: 2.8.2

=== tarantool single (memtx):

FailProduce:  0

FailConsume:  0

DoubleConsume:  0

Produced (ms):  11238

MaxLatency (ms):  1071

AvgLatency (ms):  101

Total (s) 11.244551225s

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 9596

MaxLatency (ms): 816

AvgLatency (ms): 61

Total (s) 9.957516119s

=== tarantool single (vinyl):

FailProduce:  0

FailConsume:  0

DoubleConsume:  0

Produced (ms):  11383

MaxLatency (ms):  1076

AvgLatency (ms):  157

Total (s) 11.388865281s

FailProduce:  0

FailConsume:  0

DoubleConsume:  0

Produced (ms):  9104

MaxLatency (ms):  102

AvgLatency (ms):  13

Total (s) 9.196549551s

CLICKHOUSE version: 21.11.4.14

=== clickhouse single:

FailProduce: 0
FailConsume: 0

DoubleConsume: 0

Produced (ms): 2052

MaxLatency (ms): 2078

AvgLatency (ms): 1461

Total (s) 3.570767491s

FailProduce: 0

FailConsume: 0

DoubleConsume: 0

Produced (ms): 2057

MaxLatency (ms): 2008

AvgLatency (ms): 1445

Total (s) 3.536277427s

The result recap table (ms = millisecond, us = microsecond, ns = nanosecond):

only best of 2 runs	RedPanda single	RedPanda multi	Kafka single	Kafka multi	Tarantool memtx	Tarantool vinyl	Clickhouse single
Publish (ms)	2,387	2,413	6,380	6,535	9,596	9,104	2,052
Sub Max Latency (ms)	2,125	2,704	11,856	11,903	816	102	2,008
Sub Avg Latency (ms)	490	467	8,636	8,523	61	13	1,445
Pub Troughput (msg/s)	837,872	828,844	313,480	306,044	208,420	219,684	974,659
est. Pub Latency (ns)	1,194	1,207	3,190	3,268	4,798	4,552	1,026
est. Sub Throughput (msg/s)	4,081,633	4,282,655	231,589	234,659	32,786,885	153,846,154	1,384,083

Conclusion: Tarantool probably the only single node database that can compete with Kafka for queue use case (we can have multi-master replica but not recommended, it's better to use master-slave config where slave used as failover), for other database especially RDBMS that persist to disk pretty sure can only do ~50K tps, Clickhouse can be multi-master, and last time i check, it can do ~600K inserts per seconds (while this time it's around 1M inserts per seconds), I simulate the atomic counter on Clickhouse using TimeStamp64Milli, the query limited to 100 queries per second but it's quite good enough for pub-sub use case. The benefit of using database as MQ/PubSub is that you can do a very flexible query (SQL support), mostly better tooling (especially Clickhouse), or update the record for new consumer, but the cons is that you must notify/fan-out (for example using NATS broadcast, only push the signal for worker to pull), track the ack/retries and the read offset of the workers yourself (pull).

Programming Rants

2021-11-22

Kafka vs RedPanda Benchmark (also Tarantool and Clickhouse as queue)

No comments :

Post a Comment