Programming Rants: consul

2022-04-04

Automatic Load Balancer Registration/Deregistration with NATS or FabioLB

Today we're gonna test 2 alternative for automatic load balancing (previously I always use Caddy or NginX (because most of my projects is single server -- the bottleneck is always the database not the backend/compute part), with manual reverse proxy configuration, but today we're gonna test 2 possible way to high-availability load balance strategy (without kubernetes of course), first is using NATS, second one is using standard load balancer, in this case FabioLB.

To use NATS, we're gonna use this strategy:
first one we deploy is the our custom reverse proxy, that should able to convert any query string, form body with any kind of content-type, and any header if needed, we can use any serialization format (json, msgpack, protobuf, etc), but in this case we're just gonna use normal string, we call this service "apiproxy". The apiproxy will send the serialized payload (from map/object) into NATS using request-reply mechanism. Another service is our backend "worker"/handler, that could be anything, but in this case is our real handler that would contain our business logic, so we need to subscribe and return a reply to the apiproxy and it would be deserialized back to the client with any serizaliation format and protocol (gRPC/Websocket/HTTP-REST/JSONP/etc). Here's the benchmark result of normal Fiber without any proxy, apiproxy-nats-worker with single nats vs multi nats instance

# no proxy
go run main.go apiserver
hey -n 1000000 -c 255 http://127.0.0.1:3000
Average:      0.0011 secs
Requests/sec: 232449.1716

# single nats
go run main.go apiproxy
go run main.go # worker
hey -n 1000000 -c 255 http://127.0.0.1:3000
Average:      0.0025 secs
Requests/sec: 100461.5866

# 2 worker
Average:      0.0033 secs
Requests/sec: 76130.4079

# 4 worker
Average:      0.0051 secs
Requests/sec: 50140.6288

# limit the apiserver CPU
GOMAXPROCS=2 go run main.go apiserver
Average:      0.0014 secs
Requests/sec: 184234.0106

# apiproxy 2 core
# 1 worker 2 core each
Average:      0.0025 secs
Requests/sec: 103007.4516

# 2 worker 2 core each
Average:      0.0029 secs
Requests/sec: 87522.6801

# 4 worker 2 core each
Average:      0.0037 secs
Requests/sec: 67714.5851

# seems that the bottleneck is spawning the producer's NATS
# spawning 8 connections using round-robin

# 1 worker 2 core each
Average:      0.0021 secs
Requests/sec: 121883.4324

# 4 worker 2 core each
Average:      0.0030 secs
Requests/sec: 84289.4330

# seems also the apiproxy is hogging all the CPU cores
# limiting to 8 core for apiproxy
# now synchronous handler changed into async/callback version
GOMAXPROCS=8 go run main.go apiserver

# 1 worker 2 core each
Average:      0.0017 secs
Requests/sec: 148298.8623

# 2 worker 2 core each
Average:      0.0017 secs
Requests/sec: 143958.4056

# 4 worker 2 core each
Average:      0.0029 secs
Requests/sec: 88447.5352

# limiting the NATS to 4 core using go run on the source
# 1 worker 2 core each
Average:      0.0013 secs
Requests/sec: 194787.6327

# 2 worker 2 core each
Average:      0.0014 secs
Requests/sec: 176702.0119

# 4 worker 2 core each
Average:      0.0022 secs
Requests/sec: 116926.5218

# same nats core count, increase worker core count
# 1 worker 4 core each
Average:      0.0013 secs
Requests/sec: 196075.4366

# 2 worker 4 core each
Average:      0.0014 secs
Requests/sec: 174912.7629

# 4 worker 4 core each
Average:      0.0021 secs
Requests/sec: 121911.4473 --> see update below

Could be better if it was tested in multiple server, but it seems the bottleneck is on NATS connection when have many subscriber, they could not scale linearly (16-66% overhead for a single API proxy) IT's A BUG ON MY SIDE, SEE UPDATE BELOW. Next we're gonna try FabioLB with Consul, Consul used for service registry (it's a synchronous-consistent "database" like Zookeeper or Etcd). To install all of it use this commands:

# setup:
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt install consul
go install github.com/fabiolb/fabio@latest

# start:
sudo consul agent -dev --data-dir=/tmp/consul
fabio
go run main.go -addr 172.17.0.1:5000 -name svc-a -prefix /foo -consul 127.0.0.1:8500

# benchmark:
# without fabio
Average:      0.0013 secs
Requests/sec: 197047.9124

# with fabio 1 backend
Average:      0.0038 secs
Requests/sec: 65764.9021

# with fabio 2 backend
go run main.go -addr 172.17.0.1:5001 -name svc-a -prefix /foo -consul 127.0.0.1:8500

# the bottleneck might be the cores, so we limit the cores to 2 for each worker
# with fabio 1 backend 2 core each
Average:      0.0045 secs
Requests/sec: 56339.5518

# with fabio 2 backend 2 core each
Average:      0.0042 secs
Requests/sec: 60296.9714

# what if we limit also the fabio
GOMAXPROCS=8 fabio

# with fabio 8 core, 1 backend 2 core each
Average:      0.0042 secs
Requests/sec: 59969.5206

# with fabio 8 core, 2 backend 2 core each
Average:      0.0041 secs
Requests/sec: 62169.2256

# with fabio 8 core, 4 backend 2 core each
Average:      0.0039 secs
Requests/sec: 64703.8253

All CPU cores utilized around 50% of 32-core server 128GB RAM, can't find which part the bottleneck for now, but for sure both strategy have around 16% vs 67% overhead compared for non proxies (which is make sense because adding more layer will add more transport and more things to copy/transfer and transform/serialize-deserialize). The code used in this benchmark is here, on 2022mid directory, and the code for fabio-consul registration copied from ebay's github repository.

Why even we need to do this? If we're using api gateway pattern (one of the pattern that being used in my past company, but with Kubernetes on worker part), we could deploy independently and communicate between service using the gateway (proxy) without knowing the IP address or domain name of the service itself, as long as it have proper route and payload it can be handled wherever the service being deployed. What if you want to do canary or blue green deployment? you can just register a handler in nats or consul with different route name (especially for communication between services, not public to service), and wait for all traffic to be moved there before killing previous deployment.

So what should you choose? both strategy requires 3 moving part (apiproxy-nats-worker, fabio-consul-worker) but NATS strategy simpler in the development and can give better performance (especially if you make the apiproxy to be as flexible as possible), but it needs to have better serialization, since in this benchmark the serialization not measured, if you need better performance on serialization you must use codegen, which may require you to deploy 2 times (one for apiproxy, one for worker, unless you split the raw response meta with jsonparser or use map only for apiproxy). FabioLB strategy have more features, also you can use consul for service discovery (contacting other services directly by name without have to go thru FabioLB). NATS strategy have some benefit in terms of security, which is the NATS cluster can be inside DMZ, and worker can be on the different subnet without ability to connect each other and it would still works, where if you use consul to connect directly to another service, they should have route or connection to access each other. The bad part about NATS is that you should not use it for file upload, or it would hogging a lot of resource, so it should handled by apiproxy directly, then the reference of the uploaded file should be forwarded as payload to NATS. You can check NATS traffic statistics using nats-top.

What's next? Maybe we can try traefik, which is a service registry combined with load balancer in one binary, it can also use consul.

UPDATE: by changing the code from Subscribe (broadcast/fan-out) to QueueSubscribe (load balance), it have similar performance on 1/2/4 subscribers, so we can NATS for high availability/fault tolerance in api gateway pattern with cost of 16% overhead.

TL:DR

no LB: 232K rps
-> LB with NATS request-reply: 196K rps (16% overhead)
no LB: 197K rps
-> LB with Fabio+Consul: 65K rps (67% overhead)

2021-08-05

You don't need Kubernetes

Everyone using kubernetes, even they don't have to, which adds additional complexity. In this article we will discuss when we need Kubernetes, and when we don't, and what's the alternative?

When you need Kubernetes?

If you have lot of teams (>3, not 3 person, but at least 3 teams) and services, it's better to use Kubernetes or any other orchestration service to utilize your servers
Why? because that way they can deploy each services independently, and service mesh management is far easier if using orchestration.
Alternative? use Nomad or standard rsync with autoreloader (eg. overseer if you are using Golang) or if you are using dynamic languages (or old CGI/FastCGI) that are not creating own web server usually you don't need an autoreloader since source code on the server anyway and will be autoreloaded by the web server
If you have more than 3+ powerful servers, it's better to use Kubernetes or any other orchestration service
Why? with orchestration service we can binpack and fully utilize those servers better than creating bunch of VMs inside the servers
Alternative? use Ansible to setup those servers, or Nomad to orchestate, or create a script to rsync to multiple servers
If you need canary or A/B deployment
Why? creating canary-like deployment manually (setting up load balancer to send a percentage of request to specific nodes is a bit chore)
Alternative? use Nomad and Consul service registry
When you are using slow programming language implementation that the first bottleneck is not database but your API/Web requests implementation
Why? normally if you code using compiled programming language, the bottleneck is not the network/number of hits, but the database, which usually solved with caching if it just read, or CDN if static files. But when you are using slow programming languages (eg. Ruby, Python, PHP5, etc), you mostly will hit bottleneck on the API/Web requests side first before the database. So you'll need to spawn additional pod to handle those requests.
Alternative? Amazon AutoScaling, Google Cloud AutoScaling Group, Azure AutoScale, or just do a capacity planning and scale up/out manually before the peak season (eg. Black Friday, Christmas, discount day, etc) and scale down/in after peak season ends.

When you don't need Kubernetes?

When you only have a little <=3 or bunch of least powerful servers that are utilized well.
Seriously you don't need it, just a simple rsync script (either manual or in CI/CD pipeline) or Ansible script would be sufficient
When you are alone or in a small team (no other team), writing modular monolith is fine until it's painful to build, test, and deploy, then that's the moment that you will need to split to multiple services.
When you only have production and test environment without A/B or canary deployment.
Again, Kubernetes will be totally overkill in this case.

always cache reads
prepare for nth server when it's already 70% load (you need to provision a new node anyway if kubernetes also hit the load), either manually or use cloud providers' autoscaler
put slow tasks/computation as background job, so web services could always respond fast. Even when you have autoscaler for APIs/web services, if the bottleneck is the database or I/O (which is more likely than the processor/computation), it would be totally useless.

When you don't want Kubernetes to be the 30% overhead your servers, use lightweight orchestration and service discovery instead, like Nomad with Consul.

Well, for the last part, you are mostly don't need Kubernetes, don't be eaten by the hype, especially if you deploy databases on the pod, it's unstable AF, you don't need to automatically scale out/in the databases anyway (usually scale up if the database product is quite hassle to scale out), it would be a crazy thing to move around gigabytes of data because the automatic scaling. I hope this article could help people that started to build their startup to start provision with proper capacity planning and releasing instead of prematurely optimize their architecture and infrastructure with Kubernetes.

FAQ

So, what are your qualification saying this?

I worked for small companies (<7 engineers) and huge companies (700+ engineers) that are both using Kubernetes, and I realized the difference why one is a must, and why the other one is just decision based on hype.

EDIT

I found something that even simpler than nomad/waypoint/kubernetes, jelastic! see the demo video here.