Showing posts with label gofiber. Show all posts
Showing posts with label gofiber. Show all posts

2023-11-24

mTLS using Golang Fiber

In this demo we're going to create mTLS using Go and Fiber. To create certificates that can be used for mutual authentication, what you need to have is just an OpenSSL program (or simplecert, or mkcert like in previous natsmtls1 example), create a CA (certificate authority), server certs, and client certs, something like this:

# generate CA Root
openssl req -newkey rsa:2048 -new -nodes -x509 -days 3650 -out ca.crt -keyout ca.key -subj "/C=SO/ST=Earth/L=MyLocation/O=MyOrganiz/OU=MyOrgUnit/CN=localhost"

# generate Server Certs
openssl genrsa -out server.key 2048
# generate server Cert Signing request
openssl req -new -key server.key -days 3650 -out server.csr -subj "/C=SO/ST=Earth/L=MyLocation/O=MyOrganiz/OU=MyOrgUnit/CN=localhost"
# sign with CA Root
openssl x509  -req -in server.csr -extfile <(printf "subjectAltName=DNS:localhost") -CA ca.crt -CAkey ca.key -days 3650 -sha256 -CAcreateserial -out server.crt

# generate Client Certs
openssl genrsa -out client.key 2048
# generate client Cert Signing request
openssl req -new -key client.key -days 3650 -out client.csr -subj "/C=SO/ST=Earth/L=MyLocation/O=$O/OU=$OU/CN=localhost"
# sign with CA Root
openssl x509  -req -in client.csr -extfile <(printf "subjectAltName=DNS:localhost") -CA ca.crt -CAkey ca.key -out client.crt -days 3650 -sha256 -CAcreateserial


You will get at least 2 files related to CA, 3 files related to server, and 3 files related to client, but what you really need is just CA public key, server private and public key (key pairs), and client private and public key (key pairs). If you need to generate another client or rollover server keys, you will still need CA's private key so don't erase it.

Next, now that you already have those 5 keys, you will need to load CA public key, and server key pair and use it on fiber, something like this:

caCertFile, _ := os.ReadFile(in.CaCrt)
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCertFile)

serverCerts, _ := tls.LoadX509KeyPair(in.ServerCrt, in.ServerKey)

tlsConfig := &tls.Config{
    ClientCAs:        caCertPool,
    ClientAuth:       tls.RequireAndVerifyClientCert,
    MinVersion:       tls.VersionTLS12,
    CurvePreferences: []tls.CurveID{tls.CurveP521, tls.CurveP384, tls.CurveP256},
        CipherSuites: []uint16{
        tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
        tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
        tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
        tls.TLS_RSA_WITH_AES_256_CBC_SHA,
         tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
        tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
     },
     Certificates: []tls.Certificate{serverCerts},
}

// attach the certs to TCP socket, and start Fiber server
app := fiber.New(fiber.Config{
    Immutable: true,
})
app.Get("/", func(c *fiber.Ctx) error {
    return c.String(`secured string`)
})
ln, _ := tls.Listen("tcp", `:1443`, tlsConfig)
app.Listener(ln)


next on the client side, you just need to load CA public key, client key pairs, something like this:

caCertFile, _ := os.ReadFile(in.CaCrt)
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCertFile)
certificate, _ := tls.LoadX509KeyPair(in.ClientCrt, in.ClientKey)

httpClient := &http.Client{
    Timeout: time.Minute * 3,
    Transport: &http.Transport{
        TLSClientConfig: &tls.Config{
            RootCAs:      caCertPool,
            Certificates: []tls.Certificate{certificate},
        },
    },
}

r, _ := httpClient.Get(`https://localhost:1443`)


that's it, that's how you secure client-server communication between Go client and server with mTLS, this code can be found here.

2022-06-07

How to profile your Golang Fiber server

Usually you need to load test your webserver to measure where's the memory leak or where's the bottleneck, and Golang already provided tool to do that, called pprof. What you need to do is depends on your framework that you use, but it's all similar, most framework already have a middleware that you can import and use, for example in fiber there's pprof middleware, to use:

// import
  "github.com/gofiber/fiber/v2/middleware/pprof"

// use
  app.Use(pprof.New())

It would create a route called /debug/pprof that you can use, just start the server, then open that path. To profile or check heap you just need to click profile/heap link, it would wait for around 10 seconds, meanwhile it waits, you must hit other endpoints to generate traffic/function calls. After 10 seconds, it would show a download dialog to save your cpu profile or heap profile. From that file, you can run a command similar to gops, for example if you want to generate svg or generate web that shows your profiling:

pprof -web /tmp/profile # or
pprof -svg /tmp/profile # <-- file that you just downloaded

It would generate something like this:



So you can find out which function that took most of the CPU time (or if it's heap profile, which function that generates/allocate most memory usage. In my case the bottleneck is the default built-in pretty logger, it limits the number of requests it can only handle to ~9K rps for concurrency 255 on database write benchmark, that if we remove built-in logging and replace with zerolog, it can handle ~57K rps for same benchmark.

2022-04-04

Automatic Load Balancer Registration/Deregistration with NATS or FabioLB

Today we're gonna test 2 alternative for automatic load balancing (previously I always use Caddy or NginX (because most of my projects is single server -- the bottleneck is always the database not the backend/compute part), with manual reverse proxy configuration, but today we're gonna test 2 possible way to high-availability load balance strategy (without kubernetes of course), first is using NATS, second one is using standard load balancer, in this case FabioLB.

To use NATS, we're gonna use this strategy:
first one we deploy is the our custom reverse proxy, that should able to convert any query string, form body with any kind of content-type, and any header if needed, we can use any serialization format (json, msgpack, protobuf, etc), but in this case we're just gonna use normal string, we call this service "apiproxy". The apiproxy will send the serialized payload (from map/object) into NATS using request-reply mechanism. Another service is our backend "worker"/handler, that could be anything, but in this case is our real handler that would contain our business logic, so we need to subscribe and return a reply to the apiproxy and it would be deserialized back to the client with any serizaliation format and protocol (gRPC/Websocket/HTTP-REST/JSONP/etc). Here's the benchmark result of normal Fiber without any proxy, apiproxy-nats-worker with single nats vs multi nats instance

# no proxy
go run main.go apiserver
hey -n 1000000 -c 255 http://127.0.0.1:3000
  Average:      0.0011 secs
  Requests/sec: 232449.1716

# single nats
go run main.go apiproxy
go run main.go # worker
hey -n 1000000 -c 255 http://127.0.0.1:3000
  Average:      0.0025 secs
  Requests/sec: 100461.5866

# 2 worker
  Average:      0.0033 secs
  Requests/sec: 76130.4079

# 4 worker
  Average:      0.0051 secs
  Requests/sec: 50140.6288

# limit the apiserver CPU
GOMAXPROCS=2 go run main.go apiserver
  Average:      0.0014 secs
  Requests/sec: 184234.0106

# apiproxy 2 core
# 1 worker 2 core each
  Average:      0.0025 secs
  Requests/sec: 103007.4516

# 2 worker 2 core each
  Average:      0.0029 secs
  Requests/sec: 87522.6801

# 4 worker 2 core each
  Average:      0.0037 secs
  Requests/sec: 67714.5851

# seems that the bottleneck is spawning the producer's NATS
# spawning 8 connections using round-robin

# 1 worker 2 core each
  Average:      0.0021 secs
  Requests/sec: 121883.4324

# 4 worker 2 core each
  Average:      0.0030 secs
  Requests/sec: 84289.4330

# seems also the apiproxy is hogging all the CPU cores
# limiting to 8 core for apiproxy
# now synchronous handler changed into async/callback version
GOMAXPROCS=8 go run main.go apiserver

# 1 worker 2 core each
  Average:      0.0017 secs
  Requests/sec: 148298.8623

# 2 worker 2 core each
  Average:      0.0017 secs
  Requests/sec: 143958.4056

# 4 worker 2 core each
  Average:      0.0029 secs
  Requests/sec: 88447.5352

# limiting the NATS to 4 core using go run on the source
# 1 worker 2 core each
  Average:      0.0013 secs
  Requests/sec: 194787.6327

# 2 worker 2 core each
  Average:      0.0014 secs
  Requests/sec: 176702.0119

# 4 worker 2 core each
  Average:      0.0022 secs
  Requests/sec: 116926.5218

# same nats core count, increase worker core count
# 1 worker 4 core each
  Average:      0.0013 secs
  Requests/sec: 196075.4366

# 2 worker 4 core each
  Average:      0.0014 secs
  Requests/sec: 174912.7629

# 4 worker 4 core each
  Average:      0.0021 secs
  Requests/sec: 121911.4473 --> see update below


Could be better if it was tested in multiple server, but it seems the bottleneck is on NATS connection when have many subscriber, they could not scale linearly (16-66% overhead for a single API proxy) IT's A BUG ON MY SIDE, SEE UPDATE BELOW. Next we're gonna try FabioLB with Consul, Consul used for service registry (it's a synchronous-consistent "database" like Zookeeper or Etcd). To install all of it use this commands:

# setup:
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt install consul
go install github.com/fabiolb/fabio@latest

# start:
sudo consul agent -dev --data-dir=/tmp/consul
fabio
go run main.go -addr 172.17.0.1:5000 -name svc-a -prefix /foo -consul 127.0.0.1:8500

# benchmark:
# without fabio
  Average:      0.0013 secs
  Requests/sec: 197047.9124

# with fabio 1 backend
  Average:      0.0038 secs
  Requests/sec: 65764.9021

# with fabio 2 backend
go run main.go -addr 172.17.0.1:5001 -name svc-a -prefix /foo -consul 127.0.0.1:8500

# the bottleneck might be the cores, so we limit the cores to 2 for each worker
# with fabio 1 backend 2 core each
  Average:      0.0045 secs
  Requests/sec: 56339.5518

# with fabio 2 backend 2 core each
  Average:      0.0042 secs
  Requests/sec: 60296.9714

# what if we limit also the fabio
GOMAXPROCS=8 fabio

# with fabio 8 core, 1 backend 2 core each
  Average:      0.0042 secs
  Requests/sec: 59969.5206

# with fabio 8 core, 2 backend 2 core each
  Average:      0.0041 secs
  Requests/sec: 62169.2256

# with fabio 8 core, 4 backend 2 core each
  Average:      0.0039 secs
  Requests/sec: 64703.8253

All CPU cores utilized around 50% of 32-core server 128GB RAM, can't find which part the bottleneck for now, but for sure both strategy have around 16% vs 67% overhead compared for non proxies (which is make sense because adding more layer will add more transport and more things to copy/transfer and transform/serialize-deserialize). The code used in this benchmark is here, on 2022mid directory, and the code for fabio-consul registration copied from ebay's github repository.

Why even we need to do this? If we're using api gateway pattern (one of the pattern that being used in my past company, but with Kubernetes on worker part), we could deploy independently and communicate between service using the gateway (proxy) without knowing the IP address or domain name of the service itself, as long as it have proper route and payload it can be handled wherever the service being deployed. What if you want to do canary or blue green deployment? you can just register a handler in nats or consul with different route name (especially for communication between services, not public to service), and wait for all traffic to be moved there before killing previous deployment.

So what should you choose? both strategy requires 3 moving part (apiproxy-nats-worker, fabio-consul-worker) but NATS strategy simpler in the development and can give better performance (especially if you make the apiproxy to be as flexible as possible), but it needs to have better serialization, since in this benchmark the serialization not measured, if you need better performance on serialization you must use codegen, which may require you to deploy 2 times (one for apiproxy, one for worker, unless you split the raw response meta with jsonparser or use map only for apiproxy). FabioLB strategy have more features, also you can use consul for service discovery (contacting other services directly by name without have to go thru FabioLB). NATS strategy have some benefit in terms of security, which is the NATS cluster can be inside DMZ, and worker can be on the different subnet without ability to connect each other and it would still works, where if you use consul to connect directly to another service, they should have route or connection to access each other. The bad part about NATS is that you should not use it for file upload, or it would hogging a lot of resource, so it should handled by apiproxy directly, then the reference of the uploaded file should be forwarded as payload to NATS. You can check NATS traffic statistics using nats-top.

What's next? Maybe we can try traefik, which is a service registry combined with load balancer in one binary, it can also use consul.

UPDATE: by changing the code from Subscribe (broadcast/fan-out) to QueueSubscribe (load balance), it have similar performance on 1/2/4 subscribers, so we can NATS for high availability/fault tolerance in api gateway pattern with cost of 16% overhead.

TL:DR

no LB: 232K rps
-> LB with NATS request-reply: 196K rps (16% overhead)
no LB: 197K rps
-> LB with Fabio+Consul: 65K rps (67% overhead)

 



2021-08-04

Dockerfile Template (React, Express, Vue, Nest, Angular, GoFiber, Svelte, Django, Laravel, ASP.NET Core, Kotlin, Deno)

These are docker template for deploying common applications (either using Kubernetes, Nomad, or locally using docker-compose), this post are copied mostly from scalablescripts youtube channel and docker docs, the gist for nginx config are here.


ReactJS

FROM node:15.4 as build1
WORKDIR /app1
COPY package+.json .
RUN npm install
COPY . .
RUN npm run build

FROM nginx:1.19
COPY ./nginx.conf /etc/nginx/nginx.conf
COPY --from=build1 /app1/build /usr/share/nginx/html

To build it, use docker build -t react1 .
To run it, use docker run -p 8001:80 react1


ExpressJS

FROM node:15.4 
WORKDIR /app
COPY package+.json .
RUN npm install
COPY . .
CMD node index.js


VueJS

FROM node:15.4 as build1
WORKDIR /app1
COPY package+.json .
RUN npm install
COPY . .
RUN npm run build

FROM nginx:1.19
COPY ./nginx.conf /etc/nginx/nginx.conf
COPY --from=build1 /app1/dist /usr/share/nginx/html

The only different thing from react is the build directory not build/ but dist/.


NestJS 

FROM node:15.4 as build1
WORKDIR /app1
COPY package+.json .
RUN npm install
COPY . .
RUN npm run build

FROM node:15.4
WORKDIR /app
COPY package.json .
RUN npm install --only=production
COPY --from=build1 /app1/dist ./dist
CMD npm run start:prod


AngularJS

FROM node:15.4 as build1
WORKDIR /app1
COPY package+.json .
RUN npm install
COPY . .
RUN npm run build --prod

FROM nginx:1.19
COPY ./nginx.conf /etc/nginx/nginx.conf
COPY --from=build1 /app1/dist/PROJECT_NAME /usr/share/nginx/html


Fiber (Golang)

FROM golang:1.16-alpine as build1
WORKDIR /app1
COPY go.mod .
COPY go.sum .
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o app1.exe

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /
COPY --from=build1 /app1/app1.exe .
CMD ./app1.exe

You don't need COPY go.mod to go mod download step if you have vendor/ directory to /go/pkg/mod, you can reuse it instead of redownloading whole dependencies (this can really faster things up on the CI/CD pipeline, especially if you live on 3rd world country). The ca-certificates only needed if you need to hit https endpoints, if you don't then you can skip that step.


Svelte

FROM node:15.4 as build1
WORKDIR /app1
COPY package+.json .
RUN npm install
COPY . .
RUN npm run build

FROM nginx:1.19
COPY ./nginx.conf /etc/nginx/nginx.conf
COPY --from=build1 /app1/public /usr/share/nginx/html


Django

FROM python:3.9-alpine as build1
ENV PYTHONUNBUFFERED 1
WORKDIR /app1
COPY requirements.txt .
CMD pip install -r requirements.txt
COPY . .
CMD python manage.py runserver 0.0.0.0:80


Laravel

FROM php:7.4-fpm
RUN apt-get update && apt-get install -y git curl libpng-dev libonig-dev libxml2-dev zip unzip
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
RUN docker-php-ext-install pdo_mysql mbstring
WORKDIR /app1
COPY composer.json .
RUN composer install --no-scripts
COPY . .
CMD php artisan serve --host=0.0.0.0 --port=80


ASP.NET Core

FROM mcr.microsoft.com/dotnet/sdk:5.0 as build1
WORKDIR app1
COPY *.csproj .
CMD dotnet restore
COPY . .
RUN dotnet publish -c Release -o out

FROM mcr.microsoft.com/dotnet/aspnet
WORKDIR /app
COPY --from=build1 /app1/out .
ENTRYPOINT ["dotnet", "PROJECT_NAME.dll"]


Kotlin

FROM gradle:7-jdk8 as build1
WORKDIR /app1
COPY . .
RUN ./gradlew build --stacktrace

FROM openjdk
WORKDIR /app
EXPOSE 80
COPY --from=build1 /app/build/libs/PROJECT_NAME-VERSION-SNAPSHOT.jar .
CMD java -jar PROJECT_NAME-VERSION-SNAPSHOT.jar


Deno

FROM denoland/deno:1.11.0
WORKDIR /app1
COPY . .
RUN ["--run","--allow-net","app.ts"]


Deployment

For deployment you can use AWS (elastic container registry and elastic container instance or elastic container service with fargate), Azure (azure container registry and azure container instance), GoogleCloud (upload to container registry and google cloud run), or just upload it to docker registry then pull it on the server.