2023-12-22

Benchmarking LLM models

Today we're gonna try to use ollama to generate some code, it quite easy you just need to install it using curl like this, and install some model:

curl https://ollama.ai/install.sh | sh

# these models already predownloaded before first run
time ollama run codellama 'show me inplace mergesort using golang'
#time ollama run deepseek-coder 'show me inplace mergesort using golang'
time ollama run llama2-uncensored 'show me inplace mergesort using golang'
time ollama run mistral 'show me inplace mergesort using golang'

#time ollama run starcoder 'show me inplace mergesort using golang'
time ollama run codeup 'show me inplace mergesort using golang'
time ollama run codebooga 'show me inplace mergesort using golang'
time ollama run magicoder 'show me inplace mergesort using golang'

ollama list
NAME                     ID           SIZE    MODIFIED
codebooga:latest         05b83c5673dc 19 GB   3 seconds ago
codellama:latest         8fdf8f752f6e 3.8 GB  4 minutes ago
deepseek-coder:latest    140a485970a6 776 MB  10 minutes ago
llama2-uncensored:latest 44040b922233 3.8 GB  10 minutes ago
mistral:latest           4d9f4b269c33 4.1 GB  2 days ago
codeup:latest            54289661f7a9 7.4 GB  4 minutes ago
magicoder:latest         8007de06f5d9 3.8 GB  4 minutes ago

starcoder:latest         847e5a7aa26f 1.8 GB  10 minutes ago



Here's the result of each of them, the benchmark runs using 32-core CPU, since my GPU still unsupported unsupported (gfx1032):

codellama - 62.59s - will overwrite "left" if array overlapped, at least the structure a bit correct

 Sure! Here is an example of how to implement in-place merge sort using Go:
```
package main

import (
        "fmt"
)

func merge(arr []int, low int, mid int, high int) {
        left := arr[low : mid+1]
        right := arr[mid+1 : high+1]

        for i := low; i <= high; i++ {
                if left[i-low] > right[i-mid] {
                        arr[i] = left[i-low]
                } else {
                        arr[i] = right[i-mid]
                }
        }
}

func inplaceMergeSort(arr []int, low int, high int) {
        if low >= high {
                return
        }

        mid := (low + high) / 2
        inplaceMergeSort(arr, low, mid)
        inplaceMergeSort(arr, mid+1, high)
        merge(arr, low, mid, high)
}

func main() {
        arr := []int{5, 3, 7, 9, 2, 4, 6, 8}
        inplaceMergeSort(arr, 0, len(arr)-1)
        fmt.Println("Sorted array:", arr)
}
```
This code sorts the input array in place using the merge sort algorithm. The `merge` function is used to combine two sorted subarrays
into a single sorted array. The `inplaceMergeSort` function recursively splits the array into smaller subarrays until each subarray
contains only one element, at which point the `merge` function is called to combine the subarrays.

The `main` function demonstrates how to use the `inplaceMergeSort` function on an example input array.

deepseek-coder - 23.53s - fast but messy indentation, also spouting quicksort not mergesort 

 
Here's an in-place version of merge sort using Golang:

```go
package main
 
import (
    "fmt"
)     
    
func partition(array []int, low int , high int ) int {
        pivot := array[high] // Picks the last element as a pivot. This is done arbitrarily here for simplicity of code but in
real-world cases it should be chosen by an external algorithm like quicksort or similar with better average case performance then
mergesorts which pick random elements
        i := (low - 1)    // Index of smaller element  
        for j:=  low ;j <= high-1; j++ {     
            if array[j] < pivot{      
                i++                     // Swap arr[i+1] and arra[high], then increment i.  This is the part that makes sure all elements sma
smaller than Pivot go to left of it, while rest are in right side    // The key process is to check if current element 'array'[] has
value less
                //than pivot'. If yes,"then do nothing", else swap values and increment i.
                }                          }      return (i + 1)            }
            func merge( array []int, low int , mid int,high int ) { // The main function that sorts arr[l..r] using auxiliary space
a[]={0...n} and i is index of start element in left subarray 和 right.   pivot such as to be inserted
                // elements into correct position for final sorted array, then call itself recursively      return }            public func m
main() {          arr :=  [7,12,45689,30,-2 ,-5]         fmt.Println(array)    mergeSort(&arr[..])     // Calling the function to sort
array using mergesort
                }` `go run main.go 输出结果为[-5 -2  12  7 30 45689 ]] ) and go playground https://play.golang.org/p/-Ni_kJvKlf for running the code
```     Sorry, I can't provide you with a full in-place merge sort because it depends on your requirements (whether to print or not),
as well complexity analysis and other factors which may affect performance of an algorithm. The above is just how one might implement
such functionality within Go language using Goroutines if necessary for multi threading/parallelism, but would need a different
approach otherwise due the nature of merge sort itself being inherently sequential in operation (each element has to be processed
before it can go on). 
 

llama2-uncensored - 50.87s - err, using builtin sort XD

Sure, here's an implementation of in-place mergesort using Go:
```go
package main
import "fmt"
func mergeSort(arr []int) {
 if len(arr) <= 1 {
 return
 }
 mid := len(arr) / 2
 left := arr[:mid]
 right := arr[mid:]
 fmt.Println("Merging", left, "and", right)
 sort.Ints(left)
 sort.Ints(right)
 mergeSort(left)
 merge(left, right)
}
func merge(left []int, right []int) []int {
 res := make([]int, len(left)+len(right))
 for i := 0; i < len(res); i++ {
 if left[i] <= right[i-1] {
 res[i] = left[i]
 } else {
 res[i] = right[i-1]
 }
 }
 return res
}
func main() {
 arr := []int{5, 3, 2, 1, 7, 8, 4, 6}
 fmt.Println("Original array:", arr)
 mergeSort(arr)
 fmt.Println("Merged array: ", arr)
}
```
This implementation uses the built-in `sort` package to sort the left and right halves of the original array separately, and then
merges them using a recursive function called `merge`. Finally, it applies the sorted array to the original one using the `mergeSort`
function.

mistral - 96.77s - not really inplace XD it creates copy

 Here's an example of in-place MergeSort implementation in Golang. In this approach, we divide the given slice into smaller sub-arrays
until each sub-array contains only one element. Then we merge these sub-arrays back together to sort the original slice.

```go
func merge(arr []int, left int, mid int, right int) { // Merge function
    n1 := mid - left + 1
    n2 := right - mid

    Left := make([]int, n1)
    Right := make([]int, n2)

    for i := 0; i < n1; i++ {
        Left[i] = arr[left + i]
    }
    for j := 0; j < n2; j++ {
        Right[j] = arr[mid+1+j]
    }

    i := 0
    j := 0
    k := left

    for i < n1 && j < n2 {
        if Left[i] <= Right[j] {
            arr[k] = Left[i]
            i++
        } else {
            arr[k] = Right[j]
            j++
        }
        k++
    }

    for i < n1 { // Copy the rest of Left
        arr[k] = Left[i]
        i++
        k++
    }

    for j < n2 { // Copy the rest of Right
        arr[k] = Right[j]
        j++
        k++
    }
}

func mergeSort(arr []int, left int, right int) { // Merge sort function
    if len(arr) <= 1 {
        return
    }

    mid := (left + right) / 2

    mergeSort(arr, left, mid)
    mergeSort(arr, mid+1, right)
    merge(arr, left, mid, right)
}

func main() {
    arr := []int{12, 11, 13, 5, 6, 7}
    fmt.Println("Unsorted array:")
    printArray(arr)
    mergeSort(arr, 0, len(arr)-1)
    fmt.Println("Sorted array:")
    printArray(arr)
}

func printArray(arr []int) {
    for _, a := range arr {
        fmt.Println(a, " ")
    }
    fmt.Println()
}
```

In this example, the `mergeSort` function is called recursively on the array until each sub-array contains only one element. After
that, the merge function merges these sub-arrays back together, sorting the entire array in place. 

starcoder - 71.65s - BAD, spouting python nonsense

# golang
go run -race./cmd/test
# python
python3.7 -m pytest tests_unit --cov=src
-- spouting nonsense python code --

magicoder - 91.89s - not really in-place

 Here is a simple implementation of the in-place merge sort algorithm in Golang. It's very similar to the traditional merge sort
algorithm but works in-place without needing an additional array space:

```go
package main

import (
        "fmt"
)

func printArray(arr []int) {
        for _, value := range arr {
                fmt.Print(value, " ")
        }
        fmt.Println()
}

func mergeSort(arr []int) {
        var helper [100]int // declare an array to hold the helper values
        mergeSortRecursive(&helper[0], arr, 0, len(arr)-1)
}

func mergeSortRecursive(aux *[100]int, src []int, lo, hi int) {
        if lo >= hi {
                return // base case: array is sorted
        }

        // sort the left half
        mid := (lo + hi) / 2
        mergeSortRecursive(aux, src, lo, mid)

        // sort the right half
        mergeSortRecursive(aux, src, mid+1, hi)

        // merge the sorted halves together
        merge(&helper[0], aux, src, lo, mid, hi)
}

func merge(dst *[100]int, aux *[100]int, src []int, lo, mid, hi int) {
        i := lo
        j := mid + 1

        for k := lo; k <= hi; k++ {
                if i > mid {
                        dst[k] = src[j]
                        j++
                } else if j > hi {
                        dst[k] = src[i]
                        i++
                } else if src[i] < src[j] {
                        dst[k] = src[i]
                        i++
                } else {
                        dst[k] = src[j]
                        j++
                }
        }

        // copy back the sorted values to our original array
        for i, value := range dst[:hi-lo+1] {
                src[i+lo] = value
        }
}

func main() {
        arr := []int{37, 49, 38, 50, 3, 13}
        printArray(arr) // before sorting
        mergeSort(arr)
        printArray(arr) // after sorting
}
```
This code prints the input array, sorts it using merge sort algorithm and prints again to check if the sorting is successful.

codeup - 127.32s - using temporary variable, so this just normal mergesort, quite slow

Here is an example of an in-place mergesort implementation in Go:
```
func mergeSort(arr []int, low, high int) {
    if low < high {
        // divide the array into two halves
        mid := (low + high) / 2
        mergeSort(arr, low, mid)
        mergeSort(arr, mid+1, high)
        merge(arr, low, mid, high)
    }
}

func merge(arr []int, low, mid, high int) {
    var result []int
    left := arr[:mid]
    right := arr[mid:]
    for i := 0; i < len(left); i++ {
        result = append(result, left[i])
    }
    for j := 0; j < len(right); j++ {
        result = append(result, right[j])
    }
    return result
}
```
This code sorts the input array in place using the mergesort algorithm. The `mergeSort` function divides the array into two halves, sorts each half recursively, and then merges the two sorted halves back together using the `merge` function. The `merge` function takes four
integer parameters: the left and right indices of the two arrays being merged, and the low and high indices of the resulting array. It appends the elements of the left and right arrays to the resulting array in the correct order.

To use this code, you can simply call the `mergeSort` function with the address of the input array as an argument, like this:
```
arr := []int{1, 2, 3, 4, 5, 6}
mergeSort(arr)
fmt.Println(arr) // [1, 2, 3, 4, 5, 6]
```
This will sort the input array in place and print the resulting sorted array. Note that this implementation of mergesort has a time complexity of O(n log n), which means it will take longer to sort larger arrays.

codebooga - 490.72s - using temporary variable, so this just normal mergesort, slow AF

Here is an implementation of in-place merge sort using Go:

```go
package main

import (
        "fmt"
)

func merge(arr []int, low int, mid int, high int) {
        n1 := mid - low + 1
        n2 := high - mid

        L := make([]int, n1)
        R := make([]int, n2)

        for i := 0; i < n1; i++ {
                L[i] = arr[low+i]
        }
        for j := 0; j < n2; j++ {
                R[j] = arr[mid+1+j]
        }

        i, j, k := 0, 0, low

        for i < n1 && j < n2 {
                if L[i] <= R[j] {
                        arr[k] = L[i]
                        i++
                } else {
                        arr[k] = R[j]
                        j++
                }
                k++
        }

        for i < n1 {
                arr[k] = L[i]
                i++
                k++
        }

        for j < n2 {
                arr[k] = R[j]
                j++
                k++
        }
}

func mergeSort(arr []int, low int, high int) {
        if low < high {
                mid := (low + high) / 2
                mergeSort(arr, low, mid)
                mergeSort(arr, mid+1, high)
                merge(arr, low, mid, high)
        }
}

func inPlaceMergeSort(arr []int) {
        if len(arr) > 1 {
                mergeSort(arr, 0, len(arr)-1)
        }
}

func main() {
        arr := []int{12, 11, 13, 5, 6, 7}
        fmt.Println("Given array is: ", arr)
        inPlaceMergeSort(arr)
        fmt.Println("Sorted array is: ", arr)
}
```

This program sorts the given array using in-place merge sort algorithm. The `merge` function takes three parameters, an array `arr`, and two indices `low` and `high`. It merges the two sorted subarrays of `arr`, one starting from index `low` to `mid` (inclusive) and another from
`mid+1` to `high` (inclusive), into a single sorted subarray. The `mergeSort` function is a recursive function that sorts the array using merge sort algorithm by dividing it into two halves until each half has only one element, then calling the `merge` function to combine them
back in order. Finally, the `inPlaceMergeSort` function calls `mergeSort` on the input array and sorts it in place. The `main` function demonstrates how to use this implementation by sorting an example array.

Conclusion

So as you can see, size of model determines the speed of resolving the answer, also all of them are not good enough for now to implement in-place merge part.

2023-11-27

Install Ruby 2.7 on Ubuntu 22.04

So I have a past project from 2014 that still running until now (5 independent services: 3 written in Golang - 2 read-only service, 2 write-only service written in Ruby, 1 PostgreSQL, 1 Redis, ~45GB database / 2.7GB daily backup, 1.1GB cache, 8GB logs, <800MB compressed), and the server RAID already near death (since the server was alive without issue for 12 years), and on new server there's freshly installed 22.04, and old service when trying to be run with ruby 3.x will always failed because of library compilation issues (eg. still depend on outdated postgres12-dev). Here's how you can install ruby 2.8 without rvm:

# add old brightbox PPA (must be focal, not jammy)
echo 'deb https://ppa.launchpadcontent.net/brightbox/ruby-ng/ubuntu focal main
deb-src https://ppa.launchpadcontent.net/brightbox/ruby-ng/ubuntu focal main ' > /etc/apt/sources.list.d/brightbox-ng-ruby-2.7.list

# add the signing key
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys F5DA5F09C3173AA6

# add the focal security for libssl1.1
echo "deb http://security.ubuntu.com/ubuntu focal-security main" | sudo tee /etc/apt/sources.list.d/focal-security.list

# update and install
sudo apt update
sudo apt install -y ruby2.7 ruby2.7-dev

 

that's it, now you can run gem install (with sudo for global) without any compilation issue as long as you have correct C library dependency (eg. postgresql-server-dev-12).

2023-11-24

mTLS using Golang Fiber

In this demo we're going to create mTLS using Go and Fiber. To create certificates that can be used for mutual authentication, what you need to have is just an OpenSSL program (or simplecert, or mkcert like in previous natsmtls1 example), create a CA (certificate authority), server certs, and client certs, something like this:

# generate CA Root
openssl req -newkey rsa:2048 -new -nodes -x509 -days 3650 -out ca.crt -keyout ca.key -subj "/C=SO/ST=Earth/L=MyLocation/O=MyOrganiz/OU=MyOrgUnit/CN=localhost"

# generate Server Certs
openssl genrsa -out server.key 2048
# generate server Cert Signing request
openssl req -new -key server.key -days 3650 -out server.csr -subj "/C=SO/ST=Earth/L=MyLocation/O=MyOrganiz/OU=MyOrgUnit/CN=localhost"
# sign with CA Root
openssl x509  -req -in server.csr -extfile <(printf "subjectAltName=DNS:localhost") -CA ca.crt -CAkey ca.key -days 3650 -sha256 -CAcreateserial -out server.crt

# generate Client Certs
openssl genrsa -out client.key 2048
# generate client Cert Signing request
openssl req -new -key client.key -days 3650 -out client.csr -subj "/C=SO/ST=Earth/L=MyLocation/O=$O/OU=$OU/CN=localhost"
# sign with CA Root
openssl x509  -req -in client.csr -extfile <(printf "subjectAltName=DNS:localhost") -CA ca.crt -CAkey ca.key -out client.crt -days 3650 -sha256 -CAcreateserial


You will get at least 2 files related to CA, 3 files related to server, and 3 files related to client, but what you really need is just CA public key, server private and public key (key pairs), and client private and public key (key pairs). If you need to generate another client or rollover server keys, you will still need CA's private key so don't erase it.

Next, now that you already have those 5 keys, you will need to load CA public key, and server key pair and use it on fiber, something like this:

caCertFile, _ := os.ReadFile(in.CaCrt)
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCertFile)

serverCerts, _ := tls.LoadX509KeyPair(in.ServerCrt, in.ServerKey)

tlsConfig := &tls.Config{
    ClientCAs:        caCertPool,
    ClientAuth:       tls.RequireAndVerifyClientCert,
    MinVersion:       tls.VersionTLS12,
    CurvePreferences: []tls.CurveID{tls.CurveP521, tls.CurveP384, tls.CurveP256},
        CipherSuites: []uint16{
        tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
        tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
        tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
        tls.TLS_RSA_WITH_AES_256_CBC_SHA,
         tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
        tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
     },
     Certificates: []tls.Certificate{serverCerts},
}

// attach the certs to TCP socket, and start Fiber server
app := fiber.New(fiber.Config{
    Immutable: true,
})
app.Get("/", func(c *fiber.Ctx) error {
    return c.String(`secured string`)
})
ln, _ := tls.Listen("tcp", `:1443`, tlsConfig)
app.Listener(ln)


next on the client side, you just need to load CA public key, client key pairs, something like this:

caCertFile, _ := os.ReadFile(in.CaCrt)
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCertFile)
certificate, _ := tls.LoadX509KeyPair(in.ClientCrt, in.ClientKey)

httpClient := &http.Client{
    Timeout: time.Minute * 3,
    Transport: &http.Transport{
        TLSClientConfig: &tls.Config{
            RootCAs:      caCertPool,
            Certificates: []tls.Certificate{certificate},
        },
    },
}

r, _ := httpClient.Get(`https://localhost:1443`)


that's it, that's how you secure client-server communication between Go client and server with mTLS, this code can be found here.

2023-10-14

Benchmarking docker-volume vs mount-fs vs tmpfs

So today we're gonna benchmark between docker-volume (bind to docker-managed volume), bind/mount-fs (binding to host filesystem), and tmpfs. Which one can be the fastest? here's the docker compose:

version: "3.7"
services:
  web:
    image: ubuntu
    command: "sleep 3600"
    volumes:
        - ./temp1:/temp1 # mountfs
        - temp2:/temp2   # dockvol
        - temp3:/temp3   # tmpfs

volumes:
  temp2:
  temp3:
    driver_opts:
      type: tmpfs
      device: tmpfs


The docker compose file is on the sibling directory as data-root of docker to ensure using the same SSD. First benchmark we're gonna clone from this repository, then run copy, create 100 small files, then do 2 sequential write (small and large), here's the result of those (some steps not pasted below, eg. removing file when running benchmark twice for example):

apt install git g++ make time
alias time='/usr/bin/time -f "\nCPU: %Us\tReal: %es\tRAM: %MKB"'

cd /temp3 # tmpfs
git clone https://github.com/nikolausmayer/file-IO-benchmark.git

### copy small files

time cp -R /temp3/file-IO-benchmark /temp2 # dockvol
CPU: 0.00s      Real: 1.02s     RAM: 2048KB

time cp -R /temp3/file-IO-benchmark /temp1 # bindfs
CPU: 0.00s      Real: 1.00s     RAM: 2048KB

### create 100 x 10MB files

cd /temp3/file*
time make data # tmpfs
CPU: 0.41s      Real: 0.91s     RAM: 3072KB

cd /temp2/file*
time make data # dockvol
CPU: 0.44s      Real: 1.94s     RAM: 2816KB

cd /temp1/file*
time make data # mountfs
CPU: 0.51s      Real: 1.83s     RAM: 2816KB

### compile

cd /temp3/file*
time make # tmpfs
CPU: 2.93s  Real: 3.23s RAM: 236640KB

cd /temp2/file*
time make # dockvol
CPU: 2.94s  Real: 3.22s RAM: 236584KB

cd /temp1/file*
time make # mountfs
CPU: 2.89s  Real: 3.13s RAM: 236300KB

### sequential small

cd /temp3 # tmpfs
time dd if=/dev/zero of=./test.img count=10 bs=200M

2097152000 bytes (2.1 GB, 2.0 GiB) copied, 0.910784 s, 2.3 GB/s

cd /temp2 # dockvol
time dd if=/dev/zero of=./test.img count=10 bs=200M

2097152000 bytes (2.1 GB, 2.0 GiB) copied, 2.26261 s, 927 MB/s

cd /temp1 # mountfs
time dd if=/dev/zero of=./test.img count=10 bs=200M
2097152000 bytes (2.1 GB, 2.0 GiB) copied, 2.46954 s, 849 MB/s

### sequential large

cd /temp3 # tmpfs
time dd if=/dev/zero of=./test.img count=10 bs=1G
10737418240 bytes (11 GB, 10 GiB) copied, 4.95956 s, 2.2 GB/s

cd /temp2 # dockvol
time dd if=/dev/zero of=./test.img count=10 bs=1G
10737418240 bytes (11 GB, 10 GiB) copied, 81.8511 s, 131 MB/s
10737418240 bytes (11 GB, 10 GiB) copied, 44.2367 s, 243 MB/s
# ^ running twice because I'm not sure why it's so slow

cd /temp1 # mountfs
time dd if=/dev/zero of=./test.img count=10 bs=1G
10737418240 bytes (11 GB, 10 GiB) copied, 12.7516 s, 842 MB/s

The conclusion is, docker volume is a bit faster (+10%) for sequential small, but significantly slower (-72% to -84%) for large sequential files compared to bind/mount-fs, for the other cases seems there's no noticeable difference. I always prefer bind/mount-fs over docker volume because of safety, for example if you accidentally run docker volume rm $(docker volume ls -q) this would delete all your docker volume (I did this multiple times on my own dev PC), also you can easily backup/rsync/copy/manage files if using bind/mount-fs. For other cases, that you don't care whether losing files or not and need high performance (as long as your ram is enough), just use tmpfs.

2023-10-07

NATS: at-most once Queue / simpler networking

As you might already know in the past article, I'm a huge fan of NATS, NATS is one of the fastest one at-most-once delivery non-persistent message broker. Unlike rabbitmq/lavinmq/kafka/redpanda, NATS is not a persistent queue, the persistent version is called NATS-Jetstream (but it's better to use rabbitmq/lavinmq/kafka/redpanda since they are more popular and have more integration than NATS-Jetstream, or use cloud provider's like GooglePubSub/AmazonSQS.

Features of NATS:

1. embeddable, you can embed nats directly on go application, so don't have to run a dedicated instance

ns, err := server.NewServer(opts)
go ns.Start()

if !ns.ReadyForConnections(4 * time.Second) {
      log.Fatal("not ready for connections")
}
nc, err := nats.Connect(ns.ClientURL())


2. high performance, old benchmark (not apple-to-apple because the rest mostly have persistence (at -least-once delivery) by default)
3. wildcard topic, topic can contain wildcard, so you can subscribe topic like foo.*.bar (0-9a-zA-Z, separate with dot . for subtopic)
4. autoreconnect (use Reconnect: -1), unlike some amqp client-library that have to handle reconnection manually
5. built in top-like monitoring nats-top -s serverHostname -m port

Gotchas

Since NATS does not support persistence, if there's no subscriber, message will lost (unlike other message queue), so it behaves like Redis PubSub, not like Redis Queue, just the difference is Redis PubSub is fan-out/broadcast only, while nats can do both broadcast or non-persistent queue.

Use Case of NATS

1. publish/subscribe, for example to give signal to another services. this assume broadcast, since there's no queue.
Common pattern for this for example, there is API for List items changed after certain updatedAt, and the service A want to give signal to another service B that there's new/updated items, so A just need to broadcast this message to NATS, and any subscriber of that topic can get the signal to fetch from A when getting a signal, and periodically as fallback.

// on publisher
err := nc.Publish(strTopic, bytMsg)

// on subscsriber
_, err := nc.Subscribe(strTopic, func(m *nats.Msg) {
    _ = m.Subject
    _ = m.Data
})

// sync version
sub, err := nc.SubscribeSync(strTopic)
for {
  msg, err := sub.NextMsg(5 * time.Second)
  if err != nil { // nats.ErrTimeout if no message
    break
  }
  _ = msg.Subject
  _ = msg.Data
}

 

2. request/reply, for example to load-balance requests/autoscaling, so you can deploy anywhere, and the "worker" will catch up
we can use QueueSubscribe with same queueName to make sure only 1 worker handling per message. if you have 2 different queueName, the same message will be processed by 2 different worker that subscribe to the different queueName.

// on requester
msg, err := nc.Request(strTopic, bytMsg, 5*time.Second)
_ = msg == []byte("reply"} // from worker/responder

// on worker/responder
_, err := nc.QueueSubscribe(strTopic, queueName, func(m *nats.Msg) {
    _ = m.Subject
    _ = m.Data
    m.Respond([]byte("reply")) // send back to requester
})

You can see more examples here, and example how to add otel trace on NATS here, and how to create mtls here.

2023-09-28

Chisel: Ngrok local-tunnel Alternative

So today we're gonna learn a tool called chisel, from the same creator of overseer that I usually use to graceful restart production service.

What chisel can do? It can forward traffic from private network to public network, for example if you have service/port that only accessible from your internal network, and you want it to be exposed/tunneled to server that has public IP/accessible from outside, but only for the case when you cannot use reverse proxy because the reverse proxy cannot access your private server (eg. because it protected by firewall or doesn't have public IP at all).

internet --> public server <-- internet <-- private server/localhost
                           <--> tunnel <-->


You can first install chisel by running this command

go install github.com/jpillora/chisel@latest 

or download the binary directly. 

Then in the public server (server with public IP), you can do something like this:

chisel server --port 3229 --reverse

this would listen to port 3229 for tunnel requests.

On the client/private network that you want to be exposed to public you can run this command:

chisel client http://publicServerIP:3229 R:3006:127.0.0.1:3111

The command above means that on the server, there will be port 3006 listened, any traffic that goes to that port, will be forwarded to client to port 3111.

After that you can add https for example using caddy (don't forget to add DNS first so letsencrypt can get the proper certificate):

https://myWebSite.com {
  reverse_proxy localhost:3006
}

Other alternatives are cloudflare tunnel, but it requires you to setup network and other stuff in their website (not sure what they will charge you for excess traffic), there's also ngrok (the original, but now a paid service), localtunnel (but it always dead after few requests).

More alternative and resources here:

2023-08-08

Free VPN on Linux

Usually I use extension in firefox or chrome, like UrbanVPN, but now I know that Cloudflare provides free VPN, I have problem where my ISP always block DNS queries, where my work mostly heavy on Web, DNS, Storage, any cloud related stuff. Normally I use DNSSec/DNSCrypt-proxy so I could bypass those restriction, but now I know that Cloudflare warp is available on Linux, all you need to do is just install:

curl https://pkg.cloudflareclient.com/pubkey.gpg | sudo gpg --yes --dearmor --output /usr/share/keyrings/cloudflare-warp-archive-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/cloudflare-warp-archive-keyring.gpg] https://pkg.cloudflareclient.com/ $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/cloudflare-client.list
sudo apt-get update && sudo apt-get install cloudflare-warp

# setup
warp-cli register
warp-cli connect

curl https://www.cloudflare.com/cdn-cgi/trace/
# make sure warp=on

# if you no longer need it
warp-cli disconnect

I guess that's it. :3

The cons is that this warp-taskbar behaves like a virus, cannot be killed in any way, you have to uninstall cloudflare-warp to kill it to make it not spamming disconnect log when you disable the service.

2023-07-29

Using Vault with Go

So today we're gonna use vault to make the configuration of an application to be in-memory, this would make debugging harder (since it's in memory, not on disk), but a bit more secure (if got hacked, have to read memory to know the credentials). 

The flow of doing this is something like this:

1. Set up Vault service in separate directory (vault-server/Dockerfile):

FROM hashicorp/vault

RUN apk add --no-cache bash jq

COPY reseller1-policy.hcl /vault/config/reseller1-policy.hcl
COPY terraform-policy.hcl /vault/config/terraform-policy.hcl
COPY init_vault.sh /init_vault.sh

EXPOSE 8200

ENTRYPOINT [ "/init_vault.sh" ]

HEALTHCHECK \
    --start-period=5s \
    --interval=1s \
    --timeout=1s \
    --retries=30 \
        CMD [ "/bin/sh", "-c", "[ -f /tmp/healthy ]" ]

2. The reseller1 ("user" for the app) policy and terraform (just name, we don't use terraform here, this could be any tool that provision/deploy the app, eg. any CD pipeline) policy is something like this:

# terraform-policy.hcl
path "auth/approle/role/dummy_role/secret-id" {
  capabilities = ["update"]
}

path "secret/data/dummy_config_yaml/*" {
  capabilities = ["create","update","read","patch","delete"]
}

path "secret/dummy_config_yaml/*" { # v1
  capabilities = ["create","update","read","patch","delete"]
}

path "secret/metadata/dummy_config_yaml/*" {
  capabilities = ["list"]
}

# reseller1-policy.hcl
path "secret/data/dummy_config_yaml/reseller1/*" {
  capabilities = ["read"]
}

path "secret/dummy_config_yaml/reseller1/*" { # v1
  capabilities = ["read"]
}

3. Then we need to create init script for docker (init_vault.sh), so it could execute required permissions when docker started (insert policies, create appRole, reset token for provisioner), something like this:

set -e

export VAULT_ADDR='http://127.0.0.1:8200'
export VAULT_FORMAT='json'
sleep 1s
vault login -no-print "${VAULT_DEV_ROOT_TOKEN_ID}"
vault policy write terraform-policy /vault/config/terraform-policy.hcl
vault policy write reseller1-policy /vault/config/reseller1-policy.hcl
vault auth enable approle

# configure AppRole
vault write auth/approle/role/dummy_role \
    token_policies=reseller1-policy \
    token_num_uses=0 \
    secret_id_ttl="32d" \
    token_ttl="32d" \
    token_max_ttl="32d"

# overwrite token for provisioner
vault token create \
    -id="${TERRAFORM_TOKEN}" \
    -policy=terraform-policy \
    -ttl="32d"

# keep container alive
tail -f /dev/null & trap 'kill %1' TERM ; wait

5. Now that all has been set up, we can create docker compose (docker-compose.yaml) to start everything with proper environment variable injection, something like this:

version: '3.3'
services:
  testvaultserver1:
    build: ./vault-server/
    cap_add:
      - IPC_LOCK
    environment:
      VAULT_DEV_ROOT_TOKEN_ID: root
      APPROLE_ROLE_ID:         dummy_app
      TERRAFORM_TOKEN:         dummyTerraformToken
    ports:
      - "8200:8200"

# run with: docker compose up 

6. Now that vault server already up, we can run a script (should be run by provisioner/CD) to retrieve an AppSecret and write it to /tmp/secret, and write our app configuration (config.yaml) to vault path with key dummy_config_yaml/reseller1/region99 something like this:

TERRAFORM_TOKEN=`cat docker-compose.yml | grep TERRAFORM_TOKEN | cut -d':' -f2 | xargs echo -n`
VAULT_ADDRESS="127.0.0.1:8200"

# retrieve secret for appsecret so dummy app can load the /tmp/secret
curl \
   --request POST \
   --header "X-Vault-Token: ${TERRAFORM_TOKEN}" \
      "${VAULT_ADDRESS}/v1/auth/approle/role/dummy_role/secret-id" > /tmp/debug

cat /tmp/debug | jq -r '.data.secret_id' > /tmp/secret

# check appsecret exists
cat /tmp/debug
cat /tmp/secret

VAULT_DOCKER=`docker ps| grep vault | cut -d' ' -f 1`

echo 'put secret'
cat config.yaml | docker exec -i $VAULT_DOCKER vault -v kv put -address=http://127.0.0.1:8200 -mount=secret dummy_config_yaml/reseller1/region99 raw=-

echo 'check secret length'
docker exec -i $VAULT_DOCKER vault -v kv get -address=http://127.0.0.1:8200 -mount=secret dummy_config_yaml/reseller1/region99 | wc -l

7. Next, we just need to creat an application that will read the AppSecret (/tmp/secret), retrieve the application config from vault key path secret dummy_config_yaml/reseller1/region99, something like this:

secretId := readFile(`/tmp/secret`)
config := vault.DefaultConfig()
config.Address = address
appRoleAuth, err := approle.NewAppRoleAuth(
    AppRoleID, -- injected on compile time = `dummy_app`
    approleSecretID)
const configPath = `
secret/data/dummy_config_yaml/reseller1/region99`
secret, err := client.Logical().Read(configPath)
data := secret.Data[`data`]
m, ok := data.(map[string]interface{})
raw, ok := m[`raw`]
rawStr, ok := raw.(string)

the content of rawStr that read from vault will have exactly the same as config.yaml.

This way if hacker already got in into the system/OS/docker, can only know the secretId, to know the AppRoleID and the config.yaml content they have to analyze from memory. Full source code can be found here.

2023-07-02

KEDA Kubernetes Event-Driven Autoscaling

Autoscaling mostly useless, if the number of host/nodes/hypervisor is limited, eg. we only have N number of nodes, and we tried to autoscale the services inside of it, so by default we already waste a lot of unused resources (especially if the billing criteria is not like Jelastic, you use whatever you allocate not whatever you utilize). Autoscaling also quite useless if your problem is I/O-bound not CPU-bound, for example you don't use autoscaled database (or whatever the I/O bottleneck are). CPU are rarely the bottleneck in my past experience. But today we're gonna try to use KEDA, to autoscale kubernetes service. First we need to install fastest kube:

# install minikube for local kubernetes cluster
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 
sudo install minikube-linux-amd64 /usr/local/bin/minikube
minikube start # --driver=kvm2 or --driver=virtualbox
minikube kubectl
alias k='minikube kubectl --'
k get pods --all-namespaces

# if not using --driver=docker (default)
minikube addons configure registry-creds
Do you want to enable AWS Elastic Container Registry? [y/n]: n
Do you want to enable Google Container Registry? [y/n]: n
Do you want to enable Docker Registry? [y/n]: y
-- Enter docker registry server url: https://hub.docker.com/
-- Enter docker registry username: kokizzu
-- Enter docker registry password:
Do you want to enable Azure Container Registry? [y/n]: n
✅  registry-creds was successfully configured

 

Next we need to create a dummy container as for a pod that we want to be autoscaled:

# build example docker image
docker login -u kokizzu # replace with your docker hub username
docker build -t pf1 .
docker image ls pf1
# REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
# pf1          latest    204670ee86bd   2 minutes ago   89.3MB

# run locally for testing
docker run -it pf1 -p 3000:3000

# tag and upload
docker image tag pf1 kokizzu/pf1:v0001
docker image push kokizzu/pf1:v0001


Create deployment terraform file, something like this:

# main.tf
terraform {
  required_version = ">= 1.3.0"
  required_providers {
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "= 2.20.0"
    }
  }
  backend "local" {
    path = "/tmp/pf1.tfstate"
  }
}
provider "kubernetes" {
  config_path    = "~/.kube/config"
  # from k config view | grep -A 3 minikube | grep server:
  host           = "https://240.1.0.2:8443"
  config_context = "minikube"
}
provider "helm" {
  kubernetes {
    config_path    = "~/.kube/config"
    config_context = "minikube"
  }
}
resource "kubernetes_namespace_v1" "pf1ns" {
  metadata {
    name        = "pf1ns"
    annotations = {
      name = "deployment namespace"
    }
  }
}
resource "kubernetes_deployment_v1" "promfiberdeploy" {
  metadata {
    name      = "promfiberdeploy"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    selector {
      match_labels = {
        app = "promfiber"
      }
    }
    replicas = "1"
    template {
      metadata {
        labels = {
          app = "promfiber"
        }
        annotations = {
          "prometheus.io/path"   = "/metrics"
          "prometheus.io/scrape" = "true"
          "prometheus.io/port"   = 3000
        }
      }
      spec {
        container {
          name  = "pf1"
          image = "kokizzu/pf1:v0001" # from promfiber.go
          port {
            container_port = 3000
          }
        }
      }
    }
  }
}
resource "kubernetes_service_v1" "pf1svc" {
  metadata {
    name      = "pf1svc"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    selector = {
      app = kubernetes_deployment_v1.promfiberdeploy.spec.0.template.0.metadata.0.labels.app
    }
    port {
      port        = 33000 # no effect in minikube, will forwarded to random port anyway
      target_port = kubernetes_deployment_v1.promfiberdeploy.spec.0.template.0.spec.0.container.0.port.0.container_port
    }
    type = "NodePort"
  }
}
resource "kubernetes_ingress_v1" "pf1ingress" {
  metadata {
    name        = "pf1ingress"
    namespace   = kubernetes_namespace_v1.pf1ns.metadata.0.name
    annotations = {
      "kubernetes.io/ingress.class" = "nginx"
    }
  }
  spec {
    rule {
      host = "pf1svc.pf1ns.svc.cluster.local"
      http {
        path {
          path = "/"
          backend {
            service {
              name = kubernetes_service_v1.pf1svc.metadata.0.name
              port {
                number = kubernetes_service_v1.pf1svc.spec.0.port.0.port
              }
            }
          }
        }
      }
    }
  }
}
resource "kubernetes_config_map_v1" "prom1conf" {
  metadata {
    name      = "prom1conf"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  data = {
    # from https://github.com/techiescamp/kubernetes-prometheus/blob/master/config-map.yaml
    "prometheus.yml" : <<EOF
global:
  scrape_interval: 15s
  evaluation_interval: 15s
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093
rule_files:
  #- /etc/prometheus/prometheus.rules
scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "pf1"
    static_configs:
      - targets: [
          "${kubernetes_ingress_v1.pf1ingress.spec.0.rule.0.host}:${kubernetes_service_v1.pf1svc.spec.0.port.0.port}"
        ]
EOF
    # need to delete stateful set if this changed after terraform apply
    # or kubectl rollout restart statefulset prom1stateful -n pf1ns
    # because statefulset pod not restarted automatically when changed
    # if configmap set as env or config file
  }
}
resource "kubernetes_persistent_volume_v1" "prom1datavol" {
  metadata {
    name = "prom1datavol"
  }
  spec {
    access_modes = ["ReadWriteOnce"]
    capacity     = {
      storage = "1Gi"
    }
    # do not add storage_class_name or it would stuck
    persistent_volume_source {
      host_path {
        path = "/tmp/prom1data" # mkdir first?
      }
    }
  }
}
resource "kubernetes_persistent_volume_claim_v1" "prom1dataclaim" {
  metadata {
    name      = "prom1dataclaim"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    # do not add storage_class_name or it would stuck
    access_modes = ["ReadWriteOnce"]
    resources {
      requests = {
        storage = "1Gi"
      }
    }
  }
}
resource "kubernetes_stateful_set_v1" "prom1stateful" {
  metadata {
    name      = "prom1stateful"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
    labels    = {
      app = "prom1"
    }
  }
  spec {
    selector {
      match_labels = {
        app = "prom1"
      }
    }
    template {
      metadata {
        labels = {
          app = "prom1"
        }
      }
      # example: https://github.com/mateothegreat/terraform-kubernetes-monitoring-prometheus/blob/main/deployment.tf
      spec {
        container {
          name  = "prometheus"
          image = "prom/prometheus:latest"
          args  = [
            "--config.file=/etc/prometheus/prometheus.yml",
            "--storage.tsdb.path=/prometheus/",
            "--web.console.libraries=/etc/prometheus/console_libraries",
            "--web.console.templates=/etc/prometheus/consoles",
            "--web.enable-lifecycle",
            "--web.enable-admin-api",
            "--web.listen-address=:10902"
          ]
          port {
            name           = "http1"
            container_port = 10902
          }
          volume_mount {
            name       = kubernetes_config_map_v1.prom1conf.metadata.0.name
            mount_path = "/etc/prometheus/"
          }
          volume_mount {
            name       = "prom1datastorage"
            mount_path = "/prometheus/"
          }
          #security_context {
          #  run_as_group = "1000" # because /tmp/prom1data is owned by 1000
          #}
        }
        volume {
          name = kubernetes_config_map_v1.prom1conf.metadata.0.name
          config_map {
            default_mode = "0666"
            name         = kubernetes_config_map_v1.prom1conf.metadata.0.name
          }
        }
        volume {
          name = "prom1datastorage"
          persistent_volume_claim {
            claim_name = kubernetes_persistent_volume_claim_v1.prom1dataclaim.metadata.0.name
          }
        }
      }
    }
    service_name = ""
  }
}
resource "kubernetes_service_v1" "prom1svc" {
  metadata {
    name      = "prom1svc"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    selector = {
      app = kubernetes_stateful_set_v1.prom1stateful.spec.0.template.0.metadata.0.labels.app
    }
    port {
      port        = 10902 # no effect in minikube, will forwarded to random port anyway
      target_port = kubernetes_stateful_set_v1.prom1stateful.spec.0.template.0.spec.0.container.0.port.0.container_port
    }
    type = "NodePort"
  }
}
resource "helm_release" "pf1keda" {
  name       = "pf1keda"
  repository = "https://kedacore.github.io/charts"
  chart      = "keda"
  namespace  = kubernetes_namespace_v1.pf1ns.metadata.0.name
  # uninstall: https://keda.sh/docs/2.11/deploy/#helm
}
# run with this commented first, then uncomment
## from: https://www.youtube.com/watch?v=1kEKrhYMf_g
#resource "kubernetes_manifest" "scaled_object" {
#  manifest = {
#    "apiVersion" = "keda.sh/v1alpha1"
#    "kind"       = "ScaledObject"
#    "metadata"   = {
#      "name"      = "pf1scaledobject"
#      "namespace" = kubernetes_namespace_v1.pf1ns.metadata.0.name
#    }
#    "spec" = {
#      "scaleTargetRef" = {
#        "apiVersion" = "apps/v1"
#        "name"       = kubernetes_deployment_v1.promfiberdeploy.metadata.0.name
#        "kind"       = "Deployment"
#      }
#      "minReplicaCount" = 1
#      "maxReplicaCount" = 5
#      "triggers"        = [
#        {
#          "type"     = "prometheus"
#          "metadata" = {
#            "serverAddress" = "http://prom1svc.pf1ns.svc.cluster.local:10902"
#            "threshold"     = "100"
#            "query"         = "sum(irate(http_requests_total[1m]))"
#            # with or without {service=\"promfiber\"} is the same since 1 service 1 pod in our case
#          }
#        }
#      ]
#    }
#  }
#}


terraform init # download dependencies
terraform plan # check changes
terraform apply # deploy
terraform apply # uncomment first scaled_object part

k get pods --all-namespaces -w # check deployment
NAMESPACE NAME                          READY STATUS  RESTARTS AGE
keda-admission-webhooks-xzkp4          1/1    Running 0        2m2s
keda-operator-r6hsh                    1/1    Running 1        2m2s
keda-operator-metrics-apiserver-xjp4d  1/1    Running 0        2m2s
promfiberdeploy-868697d555-8jh6r       1/1    Running 0        3m40s
prom1stateful-0                        1/1    Running 0        22s

k get services --all-namespaces
NAMESP NAME     TYPE     CLUSTER-IP     EXTERNAL-IP PORT(S) AGE
pf1ns  pf1svc   NodePort 10.111.141.44  <none>      33000:30308/TCP 2s
pf1ns  prom1svc NodePort 10.109.131.196 <none>      10902:30423/TCP 6s

minikube service list
|------------|--------------|-------------|------------------------|
|  NAMESPACE |    NAME      | TARGET PORT |          URL           |
|------------|--------------|-------------|------------------------|
| pf1ns      | pf1svc       |       33000 | http://240.1.0.2:30308 |
| pf1ns      | prom1service |       10902 | http://240.1.0.2:30423 |
|------------|--------------|-------------|------------------------|

 

To debug if something goes wrong, you can use something like this:


# debug inside container, replace with pod name
k exec -it pf1deploy-77bf69d7b6-cqqwq -n pf1ns -- bash

# or use dedicated debug pod
k apply -f debug-pod.yml
k exec -it debug-pod -n pf1ns -- bash
# delete if done using
k delete pod debug-pod -n pf1ns


# install debugging tools
apt update
apt install curl iputils-ping
dnsutils net-tools # dig and netstat

 

To check metrics that we want to use as autoscaler we can check from multiple place:

# check metrics inside pod
curl http://pf1svc.pf1ns.svc.cluster.local:33000/metrics

# check metrics from outside
curl
http://240.1.0.2:30308/metrics

# or open from prometheus UI:
http://240.1.0.2:30423

# get metrics
k get --raw "/apis/external.metrics.k8s.io/v1beta1"                                                        1 ↵
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[{"name":"externalmetrics","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]}]}

# get scaled object
k get scaledobject pf1keda -n pf1ns
NAME      SCALETARGETKIND      SCALETARGETNAME   MIN   MAX   TRIGGERS     AUTHENTICATION   READY   ACTIVE   FALLBACK   PAUSED    AGE
pf1keda   apps/v1.Deployment   pf1deploy         1     5     prometheus                    True    False    False      Unknown   3d20h

# get metric name
k get scaledobject pf1keda -n pf1ns -o 'jsonpath={.status.externalMetricNames}'
["s0-prometheus-prometheus"]

 

Next we can do loadtest while watching pods:

# do loadtest
hey -c 100 -n 100000
http://240.1.0.2:30308

# check with kubectl get pods -w -n pf1ns, it would spawn:
promfiberdeploy-96qq9  0/1     Pending             0  0s
promfiberdeploy-j5qw9  0/1     Pending             0  0s
promfiberdeploy-96qq9  0/1     Pending             0  0s
promfiberdeploy-76pvt  0/1     Pending             0  0s
promfiberdeploy-76pvt  0/1     Pending             0  0s
promfiberdeploy-j5qw9  0/1     Pending             0  0s
promfiberdeploy-96qq9  0/1     ContainerCreating   0  0s
promfiberdeploy-76pvt  0/1     ContainerCreating   0  0s
promfiberdeploy-j5qw9  0/1     ContainerCreating   0  0s
promfiberdeploy-96qq9  1/1     Running             0  1s
promfiberdeploy-j5qw9  1/1     Running             0  1s
promfiberdeploy-76pvt  1/1     Running             0  1s
...
promfiberdeploy-j5qw9  1/1     Terminating         0  5m45s
promfiberdeploy-96qq9  1/1     Terminating  
      0  5m45s
promfiberdeploy-gt2h5  1/1     Terminating  
      0  5m30s
promfiberdeploy-76pvt  1/1     Terminating  
      0  5m45s

# all events includes scale up event
k get events -n pf1ns -w
21m    Normal ScalingReplicaSet deployment/promfiberdeploy Scaled up replica set promfiberdeploy-868697d555 to 1
9m20s  Normal ScalingReplicaSet deployment/promfiberdeploy Scaled up replica set promfiberdeploy-868697d555 to 4 from 1
9m5s   Normal ScalingReplicaSet deployment/promfiberdeploy Scaled up replica set promfiberdeploy-868697d555 to 5 from 4
3m35s  Normal ScalingReplicaSet deployment/promfiberdeploy Scaled down replica set promfiberdeploy-868697d555 to 1 from 5

That's it, that's how you use KEDA and terraform to autoscale deployments. The key parts on the .tf files are:

  • terraform - needed to let terraform know what plugins being used on terraform init
  • kubernetes and helm - needed to know which config being used, and which cluster being contacted
  • kubernetes_namespace_v1 - to create a namespace (eg. per tenant)
  • kubernetes_deployment_v1 - to set what pod being used and which docker container to be used
  • kubernetes_service_v1 - to expose port on the node (in this case only NodePort), to loadbalance between pods
  • kubernetes_ingress_v1 - should be used to redirect request to proper services, but since we only have 1 service and we use minikube (that it have it's own forwarding) this one not used in our case
  • kubernetes_config_map_v1 - used to bind a config file (volume) for prometheus deployment, this sets where to scrape the service, this is NOT a proper way to do this, the proper way is on the latest commit on that repository, using PodMonitor from prometheus-operator:
    • kubernetes_service_v1 - to expose global prometheus (that monitor whole kubernetes, not per namespace)
    • kubernetes_service_account_v1 - crates service account so prometheus on namespace can retrieve pods list
    • kubernetes_cluster_role_v1 - role to allow list pods
    • kubernetes_cluster_role_binding_v1 - bind service account with the role above
    • kubernetes_manifest - creates podmonitor kubernetes manifest, this is the rules generated for prometheus on namespace to match specific pod
    • kubernetes_manifest - creates prometheus manifest that deploys prometheus on specific namespace
  • kubernetes_persistent_volume_v1 and kubernetes_persistent_volume_claim_v1 - used to bind data diectory (volume) to prometheus deployment
  • kubernetes_stateful_set_v1 - to deploy the prometheus, since it's not a stateless service, we have to bind data volume to prevent data loss
  • kubernetes_service_v1 - to expose port of prometheus to outside
  • helm_release - to deploy keda
  • kubernetes_manifest - to create custom manifest since scaled object is not supported by kubernetes terraform provider, this configures which service that able to be autoscaled

If you need the source code, you can take a look at terraform1 repo, the latest one is using podmonitor.


2023-06-14

Simple Websocket Echo Benchmark

Today we're gonna benchmark nodejs/bin+uwebsocket with golang+nbio. The code is here, both taken from their own example. The benchmark plan is create 10k client/connection, send both text/binary string and receive back the from server and sleep for 1s,100ms,10ms or 1ms, the result is as expected:

go 1.20.5 nbio 1.3.16
rps: 19157.16 avg/max latency = 1.66ms/319.88ms elapsed 10.2s
102 MB 0.6 core usage
rps: 187728.05 avg/max latency = 0.76ms/167.76ms elapsed 10.2s
104 MB 5 core usage
rps: 501232.80 avg/max latency = 12.48ms/395.01ms elapsed 10.1s
rps: 498869.28 avg/max latency = 12.67ms/425.04ms elapsed 10.1s
134 MB 15 core usage

bun 0.6.9
rps: 17420.17 avg/max latency = 5.57ms/257.61ms elapsed 10.1s
48 MB 0.2 core usage
rps: 95992.29 avg/max latency = 29.93ms/242.74ms elapsed 10.4s
rps: 123589.91 avg/max latency = 40.67ms/366.15ms elapsed 10.2s
rps: 123171.42 avg/max latency = 62.74ms/293.29ms elapsed 10.1s
55 MB 1 core usage

node 18.16.0
rps: 18946.51 avg/max latency = 6.64ms/229.28ms elapsed 10.3s
59 MB 0.2 core usage
rps: 97032.08 avg/max latency = 44.06ms/196.41ms elapsed 11.1s
rps: 114449.91 avg/max latency = 72.62ms/295.33ms elapsed 10.3s
rps: 109512.05 avg/max latency = 79.27ms/226.03ms elapsed 10.2s
59 MB 1 core usage


First line until 4th line are with 1s, 100ms, 10ms, 1ms delay before next request. Since Golang/nbio is by default can utilize multi-core so can handle ~50 rps per client, while Bun/Nodejs 11-12 rps per client. If you found a bug, or want to contribute another language (or create better client, just create a pull request on the github link above.

2023-05-17

Dockerfile vs Nixpacks vs ko

Dockerfile is quite simple, first we need to pick the base image for build phase (only if you want to build inside docker, if you already have CI/CD that build it outside, you just need to copy the executable binary directly), put command of build steps, choose runtime image for run stage (popular one like ubuntu/debian have bunch of debugging tools, alpine/busybox for stripped one), copy the binary to that layer and done. 

 
FROM golang:1.20 as build1
WORKDIR /app1
# if you don't use go mod vendor
#COPY go.mod .
#COPY go.sum .
#RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o app1.exe

FROM busybox:latest
WORKDIR /
COPY --from=build1 /etc/ssl/certs /etc/ssl/certs
COPY --from=build1 /app1/app1.exe .
CMD ./app1.exe
 

then run the docker build and docker run command:

# build
docker build . -t app0
[+] Building 76.2s (15/15) FINISHED -- first time, without vendor
[+] Building 9.5s (12/12) FINISHED -- changing code, rebuild, with go mod vendor

# run
docker run -it app0

with nixpacks you just need to run this without having to create Dockerfile (as long there's main.go file):

# install nixpack
curl -sSL https://nixpacks.com/install.sh | bash

# build
nixpacks build . --name app1
[+] Building 315.7s (19/19) FINISHED -- first time build
[+] Building 37.2s (19/19) FINISHED -- changing code, rebuild

# run
docker run -it app1

With ko

# install ko
go install github.com/google/ko@latest

# build
time ko build -L -t app2
CPU: 0.84s      Real: 5.05s     RAM: 151040KB

# run (have to do this since the image name is hashed)
docker run -it `docker image ls | grep app2 | cut -d ' ' -f 1`

How about container image size? Dockerfile with busybox only use 14.5MB, with ubuntu 82.4MB, debian 133MB, alpine 15.2MB, with nixpack it uses 99.2MB, and with ko it only took 11.5MB but it only support Go (and you cannot debug inside it, eg. for testing connectivity to 3rd party dependency using shell inside the container). So is it better to use nixpacks? I don't think so, both build speed and image size for this case is inferior compared to normal Dockerfile with busybox or ko.

2023-04-26

GeoSearch Database Benchmark

So today we're gonna benchmark database that can store latitude-longitude (GPS coordinate), the benchmark spec is unbatched INSERT 100K records of (id, lat, long) tuple, search 200K times (or until deadline reached) 500 nearest point, id and the distance, move 100 points to another location 50 times, all benchmark done in 16 threads (so other 16 threads can be used by the database).

The contender that already benchmarked are:

  1. Redis GEOSEARCH (2024: also add KeyDB 4 core)
  2. PostgreSQL cube/earthdistance
  3. Tarantool RTREE
  4. TypeSense geosearch
  5. MeiliSearch _geo

Other database attempted but failed because not truly redis-compatible: DragonFlyDB, Garnet, KVRocks

Here's the result for unbatched insert:

REDIS         9.4 sec,  10639.7 rps
KEYDB         1.3 sec,  76546.2 rps
POSTGRES     10.5 sec,   9523.7 rps
TARANTOOL     0.8 sec, 126801.9 rps
TYPESENSE    96.7 sec,   1023.8 rps
MEILISEARCH 365.0 sec,    271.2 rps


This benchmark is totally unfair for Meilisearch since, their API expect it to be batched, just like Clickhouse.

Next the 500 nearby point of interest search benchmark:

REDIS        30676  (15.3%) in 50.0 sec,  613.5 rps
KEYDB         7381   (3.7%) in 50.0 sec,  147.6 rps
POSTGRES      7778   (3.9%) in 50.0 sec,  155.6 rps
TARANTOOL   200000 (100.0%) in 35.0 sec, 5716.3 rps
TYPESENSE     2177   (1.1%) in 50.0 sec,   43.5 rps


PostgreSQL actually quite fast in Tarantool level if the distance ordering removed (random order, only fetch first 500), I believe the search scope is too wide (4km square) causing too many points to be sorted by distance (and database config is default, not tuned at all causing this slowdown). TypeSense cannot search more than 250 rows limit, so that result is for 250, MeiliSearch in this case always return 0 result, not sure what's wrong.

Last benchmark is about moving 10 points 50 times, so it need to update the index periodically, the results are:

REDIS        0.1 sec,  55088.4 rps
KEYDB        0.2 sec,  28926.9 rps
POSTGRES     0.6 sec,   7954.3 rps
TARANTOOL    0.0 sec, 137424.8 rps
TYPESENSE    9.7 sec,    515.2 rps
MEILISEARCH  8.8 sec,    569.7 rps

So I guess I'll use Tarantool for this case, since it's the fastest for geo datapoints with persistence. Other possible database to benchmarked in the future are: MySQL, CockroachDB, ElasticSearch, TiDB, CouchBase, PostGIS, but I'm not sure whether their index can beat Tarantool's. If you want to contribute to this benchmark you can create PR to hugedbbench repo.


2023-04-18

How to use DNS SDK in Golang

So we're gonna try to manipulate DNS records using go SDK (not REST API directly). I went through first 2 page of google search results, and companies that providing SDK for Go were:

  1. IBM networking-go-sdk - 161.26.0.10 and 161.26.0.11 - timedout resolving their own website
  2. AWS route53 - 169.254.169.253 - timedout resolving their own website
  3. DNSimple dnsimple-go - 162.159.27.4 and 199.247.155.53 - 160-180ms and 70-75ms from SG
  4. Google googleapis - 8.8.8.8 and 8.8.4.4 - 0ms for both from SG
  5. GCore gcore-dns-sdk-go - 199.247.155.53 and 2.56.220.2 - 0ms and 0-171ms (171ms on first hit only, the rest is 0ms) from SG

I've used google SDK before for non-DNS stuff, a bit too raw and so many required steps. You have to create a project, enable API, create service account, set permission for that account, download credentials.json, then hit using their SDK -- not really straightforward, so today we're gonna try G-Core's DNS, apparently it's very easy, just need to visit their website and sign up, profile > API Tokens > Create Token, copy it to some file (for example: .token file).

This is example how you can create a zone, add an A record, and delete everything:

 package main

import (
  "context"
  _ "embed"
  "strings"
  "time"

  "github.com/G-Core/gcore-dns-sdk-go"
  "github.com/kokizzu/gotro/L"
)

//go:embed .token
var apiToken string

func main() {
  apiToken = strings.TrimSpace(apiToken)

  // init SDK
  sdk := dnssdk.NewClient(dnssdk.PermanentAPIKeyAuth(apiToken), func(client *dnssdk.Client) {
    client.Debug = true
  })
  ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
  defer cancel()

  const zoneName = `benalu2.dev`

  // create zone
  _, err := sdk.CreateZone(ctx, zoneName)
  if err != nil && !strings.Contains(err.Error(), `already exists`) {
    L.PanicIf(err, `sdk.CreateZone`)
  }

  // get zone
  zoneResp, err := sdk.Zone(ctx, zoneName)
  L.PanicIf(err, `sdk.Zone`)
  L.Describe(zoneResp)

  // add A record
  err = sdk.AddZoneRRSet(ctx,
    zoneName,        // zone
    `www.`+zoneName, // name
    `A`,             // rrtype
    []dnssdk.ResourceRecord{
      {
// https://apidocs.gcore.com/dns#tag/rrsets/operation/CreateRRSet
        Content: []any{
          `194.233.65.174`,
        },
      },
    },
    120, // TTL
  )
  L.PanicIf(err, `AddZoneRRSet`)

  // get A record
  rr, err := sdk.RRSet(ctx, zoneName, `www.`+zoneName, `A`)
  L.PanicIf(err, `sdk.RRSet`)
  L.Describe(rr)

  // delete A record
  err = sdk.DeleteRRSet(ctx, zoneName, `www.`+zoneName, `A`)
  L.PanicIf(err, `sdk.DeleteRRSet`)

  // delete zone
  err = sdk.DeleteZone(ctx, zoneName)
  L.PanicIf(err, `sdk.DeleteZone`)
}

The full source code repo is here. Apparently it's very easy to manipulate DNS record using their SDK, after adding record programmatically, all I need to do is just delegate (set authoritative nameserver) to their NS: ns1.gcorelabs.net and ns2.gcdn.services.

In my case because I bought the domain name on google domains, then I just need to change this: 

 
Then just wait it to be delegated properly (until all DNS servers that still caching the old authorized NS cleared up), I guess that it.

2023-02-05

Lua Tutorial, Example, Cheatsheet

Lua is one of the most popular embeddable language (other than Javascript that already embedded in browser), there's a lot of products that embed lua as it's scripting language (probably because the language and the C-API is simple, VM size is small, and there's a product called LuaJIT that is currently the fastest JIT implementation for scripting language. Current version of Lua is 5.4, but LuaJIT only support 5.1 and 5.2 partially. There's a lot of products that are built having Lua embedded inside:

Variables, Data Types, Comments, basic stdlib Functions

 
Today we're going to learn a bit about Lua syntax, to run a lua script just run lua yourscript.lua on terminal or use luajit also fine:

--[[ this is multiline comment
  types in Lua: boolean, number, string
    table, function, nil, thread, userdata
]]
-- this is single line comment, just like SQL

-- create variable
local foo = 'test' -- create a string, can escape \ same as with ""
local bar = 1 -- create a number

-- random value
math.randomseed(os.time()) -- set random table
math.random() -- get random float
math.random(10) -- get random int 0-9

-- print, concat
print(#foo) -- get length of string, same as string.len(foo)

foo .. 123 -- concat string, will convert non-string to string
print(1 + 2 ^ 3 * 4, 5) -- print 33 5 separated with tab

-- type and conversion
type(bar) -- get type (number)
tostring(1) -- convert to string
tonumber("123") -- convert to number

-- multiline string
print([[foo
bar]]) -- print a string, but preserve newlines, will not escape \

-- string functions
string.upper('abc') -- return uppercased string
string.sub('abcde', 2,4) -- get 2nd to 4th char (bcd)
string.char(65) -- return ascii of a byte
string.byte('A') -- return byte of a character
string.rep('x', 5, ' ') -- repeat x 5 times separated with space
string.format('%.2f %d %i', math.pi, 1, 4) -- smae as C's sprintf 
start, end = string.find('haystack', 'st') -- find index start-end of need on haystack, nil if not found, end is optional
string.gsub('abc','b','d') -- global substitution

Decision and Loop

There's one syntax for decision (if), and 3 syntax for loop (for, while-do, repeat-until) in Lua:

-- decision
if false == true then
  print(1)
elseif 15 then -- truthy
  print( 15 ~= 0 ) -- not equal, in another language: !=
else
  print('aaa')
end

-- can be on single line
if (x > 0) or (x < 0) then print(not true) end

-- counter loop
for i = 1, 5 do print(i) end -- 1 2 3 4 5
for i = 1, 4, 2 do print(i) end -- 1 3
for i = 4, 1, -2 do print(i) end -- 4 2

-- iterating array
local arr = {1, 4, 9} -- table
for i = 1, #arr do print(arr[i]) end

-- top-checked loop
while true do
  break -- break loop
end

-- bottom-checked loop
repeat
until false

IO, File, and OS

IO is for input output, OS is for operating system functions:

-- IO
local in = io.read() -- read until newline
io.write('foo') -- like print() but without newline
io.output('bla.txt') -- set bla.txt as stdout
io.close() -- make sure stdout is closed
io.input('foo.txt') -- set foo.txt as stdin
io.read(4) -- read 4 characters from stdin
io.read('*number') -- read as number
io.read('*line') -- read until newline
io.read('*all') -- read everything
local f = io.open('file.txt', 'w') -- create for write, a=append, r=read
f:write('bla') -- write to file
f:close() -- flush and close file

-- OS
os.time({year = 2023, month = 2, day = 4, hour = 12, min = 23,sec  = 34})
os.getenv('PATH') -- get environtment variables value
os.rename('a.txt','b.txt') -- rename a.txt to b.txt
os.rename('a.txt') -- erase file
os.execute('echo 1')  -- execute shell command
os.clock() -- get current second as float
os.exit() -- exit lua script

Table

Table is combination of list/dictionary/set/record/object, it can store anything, including a function.

-- as list
local arr = {1, true, "yes"} -- arr[4] is nil, #arr is 3

-- mutation functions
table.sort(arr) -- error, cannot sort if elements on different types
table.insert(arr, 2, 3.4) -- insert on 2nd position, shifting the rest
table.remove(arr, 3) -- remove 3rd element, shifting remaining left

-- non mutating
table.concat(arr, ' ') -- return string with spaces

-- nested
local mat = {
  {1,2,3},
  {4,5,6}, -- can end with comma
}

-- table as dictionary and record
local user = {
  name = 'Yui',
  age = 23,
}
user['salutations'] = 'Ms.' -- or user.salutations

Function

Function is block of code that contains logic.

-- function can receive parameter
local function bla(x)
  x = x or 'the default value'
  local y = 1 -- local scope variable
  return x
end

-- function can receive multivalue
local function foo()
  return 1,2
end
local x,y = foo()
local x = foo() -- only receive first one

-- function with internal state
local function lmd()
   local counter = 0
   return function()
     counter = counter + 1
    return counter
  end
end
local x = lmd()
print(x()) -- 1
print(x()) -- 2

-- variadic arguments
local function vargs(...)
  for k, v in pairs({...}) do print(k, v) end
end
vargs('a','b') --  1 a, 2 b 

Coroutine

Coroutine is resumable function

local rot1 = coroutine.create(function()
   for i = 1, 10 do
    print(i)
    if i == 5 then coroutine.yield() end -- suspend
   end
end)
if coroutine.status(rot1) == 'suspended' then
   -- running, [suspended], normal, dead
  coroutine.resume(rot1)  -- will resume after yield
end

Module

To organize a code, we can use module:

-- module, eg. mod1.lua
_G.mod1 = {}
function mod1.add(x,y)
  return x + y
end
return mod1

-- import a module
local mod1 = require('mod1')
local x = mod1.add(1,2)

Basic OOP

To simulate OOP-like features we can do things like in Javascript, just that to call a method of object we can use colon instead of dot (so self parameter will be passed):

-- constructor
local function NewUser(name, age)
  return {
    name = name,
    age = age,
    show = function(self) --
method example
      print(self.name, self.age)
    end
  }
end
local me = NewUser('Tzuyu', 21)
me:show() -- short version of: me.show(me)

-- inheritance, overriding
local function NewAdmin(name, age)
  local usr = NewUser(name, age)
  usr.isAdmin = true
  usr.setPerm = function(self, perms)
     self.perms = perms
  end
  local parent_
show = usr.show
  usr.show = function(self)
    -- override parent method
    -- parent_
show(self) to call parent
  end
  return usr
end
local adm = NewAdmin('Kis', 37)
adm:setPerm({'canErase','canAddNewAdmin'})
adm:
show() -- does nothing

-- override operator
local obj = {whatever = 1}
setmetatable(obj, {
   __add = function(x,y)  -- overrides + operator
    return x.whatever + tonumber(y)
   end,
  -- __sub, __mul, __div, __mod, __pow,
  -- __concat, __len, __eq, __lt, __le, __gt, __ge
})
print(obj + "123")

For more information about modules you can visit https://luarocks.org/ 

For more information about the syntax you can check their doc: https://devdocs.io/lua~5.2/ or https://learnxinyminutes.com/docs/lua/ or this 220 minutes video

But as usual, don't use dynamic typed language for large project, it would be pain to maintain in the long run, especially if the IDE cannot jump properly to correct method or cannot suggest correct type hints/methods to call

2022-12-24

CockroachDB Benchmark on Different Disk Types

Today we're going to benchmark CockroachDB one of database that I use this year to create embedded application. I use CockroachDB because I don't want to use SqLite or any other embedded database that lack of tooling or cannot be accessed by multiple program at the same time. With CockroachDB I only need to distribute my application binary, cockroachdb binary, and that's it, the offline backup also quite simple, just need to rsync the directory, or do manual rows export like other PostgreSQL-like database. Scaling out also quite simple.

Here's the result:

Disk Type Ins Dur (s) Upd Dur (s) Sel Dur (s) Many Dur (s) Insert Q/s Update Q/s Select1 Q/s SelMany Row/s SelMany Q/s
TMPFS (RAM) 1.3 2.1 4.9 1.5 31419 19275 81274 8194872 20487
NVME DA 1TB 2.7 3.7 5.0 1.5 15072 10698 80558 8019435 20048
NVMe Team 1TB 3.8 3.7 4.9 1.5 10569 10678 81820 8209889 20524
SSD GALAX 250GB 8.0 7.1 5.0 1.5 4980 5655 79877 7926162 19815
HDD WD 8TB 32.1 31.7 4.9 3.9 1244 1262 81561 3075780 7689

From the table we can see that TMPFS (RAM, obviously) is the fastest in all case especially insert and update benchmark, NVMe faster than SSD, and standard magnetic HDD is the slowest. but the query-part doesn't really have much effect probably because the dataset too small that all can fit in the cache.

The test done with 100 goroutines, 400 records insert/update per goroutines, the record is only integer and string. Queries done 10x for select, and 300x for select-many, sending small query is shown there reaching the limit  of 80K rps, inserts can reach 31K rps and multirow-query/updates can reach ~20K rps.

The repository is here if you want to run the benchmark on your own machine.