Showing posts with label golang. Show all posts
Showing posts with label golang. Show all posts

2024-02-20

Writing UDF for Clickhouse using Golang

Today we're going to create an UDF (User-defined Function) in Golang that can be run inside Clickhouse query, this function will parse uuid v1 and return timestamp of it since Clickhouse doesn't have this function for now. Inspired from the python version with TabSeparated delimiter (since it's easiest to parse), UDF in Clickhouse will read line by line (each row is each line, and each text separated with tab is each column/cell value):

package main
import (
    "bufio"
    "encoding/binary"
    "encoding/hex"
    "fmt"
    "os"
    "strings"
    "time"
)
func main() {
    scanner := bufio.NewScanner(os.Stdin)
    scanner.Split(bufio.ScanLines)
    for scanner.Scan() {
        id, _ := FromString(scanner.Text())
        fmt.Println(id.Time())
    }
}
func (me UUID) Nanoseconds() int64 {
    time_low := int64(binary.BigEndian.Uint32(me[0:4]))
    time_mid := int64(binary.BigEndian.Uint16(me[4:6]))
    time_hi := int64((binary.BigEndian.Uint16(me[6:8]) & 0x0fff))
    return int64((((time_low) + (time_mid << 32) + (time_hi << 48)) - epochStart) * 100)
}
func (me UUID) Time() time.Time {
    nsec := me.Nanoseconds()
    return time.Unix(nsec/1e9, nsec%1e9).UTC()
}
// code below Copyright (C) 2013 by Maxim Bublis <b@codemonkey.ru>
// see https://github.com/satori/go.uuid
// Difference in 100-nanosecond intervals between
// UUID epoch (October 15, 1582) and Unix epoch (January 1, 1970).
const epochStart = 122192928000000000
// UUID representation compliant with specification
// described in RFC 4122.
type UUID [16]byte
// FromString returns UUID parsed from string input.
// Following formats are supported:
// "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
// "{6ba7b810-9dad-11d1-80b4-00c04fd430c8}",
// "urn:uuid:6ba7b810-9dad-11d1-80b4-00c04fd430c8"
func FromString(input string) (u UUID, err error) {
    s := strings.Replace(input, "-", "", -1)
    if len(s) == 41 && s[:9] == "urn:uuid:" {
        s = s[9:]
    } else if len(s) == 34 && s[0] == '{' && s[33] == '}' {
        s = s[1:33]
    }
    if len(s) != 32 {
        err = fmt.Errorf("uuid: invalid UUID string: %s", input)
        return
    }
    b := []byte(s)
    _, err = hex.Decode(u[:], b)
    return
}
// Returns canonical string representation of UUID:
// xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
func (u UUID) String() string {
    return fmt.Sprintf("%x-%x-%x-%x-%x",
        u[:4], u[4:6], u[6:8], u[8:10], u[10:])
}


Compile and put it with proper owner and permission on /var/lib/clickhouse/user_scripts/uuid2timestr and create /etc/clickhouse-server/uuid2timestr_function.xml (must be have proper suffix) containing:

<functions>
    <function>
        <type>executable</type>
        <name>uuid2timestr</name>
        <return_type>
String</return_type>
        <argument>
            <type>String</type>
        </argument>
        <format>TabSeparated</format>
        <command>uuid2timestr</command>
        <lifetime>0</lifetime>
    </function>
</functions>


after that you can restart Clickhouse (sudo systemctl restart clickhouse-server or sudo clickhouse restart) depends on how you install it (apt or binary setup).

Usage


to make sure it's loaded, you can just find this line on the log:

<Trace> ExternalUserDefinedExecutableFunctionsLoader: Loading config file '/etc/clickhouse-server/uuid2timestr_function.xml

then just run a query using that function:

SELECT uuid2timestr('51038948-97ea-11ee-b7e0-52de156a77d8')

┌─uuid2timestr('51038948-97ea-11ee-b7e0-52de156a77d8')─┐
│ 2023-12-11 05:58:33.2391752 +0000 UTC                │
└──────────────────────────────────────────────────────┘

2023-11-24

mTLS using Golang Fiber

In this demo we're going to create mTLS using Go and Fiber. To create certificates that can be used for mutual authentication, what you need to have is just an OpenSSL program (or simplecert, or mkcert like in previous natsmtls1 example), create a CA (certificate authority), server certs, and client certs, something like this:

# generate CA Root
openssl req -newkey rsa:2048 -new -nodes -x509 -days 3650 -out ca.crt -keyout ca.key -subj "/C=SO/ST=Earth/L=MyLocation/O=MyOrganiz/OU=MyOrgUnit/CN=localhost"

# generate Server Certs
openssl genrsa -out server.key 2048
# generate server Cert Signing request
openssl req -new -key server.key -days 3650 -out server.csr -subj "/C=SO/ST=Earth/L=MyLocation/O=MyOrganiz/OU=MyOrgUnit/CN=localhost"
# sign with CA Root
openssl x509  -req -in server.csr -extfile <(printf "subjectAltName=DNS:localhost") -CA ca.crt -CAkey ca.key -days 3650 -sha256 -CAcreateserial -out server.crt

# generate Client Certs
openssl genrsa -out client.key 2048
# generate client Cert Signing request
openssl req -new -key client.key -days 3650 -out client.csr -subj "/C=SO/ST=Earth/L=MyLocation/O=$O/OU=$OU/CN=localhost"
# sign with CA Root
openssl x509  -req -in client.csr -extfile <(printf "subjectAltName=DNS:localhost") -CA ca.crt -CAkey ca.key -out client.crt -days 3650 -sha256 -CAcreateserial


You will get at least 2 files related to CA, 3 files related to server, and 3 files related to client, but what you really need is just CA public key, server private and public key (key pairs), and client private and public key (key pairs). If you need to generate another client or rollover server keys, you will still need CA's private key so don't erase it.

Next, now that you already have those 5 keys, you will need to load CA public key, and server key pair and use it on fiber, something like this:

caCertFile, _ := os.ReadFile(in.CaCrt)
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCertFile)

serverCerts, _ := tls.LoadX509KeyPair(in.ServerCrt, in.ServerKey)

tlsConfig := &tls.Config{
    ClientCAs:        caCertPool,
    ClientAuth:       tls.RequireAndVerifyClientCert,
    MinVersion:       tls.VersionTLS12,
    CurvePreferences: []tls.CurveID{tls.CurveP521, tls.CurveP384, tls.CurveP256},
        CipherSuites: []uint16{
        tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
        tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
        tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
        tls.TLS_RSA_WITH_AES_256_CBC_SHA,
         tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
        tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
     },
     Certificates: []tls.Certificate{serverCerts},
}

// attach the certs to TCP socket, and start Fiber server
app := fiber.New(fiber.Config{
    Immutable: true,
})
app.Get("/", func(c *fiber.Ctx) error {
    return c.String(`secured string`)
})
ln, _ := tls.Listen("tcp", `:1443`, tlsConfig)
app.Listener(ln)


next on the client side, you just need to load CA public key, client key pairs, something like this:

caCertFile, _ := os.ReadFile(in.CaCrt)
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCertFile)
certificate, _ := tls.LoadX509KeyPair(in.ClientCrt, in.ClientKey)

httpClient := &http.Client{
    Timeout: time.Minute * 3,
    Transport: &http.Transport{
        TLSClientConfig: &tls.Config{
            RootCAs:      caCertPool,
            Certificates: []tls.Certificate{certificate},
        },
    },
}

r, _ := httpClient.Get(`https://localhost:1443`)


that's it, that's how you secure client-server communication between Go client and server with mTLS, this code can be found here.

2023-06-14

Simple Websocket Echo Benchmark

Today we're gonna benchmark nodejs/bin+uwebsocket with golang+nbio. The code is here, both taken from their own example. The benchmark plan is create 10k client/connection, send both text/binary string and receive back the from server and sleep for 1s,100ms,10ms or 1ms, the result is as expected:

go 1.20.5 nbio 1.3.16
rps: 19157.16 avg/max latency = 1.66ms/319.88ms elapsed 10.2s
102 MB 0.6 core usage
rps: 187728.05 avg/max latency = 0.76ms/167.76ms elapsed 10.2s
104 MB 5 core usage
rps: 501232.80 avg/max latency = 12.48ms/395.01ms elapsed 10.1s
rps: 498869.28 avg/max latency = 12.67ms/425.04ms elapsed 10.1s
134 MB 15 core usage

bun 0.6.9
rps: 17420.17 avg/max latency = 5.57ms/257.61ms elapsed 10.1s
48 MB 0.2 core usage
rps: 95992.29 avg/max latency = 29.93ms/242.74ms elapsed 10.4s
rps: 123589.91 avg/max latency = 40.67ms/366.15ms elapsed 10.2s
rps: 123171.42 avg/max latency = 62.74ms/293.29ms elapsed 10.1s
55 MB 1 core usage

node 18.16.0
rps: 18946.51 avg/max latency = 6.64ms/229.28ms elapsed 10.3s
59 MB 0.2 core usage
rps: 97032.08 avg/max latency = 44.06ms/196.41ms elapsed 11.1s
rps: 114449.91 avg/max latency = 72.62ms/295.33ms elapsed 10.3s
rps: 109512.05 avg/max latency = 79.27ms/226.03ms elapsed 10.2s
59 MB 1 core usage


First line until 4th line are with 1s, 100ms, 10ms, 1ms delay before next request. Since Golang/nbio is by default can utilize multi-core so can handle ~50 rps per client, while Bun/Nodejs 11-12 rps per client. If you found a bug, or want to contribute another language (or create better client, just create a pull request on the github link above.

2023-05-17

Dockerfile vs Nixpacks vs ko

Dockerfile is quite simple, first we need to pick the base image for build phase (only if you want to build inside docker, if you already have CI/CD that build it outside, you just need to copy the executable binary directly), put command of build steps, choose runtime image for run stage (popular one like ubuntu/debian have bunch of debugging tools, alpine/busybox for stripped one), copy the binary to that layer and done. 

 
FROM golang:1.20 as build1
WORKDIR /app1
# if you don't use go mod vendor
#COPY go.mod .
#COPY go.sum .
#RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o app1.exe

FROM busybox:latest
WORKDIR /
COPY --from=build1 /etc/ssl/certs /etc/ssl/certs
COPY --from=build1 /app1/app1.exe .
CMD ./app1.exe
 

then run the docker build and docker run command:

# build
docker build . -t app0
[+] Building 76.2s (15/15) FINISHED -- first time, without vendor
[+] Building 9.5s (12/12) FINISHED -- changing code, rebuild, with go mod vendor

# run
docker run -it app0

with nixpacks you just need to run this without having to create Dockerfile (as long there's main.go file):

# install nixpack
curl -sSL https://nixpacks.com/install.sh | bash

# build
nixpacks build . --name app1
[+] Building 315.7s (19/19) FINISHED -- first time build
[+] Building 37.2s (19/19) FINISHED -- changing code, rebuild

# run
docker run -it app1

With ko

# install ko
go install github.com/google/ko@latest

# build
time ko build -L -t app2
CPU: 0.84s      Real: 5.05s     RAM: 151040KB

# run (have to do this since the image name is hashed)
docker run -it `docker image ls | grep app2 | cut -d ' ' -f 1`

How about container image size? Dockerfile with busybox only use 14.5MB, with ubuntu 82.4MB, debian 133MB, alpine 15.2MB, with nixpack it uses 99.2MB, and with ko it only took 11.5MB but it only support Go (and you cannot debug inside it, eg. for testing connectivity to 3rd party dependency using shell inside the container). So is it better to use nixpacks? I don't think so, both build speed and image size for this case is inferior compared to normal Dockerfile with busybox or ko.

2022-12-24

CockroachDB Benchmark on Different Disk Types

Today we're going to benchmark CockroachDB one of database that I use this year to create embedded application. I use CockroachDB because I don't want to use SqLite or any other embedded database that lack of tooling or cannot be accessed by multiple program at the same time. With CockroachDB I only need to distribute my application binary, cockroachdb binary, and that's it, the offline backup also quite simple, just need to rsync the directory, or do manual rows export like other PostgreSQL-like database. Scaling out also quite simple.

Here's the result:

Disk Type Ins Dur (s) Upd Dur (s) Sel Dur (s) Many Dur (s) Insert Q/s Update Q/s Select1 Q/s SelMany Row/s SelMany Q/s
TMPFS (RAM) 1.3 2.1 4.9 1.5 31419 19275 81274 8194872 20487
NVME DA 1TB 2.7 3.7 5.0 1.5 15072 10698 80558 8019435 20048
NVMe Team 1TB 3.8 3.7 4.9 1.5 10569 10678 81820 8209889 20524
SSD GALAX 250GB 8.0 7.1 5.0 1.5 4980 5655 79877 7926162 19815
HDD WD 8TB 32.1 31.7 4.9 3.9 1244 1262 81561 3075780 7689

From the table we can see that TMPFS (RAM, obviously) is the fastest in all case especially insert and update benchmark, NVMe faster than SSD, and standard magnetic HDD is the slowest. but the query-part doesn't really have much effect probably because the dataset too small that all can fit in the cache.

The test done with 100 goroutines, 400 records insert/update per goroutines, the record is only integer and string. Queries done 10x for select, and 300x for select-many, sending small query is shown there reaching the limit  of 80K rps, inserts can reach 31K rps and multirow-query/updates can reach ~20K rps.

The repository is here if you want to run the benchmark on your own machine.

Map to Struct and Struct to Map Golang Benchmark 2022 Edition

Sometimes we want to convert from map to struct or struct to map (dictionary in other language), or even struct to struct. There's some library that can help us doing this, for example structs, mapstructure, copier, or smapping. We could also utilize serialization and deserialization libraries to do this. With caveats, that some serialization format (eg. JSON) doesn't allow integer larger than 2^53 for example.

Here's benchmark that I run this morning:

map to structtotalns/opB/opallocs/op
M2S_GoccyGoJson_MarshalUnmarshal-326,661,932517803
M2S_JsonIteratorGo_MarshalUnmarshal-324,892,6117241968
M2S_VmihailencoMspackV5_MarhsalUnmarshal-324,572,5977411885
M2S_FxamackerCbor_MarshalUnmarshal-324,418,5587991208
M2S_SurrealdbCork_EncodeDecode-323,080,2821,0801,2176
M2S_GopkgInMgoV2Bson_MarshalUnmarshal-323,227,9051,09223213
M2S_ShamatonMsgpackV2_MarshalUnmarshal-323,062,6771,16195615
M2S_MitchellhMapstructure_Decode-322,487,4281,39572018
M2S_MongoDriverBson_MarshalUnmarshal-322,477,9831,45941414
M2S_KokizzuJson5b_MarshalUnmarshal-321,987,2401,71163216
M2S_EncodingJson_MarshalUnmarshal-322,056,9441,78060016
M2S_EtNikBinngo_MarshalUnmarshal-321,985,5951,85742539
M2S_PquernaFfjson_MarshalUnmarshal-321,739,9681,98660916
M2S_UngorjiGocodec_BincEncodeDecode-321,401,4532,5824,34023
M2S_UngorjiGoCodec_CborEncodeDecode-321,304,8282,6364,34023
M2S_PelletierGoTomlV2_MarshalUnmarshal-321,284,0372,7871,60027
M2S_UngorjiGocodec_SimpleEncodeDecode-321,295,9262,8104,34023
M2S_UngorjiGocodec_JsonEncodeDecode-321,000,0003,0284,95625
M2S_IchibanTnetstrings_MarshalUnmarshal-32749,9475,0569,32948
M2S_BurntSushiToml_EncodeUnmarshal-32425,3358,0657,95871
M2S_HjsonHjsonGoV4_MarshalUnmarshal-32355,78410,8703,93678
M2S_GopkgInYamlV3_MarshalUnmarshal-32271,19013,52414,11280
M2S_DONUTSLz4Msgpack_MarshalUnmarshal-32240,61915,4981,26416
M2S_GoccyGoYaml_MarshalUnmarshal-32214,77616,1927,821214
M2S_GhodssYaml_MarshalUnmarshal-32156,41223,34721,378161
M2S_NaoinaToml_MarshalUnmarshal-3257,60758,331398,54477





struct to maptotalns/opB/opallocs/op
S2M_MitchellhMapstructure_Decode-325,055,40271653612
S2M_GoccyGoJson_MarshalUnmarshal-324,660,22474752212
S2M_JsonIteratorGo_MarshalUnmarshal-324,283,26283550514
S2M_VmihailencoMspackV5_MarhsalUnmarshal-324,009,86390860712
S2M_FxamackerCbor_MarshalUnmarshal-323,562,3521,02345211
S2M_ShamatonMsgpackV2_MarshalUnmarshal-323,180,0101,08955615
S2M_GopkgInMgoV2Bson_MarshalUnmarshal-323,047,3961,14552815
S2M_SurrealdbCork_EncodeDecode-322,976,3281,1961,61112
S2M_EncodingJson_MarshalUnmarshal-321,914,1651,78268818
S2M_PquernaFfjson_MarshalUnmarshal-321,911,9501,84569718
S2M_EtNikBinngo_MarshalUnmarshal-321,948,8021,85976845
S2M_KokizzuJson5b_MarshalUnmarshal-321,888,7741,88496020
S2M_MongoDriverBson_MarshalUnmarshal-321,857,6491,99575918
S2M_PelletierGoTomlV2_MarshalUnmarshal-321,244,0122,8641,80031
S2M_UngorjiGocodec_BincEncodeDecode-321,000,0003,2344,88834
S2M_UngorjiGoCodec_CborEncodeDecode-32989,6713,3584,88834
S2M_UngorjiGocodec_SimpleEncodeDecode-321,000,0003,4004,88834
S2M_UngorjiGocodec_JsonEncodeDecode-32912,5123,6395,50436
S2M_IchibanTnetstrings_MarshalUnmarshal-32776,7964,7449,56146
S2M_BurntSushiToml_EncodeUnmarshal-32447,2168,5388,23173
S2M_HjsonHjsonGoV4_MarshalUnmarshal-32389,4769,4163,86866
S2M_GopkgInYamlV3_MarshalUnmarshal-32315,93913,33814,40081
S2M_DONUTSLz4Msgpack_MarshalUnmarshal-32242,33014,29874416
S2M_GoccyGoYaml_MarshalUnmarshal-32230,04214,9197,580202
S2M_GhodssYaml_MarshalUnmarshal-32151,02322,68221,441161
S2M_NaoinaToml_MarshalUnmarshal-3260,91652,047398,11280





struct to structtotalns/opB/opallocs/op
S2S_GoccyGoJson_MarshalUnmarshal-3212,046,4973171124
S2S_ShamatonMsgpackV2_MarshalUnmarshal-327,897,4884581486
S2S_JsonIteratorGo_MarshalUnmarshal-327,853,592494926
S2S_FxamackerCbor_MarshalUnmarshal-327,038,808511805
S2S_GopkgInMgoV2Bson_MarshalUnmarshal-325,105,3437151449
S2S_VmihailencoMspackV5_MarhsalUnmarshal-324,549,7008182136
S2S_MongoDriverBson_MarshalUnmarshal-323,560,9461,0193218
S2S_EncodingJson_MarshalUnmarshal-322,731,0511,3133049
S2S_PquernaFfjson_MarshalUnmarshal-322,734,3571,3303049
S2S_KokizzuJson5b_MarshalUnmarshal-322,594,7281,3435049
S2S_SurrealdbCork_EncodeDecode-322,555,7451,3971,2417
S2S_EtNikBinngo_MarshalUnmarshal-321,995,4681,84040041
S2S_PelletierGoTomlV2_MarshalUnmarshal-321,460,6832,4591,44023
S2S_UngorjiGocodec_SimpleEncodeDecode-321,205,6482,9194,36424
S2S_UngorjiGoCodec_CborEncodeDecode-321,290,7342,9204,36424
S2S_UngorjiGocodec_BincEncodeDecode-321,207,3273,0074,36424
S2S_UngorjiGocodec_JsonEncodeDecode-321,000,0003,2234,98026
S2S_IchibanTnetstrings_MarshalUnmarshal-32722,4934,9509,28947
S2S_BurntSushiToml_EncodeUnmarshal-32398,3668,4587,91872
S2S_HjsonHjsonGoV4_MarshalUnmarshal-32304,36911,1894,57879
S2S_DONUTSLz4Msgpack_MarshalUnmarshal-32282,50512,2152377
S2S_GopkgInYamlV3_MarshalUnmarshal-32279,33212,54114,01676
S2S_GoccyGoYaml_MarshalUnmarshal-32211,95215,5427,982208
S2S_GhodssYaml_MarshalUnmarshal-32160,66023,14821,073154
S2S_NaoinaToml_MarshalUnmarshal-3264,46859,672399,06583

The repository and the always updated result is here, feel free to add your own serialization/deserialization library. As we can see, goccy-gojson is the fastest among all, too bad if you store int64 larger than 2^53 it give wrong result. So it's better to use second best and all rounder vmihailenco-msgpack, or for specific use case struct to map/struct is mapstructure.

Here's the top ranking:

Ser/Deser M2S S2M S2S
GoccyGoJson 1 2 2
JsonIteratorGo 2 3 4
MitchellhMapstructure 8 1 1
VmihailencoMspackV5 3 4 7
FxamackerCbor 4 5 5
ShamatonMsgpackV2 7 6 3
SurrealdbCork 5 8 12