Programming Rants: redis

Showing posts with label redis. Show all posts

2023-04-26

GeoSearch Database Benchmark

So today we're gonna benchmark database that can store latitude-longitude (GPS coordinate), the benchmark spec is unbatched INSERT 100K records of (id, lat, long) tuple, search 200K times (or until deadline reached) 500 nearest point, id and the distance, move 100 points to another location 50 times, all benchmark done in 16 threads (so other 16 threads can be used by the database).

The contender that already benchmarked are:

Redis GEOSEARCH (2024: also add KeyDB 4 core)
PostgreSQL cube/earthdistance
Tarantool RTREE
TypeSense geosearch
MeiliSearch _geo

Other database attempted but failed because not truly redis-compatible: DragonFlyDB, Garnet, KVRocks

Here's the result for unbatched insert:

REDIS         9.4 sec, 10639.7 rps
KEYDB 1.3 sec, 76546.2 rps
POSTGRES    10.5 sec, 9523.7 rps
TARANTOOL    0.8 sec, 126801.9 rps
TYPESENSE 96.7 sec, 1023.8 rps
MEILISEARCH 365.0 sec,    271.2 rps

This benchmark is totally unfair for Meilisearch since, their API expect it to be batched, just like Clickhouse.

Next the 500 nearby point of interest search benchmark:

REDIS        30676 (15.3%) in 50.0 sec, 613.5 rps
KEYDB         7381   (3.7%) in 50.0 sec, 147.6 rps
POSTGRES      7778 (3.9%) in 50.0 sec, 155.6 rps
TARANTOOL   200000 (100.0%) in 35.0 sec, 5716.3 rps
TYPESENSE     2177 (1.1%) in 50.0 sec,   43.5 rps

PostgreSQL actually quite fast in Tarantool level if the distance ordering removed (random order, only fetch first 500), I believe the search scope is too wide (4km square) causing too many points to be sorted by distance (and database config is default, not tuned at all causing this slowdown). TypeSense cannot search more than 250 rows limit, so that result is for 250, MeiliSearch in this case always return 0 result, not sure what's wrong.

Last benchmark is about moving 10 points 50 times, so it need to update the index periodically, the results are:

REDIS        0.1 sec, 55088.4 rps
KEYDB    0.2 sec, 28926.9 rps
POSTGRES     0.6 sec,   7954.3 rps
TARANTOOL    0.0 sec, 137424.8 rps
TYPESENSE    9.7 sec,    515.2 rps
MEILISEARCH 8.8 sec,    569.7 rps

So I guess I'll use Tarantool for this case, since it's the fastest for geo datapoints with persistence. Other possible database to benchmarked in the future are: MySQL, CockroachDB, ElasticSearch, TiDB, CouchBase, PostGIS, but I'm not sure whether their index can beat Tarantool's. If you want to contribute to this benchmark you can create PR to hugedbbench repo.

2022-05-07

Getting started with Trino

Trino is a distributed query engine, that allows you to JOIN from multiple datasources (databases like mysql, postgresql, bigquery, cassandra, mongodb, redis, prometheus, elasticsearch, csv file, google sheets, s3, etc). It's like Clickhouse but without high-tech (merge-tree) storage ability, so it cannot do blazing fast analytics query like in Clickhouse, but it can be as fast as the connected database that it uses, eg. if it uses Clickhouse connected, then it can be as fast as Clickhouse. It was developed by Facebook (previously named Presto). List of database connectors can be seen here. To use Trino, you can use dockerized version or manual:

# Docker
docker run -d -p 8080:8080 --name trino1 trinodb/trino
# web UI only for monitoring, use random username
docker exec -it trino1 trino

# Ubuntu 22.04
java --version
python3 --version
# download and extract from https://trino.io/download.html
mkdir ./trino-server-379/etc

cd trino-server-379
SRCURL=https://raw.githubusercontent.com/trinodb/trino-the-definitive-guide/master/single-installation/etc
wget -c $SRCURL/jvm.config
wget -c $SRCURL/log.properties
wget -c $SRCURL/node.properties
echo '
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8081
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery.uri=http://127.0.0.1:8081
' > config.properties
echo '
node.data-dir=/tmp/
' >> node.properties
mkdir catalog
echo '

connector.name=cassandra

cassandra.contact-points=127.0.0.1
# more here https://trino.io/docs/current/connector/cassandra.html

' > catalog/localscylla.properties

cd ..
python3 ./bin/launcher.py run # to run in background: start

# CLI/Client
EXELOC=/usr/bin/trino
curl -O $EXELOC https://repo1.maven.org/maven2/io/trino/trino-cli/379/trino-cli-379-executable.jar
chmod a+x $EXELOC
trino --server http://localhost:8081

These are the list of commands can be used in trino (other than standard SQL):

SHOW CATALOGS;

SHOW SCHEMAS FROM/IN __CATALOG__; # eg. localscylla

SHOW TABLES FROM/IN __CATALOG__.__SCHEMA__;

DESCRIBE __CATALOG__.__SCHEMA__.__TABLE__;

EXPLAIN SELECT * FROM __CATALOG__.__SCHEMA__.__TABLE__;

That's it, you can add more databases connection by creating more etc/catalog/*.properties file with proper configuration (username, password, port, etc).

2017-05-26

GotRo Framework Tutorial: Go, Redis and PostgreSQL

Gotro is opinionated colleciton of libraries and framework for Go, it's a rewrite of Gokil framework that specially built for Go+Redis+PostgreSQL web application development. Previously Gokil written using the infamous julienschmidt's httprouter, the fastest on that time (2014), but a year later (2015) forked as buaazp's fasthttprouter that built based on faster valyala's fasthttp. In this tutorial, we will learn how to use this framework with Redis and PostgreSQL to create a SoHo or even medium-enterprise-class web projects.

Each folder on the directory/package uses 1 letter that contains specific commonly used functions, that are:

A - Array
B - Boolean
C - Character (or Rune)
D - Database
F - Floating Point
L - Logging
M - Map
I - Integer
S - String
T - Time (and Date)
W - Web (the "framework") -- deprecated, use W2 instead
X - Anything (aka interface{})
Z - Z-Template Engine, that has syntax similar to ruby string interpolation #{foo} with additional other that javascript friendly syntax {/* foo */}, [/* bar */], /*! bar */

To use this libraries, run this command on the console:

go get -u -v github.com/kokizzu/gotro

To start a new project, copy a directory from W/example-simplified folder to your $GOPATH/src, that's the base of your project, that should contain something like this:

├── public
│ └── lib
│ └── jquery.js
├── start_dev.sh
├── server.go
└── views
├── error.html
├── layout.html
├── login_example.html
├── named_params_example.html
├── post_values_example.html
└── query_string_example.html

To start the development server, run ./start_dev.sh, it would show something like this:

set ownership of $GOROOT..
remove $GOPATH/pkg if go upgraded/downgraded..
precompile all dependencies..
hello1
starting gin..
[gin] listening on port 3000
2017-05-26 10:55:59.835 StartServer ▶ Gotro Example [DEVELOPMENT] server with 6 route(s) on :3001
Work Directory: /home/asd/go/src/hello1/

If you got an error, probably because you haven't installed Redis, that being used in this example to store sessions, to do that on Ubuntu, you can type:

sudo apt-get install redis-server
sudo systemctl enable redis-server
sudo systemctl start redis-server

To see the example, open your browser at http://localhost:3000

Port 3000 is the proxy port for gin program that auto-recompile if the source code changed, the server itself listens on port 3001, if you change it on the source code, you must also change the gin target port on start_dev.sh file, by replacing -a 3001 and -p 3000

Next we see the example on the server.go file:

redis_conn := Rd.NewRedisSession(``, ``, 9, `session::`)
global_conn := Rd.NewRedisSession(``, ``, 10, `session::`)
W.InitSession(`Aaa`, 2*24*time.Hour, 1*24*time.Hour, *redis_conn, *global_conn)
W.Mailers = map[string]*W.SmtpConfig{
``: {
Name: `Mailer Daemon`,
Username: `test.test`,
Password: `123456`,
Hostname: `smtp.gmail.com`,
Port: 587,
},
}
W.Assets = ASSETS
W.Webmasters = WEBMASTER_EMAILS
W.Routes = ROUTERS
W.Filters = []W.Action{AuthFilter}
// web engine
server := W.NewEngine(DEBUG_MODE, false, PROJECT_NAME+VERSION, ROOT_DIR)
server.StartServer(LISTEN_ADDR)

There are 2 redis connection, one for storing local session, one for storing global session (used cross app communication).
You must call W.InitSession to tell the framework name of the cookie, default expiration (how long until a cookie expired, and every how long we should renew). On the next line, we set the mailer W.Mailers, connection that we use to send if there are panic or any other critical error within your web server.
W.Assets is the assets file, should contain any css or javascript script that will be included on every page, the assets should be saved on the public/css/ or public/js/ directory. This is the example how to fill them:

var ASSETS = [][2]string{
//// http://api.jquery.com/ 1.11.1
{`js`, `jquery`},
////// http://hayageek.com/docs/jquery-upload-file.php
{`css`, `uploadfile`},
{`js`, `jquery.form`},
{`js`, `jquery.uploadfile`},
//// https://vuejs.org/v2/guide/ 2.0
{`js`, `vue`},
//// http://momentjs.com/ 2.17.1
{`js`, `moment`},
//// github.com/kokizzu/semantic-ui-daterangepicker
{`css`, `daterangepicker`},
{`js`, `daterangepicker`},
//// http://semantic-ui.com 2.2 // should be below `js` and `css` items
{`/css`, `semantic/semantic`},
{`/js`, `semantic/semantic`},
//// global, helpers, project specific
{`/css`, `global`},
{`/js`, `global`},
}

If you start the type of the file with slash, it means it would locate the file in absolute path starting from public/. Currently only js and css files supported.

Next we must set the W.Webmasters, that is the hardcoded superadmin, one that will be receiving the error emails and could be accessed through ctx.IsWebMaster() that matching the ctx.Session.GetStr(`email`) variable with those values.

Next initialization phase, you must set the route W.Routes, which is used to assign an URL path to a handler function, for example:

var ROUTERS = map[string]W.Action{
``: LoginExample,
`login_example`: LoginExample,
`post_values_example`: PostValuesExample,
`named_params_example/:test1`: NamedParamsExample,
`query_string_example`: QueryStringExample,
}

In this example, there are five routes with four different handler function (you can put them on a package, normally you separate them on different package based on access level), on the fourth route we capture the :value as string, that can be anything and can be retrieved by calling ctx.ParamStr(`test1`). Here's some example how to separate the handler based on first segment:

`accounting/acct_payments`: fAccounting.AcctPayments,

`accounting/acct_invoices`: fAccounting.AcctInvoices,

`employee/attendance_list`: fEmployee.AttendanceList,

`employee/business_trip`: fEmployee.BusinessTrip,

`human_resource/business_trip`: fHumanResource.BusinessTrip,

`human_resource/employee/profile/:id`: fHumanResource.EmployeeProfileEdit,

`human_resource/employees`: fHumanResource.Employees,

A handler function should have exactly one parameter with type *W.Context, for example:

func PostValuesExample(ctx *W.Context) {

if ctx.IsAjax() {

ajax := AjaxResponse()

value := ctx.Posts().GetStr(`test2`)

ajax.Set(`test3`, value)

ctx.AppendJson(ajax.SX)

return

}

ctx.Render(`view1`, M.SX{ // <-- locals of the view

`title`: `Post example`,

`map`: M.SI{`test1`:1,`test4`:4},

`arr`: []int{1,2,3,4},

})

}

On above function, we check if the request method is POST or not, if it's so, we assume that it's sent from AJAX, something like this if using jQuery:

var data = {test2: 'foo'};

$.post('', data, function(res) {

alert("Value: " + res.test3);

}).fail(function(xhr, textStatus, errorThrown ) {

alert(textStatus + '\n' + xhr.status);

});

On above javascript snippet, we send to current page through AJAX HTTP POST method, sending a value with key test2 that filled with string foo. The server later will capture it and sending back to client that sending that string as an object with key test3, not that anything you put on it will be converted to JSON. The javascript will retrieve that value through callback (third line on the javascript snippet).

But if client's request is not a POST method, the server will call ctx.Render that will load a file view1.html from view/ directory, if you need to pass anything to that view, put them on a M.SX that is a map with string key and any value type, note that everything you put in this map will be rendered as json. But what's the syntax? This template engine called Z-Template engine, that designed for simplicity and compatibility with javascript syntax, unlike any other template engine, the syntax will not interfere with Javascript IDE's autocomplete feature, here's the example to render values above:

<h1>#{title}</h1>

<h2>#{something that not exists}</h2>

<script>

var title = '#{title}'; // 'Post example'

var a_map = {/* map */}; // {"test1":1,"test4":4}

var an_arr = [/* arr */]; // [1,2,3,4]

</script>

Different from any other template engine, any value given to the Render method that not being used will show a warning, and any key used in the template that not provided in render function will render the key itself (eg: something that not exists).

Wait, in PHP you can retrieve query parameter using $_GET variable, how to do that in this framework?

// this is Go

ctx.QueryParams().GetInt(`theKey`) // equal to $_GET['theKey']

Now back to the handler function, the ctx parameter can be used to control the output, normally when you call Render method, it would also wrap the rendered view with view/layout.html, but if you did not want that, you can call this:

ctx.NoLayout = true

ctx.Buffer.Reset() // to clear rendered things if you already call Render method

ctx.Title = `something` // to set the title, if you use the layout

Layout view have some provided values (locals), that are: title, project_name, assets (the js and css you give on the assets), is_superadmin (if the current logged in person is a webmaster), debug_mode (always true if you didn't update VERSION variable on compile time).

You can see other methods and properties available, you can see them by control-click the W.Context type from your IDE (Gogland, Wide, Visual Studio Code, etc).

Now how to connect to the database? First you must install the database, for example PostgreSQL 9.6 in Ubuntu:

sudo apt-get install postgresql
sudo systemctl enable postgresql
hba=/etc/postgresql/9.6/main/pg_hba.conf
sudo sed -i 's|local all all peer|local all all trust|g' $hba
sudo sed -i 's|host all all 127.0.0.1/32 md5|host all all 127.0.0.1/32 trust|g' $hba
sudo sed -i 's|host all all ::1/128 md5|host all all ::1/128 trust|g' $hba
echo 'local all test1 trust' sudo tee -a $hba # if needed
sudo systemctl start postgresql
sudo su - postgres <<EOF
createuser test1
createdb test1
psql -c 'GRANT ALL PRIVILEGES ON DATABASE test1 TO test1;'

EOF

After testing if your database created correctly, you must create a directory, for example model/ then create a file inside it, for example conn.go with these content:

package model

import (

"github.com/kokizzu/gotro/D/Pg"

_ "github.com/lib/pq"

)

var PG_W, PG_R *Pg.RDBMS

func init() {

PG_W = Pg.NewConn(`test1`, `test1`)

// ^ later when scaling we replace this one

PG_R = Pg.NewConn(`test1`, `test1`)

}

On the code above we create 2 connection, writer and reader, this is the recommended way to scale the reader through multiple servers, if you need better writer (but didn't support join, you can use ScyllaDB or Redis). Next, we create a program to initialize our tables, for example in go/init.go:

package main

import "hello1/model"

func main() {

model.PG_W.CreateBaseTable(`users`, `users`)

model.PG_W.CreateBaseTable(`todos`, `users`) // 2nd table

}

You must execute the gotro/D/Pg/functions.sql using psql before running the code above, it would create 2 tables with indexes with 2 log tables, triggers and some indexes, you can check it inside psql -U test1 using \dt+ or \d users command, that would show something like this:

This is our generic table, what if we need more columns? You don't need to alter table, we use PostgreSQL's JSONB column data. JSONB is very powerful, it can be indexed, queried using arrow operator, greater than its competitor. Using these exact table design, we can store the old and updated value on the log, everytime somebody changed the value.

Ok, now let's create a real model from users table, create a package and file mUsers/m_users.go with content:

package mUsers
import (
"Billions/sql"
"github.com/kokizzu/gotro/A"
"github.com/kokizzu/gotro/D/Pg"
"github.com/kokizzu/gotro/I"
"github.com/kokizzu/gotro/M"
"github.com/kokizzu/gotro/S"
"github.com/kokizzu/gotro/T"
"github.com/kokizzu/gotro/W"
)
const TABLE = `users`
var TM_MASTER Pg.TableModel
var SELECT = ``
var Z func(string) string
var ZZ func(string) string
var ZJ func(string) string
var ZB func(bool) string
var ZI func(int64) string
var ZLIKE func(string) string
var ZT func(...string) string
var PG_W, PG_R *Pg.RDBMS
func init() {
Z = S.Z
ZB = S.ZB
ZZ = S.ZZ
ZJ = S.ZJ
ZI = S.ZI
ZLIKE = S.ZLIKE
ZT = S.ZT
PG_W = sql.PG_W
PG_R = sql.PG_R
TM_MASTER = Pg.TableModel{
CacheName: TABLE + `_USERS_MASTER`,
Fields: []Pg.FieldModel{
{Key: `id`},
{Key: `is_deleted`},
{Key: `modified_at`},
{Label: `E-Mail(s)`, Key: `emails`, CustomQuery: `emails_join(data)`, Type: `emails`, FormTooltip: `separate with comma`},
{Label: `Phone`, Key: `phone`, Type: `phone`, FormHide: true},
{Label: `Full Name`, Key: `full_name`},
},
}
SELECT = TM_MASTER.Select()
}
func One_ByID(id string) M.SX {
ram_key := ZT(id)
query := ram_key + `
SELECT ` + SELECT + `
FROM ` + TABLE + ` x1
WHERE x1.id::TEXT = ` + Z(id)
return PG_R.CQFirstMap(TABLE, ram_key, query)
}
func Search_ByQueryParams(qp *Pg.QueryParams) {
qp.RamKey = ZT(qp.Term)
if qp.Term != `` {
qp.Where += ` AND (x1.data->>'name') LIKE ` + ZLIKE(qp.Term)
}
qp.From = `FROM ` + TABLE + ` x1`
qp.OrderBy = `x1.id`
qp.Select = SELECT
qp.SearchQuery_ByConn(PG_W)

}
/* accessed through: {"order":["-col1","+col2"],"filter":{"is_deleted":false,"created_at":">isodate"},"limit":10,"offset":5}
this will retrieve record 6-15 order by col1 descending, col2 ascending, filtered by is_deleted=false and created_at > isodate
*/

If the example above too complex for you, you can also do manually, see gotro/D/Pg/_example for simpler example. The example above we create a query model, that query from a single table. If you need multiple table (join), you can extend the fields, something like this:

{Label: `Admin`, Key: `admin`, CustomQuery: `x2.data->>'full_name'`},

And the query params something like this:

qp.From = `FROM ` + TABLE + ` x1 LEFT JOIN ` + mAdmin.TABLE + ` x2 ON (x1.data->>'admin_id') = x2.id::TEXT `

You can also do something like this:

func All_ByStartID_ByLimit_IsAsc_IsIncl(id string, limit int64, is_asc, is_incl bool) A.MSX { sign := S.IfElse(is_asc, `>`, `<`) + S.If(is_incl, `=`) ram_key := ZT(id, I.ToS(limit), sign) where := `` if id != `` { where = `AND x1.id ` + sign + Z(id) } query := ram_key + ` SELECT ` + SELECT + ` FROM ` + TABLE + ` x1 WHERE x1.is_deleted = false ` + where + ` ORDER BY x1.id ` + S.If(!is_asc, `DESC`) + ` LIMIT ` + I.ToS(limit) return PG_R.CQMapArray(table, ram_key, query) } ` // accessed through: {"limit":10} // this will retrieve last 10 records

Or query a single row:

func API_Backoffice_Form(rm *W.RequestModel) { rm.Ajax.SX = One_ByID(rm.Id) } // accessed through: {a:'form',id:'123'} // this will retreive all columns on this record

Or create a save/delete/restore function:

func API_Backoffice_SaveDeleteRestore(rm *W.RequestModel) { PG_W.DoTransaction(func(tx *Pg.Tx) string { dm := Pg.NewRow(tx, TABLE, rm) // NewPostlessData emails := rm.Posts.GetStr(`emails`) // rm is the requestModel, values provided by http req dm.Set_UserEmails(emails) // dm is the dataModel, row we want to update // we can call dm.Get* to retrieve old record values dm.SetStr(`full_name`) dm.UpsertRow() if !rm.Ajax.HasError() { dm.WipeUnwipe(rm.Action) } return rm.Ajax.LastError() }) } // accessed through: {a:'save',full_name:'foo',id:'1'} // update // if without id, it would insert

Then you can call them on a handler or package-internal function, something like:

func API_Backoffice_FormLimit(rm *W.RequestModel) { id := rm.Posts.GetStr(`id`) limit := rm.Posts.GetInt(`limit`) is_asc := rm.Posts.GetBool(`asc`) is_incl := rm.Posts.GetBool(`incl`) result := All_ByStartID_ByLimit_IsAsc_IsIncl(id, limit, is_asc, is_incl) rm.Ajax.Set(`result`, result) } func API_Backoffice_Search(rm *W.RequestModel) { qp := Pg.NewQueryParams(rm.Posts, &TM_MASTER) Search_ByQueryParams(qp) qp.ToMap(rm.Ajax) }

And call those two APIs function inside a handler something like this:

func PrepareVars(ctx *W.Context, title string) { user_id := ctx.Session.GetStr(`id`) rm = &W.RequestModel{ Actor: user_id, DbActor: user_id, Level: ctx.Session.SX, Ctx: ctx, } ctx.Title = title is_ajax := ctx.IsAjax() if is_ajax { rm.Ajax = NewAjaxResponse() } page := rm.Level.GetMSB(`page`) first_segment := ctx.FirstPath() // validate if this user may access this first segment // check their access level, if it's not ok, set rm.Ok to false // then render an error, something like this: /* if is_ajax { rm.Ajax.Error(sql.ERR_403_MUST_LOGIN_HIGHER) ctx.AppendJson(rm.Ajax.SX) return } ctx.Error(403, sql.ERR_403_MUST_LOGIN_HIGHER) return */ if !is_ajax { // render menu based on privilege } else { // prepare variables required for ajax response rm.Posts = ctx.Posts() rm.Action = rm.Posts.GetStr(`a`) id := rm.Posts.GetStr(`id`) rm.Id = S.IfElse(id == `0`, ``, id) } } func Users(ctx *W.Context) { rm := PrepareVars(ctx, `Users`) if !rm.Ok { return } if rm.IsAjax() { // handle ajax switch rm.Action { case `search`: // @API mUsers.API_Backoffice_Search(rm) case `form_limit`: // @API mUsers.API_Backoffice_FormLimit(rm) case `form`: // @API mUsers.API_Backoffice_Form(rm) case `save`, `delete`, `restore`: // @ffPI mUsers.API_Backoffice_SaveDeleteRestore(rm) default: // @API-END handler.ErrorHandler(rm.Ajax, rm.Action) } ctx.AppendJson(rm.Ajax.SX) return } locals := W.Ajax{SX: M.SX{ `title`: ctx.Title, }} qp := Pg.NewQueryParams(nil, &mUsers.TM_MASTER) mUsers.Search_ByQueryParams(qp) qp.ToMap(locals) ctx.Render(`backoffice/users`, locals.SX) }

Now that we're done creating the backend API server, all that's left is create the systemd service hello1.service:

[Unit] Description=My Hello1 Service After=network-online.target postgresql.service Wants=network-online.target systemd-networkd-wait-online.service [Service] Type=simple Restart=on-failure User=yourusername Group=users WorkingDirectory=/home/yourusername/web ExecStart=/home/yourusername/web/run_production.sh ExecStop=/usr/bin/killall Hello1 LimitNOFILE=2097152 LimitNPROC=65536 ProtectSystem=full NoNewPrivileges=true [Install] WantedBy=multi-user.target

Create the run_production.sh shell script

#!/usr/bin/env bash ofile=logs/access_`date +%F_%H%M%S`.log echo Logging into: `pwd`/$ofile unbuffer time ./Hello1 | tee $ofile

Then compile the binary (you: can also set the VERSION here, to make it production):

go build -ldflags " -X main.LISTEN_ADDR=:${SUB_PORT} " -o /tmp/Subscriber

Copy the binary, the script above, and whole public/ and views/ directory to the server /home/yourusername/web, copy the service file to the /usr/lib/systemd/system/ then reload the systemd service on the server:

sudo systemctl daemon-reload sudo systemctl enable hello1 sudo systemctl start hello1

you're good to go, you can check the service status using journalctl -f hello1

of course you can automate the hassle above using scp or rsync command.

Well, that's all for now, you can see the complete example on W/example-complex directory, if you have any question, you can contact me through telegram @kokizzu, for frontend stuff, I recommend to learn about VueJS or Weex for mobile.

2017-05-22

Go-Redis vs RediGo (also Aerospike)

This is an old benchmark result that test Redis and Aerospike, both are in-memory database, I did this about December last year, that I used to test Redis agains Aerospike for cases of storing random session per request:

Redis (redigo)
Transactions: 52343 hits
Availability: 100.00 %
Elapsed time: 9.16 secs
Data transferred: 17.02 MB
Response time: 0.04 secs
Transaction rate: 5714.30 trans/sec (1654 worst >1M uQ sess)
Throughput: 1.86 MB/sec
Concurrency: 252.55
Successful transactions: 52343
Failed transactions: 0
Longest transaction: 1.13
Shortest transaction: 0.00

Aerospike (aerospike-client-go)
Transactions: 80806 hits
Availability: 100.00 %
Elapsed time: 9.71 secs
Data transferred: 26.28 MB
Response time: 0.03 secs
Transaction rate: 8321.94 trans/sec (8999 best, 7769 worst)
Throughput: 2.71 MB/sec
Concurrency: 251.91
Successful transactions: 80806
Failed transactions: 0
Longest transaction: 1.17
Shortest transaction: 0.00

Redis (go-redis)
Transactions: 91187 hits
Availability: 100.00 %
Elapsed time: 9.95 secs
Data transferred: 29.65 MB
Response time: 0.03 secs
Transaction rate: 9164.52 trans/sec (3536 worst >1M uQ sess)
Throughput: 2.98 MB/sec
Concurrency: 252.70
Successful transactions: 91187
Failed transactions: 0
Longest transaction: 0.20
Shortest transaction: 0.00

The bad part about Redis (that uses SkipList), more data we store, it slows down faster, in this case 1 million sessions stored slows Redis down by more than 60%, while Aerospike only slowed down by 10%).

Redis 3.2.1 vs ScyllaDB 1.7RC2

Since Scylla still have no secondary index, all I can do to use Scylla is to replace Redis for storing user login sessions, this benchmark only test read queries (queries that always returns zero request because the record does not exists) and the result are:

# Redis
$ hey -c 255 -n 255000 http://localhost:3001
3089 requests done.
7217 requests done.
11691 requests done.
*snip*
241822 requests done.
246305 requests done.
250697 requests done.
All requests done.

Summary:
Total: 29.5162 secs
Slowest: 0.1647 secs
Fastest: 0.0003 secs
Average: 0.0294 secs
Requests/sec: 8639.3205
Total data: 2732835000 bytes
Size/request: 10717 bytes

Status code distribution:
[200] 255000 responses

Response time histogram:
0.000 [1] |
0.017 [4084] |∎
0.033 [194502]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.050 [54311] |∎∎∎∎∎∎∎∎∎∎∎
0.066 [1812] |
0.082 [65] |
0.099 [7] |
0.115 [122] |
0.132 [48] |
0.148 [25] |
0.165 [23] |

Latency distribution:
10% in 0.0231 secs
25% in 0.0255 secs
50% in 0.0286 secs
75% in 0.0325 secs
90% in 0.0370 secs
95% in 0.0404 secs
99% in 0.0487 secs

# ScyllaDB best response time
$ hey -c 255 -n 255000 http://localhost:3001
2114 requests done.
4874 requests done.
7714 requests done.
*snip*
247202 requests done.
249898 requests done.
252610 requests done.
All requests done.

Summary:
Total: 48.5436 secs
Slowest: 0.2649 secs
Fastest: 0.0013 secs
Average: 0.0483 secs
Requests/sec: 5253.0127
Total data: 2732835000 bytes
Size/request: 10717 bytes

Status code distribution:
[200] 255000 responses

Response time histogram:
0.001 [1] |
0.028 [6804] |∎∎
0.054 [176673]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.080 [66748] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.107 [3728] |∎
0.133 [470] |
0.159 [250] |
0.186 [46] |
0.212 [79] |
0.239 [144] |
0.265 [57] |

Latency distribution:
10% in 0.0334 secs
25% in 0.0399 secs
50% in 0.0466 secs
75% in 0.0552 secs
90% in 0.0636 secs
95% in 0.0699 secs

99% in 0.0899 secs

# ScyllaDB best Req/s
$ hey -c 255 -n 255000 http://localhost:3001
2188 requests done.
4910 requests done.
7019 requests done.
*snip*
244547 requests done.
249813 requests done.
254894 requests done.
All requests done.

Summary:
Total: 42.0725 secs
Slowest: 8.0907 secs
Fastest: 0.0002 secs
Average: 0.0418 secs
Requests/sec: 6060.9647
Total data: 2732835000 bytes
Size/request: 10717 bytes

Status code distribution:
[200] 255000 responses

Response time histogram:
0.000 [1] |
0.809 [254744]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.618 [0] |
2.427 [0] |
3.236 [0] |
4.045 [0] |
4.854 [0] |
5.664 [0] |
6.473 [0] |
7.282 [0] |
8.091 [255] |

Latency distribution:
10% in 0.0069 secs
25% in 0.0183 secs
50% in 0.0347 secs
75% in 0.0470 secs
90% in 0.0573 secs
95% in 0.0640 secs
99% in 0.0843 secs

This is the example code of the main function using gotro framework that used to do this benchmark:

//// Testing Redis:
//login_conn := Rd.NewRedisSession(``, ``, 1, `session::`)
//global_conn := Rd.NewRedisSession(``, ``, 3, `global::`)

// Testing Scylla:
login_conn:= Sc.NewScyllaSession(`127.0.0.1`, `session`, `login`, ``, ``)
global_conn := Sc.NewScyllaSession(`127.0.0.1`, `session`, `global`, ``, ``)

// see example
W.InitSession(`SK`, 12*time.Hour, 6*time.Hour, *login_conn, *global_conn)
W.Mailers = ...
W.Assets = ...
W.Webmasters = ...
W.Routes = ...
server := W.NewEngine(DEBUG_MODE, false, `test`, ROOT_DIR)
server.StartServer(LISTEN_ADDR)

And just like before, after doing this intensive benchmark, the Ubuntu showed an error for Scylla:

The bad part about Redis is the scalability stuck on single core, if you add more server the write will not scale, so Scylla is better replacement if you want to do horizontal scaling.

2017-01-01

Redis GUI

Redis is one of full featured in-memory database with optional persistence and replication. Redis support 5 kind of data types: key-value (SET, GET), hashtable (HSET, HGET), linkedlist (L/RPUSH, L/RPOP), sets (SADD, SREM) and sorted/scored-sets (ZADD, ZREM).

Looking for GUI for redis? Thy these apps, from the best to the least:

Redis React

Built using Mono + ReactJS. Just download and Run (install Mono first if you are using Linux).

Redis Desktop Manager

Built using C++ and Qt5.

yaourt --needed --noconfirm -S --force redis-desktop-manager

FastoRedis

A fork of FastoNoSql, I don't know what this one built with, since the github repository doesn't have the source.

Redis Commander

Built using NodeJS

sudo npm install -g redis-commander

Rebrow

Built using Python2 and Flask Framework. Just clone the repository, install its dependencies and run.

Btw, happy new year :3

2016-02-28

Unfair Database Benchmark

So today we will benchmark some popular database, PostgreSQL, MySQL (Percona), Aerospike, Couchbase, MongoDB, Redis under 64-bit Linux (i7-4700MQ, 16GB RAM, non SSD harddisk) and Go using certain library. The settings for each database should have maximum 4GB in working memory, each insert should be done in transaction (if possible), index must be created (if possible), searching 40 times of insertion number in random.

Disclaimer: this is not apple to apple comparison, you can look for the source (and sent a pull request) on my github or dropbox. Let's get started!

PostgreSQL (9.5.1)

using JSONB data type

sudo pacman -S postgresql

sudo su - postgres <<EOF
initdb --locale en_CA.UTF-8 -E UTF8 -D '/var/lib/postgres/data'
EOF

echo "
max_connections = 64
superuser_reserved_connections = 3
shared_buffers = 2GB
work_mem = 32MB
dynamic_shared_memory_type = posix
effective_cache_size = 4GB
logging_collector = off
datestyle = 'iso, dmy'
timezone = 'Asia/Jakarta'
lc_messages = 'en_CA.UTF-8'
lc_monetary = 'en_CA.UTF-8'
lc_numeric = 'en_CA.UTF-8'
lc_time = 'en_CA.UTF-8'
default_text_search_config = 'pg_catalog.english'
" > /var/lib/postgresql/data/postgresql.conf

sudo systemctl start postgresql

sudo su - postgres <<EOF
createdb test2
EOF

go get -u -v gitlab.com/kokizzu/gokil

The result:

initial database size: 7192 kB
INSERT micro 100 983.62 ms
INSERT tiny 1000 13223.53 ms
INSERT small 10000 119006.15 ms
final database size: 10 MB
SEARCH micro 100 354.34 ms
SEARCH tiny 1000 3499.50 ms
SEARCH small 10000 35499.98 ms

MySQL (Percona 5.7.10)

using InnoDB, current JSON data type doesn't support indexing

sudo pacman -S percona-server percona-server-clients

echo '[client]
port = 3306
socket = /run/mysqld/mysqld.sock
[mysqld]
port = 3306
socket = /run/mysqld/mysqld.sock
datadir = /var/lib/mysql
symbolic-links=0
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
key_buffer_size = 800M
max_allowed_packet = 16M
table_open_cache = 64
sort_buffer_size = 32M
net_buffer_length = 4M
read_buffer_size = 16M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 128M
innodb_buffer_pool_size = 4G
[mysqld_safe]
log-error = /var/log/mysqld.log
pid-file = /run/mysqld/mysqld.pid

' > /etc/mysql/my.cnf

sudo systemctl start mysqld

echo 'CREATE DATABASE test2' | mysql -u root

The result:

initial database size: 0
INSERT micro 100 3258.85 ms
INSERT tiny 1000 32297.04 ms
INSERT small 10000 341951.99 ms
final database size: 2.02 MB
SEARCH micro 100 282.32 ms
SEARCH tiny 1000 2843.62 ms
SEARCH small 10000 28404.53 ms

MongoDB (3.2.3)

using struct and map[string]interface{} inside the struct

sudo pacman -S mongodb

go get -u -v gopkg.in/mgo.v2

The result:

initial database size: 78 MB
INSERT micro 100 10.19 ms
INSERT tiny 1000 83.58 ms
INSERT small 10000 894.40 ms
final database size: 78 MB
SEARCH micro 100 321.75 ms
SEARCH tiny 1000 3183.75 ms
SEARCH small 10000 33069.50 ms

CouchBase (4.0.0)

invalid implementation, always got panic: {MCResponse status=0x20 keylen=0, extralen=0, bodylen=12}

sudo yaourt -S couchbase-server-community

go get -u -v github.com/couchbase/go-couchbase

sudo systemctl restart couchbase-server

Visit http://127.0.0.1:8091 to configure, I use 3000MB data cache and 1000MB index cache.

The result (incorrect):

initial database size: 4.06 MB
INSERT micro 100 132.29 ms
INSERT tiny 1000 1317.61 ms
INSERT small 10000 13082.86 ms
final database size: -
SEARCH micro 100 - ms
SEARCH tiny 1000 69020.08 ms
SEARCH small 10000 - ms

AeroSpike (3.7.4)

without secondary index, since I could not found source or binary package for their command line tool for creating the index)

sudo yourt -S aerospike-server-ce
# change this two line on PKGBUILD:
# source=("http://aerospike.com/download/server/latest/artifact/tgz")
# sha512sums=('e9a574e5020db9542de51ad0a1d8da28b8f89d10505848c406656cf113654c0135dfd63fe0aaaafb06f50938124a00275f1b2a0a5ead7058b7b53964c046f3a1')

go get -u -v github.com/aerospike/aerospike-client-go

cd /

sudo aerospike init

echo '

namespace test2 {

replication-factor 2

memory-size 4G

default-ttl 0 # 30 days, use 0 to never expire/evict.

storage-engine device {

file /opt/aerospike/data

filesize 1G

}

}' >> /etc/aerospike.conf

sudo mkdir /opt/aerospike

sudo aerospike start

The result:

initial database size: 1 GB
INSERT micro 100 4.63 ms
INSERT tiny 1000 40.75 ms
INSERT small 10000 426.20 ms
final database size: 1 GB
SEARCH micro 100 113.93 ms
SEARCH tiny 1000 1086.70 ms
SEARCH small 10000 11259.24 ms

Redis (3.0.7)

using gob for serialization and deserialization.

sudo yourt -S redis

go get -u -v github.com/garyburd/redigo/redis

sudo systemctl start redis

The result:

initial database size: 18 bytes
INSERT micro 100 3.94 ms
INSERT tiny 1000 38.28 ms
INSERT small 10000 377.24 ms
final database size: 3.9 MB
SEARCH micro 100 424.79 ms
SEARCH tiny 1000 4263.79 ms
SEARCH small 10000 42904.11 ms

Here's the recap:

Subscribe to: Posts ( Atom )