2017-05-26

GotRo Framework Tutorial: Go, Redis and PostgreSQL

Gotro is opinionated colleciton of libraries and framework for Go, it's a rewrite of Gokil framework that specially built for Go+Redis+PostgreSQL web application development. Previously Gokil written using the infamous julienschmidt's httprouter, the fastest on that time (2014), but a year later (2015) forked as buaazp's fasthttprouter that built based on faster valyala's fasthttp. In this tutorial, we will learn how to use this framework with Redis and PostgreSQL to create a SoHo or even medium-enterprise-class web projects.

Each folder on the directory/package uses 1 letter that contains specific commonly used functions, that are:

  • A - Array
  • B - Boolean
  • C - Character (or Rune)
  • D - Database
  • F - Floating Point
  • L - Logging
  • M - Map
  • I - Integer
  • S - String
  • T - Time (and Date)
  • W - Web (the "framework") -- deprecated, use W2 instead
  • X - Anything (aka interface{})
  • Z - Z-Template Engine, that has syntax similar to ruby string interpolation #{foo} with additional other that javascript friendly syntax {/* foo */}[/* bar */]/*! bar */
To use this libraries, run this command on the console:

go get -u -v github.com/kokizzu/gotro

To start a new project, copy a directory from W/example-simplified folder to your $GOPATH/src, that's the base of your project, that should contain something like this:

├── public
│   └── lib
│       └── jquery.js
├── start_dev.sh
├── server.go
└── views
    ├── error.html
    ├── layout.html
    ├── login_example.html
    ├── named_params_example.html
    ├── post_values_example.html
    └── query_string_example.html

To start the development server, run ./start_dev.sh, it would show something like this:

set ownership of $GOROOT..
remove $GOPATH/pkg if go upgraded/downgraded..
precompile all dependencies..
hello1
starting gin..
[gin] listening on port 3000
2017-05-26 10:55:59.835 StartServer ▶ Gotro Example [DEVELOPMENT] server with 6 route(s) on :3001
  Work Directory: /home/asd/go/src/hello1/

If you got an error, probably because you haven't installed Redis, that being used in this example to store sessions, to do that on Ubuntu, you can type:

sudo apt-get install redis-server
sudo systemctl enable redis-server
sudo systemctl start redis-server

To see the example, open your browser at http://localhost:3000 
Port 3000 is the proxy port for gin program that auto-recompile if the source code changed, the server itself listens on port 3001, if you change it on the source code, you must also change the gin target port on start_dev.sh file, by replacing -a 3001 and -p 3000   

Next we see the example on the server.go file:

redis_conn := Rd.NewRedisSession(``, ``, 9, `session::`)
global_conn := Rd.NewRedisSession(``, ``, 10, `session::`)
W.InitSession(`Aaa`, 2*24*time.Hour, 1*24*time.Hour, *redis_conn, *global_conn)
W.Mailers = map[string]*W.SmtpConfig{
``: {
Name:     `Mailer Daemon`,
Username: `test.test`,
Password: `123456`,
Hostname: `smtp.gmail.com`,
Port:     587,
},
}
W.Assets = ASSETS
W.Webmasters = WEBMASTER_EMAILS
W.Routes = ROUTERS
W.Filters = []W.Action{AuthFilter}
// web engine
server := W.NewEngine(DEBUG_MODE, false, PROJECT_NAME+VERSION, ROOT_DIR)
server.StartServer(LISTEN_ADDR)

There are 2 redis connection, one for storing local session, one for storing global session (used cross app communication).
You must call W.InitSession to tell the framework name of the cookie, default expiration (how long until a cookie expired, and every how long we should renew). On the next line, we set the mailer W.Mailers, connection that we use to send if there are panic or any other critical error within your web server.
W.Assets is the assets file, should contain any css or javascript script that will be included on every page, the assets should be saved on the public/css/ or public/js/ directory. This is the example how to fill them:

var ASSETS = [][2]string{
//// http://api.jquery.com/ 1.11.1
{`js`, `jquery`},
////// http://hayageek.com/docs/jquery-upload-file.php
{`css`, `uploadfile`},
{`js`, `jquery.form`},
{`js`, `jquery.uploadfile`},
//// https://vuejs.org/v2/guide/ 2.0
{`js`, `vue`},
//// http://momentjs.com/ 2.17.1
{`js`, `moment`},
//// github.com/kokizzu/semantic-ui-daterangepicker
{`css`, `daterangepicker`},
{`js`, `daterangepicker`},
//// http://semantic-ui.com 2.2 // should be below `js` and `css` items
{`/css`, `semantic/semantic`},
{`/js`, `semantic/semantic`},
//// global, helpers, project specific
{`/css`, `global`},
{`/js`, `global`},
}

If you start the type of the file with slash, it means it would locate the file in absolute path starting from public/. Currently only js and css files supported.

Next we must set the W.Webmasters, that is the hardcoded superadmin, one that will be receiving the error emails and could be accessed through ctx.IsWebMaster() that matching the ctx.Session.GetStr(`email`) variable with those values.

Next initialization phase, you must set the route W.Routes, which is used to assign an URL path to a handler function, for example:

var ROUTERS = map[string]W.Action{
``:                            LoginExample,
`login_example`:               LoginExample,
`post_values_example`:         PostValuesExample,
`named_params_example/:test1`: NamedParamsExample,
`query_string_example`:        QueryStringExample,
}

In this example, there are five routes with four different handler function (you can put them on a package, normally you separate them on different package based on access level), on the fourth route we capture the :value as string, that can be anything and can be retrieved by calling ctx.ParamStr(`test1`). Here's some example how to separate the handler based on first segment:

`accounting/acct_payments`:            fAccounting.AcctPayments,
`accounting/acct_invoices`:            fAccounting.AcctInvoices,
`employee/attendance_list`:            fEmployee.AttendanceList,
`employee/business_trip`:              fEmployee.BusinessTrip,
`human_resource/business_trip`:        fHumanResource.BusinessTrip,
`human_resource/employee/profile/:id`: fHumanResource.EmployeeProfileEdit,
`human_resource/employees`:            fHumanResource.Employees,

A handler function should have exactly one parameter with type *W.Context, for example:

func PostValuesExample(ctx *W.Context) {
if ctx.IsAjax() {
ajax := AjaxResponse()
value := ctx.Posts().GetStr(`test2`)
ajax.Set(`test3`, value)
ctx.AppendJson(ajax.SX)
return
}
ctx.Render(`view1`, M.SX{ // <-- locals of the view
`title`: `Post example`,
`map`: M.SI{`test1`:1,`test4`:4},
`arr`: []int{1,2,3,4},
})
}

On above function, we check if the request method is POST or not, if it's so, we assume that it's sent from AJAX, something like this if using jQuery:

var data = {test2: 'foo'};
$.post('', data, function(res) {
alert("Value: " + res.test3);
}).fail(function(xhr, textStatus, errorThrown ) {
alert(textStatus + '\n' + xhr.status);
});

On above javascript snippet, we send to current page through AJAX HTTP POST method, sending a value with key test2 that filled with string foo. The server later will capture it and sending back to client that sending that string as an object with key test3, not that anything you put on it will be converted to JSON. The javascript will retrieve that value through callback (third line on the javascript snippet).

But if client's request is not a POST method, the server will call ctx.Render that will load a file view1.html from view/ directory, if you need to pass anything to that view, put them on a M.SX that is a map with string key and any value type, note that everything you put in this map will be rendered as json. But what's the syntax? This template engine called Z-Template engine, that designed for simplicity and compatibility with javascript syntax, unlike any other template engine, the syntax will not interfere with Javascript IDE's autocomplete feature, here's the example to render values above:

<h1>#{title}</h1>
<h2>#{something that not exists}</h2>
<script>
  var title = '#{title}'; // 'Post example'
  var a_map = {/* map */}; // {"test1":1,"test4":4}
  var an_arr = [/* arr */]; // [1,2,3,4]
</script>

Different from any other template engine, any value given to the Render method that not being used will show a warning, and any key used in the template that not provided in render function will render the key itself (eg: something that not exists).

Wait, in PHP you can retrieve query parameter using $_GET variable, how to do that in this framework?

// this is Go
ctx.QueryParams().GetInt(`theKey`) // equal to $_GET['theKey']

Now back to the handler function, the ctx parameter can be used to control the output, normally when you call Render method, it would also wrap the rendered view with view/layout.html, but if you did not want that, you can call this:

ctx.NoLayout = true
ctx.Buffer.Reset() // to clear rendered things if you already call Render method
ctx.Title = `something` // to set the title, if you use the layout

Layout view have some provided values (locals), that are: title, project_name, assets (the js and css you give on the assets), is_superadmin (if the current logged in person is a webmaster), debug_mode (always true if you didn't update VERSION variable on compile time).

You can see other methods and properties available, you can see them by control-click the W.Context type from your IDE (Gogland, Wide, Visual Studio Code, etc).

Now how to connect to the database? First you must install the database, for example PostgreSQL 9.6 in Ubuntu:

sudo apt-get install postgresql
sudo systemctl enable postgresql
hba=/etc/postgresql/9.6/main/pg_hba.conf
sudo sed -i 's|local   all             all                                     peer|local all all trust|g' $hba
sudo sed -i 's|host    all             all             127.0.0.1/32            md5|host all all 127.0.0.1/32 trust|g' $hba
sudo sed -i 's|host    all             all             ::1/128                 md5|host all all ::1/128 trust|g' $hba
echo 'local all test1 trust' sudo tee -a $hba # if needed
sudo systemctl start postgresql 
sudo su - postgres <<EOF
createuser test1
createdb test1
psql -c 'GRANT ALL PRIVILEGES ON DATABASE test1 TO test1;'

EOF

After testing if your database created correctly, you must create a directory, for example model/ then create a file inside it, for example conn.go with these content:

package model
import (
"github.com/kokizzu/gotro/D/Pg"
_ "github.com/lib/pq"
)
var PG_W, PG_R *Pg.RDBMS
func init() {
PG_W = Pg.NewConn(`test1`, `test1`) 
        // ^ later when scaling we replace this one
PG_R = Pg.NewConn(`test1`, `test1`)
}

On the code above we create 2 connection, writer and reader, this is the recommended way to scale the reader through multiple servers, if you need better writer (but didn't support join, you can use ScyllaDB or Redis). Next, we create a program to initialize our tables, for example in go/init.go:

package main
import "hello1/model"
func main() {
  model.PG_W.CreateBaseTable(`users`, `users`)
  model.PG_W.CreateBaseTable(`todos`, `users`) // 2nd table
}

You must execute the gotro/D/Pg/functions.sql using psql before running the code above, it would create 2 tables with indexes with 2 log tables, triggers and some indexes, you can check it inside psql -U test1 using \dt+ or \d users command, that would show something like this:

                          Table "public.users" 
  Column    |           Type           |                     Modifiers 
------------+--------------------------+-----------------------------------------
id          | bigint                   | not null default nextval('users_id_seq'::regclass) 
unique_id   | character varying(4096)  | 
created_at  | timestamp with time zone | default now() 
updated_at  | timestamp with time zone | 
deleted_at  | timestamp with time zone | 
restored_at | timestamp with time zone | 
modified_at | timestamp with time zone | default now() 
created_by  | bigint                   | 
updated_by  | bigint                   | 
deleted_by  | bigint                   | 
restored_by | bigint                   | 
is_deleted  | boolean                  | default false 
data        | jsonb                    |

This is our generic table, what if we need more columns? You don't need to alter table, we use PostgreSQL's JSONB column data. JSONB is very powerful, it can be indexed, queried using arrow operator, greater than its competitor. Using these exact table design, we can store the old and updated value on the log, everytime somebody changed the value.

Ok, now let's create a real model from users table, create a package and file mUsers/m_users.go with content:

package mUsers
import (
 "Billions/sql"
 "github.com/kokizzu/gotro/A"
 "github.com/kokizzu/gotro/D/Pg"
 "github.com/kokizzu/gotro/I"
 "github.com/kokizzu/gotro/M"
 "github.com/kokizzu/gotro/S"
 "github.com/kokizzu/gotro/T"
 "github.com/kokizzu/gotro/W"
)
const TABLE = `users`
var TM_MASTER Pg.TableModel
var SELECT = ``
var Z func(string) string
var ZZ func(string) string
var ZJ func(string) string
var ZB func(bool) string
var ZI func(int64) string
var ZLIKE func(string) string
var ZT func(...string) string
var PG_W, PG_R *Pg.RDBMS
func init() {
 Z = S.Z
 ZB = S.ZB
 ZZ = S.ZZ
 ZJ = S.ZJ
 ZI = S.ZI
 ZLIKE = S.ZLIKE
 ZT = S.ZT
 PG_W = sql.PG_W
 PG_R = sql.PG_R
 TM_MASTER = Pg.TableModel{
  CacheName: TABLE + `_USERS_MASTER`,
  Fields: []Pg.FieldModel{
   {Key: `id`},
   {Key: `is_deleted`},
   {Key: `modified_at`},
   {Label: `E-Mail(s)`, Key: `emails`, CustomQuery: `emails_join(data)`, Type: `emails`, FormTooltip: `separate with comma`},
   {Label: `Phone`, Key: `phone`, Type: `phone`, FormHide: true},
   {Label: `Full Name`, Key: `full_name`},
  },
 }
 SELECT = TM_MASTER.Select()
}
func One_ByID(id string) M.SX {
 ram_key := ZT(id)
 query := ram_key + `
SELECT ` + SELECT + `
FROM ` + TABLE + ` x1
WHERE x1.id::TEXT = ` + Z(id)
 return PG_R.CQFirstMap(TABLE, ram_key, query)
}
func Search_ByQueryParams(qp *Pg.QueryParams) {
 qp.RamKey = ZT(qp.Term)
 if qp.Term != `` {
  qp.Where += ` AND (x1.data->>'name') LIKE ` + ZLIKE(qp.Term)
 }
 qp.From = `FROM ` + TABLE + ` x1`
 qp.OrderBy = `x1.id`
 qp.Select = SELECT
 qp.SearchQuery_ByConn(PG_W)

}
/* accessed through: {"order":["-col1","+col2"],"filter":{"is_deleted":false,"created_at":">isodate"},"limit":10,"offset":5}
this will retrieve record 6-15 order by col1 descending, col2 ascending, filtered by is_deleted=false and created_at > isodate
*/

If the example above too complex for you, you can also do manually, see gotro/D/Pg/_example for simpler example. The example above we create a query model, that query from a single table. If you need multiple table (join), you can extend the fields, something like this:

 {Label: `Admin`, Key: `admin`, CustomQuery: `x2.data->>'full_name'`},

And the query params something like this:

qp.From = `FROM ` + TABLE + ` x1 LEFT JOIN ` + mAdmin.TABLE + ` x2 ON (x1.data->>'admin_id') = x2.id::TEXT `

You can also do something like this:

func All_ByStartID_ByLimit_IsAsc_IsIncl(id string, limit int64, is_asc, is_incl bool) A.MSX { sign := S.IfElse(is_asc, `>`, `<`) + S.If(is_incl, `=`) ram_key := ZT(id, I.ToS(limit), sign) where := `` if id != `` { where = `AND x1.id ` + sign + Z(id) } query := ram_key + ` SELECT ` + SELECT + ` FROM ` + TABLE + ` x1 WHERE x1.is_deleted = false ` + where + ` ORDER BY x1.id ` + S.If(!is_asc, `DESC`) + ` LIMIT ` + I.ToS(limit) return PG_R.CQMapArray(table, ram_key, query) } ` // accessed through: {"limit":10} // this will retrieve last 10 records

Or query a single row:

func API_Backoffice_Form(rm *W.RequestModel) { rm.Ajax.SX = One_ByID(rm.Id) } // accessed through: {a:'form',id:'123'} // this will retreive all columns on this record

Or create a save/delete/restore function:

func API_Backoffice_SaveDeleteRestore(rm *W.RequestModel) { PG_W.DoTransaction(func(tx *Pg.Tx) string { dm := Pg.NewRow(tx, TABLE, rm) // NewPostlessData emails := rm.Posts.GetStr(`emails`) // rm is the requestModel, values provided by http req dm.Set_UserEmails(emails) // dm is the dataModel, row we want to update // we can call dm.Get* to retrieve old record values dm.SetStr(`full_name`) dm.UpsertRow() if !rm.Ajax.HasError() { dm.WipeUnwipe(rm.Action) } return rm.Ajax.LastError() }) } // accessed through: {a:'save',full_name:'foo',id:'1'} // update // if without id, it would insert

Then you can call them on a handler or package-internal function, something like:

func API_Backoffice_FormLimit(rm *W.RequestModel) { id := rm.Posts.GetStr(`id`) limit := rm.Posts.GetInt(`limit`) is_asc := rm.Posts.GetBool(`asc`) is_incl := rm.Posts.GetBool(`incl`) result := All_ByStartID_ByLimit_IsAsc_IsIncl(id, limit, is_asc, is_incl) rm.Ajax.Set(`result`, result) } func API_Backoffice_Search(rm *W.RequestModel) { qp := Pg.NewQueryParams(rm.Posts, &TM_MASTER) Search_ByQueryParams(qp) qp.ToMap(rm.Ajax) }

And call those two APIs function inside a handler something like this:

func PrepareVars(ctx *W.Context, title string) { user_id := ctx.Session.GetStr(`id`) rm = &W.RequestModel{ Actor: user_id, DbActor: user_id, Level: ctx.Session.SX, Ctx: ctx, } ctx.Title = title is_ajax := ctx.IsAjax() if is_ajax { rm.Ajax = NewAjaxResponse() } page := rm.Level.GetMSB(`page`) first_segment := ctx.FirstPath() // validate if this user may access this first segment // check their access level, if it's not ok, set rm.Ok to false // then render an error, something like this: /* if is_ajax { rm.Ajax.Error(sql.ERR_403_MUST_LOGIN_HIGHER) ctx.AppendJson(rm.Ajax.SX) return } ctx.Error(403, sql.ERR_403_MUST_LOGIN_HIGHER) return */ if !is_ajax { // render menu based on privilege } else { // prepare variables required for ajax response rm.Posts = ctx.Posts() rm.Action = rm.Posts.GetStr(`a`) id := rm.Posts.GetStr(`id`) rm.Id = S.IfElse(id == `0`, ``, id) } } func Users(ctx *W.Context) { rm := PrepareVars(ctx, `Users`) if !rm.Ok { return } if rm.IsAjax() { // handle ajax switch rm.Action { case `search`: // @API mUsers.API_Backoffice_Search(rm) case `form_limit`: // @API mUsers.API_Backoffice_FormLimit(rm) case `form`: // @API mUsers.API_Backoffice_Form(rm) case `save`, `delete`, `restore`: // @ffPI mUsers.API_Backoffice_SaveDeleteRestore(rm) default: // @API-END handler.ErrorHandler(rm.Ajax, rm.Action) } ctx.AppendJson(rm.Ajax.SX) return } locals := W.Ajax{SX: M.SX{ `title`: ctx.Title, }} qp := Pg.NewQueryParams(nil, &mUsers.TM_MASTER) mUsers.Search_ByQueryParams(qp) qp.ToMap(locals) ctx.Render(`backoffice/users`, locals.SX) }

Now that we're done creating the backend API server, all that's left is create the systemd service hello1.service:

[Unit] Description=My Hello1 Service After=network-online.target postgresql.service Wants=network-online.target systemd-networkd-wait-online.service [Service] Type=simple Restart=on-failure User=yourusername Group=users WorkingDirectory=/home/yourusername/web ExecStart=/home/yourusername/web/run_production.sh ExecStop=/usr/bin/killall Hello1 LimitNOFILE=2097152 LimitNPROC=65536 ProtectSystem=full NoNewPrivileges=true [Install] WantedBy=multi-user.target

Create the run_production.sh shell script

#!/usr/bin/env bash ofile=logs/access_`date +%F_%H%M%S`.log echo Logging into: `pwd`/$ofile unbuffer time ./Hello1 | tee $ofile

Then compile the binary (you: can also set the VERSION here, to make it production):

go build -ldflags " -X main.LISTEN_ADDR=:${SUB_PORT} " -o /tmp/Subscriber

Copy the binary, the script above, and whole public/ and views/ directory to the server /home/yourusername/web, copy the service file to the /usr/lib/systemd/system/ then reload the systemd service on the server:

sudo systemctl daemon-reload sudo systemctl enable hello1 sudo systemctl start hello1

you're good to go, you can check the service status using journalctl -f hello1
of course you can automate the hassle above using scp or rsync command.

Well, that's all for now, you can see the complete example on W/example-complex directory, if you have any question, you can contact me through telegram @kokizzu, for frontend stuff, I recommend to learn about VueJS or Weex for mobile.

2017-05-22

Go-Redis vs RediGo (also Aerospike)

This is an old benchmark result that test Redis and Aerospike, both are in-memory database, I did this about December last year, that I used to test Redis agains Aerospike for cases of storing random session per request:

Redis (redigo)
Transactions:                  52343 hits 
Availability:                 100.00 % 
Elapsed time:                   9.16 secs 
Data transferred:              17.02 MB 
Response time:                  0.04 secs 
Transaction rate:            5714.30 trans/sec  (1654 worst >1M uQ sess)
Throughput:                     1.86 MB/sec 
Concurrency:                  252.55 
Successful transactions:       52343 
Failed transactions:               0 
Longest transaction:            1.13 
Shortest transaction:           0.00

Aerospike (aerospike-client-go)
Transactions:                  80806 hits
Availability:                 100.00 %
Elapsed time:                   9.71 secs
Data transferred:              26.28 MB
Response time:                  0.03 secs
Transaction rate:            8321.94 trans/sec (8999 best, 7769 worst)
Throughput:                     2.71 MB/sec
Concurrency:                  251.91
Successful transactions:       80806
Failed transactions:               0
Longest transaction:            1.17
Shortest transaction:           0.00

Redis (go-redis)
Transactions:                  91187 hits 
Availability:                 100.00 % 
Elapsed time:                   9.95 secs 
Data transferred:              29.65 MB 
Response time:                  0.03 secs 
Transaction rate:            9164.52 trans/sec (3536 worst >1M uQ sess)
Throughput:                     2.98 MB/sec 
Concurrency:                  252.70 
Successful transactions:       91187 
Failed transactions:               0 
Longest transaction:            0.20 
Shortest transaction:           0.00 

The bad part about Redis (that uses SkipList), more data we store, it slows down faster, in this case 1 million sessions stored slows Redis down by more than 60%, while Aerospike only slowed down by 10%).

Redis 3.2.1 vs ScyllaDB 1.7RC2

Since Scylla still have no secondary index, all I can do to use Scylla is to replace Redis for storing user login sessions, this benchmark only test read queries (queries that always returns zero request because the record does not exists) and the result are:

# Redis 
$ hey -c 255 -n 255000 http://localhost:3001
3089 requests done.
7217 requests done.
11691 requests done.
*snip*
241822 requests done.
246305 requests done.
250697 requests done.
All requests done.

Summary:
  Total:        29.5162 secs
  Slowest:      0.1647 secs
  Fastest:      0.0003 secs
  Average:      0.0294 secs
  Requests/sec: 8639.3205
  Total data:   2732835000 bytes
  Size/request: 10717 bytes

Status code distribution:
  [200] 255000 responses

Response time histogram:
  0.000 [1]     |
  0.017 [4084]  |∎
  0.033 [194502]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.050 [54311] |∎∎∎∎∎∎∎∎∎∎∎
  0.066 [1812]  |
  0.082 [65]    |
  0.099 [7]     |
  0.115 [122]   |
  0.132 [48]    |
  0.148 [25]    |
  0.165 [23]    |

Latency distribution:
  10% in 0.0231 secs
  25% in 0.0255 secs
  50% in 0.0286 secs
  75% in 0.0325 secs
  90% in 0.0370 secs
  95% in 0.0404 secs
  99% in 0.0487 secs

# ScyllaDB best response time
$ hey -c 255 -n 255000 http://localhost:3001
2114 requests done.
4874 requests done.
7714 requests done.
*snip*
247202 requests done.
249898 requests done.
252610 requests done.
All requests done.

Summary:
  Total:        48.5436 secs
  Slowest:      0.2649 secs
  Fastest:      0.0013 secs
  Average:      0.0483 secs
  Requests/sec: 5253.0127
  Total data:   2732835000 bytes
  Size/request: 10717 bytes

Status code distribution:
  [200] 255000 responses

Response time histogram:
  0.001 [1]     |
  0.028 [6804]  |∎∎
  0.054 [176673]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.080 [66748] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.107 [3728]  |∎
  0.133 [470]   |
  0.159 [250]   |
  0.186 [46]    |
  0.212 [79]    |
  0.239 [144]   |
  0.265 [57]    |

Latency distribution:
  10% in 0.0334 secs
  25% in 0.0399 secs
  50% in 0.0466 secs
  75% in 0.0552 secs
  90% in 0.0636 secs
  95% in 0.0699 secs

  99% in 0.0899 secs

# ScyllaDB best Req/s
$ hey -c 255 -n 255000 http://localhost:3001
2188 requests done.
4910 requests done.
7019 requests done.
*snip*
244547 requests done.
249813 requests done.
254894 requests done.
All requests done.

Summary:
  Total:        42.0725 secs
  Slowest:      8.0907 secs
  Fastest:      0.0002 secs
  Average:      0.0418 secs
  Requests/sec: 6060.9647
  Total data:   2732835000 bytes
  Size/request: 10717 bytes

Status code distribution:
  [200] 255000 responses

Response time histogram:
  0.000 [1]     |
  0.809 [254744]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  1.618 [0]     |
  2.427 [0]     |
  3.236 [0]     |
  4.045 [0]     |
  4.854 [0]     |
  5.664 [0]     |
  6.473 [0]     |
  7.282 [0]     |
  8.091 [255]   |

Latency distribution:
  10% in 0.0069 secs
  25% in 0.0183 secs
  50% in 0.0347 secs
  75% in 0.0470 secs
  90% in 0.0573 secs
  95% in 0.0640 secs
  99% in 0.0843 secs

This is the example code of the main function using gotro framework that used to do this benchmark:


//// Testing Redis:
//login_conn := Rd.NewRedisSession(``, ``, 1, `session::`)
//global_conn := Rd.NewRedisSession(``, ``, 3, `global::`)

// Testing Scylla:
login_conn:= Sc.NewScyllaSession(`127.0.0.1`, `session`, `login`, ``, ``)
global_conn := Sc.NewScyllaSession(`127.0.0.1`, `session`, `global`, ``, ``)

// see example
W.InitSession(`SK`, 12*time.Hour, 6*time.Hour, *login_conn, *global_conn)
W.Mailers = ...
W.Assets = ...
W.Webmasters = ...
W.Routes = ...
server := W.NewEngine(DEBUG_MODE, false, `test`, ROOT_DIR)
server.StartServer(LISTEN_ADDR)

And just like before, after doing this intensive benchmark, the Ubuntu showed an error for Scylla:


The bad part about Redis is the scalability stuck on single core, if you add more server the write will not scale, so Scylla is better replacement if you want to do horizontal scaling.

2017-05-19

PostgreSQL 9.6.2 vs ScyllaDB 1.7RC2

Since CockroachDB 1.0 not yet performant (ain't got time to wait), today we're gonna test PostgreSQL 9.6.2 vs ScyllaDB 1.7RC2 on Ubuntu XFS filesystem.

test1: postgresql
INSERT: 34.667685316s (3.47 ms/op)
UPDATE: 35.117617526s (3.51 ms/op)
SELECT: 47.529755777s (0.73 ms/op)
CPU: 35.14s     Real: 117.64s   RAM: 58 544 KB

test2: postgresql jsonb
INSERT: 33.861673279s (3.39 ms/op)
UPDATE: 34.038996914s (3.40 ms/op)
SELECT: 45.340834079s (0.70 ms/op)
CPU: 33.62s     Real: 113.58s   RAM: 58 140 KB

test4: scylladb
INSERT: 2.133985799s (0.21 ms/op)
UPDATE: 2.167973712s (0.22 ms/op)
SELECT: 2m24.804415353s (2.22 ms/op)
CPU: 41.29s     Real: 152.57s   RAM: 79 708 KB

Hmm.. This is weird, because the difference between this and previous benchmark are:

LabelThisPrevious
Operating System Ubuntu 17.04 Manjaro
Kernel 4.10.0-19-generic
tuned by scylla_setup
aufs_friendly 4.10.13-1
PostgreSQL config:
effective_cache
shared_buffers
work_mem
ubuntu default:
4G/128M/4M
modified:
2G/2G/16M
PostgreSQL total time 113.58s 116.34s
ScyllaDB  official deb 1.7RC2 official docker 1.6.4
ScyllaDB config 8GB / 8 / XFS 4GB / 4 / EXT4
ScyllaDB total time 152.57s 141.34s

Probably the kernel factor? Oh yeah, you can get the source on github (you can create PR if there's bug). Now let's test it in parallel with one third of data (only show first 3 and last 3 result):

test1: postgresql
I-18: (6.17 ms/op: 101)
I-12: (6.16 ms/op: 101)
I-14: (6.19 ms/op: 101)
I-32: (157.80 ms/op: 101)
I-05: (167.31 ms/op: 101)
I-31: (168.23 ms/op: 101)
INSERT: 16.99355453s (5.10 ms/op)
U-13: (5.92 ms/op: 101)
U-16: (5.97 ms/op: 101)
U-04: (6.00 ms/op: 101)
U-21: (1312.46 ms/op: 101)
U-28: (1333.33 ms/op: 101)
U-20: (1333.60 ms/op: 101)
UPDATE: 2m14.695128106s (40.41 ms/op)
S-24: (14.56 ms/op: 139)
S-69: (17.86 ms/op: 115)
S-78: (24.18 ms/op: 88)
S-06: (18.14 ms/op: 556)
S-46: (18.54 ms/op: 556)
S-37: (1427.27 ms/op: 91)
SELECT: 2m9.888985893s (5.98 ms/op: 21716)
CPU: 13.07s     Real: 281.90s   RAM: 59 072 KB

test2: postgresql jsonb
I-03: (6.23 ms/op: 101)
I-10: (6.28 ms/op: 101)
I-04: (6.30 ms/op: 101)
I-08: (157.54 ms/op: 101)
I-29: (225.82 ms/op: 101)
I-22: (314.39 ms/op: 101)
INSERT: 31.754559744s (9.53 ms/op)
U-23: (6.45 ms/op: 101)
U-13: (6.51 ms/op: 101)
U-04: (6.54 ms/op: 101)
U-05: (1287.63 ms/op: 101)
U-01: (1287.77 ms/op: 101)
U-26: (1953.67 ms/op: 101)
UPDATE: 3m17.321405467s (59.20 ms/op)
S-27: (27.54 ms/op: 124)
S-21: (21.89 ms/op: 159)
S-29: (30.55 ms/op: 115)
S-48: (323.69 ms/op: 417)
S-05: (202.41 ms/op: 667)
S-03: (121.61 ms/op: 1111)
SELECT: 2m15.109162326s (6.22 ms/op: 21716)
CPU: 13.78s     Real: 364.50s   RAM: 56 688 KB

test4: scylladb
I-17: (1.59 ms/op: 101)
I-12: (1.59 ms/op: 101)
I-30: (1.60 ms/op: 101)
I-13: (1.65 ms/op: 101)
I-22: (1.65 ms/op: 101)
I-06: (1.65 ms/op: 101)
INSERT: 166.992399ms (0.05 ms/op)
U-06: (1.60 ms/op: 101)
U-12: (1.61 ms/op: 101)
U-29: (1.62 ms/op: 101)
U-31: (1.68 ms/op: 101)
U-30: (1.68 ms/op: 101)
U-11: (1.68 ms/op: 101)
UPDATE: 170.240627ms (0.05 ms/op)
S-79: (52.05 ms/op: 86)
S-74: (50.06 ms/op: 98)
S-75: (53.78 ms/op: 96)
S-03: (12.18 ms/op: 1111)
S-02: (8.43 ms/op: 1666)
S-42: (8.53 ms/op: 1666)
SELECT: 14.678651323s (0.68 ms/op: 21716)
CPU: 16.92s     Real: 18.08s    RAM: 76 824 KB

WTF! These numbers are blazing fast!
Too bad that the gocql or probably the scylla-server sometimes refused to connect:

2017/05/19 14:16:28 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: i/o timeout
2017/05/19 14:16:33 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: i/o timeout
2017/05/19 14:20:30 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: i/o timeout

2017/05/19 14:18:28 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: getsockopt: connection refused
2017/05/19 14:18:44 gocql: unable to dial control conn 127.0.0.1: dial tcp 127.0.0.1:9042: i/o timeout
2017/05/19 14:18:44 gocql: unable to create session: control: unable to connect to initial hosts: dial tcp 127.0.0.1:9042: i/o timeout

2017/05/19 14:20:34 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: i/o timeout
2017/05/19 14:20:37 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: i/o timeout 

2017/05/19 14:27:46 gocql: unable to create session: unable to discover protocol version: dial tcp 127.0.0.1:9042: i/o timeout

Or when using cqlsh 127.0.0.1:

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(None, "Tried connecting to [('127.0.0.1', 9042)]. Last error: timed out")})
Connection error: ('Unable to complete the operation against any hosts', {})
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(4, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Interrupted system call")})

Other those issue, I think ScyllaDB is freakingly awesome!

EDIT: this issue only happened after doing these intensive benchmark, after restarting this no longer happened

2017-05-15

PostgreSQL 9.6.2 vs CockroachDB 1.0 vs ScyllaDB 1.6.4

New kids on the block, multi-master database that recently released 1.0, here's some microbenchmark result:

N = 999

test1: postgresql
INSERT: 3.442695947s (3.45 ms/op)
UPDATE: 3.912135754s (3.92 ms/op)
SELECT: 3.408927374s (0.52 ms/op)
CPU: 2.62s      Real: 11.07s    RAM: 57 864 KB

test2: postgresql jsonb
INSERT: 3.270218052s (3.27 ms/op)
UPDATE: 3.796453051s (3.80 ms/op)
SELECT: 3.209289448s (0.49 ms/op)
CPU: 2.33s      Real: 10.57s    RAM: 58 680 KB

test3: cockroachdb
INSERT: 7.495245970s (7.50 ms/op)
UPDATE: 8.249719113s (8.26 ms/op)
SELECT: 16m8.372273781s (148.34 ms/op)
CPU: 2.50s      Real: 986.17s   RAM: 58 340 KB

test4: scylladb
INSERT: 150.117719ms (0.15 ms/op)
UPDATE: 147.339553ms (0.15 ms/op)
SELECT: 5.422713068s (0.83 ms/op)
CPU: 4.06s      Real: 7.76s     RAM: 76 764 KB

N = 9999

test2: postgresql jsonb
INSERT: 36.012436525s (3.60 ms/op)
UPDATE: 35.902222429s (3.59 ms/op)
SELECT: 44.119970723s (0.68 ms/op)
CPU: 32.30s     Real: 116.34s   RAM: 58 632 KB

test4: scylladb
INSERT: 1.518285796s (0.15 ms/op)
UPDATE: 1.542542984s (0.15 ms/op)
SELECT: 2m16.29325852s (2.09 ms/op)
CPU: 41.55s     Real: 141.34s   RAM: 76 712 KB

This is shocking for me that CockroachDB 2x-19x slower than PostgreSQL, so I file a bug report and one for scylla (slow query on larger datasets).
This benchmark performed on 64-bit ArchLinux, i7-4720HQ, 16GB RAM, 256GB SSD Samsung. You can get the source here. Note that ScyllaDB requires XFS, but I use EXT4 filesystem.

PostgreSQL
+ battletested for 20 years
+ schema-free (via JSONB, it has indexes :3)
+ triggers, joins, language extensions (eg. pl/v8, pl/go, pl/ruby, etc)
- no multimaster replication, except if you use PostgresXL or Postgres-X2

CockroachDB
+ survive Aphyr's Jepsen
+ autoscaling, autohealing, autobalancing
+ only 1 binary file (Go power ^^)
- seriously slow on every part of this benchmark
- no BLOB! (as per 2017-05-15)

ScyllaDB
+ Cassandra rewritten in C++
+ autoscaling, autohealing, autobalancing
+ blazing fast for insert and update benchmark (not sure if it's persisted to disk though)
- no secondary index and serial/auto-increment (as per 2017-05-15)
- only support Ubuntu, Debian, RHEL (a bit challenging to compile on another OS because it's depends on old thrift and boost library)
- communication by default without authentication, this is bad if you don't have any private network (eg. host it on a public cloud), you must enable internode-ecnryption and put a firewall to allow only certain host access exposed ports.

2017-05-11

TechEmpower Framework Benchmark Round 14

New benchmark result is out, as usual the important part is the data-update benchmark:


At that chart, the top ranking language are: Kotlin, C, Java, C++, Go, Perl, Javascript, Scala, C#; and for the database: MySQL, PostgreSQL, MongoDB.

Also the other benchmark that reflect real world case is multiple-queries:
On that benchmmark, the top performer programming language are: Dart, C++, Java, C, Go, Kotlin, Javascript, Scala, Ruby, and Ur; and the database: MongoDB, PostgreSQL, MySQL. You can see the previous result here, and here.


2017-02-01

Elixir vs Golang

Rather than debate between newbies and expert in only one language, let's find out the pros and cons between Elixir and Go:
  1. The Syntax and Learning Curve
    In Go you can start after studying about 1 day since the syntax really similar to C (most universities taught C-family language), you can feel productive right away.
    In Elixir you'll need more than just 1 day (and obviously exponentially more to get the feel in Erlang unless you've learned about Prolog and LISP before), the syntax is somehow similar to Ruby, but you also required to learn about FP concepts (just like another functional language: Haskell, LISP, Clojure, F#) that could make you a better programmer.
  2. Concurrency and Deployment
    In Go you can achieve faster concurrency for single machine, but at cost of memory usage for the same amount of unprocessed light-thread (about 2-2.6x, see the edit history of previous link). If you need to need more than one machine, you must do it manually (but it's easy since Go statically linked: just a simple scp and executing service script would do).
    In Elixir you can have distributed concurrency, as described by many Erlang expert, BEAM is a 30 years old battle-tested virtual machine, that has these built-in advantages:
    1. Lightweight user-space threads (Goroutines requires more memory)
    2. Built-in distribution and failure detection (not sure what's the comparable library in Go)
    3. Reliability-oriented standard library (in Go you must check every error)
    4. Hot code swapping (use endless in Go to achieve zero downtime)
    Definitely you'll need time to master.
  3. Raw Performance
    In Go you got raw-performance, similar to Java, but more memory efficient, for any CPU-bound tasks, you should prefer Go instead of anything that currently has slower implementation (Javascript, PHP, Python, Perl, Ruby, Erlang/Elixir), see the 16k concurrent user column.
    In Elixir or any other BEAM language, since the light-thread have smaller memory usage, you can handle more process at the same time.
  4. Hiring
    Since Go are relatively more popular (because it's easier to learn) in terms of number of job postings I've encountered, TIOBE index (Jan 2017: Go #13, Erlang #44, Elixir #66), GitHub popularity go vs elixir, or Spectrum (Go #10. Erlang #35); than other BEAM-based language (especially Erlang), if you are PM/VPE with tight deadline, I believe Go is the better choice at this moment
So what you should use for your next project? It's always depends on what's the use case (right tool/person for the right job) and the deadline, there are no silver bullet. And no I don't intent to start a flame war.