2017-05-15

PostgreSQL 9.6.2 vs CockroachDB 1.0 vs ScyllaDB 1.6.4

New kids on the block, multi-master database that recently released 1.0, here's some microbenchmark result:

N = 999

test1: postgresql
INSERT: 3.442695947s (3.45 ms/op)
UPDATE: 3.912135754s (3.92 ms/op)
SELECT: 3.408927374s (0.52 ms/op)
CPU: 2.62s      Real: 11.07s    RAM: 57 864 KB

test2: postgresql jsonb
INSERT: 3.270218052s (3.27 ms/op)
UPDATE: 3.796453051s (3.80 ms/op)
SELECT: 3.209289448s (0.49 ms/op)
CPU: 2.33s      Real: 10.57s    RAM: 58 680 KB

test3: cockroachdb
INSERT: 7.495245970s (7.50 ms/op)
UPDATE: 8.249719113s (8.26 ms/op)
SELECT: 16m8.372273781s (148.34 ms/op)
CPU: 2.50s      Real: 986.17s   RAM: 58 340 KB

test4: scylladb
INSERT: 150.117719ms (0.15 ms/op)
UPDATE: 147.339553ms (0.15 ms/op)
SELECT: 5.422713068s (0.83 ms/op)
CPU: 4.06s      Real: 7.76s     RAM: 76 764 KB

N = 9999

test2: postgresql jsonb
INSERT: 36.012436525s (3.60 ms/op)
UPDATE: 35.902222429s (3.59 ms/op)
SELECT: 44.119970723s (0.68 ms/op)
CPU: 32.30s     Real: 116.34s   RAM: 58 632 KB

test4: scylladb
INSERT: 1.518285796s (0.15 ms/op)
UPDATE: 1.542542984s (0.15 ms/op)
SELECT: 2m16.29325852s (2.09 ms/op)
CPU: 41.55s     Real: 141.34s   RAM: 76 712 KB

This is shocking for me that CockroachDB 2x-19x slower than PostgreSQL, so I file a bug report and one for scylla (slow query on larger datasets).
This benchmark performed on 64-bit ArchLinux, i7-4720HQ, 16GB RAM, 256GB SSD Samsung. You can get the source here. Note that ScyllaDB requires XFS, but I use EXT4 filesystem.

PostgreSQL
+ battletested for 20 years
+ schema-free (via JSONB, it has indexes :3)
+ triggers, joins, language extensions (eg. pl/v8, pl/go, pl/ruby, etc)
- no multimaster replication, except if you use PostgresXL or Postgres-X2

CockroachDB
+ survive Aphyr's Jepsen
+ autoscaling, autohealing, autobalancing
+ only 1 binary file (Go power ^^)
- seriously slow on every part of this benchmark
- no BLOB! (as per 2017-05-15)

ScyllaDB
+ Cassandra rewritten in C++
+ autoscaling, autohealing, autobalancing
+ blazing fast for insert and update benchmark (not sure if it's persisted to disk though)
- no secondary index and serial/auto-increment (as per 2017-05-15)
- only support Ubuntu, Debian, RHEL (a bit challenging to compile on another OS because it's depends on old thrift and boost library)
- communication by default without authentication, this is bad if you don't have any private network (eg. host it on a public cloud), you must enable internode-ecnryption and put a firewall to allow only certain host access exposed ports.

2017-05-11

TechEmpower Framework Benchmark Round 14

New benchmark result is out, as usual the important part is the data-update benchmark:


At that chart, the top ranking language are: Kotlin, C, Java, C++, Go, Perl, Javascript, Scala, C#; and for the database: MySQL, PostgreSQL, MongoDB.

Also the other benchmark that reflect real world case is multiple-queries:
On that benchmmark, the top performer programming language are: Dart, C++, Java, C, Go, Kotlin, Javascript, Scala, Ruby, and Ur; and the database: MongoDB, PostgreSQL, MySQL. You can see the previous result here, and here.


2017-02-01

Elixir vs Golang

Rather than debate between newbies and expert in only one language, let's find out the pros and cons between Elixir and Go:
  1. The Syntax and Learning Curve
    In Go you can start after studying about 1 day since the syntax really similar to C (most universities taught C-family language), you can feel productive right away.
    In Elixir you'll need more than just 1 day (and obviously exponentially more to get the feel in Erlang unless you've learned about Prolog and LISP before), the syntax is somehow similar to Ruby, but you also required to learn about FP concepts (just like another functional language: Haskell, LISP, Clojure, F#) that could make you a better programmer.
  2. Concurrency and Deployment
    In Go you can achieve faster concurrency for single machine, but at cost of memory usage for the same amount of unprocessed light-thread (about 2-2.6x, see the edit history of previous link). If you need to need more than one machine, you must do it manually (but it's easy since Go statically linked: just a simple scp and executing service script would do).
    In Elixir you can have distributed concurrency, as described by many Erlang expert, BEAM is a 30 years old battle-tested virtual machine, that has these built-in advantages:
    1. Lightweight user-space threads (Goroutines requires more memory)
    2. Built-in distribution and failure detection (not sure what's the comparable library in Go)
    3. Reliability-oriented standard library (in Go you must check every error)
    4. Hot code swapping (use endless in Go to achieve zero downtime)
    Definitely you'll need time to master.
  3. Raw Performance
    In Go you got raw-performance, similar to Java, but more memory efficient, for any CPU-bound tasks, you should prefer Go instead of anything that currently has slower implementation (Javascript, PHP, Python, Perl, Ruby, Erlang/Elixir), see the 16k concurrent user column.
    In Elixir or any other BEAM language, since the light-thread have smaller memory usage, you can handle more process at the same time.
  4. Hiring
    Since Go are relatively more popular (because it's easier to learn) in terms of number of job postings I've encountered, TIOBE index (Jan 2017: Go #13, Erlang #44, Elixir #66), GitHub popularity go vs elixir, or Spectrum (Go #10. Erlang #35); than other BEAM-based language (especially Erlang), if you are PM/VPE with tight deadline, I believe Go is the better choice at this moment
So what you should use for your next project? It's always depends on what's the use case (right tool/person for the right job) and the deadline, there are no silver bullet. And no I don't intent to start a flame war.

2017-01-02

If Programming Language were Humans

Taken from here and nixCraft page (click there to see larger text).


  • Python: my formatting is my syntax!
  • Javascript: I'm technically functional
  • PHP: can I GET guys anything?
  • Haskell: (lambda x.x am pure)(i)
  • C/C++: If you're not allocating memory, you're not living!
  • C#: I am nothing like my father.
  • Erlang: Destroy one of my processes and I will only grow stronger
  • Elixir: I Draw my Eldritch Power From all my process-chans
  • Rust: I'll be designated driver tonight!
    Programmer: we know
    Rust: Don't talk while eatingyou could choke and die haha
  • Fortran: I am unparalled in numerical ^ scientific computation!
  • Matlab: Come with me if you want to research
  • Lua: I'll take things from here
  • Lisp Dialects: Trust in the recursion
    Racket: learning is fun

Ah there are some updates!























2017-01-01

Redis GUI

Redis is one of full featured in-memory database with optional persistence and replication. Redis support 5 kind of data types: key-value (SET, GET), hashtable (HSET, HGET), linkedlist (L/RPUSH, L/RPOP), sets (SADD, SREM) and sorted/scored-sets (ZADD, ZREM).



Looking for GUI for redis? Thy these apps, from the best to the least:

Redis React

Built using Mono + ReactJS. Just download and Run (install Mono first if you are using Linux).

Redis Desktop Manager

Built using C++ and Qt5.

yaourt --needed --noconfirm -S --force redis-desktop-manager

FastoRedis

A fork of FastoNoSql, I don't know what this one built with, since the github repository doesn't have the source.

Redis Commander

Built using NodeJS

sudo npm install -g redis-commander

Rebrow

Built using Python2 and Flask Framework. Just clone the repository, install its dependencies and run.

Btw, happy new year :3