2015-02-27

Old String Set/Map Data Structure Benchmark (2012)

Just want to share some table from initial chapter of my thesis (early 2012), it's about modification (added lots of compression) of HAT-Trie for DNS suffix blocking. These tables are from chapter IV, since it's the only exciting part about it .__.)/||. That time I didn't know about Cedar yet (of course it's 2013 XD). This is the list of benchmarked data structure:


The data structure name that marked with "*" sign are those who can be set as nested per subdomain (that is should be a map/associative array). The strings tested are C++'s std::string, Qt's QString, csubdom (compressed subdomain), strcsubdom (compressed subdomain, stored in std::string), clabels (compressed full domain), strclables (compressed full domain name). Compressed in this term are packed characters (from 8-bit to 5/6-bit so it would use less memory). This benchmark performed on Ubuntu 64-bit Linux 3.2, GCC 4.6.3, AMD Phenom X9850, 8GB RAM and non-SSD disk, compile flag: g++ -c -m64 -pipe -O2 -Wall -W.


Notes about that table header "insert" is a benchmark about inserting 2.1 million blacklisted domain names, after it's completed, the data structure erased and the insert operation repeated until 30 seconds passes. The "misses" benchmark about checking if 68.4k domains that doesn't exists on the blacklist, the operation repeated until 2 seconds passes. The "exists" benchmark is about rechecking blacklisted domain names in sorted order, repeated until 8 seconds passes. The "random" benchmark is about checking random blacklisted items, repeated until 20 seconds passes. The value there are number of milliseconds required per domain name. Last column on the table is average bytes required to store one domain name.

2015-02-26

Docker: The Software Container

Docker is operating system-level virtualization, software container that enables sysadmin or software developer to deploy an isolated distributed Linux application almost anywhere without any hypervisor (but both can be combined). Docker is more resource friendly (efficient) than any hardware virtualization solutions, faster startup-shutdown time, and lower hardware requirement (it works as long as you have Linux kernel that support LXC). Docker can run on Mac OS X and Windows via boot2docker (or with Vagrant or any virtualization software). To install it on ArchLinux, type:

# install stable version
$ yaourt --needed --noconfirm -S --force docker

# or latest git version
$ yaourt --needed --noconfirm -S --force docker-git

# start and enable the service
$ sudo systemctl enable docker
$ sudo systemctl start docker

# allow your user to access docker, refresh session
$ sudo gpasswd -a `whoami` docker
$ newgrp docker

# show information
$ docker info
Containers: 0
Images: 0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
Execution Driver: native-0.2
Kernel Version: 3.18.7-1-ARCH
Operating System: ArchLinux
CPUs: 4
Total Memory: 15.49 GiB
Name: zzz
ID: 5SDJ:LPNU:UAR4:ULRJ:REZF:4V3W:6ES6:KJTW:DETH:765Y:XP4I:IZZZ

WARNING: No swap limit support

The docker service will create a network bridge interface (mostly docker0). You can use your own base image or download pre-built one. Make sure you have a lot disk space on your /var/lib/docker directory since docker store the images there. To create an ArchLinux base image, use any of these repositories, for example:

$ docker pull l3iggs/archlinux
$ docker pull kampka/archlinux
$ docker pull codekoala/arch

$ docker pull logankoester/archlinux 
Pulling repository logankoester/archlinux
88d601db3077: Download complete 
511136ea3c5a: Download complete 
9b0516337e5a: Download complete 
dce0559daa1b: Download complete 
ff4d9d90bf08: Download complete 
7207641fe7f8: Download complete 
Status: Downloaded newer image for logankoester/archlinux:latest

To list all docker images, type docker images, find the image's REPOSITORY or IMAGE ID, then you can run any command on that docker using docker run for example:

$ docker run 88d601db3077 ls -al
...

docker run -t -i logankoester/archlinux /bin/bash
exit

$ docker run logankoester/archlinux pacman -Rdd --noconfirm dirmngr

Packages (1): dirmngr-1.1.1-2

Total Removed Size:   0.49 MiB

:: Do you want to remove these packages? [Y/n] 

removing dirmngr...

$ docker run logankoester/archlinux pacman -Syu --noconfirm
:: Synchronizing package databases...
downloading core.db...
downloading extra.db...
downloading community.db...
:: Starting full system upgrade...
:: Replace dirmngr with core/gnupg? [Y/n] 
:: Replace lzo2 with core/lzo? [Y/n] 
resolving dependencies...
looking for inter-conflicts...

Packages (77): archlinux-keyring-20150212-1  bash-4.3.033-1  ca-certificates-20140923-9  ca-certificates-cacert-20140824-2  ca-certificates-mozilla-3.17.4-1  ca-certificates-utils-20140923-9  coreutils-8.23-1  cracklib-2.9.1-1  curl-7.40.0-1  db-5.3.28-2  dbus-1.8.16-2  device-mapper-2.02.116-1  dhcpcd-6.7.1-1  dirmngr-1.1.1-2 [removal]  e2fsprogs-1.42.12-1  expat-2.1.0-4  file-5.22-1  filesystem-2015.02-1  gcc-libs-4.9.2-3  gettext-0.19.4-1  glib2-2.42.1-1  glibc-2.21-2  gmp-6.0.0-2  gnupg-2.1.2-1  gnutls-3.3.12-1  gpgme-1.5.3-1  grep-2.21-1  hwids-20150129-1  inetutils-1.9.2-2  iproute2-3.18.0-1  kbd-2.0.2-1  kmod-19-1  krb5-1.13.1-1  less-471-1  libarchive-3.1.2-8  libassuan-2.1.3-1  libcap-2.24-2  libdbus-1.8.16-2  libffi-3.2.1-1  libgcrypt-1.6.2-1  libgpg-error-1.18-1  libidn-1.29-1  libksba-1.3.2-1  libldap-2.4.40-2  libsystemd-218-2  libtasn1-4.2-1  libtirpc-0.2.5-1  libunistring-0.9.4-1  libutil-linux-2.25.2-1  linux-api-headers-3.18.5-1  logrotate-3.8.8-2  lz4-127-1  lzo-2.09-1  lzo2-2.08-1 [removal]  mpfr-3.1.2.p11-1  ncurses-5.9-7  netctl-1.10-1  nettle-2.7.1-1  npth-1.1-1  openresolv-3.6.1-1  openssl-1.0.2-1  p11-kit-0.22.1-3  pacman-4.2.1-1  pacman-mirrorlist-20150205-1  pcre-8.36-2  perl-5.20.2-1  pinentry-0.9.0-1  procps-ng-3.3.10-1  shadow-4.2.1-2  systemd-218-2  systemd-sysvcompat-218-2  tar-1.28-1  texinfo-5.2-3  tzdata-2015a-1  usbutils-008-1  util-linux-2.25.2-1  xz-5.2.0-1

Total Download Size:    62.40 MiB
Total Installed Size:   264.78 MiB
Net Upgrade Size:       26.52 MiB


:: Proceed with installation? [Y/n] 

:: Retrieving packages ...
...

The previous changes of each run is not saved until you call docker commit, find out the last run ID first before committing:

$ docker ps -l 
CONTAINER ID        IMAGE                           COMMAND                CREATED             STATUS                     PORTS               NAMES
6d67ee44e7f5        logankoester/archlinux:latest   "pacman -Syu --nocon   11 minutes ago      Exited (0) 2 minutes ago                       stoic_meitner 

# docker commit ID your_username/your_repository
$ docker commit 6d67ee44e7f5 kokizzu/archlinux
5ab1562ea89959c54b8da4462abf086c91434524ae741769dab869b8263d7c1b

To check more information about current dock, use docker inspect followed by image ID:

$ docker images 
REPOSITORY               TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
kokizzu/archlinux        latest              5ab1562ea899        28 seconds ago      640.6 MB
logankoester/archlinux   latest              88d601db3077        24 hours ago        282.9 MB
...

# docker inspect ID
$ docker inspect 5ab1562ea899


After you verify that your image is working, you can share it to others (create a repository first on your dashboard), for example:

# docker push ID your_username/your_repository

You can find more information on the cheatsheet and the documentation, and if you're tempted to install sshd read this first.


Numeric CombSort Benchmark update!

As I've written before, CombSort are quite good sort algorithm. Let's compare this algorithm when implemented in various programming language. The benchmark should not use any other built-in function other than array generation and printing. The benchmark uses AMD A8-6600K, 16GB RAM with Non-SSD disk.

$ alias | grep 'alias time'
alias time='/usr/bin/time -f "\nCPU: %Us\tReal: %es\tRAM: %MKB"'
$ time --version
GNU time 1.7

g++ --version
g++ (GCC) 4.9.2 20141224 (prerelease)
$ time g++ comb.cpp
CPU: 0.05s      Real: 0.12s     RAM: 19428KB
$ time ./a.out
CPU: 1.94s      Real: 1.97s     RAM: 79804KB
$ time g++ -O2 comb.cpp
CPU: 0.07s      Real: 0.11s     RAM: 21260KB
$ time ./a.out
CPU: 0.88s      Real: 0.90s     RAM: 79804KB

clang --version
clang version 3.5.1 (tags/RELEASE_351/final)
$ time clang++ comb.cpp
CPU: 0.05s      Real: 0.08s     RAM: 33564KB
$ time ./a.out
CPU: 1.83s      Real: 1.86s     RAM: 79764KB
$ time clang++ -O2 comb.cpp
CPU: 0.08s      Real: 0.14s     RAM: 37860KB
$ time ./a.out
CPU: 0.89s      Real: 0.91s     RAM: 79804KB

java -version
java version "1.7.0_71" 
$ time javac comb.java
CPU: 1.05s      Real: 0.73s     RAM: 65952KB
$ time java comb
CPU: 1.32s      Real: 1.32s     RAM: 110488KB

php --version
PHP 5.6.4 (cli) (built: Dec 17 2014 21:45:04)
$ time php comb.php
CPU: 102.69s    Real: 104.20s   RAM: 2497508KB

hhvm --version
HipHop VM 3.5.0 (rel)
$ time hhvm -v Eval.Jit=true comb.php 
CPU: 12.56s     Real: 14.83s    RAM: 362488KB

ruby --version
ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-linux]
$ time ruby comb.rb
CPU: 52.87s     Real: 53.02s    RAM: 87892KB

rbx --version
rubinius 2.5.2 (2.1.0 7a5b05b1 2015-01-30 3.5.1 JI) [x86_64-linux-gnu]
$ time rbx comb.rb
CPU: 74.89s     Real: 74.30s    RAM: 135320KB

node --version
v0.10.35
$ time node comb1.js
CPU: 2.64s      Real: 2.64s     RAM: 92240KB
$ time node comb2.js
CPU: 2.68s      Real: 2.72s     RAM: 140612KB

rhino < /dev/null 
Rhino 1.7 release 4 2014 07 01
$ rhino comb2.js
CPU: 87.39s     Real: 61.16s    RAM: 1993848KB

$ pacman -Qo `which jsc-3`
/usr/bin/jsc-3 is owned by webkitgtk 2.4.8-1
$ time jsc-3 comb1.js
CPU: 23.74s     Real: 23.93s    RAM: 93740KB
$ time jsc-3 comb2.js
CPU: 18.99s     Real: 19.16s    RAM: 181644KB

js24 --help | grep Version
Version: JavaScript-C24.2.0
$ time js24 --ion-eager comb1.js
CPU: 2.13s      Real: 2.15s     RAM: 89688KB
$ time js24 --ion-eager comb2.js
CPU: 1.53s      Real: 1.58s     RAM: 92384KB

go version
go version go1.4.1 linux/amd64
$ time go build comb.go 
CPU: 0.14s      Real: 0.17s     RAM: 31568KB
$ time ./comb
CPU: 1.10s      Real: 1.14s     RAM: 79824KB

rustc --version
rustc 1.0.0-dev
$ time rustc comb.rs
CPU: 0.39s      Real: 0.49s     RAM: 106844KB
$ time ./comb
CPU: 10.62s     Real: 10.71s    RAM: 86020KB
$ time rustc -O comb.rs
CPU: 0.41s      Real: 0.49s     RAM: 110204KB
$ time ./comb
CPU: 0.97s      Real: 0.99s     RAM: 86108KB

scala -version
Scala code runner version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL
$ time scala comb.scala
CPU: 5.43s      Real: 6.30s     RAM: 206088KB
$ time scalac comb.scala
CPU: 10.62s     Real: 7.00s     RAM: 143460KB
$ time scala Comb
CPU: 5.49s      Real: 5.05s     RAM: 206300KB

python --version
Python 3.4.2
$ time python comb1.py
CPU: 90.47s     Real: 90.83s    RAM: 403192KB
$ time python comb2.py
CPU: 106.82s    Real: 107.26s   RAM: 87248KB

pypy --version
Python 2.7.8 (c6ad44ecf5d8, Nov 18 2014, 18:04:31) [PyPy 2.4.0 with GCC 4.9.2]
$ time pypy comb1.py
CPU: 5.34s      Real: 5.40s     RAM: 136764KB
$ time pypy comb2.py
CPU: 5.85s      Real: 6.04s     RAM: 204588KB

mcs --version
Mono C# compiler version 3.12.0.0
$ time mcs -o+ comb.cs
CPU: 0.44s      Real: 0.47s     RAM: 45908KB
$ time ./comb.exe
CPU: 1.38s      Real: 1.41s     RAM: 90472KB

lua -v
Lua 5.2.3  Copyright (C) 1994-2013 Lua.org, PUC-Rio
$ time lua comb.lua
CPU: 65.64s     Real: 65.81s    RAM: 264096KB

luajit -v
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall.
$ time luajit comb.lua
CPU: 6.30s      Real: 6.34s     RAM: 132964KB

dart --version
Dart VM version: 1.8.5 (Tue Jan 13 12:44:14 2015) on "linux_x64"
$ time dart scomb.dart
CPU: 2.12s      Real: 2.24s     RAM: 93392KB

The code can be found on my dropbox (folder: num-comb), and here's the summary:

Compiler / InterpreterLanguageCompile DurationCompile RAMRuntime DurationRuntime RAMTotal Duration
g++ (debug)C++50194281940798041990
g++ (-O2)C++702126088079804950
clang++ (debug)C++50335641830797641880
clang++ (-O2)C++803786089079804970
javac, javaJava10506595213201104882370
phpPHP1026902497508102690
hhvmPHP1256036248812560
rubyRuby528708789252870
rbxRuby7489013532074890
node (typed array)Javascript2640922402640
node (untyped array)Javascript26801406122680
rhino (untyped array)Javascript87039199384887039
jsc-3 (typed array)Javascript237409374023740
jsc-3 (untyped array)Javascript1899018164418990
js24 (typed array)Javascript2130896882130
js24 (untyped array)Javascript1530923841530
goGo140315681100798241240
rustc (debug)Rust390106844106208602011010
rustc (-O2)Rust410110204970861081380
scalaScala54302060885430
python3Python 39047040319290470
python3 (array)Python 310682087248106820
pypyPython 253401367645340
pypy (array)Python 258502045885850
mcsC#440459081380904721820
luaLua6564026409665640
luajitLua63001329646300
dartDart2120933922120

Write down your opinion (or pastie if you found a bug on these source, or if you want to add more language implementation) on the comment section ^^)b

Note #1Opal (0.6.8) and JRuby (both 1.7.18 and 9.0.0pre1) failed to run this benchmark (they exceed 300s runtime limit even when using -J-Xmx3000M -J-Djruby.compile.mode=FORCE flag).

Note #2: Yes, it's unfair to compare array of integer and array of double, life is unfair by design, get over it...

String CombSort Benchmark update!

Previously, we have benchmark CombSort algorithm implemented in various programming language for array of number. Let's compare this algorithm with addition integer to string conversion the language's built-in string library. The benchmark should not use any other built-in function other than string, integer conversion and array generation and printing. The benchmark uses AMD A8-6600K, 16GB RAM with Non-SSD disk.

$ alias | grep 'alias time'
alias time='/usr/bin/time -f "\nCPU: %Us\tReal: %es\tRAM: %MKB"'
$ time --version
GNU time 1.7

g++ --version
g++ (GCC) 4.9.2 20141224 (prerelease)
$ time g++ -std=c++11 scomb.cpp
CPU: 0.18s      Real: 0.20s     RAM: 35868KB
$ time ./a.out
CPU: 19.32s     Real: 19.60s    RAM: 548912KB
$ time g++ -std=c++11 -O2 scomb.cpp
CPU: 0.20s      Real: 0.24s     RAM: 38184KB
$ time ./a.out
CPU: 13.76s     Real: 14.05s    RAM: 548816KB

clang --version
clang version 3.5.1 (tags/RELEASE_351/final)
$ time clang++ -std=c++11 scomb.cpp
CPU: 0.15s      Real: 0.20s     RAM: 42240KB
$ time ./a.out
CPU: 18.89s     Real: 19.21s    RAM: 548868KB
$ time clang++ -std=c++11 -O2 scomb.cpp
CPU: 0.20s      Real: 0.23s     RAM: 45824KB
$ time ./a.out
CPU: 13.87s     Real: 14.15s    RAM: 548820KB

javac -version
javac 1.7.0_71
$ time javac scomb.java
CPU: 1.13s      Real: 0.80s     RAM: 65324KB
$ time java scomb
CPU: 48.96s     Real: 27.59s    RAM: 906652KB

hhvm --version
HipHop VM 3.5.0 (rel)
$ time hhvm -v Eval.Jit=true scomb.php
CPU: 89.92s     Real: 90.38s    RAM: 877468KB

ruby --version
ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-linux]
$ time ruby scomb.rb
CPU: 114.67s    Real: 115.21s   RAM: 870612KB

node --version
v0.10.35
$ time node scomb.js
CPU: 17.44s     Real: 17.41s    RAM: 411144KB

$ pacman -Qo `which jsc-3`
/usr/bin/jsc-3 is owned by webkitgtk 2.4.8-1
$ time jsc-3 scomb.js
CPU: 60.08s     Real: 43.66s    RAM: 834744KB

js24 --help | grep Version
Version: JavaScript-C24.2.0
$ time js24 scomb.js
CPU: 19.27s     Real: 19.62s    RAM: 735556KB

go version
go version go1.4.1 linux/amd64
$ time go build scomb.go
CPU: 0.14s      Real: 0.17s     RAM: 30428KB
$ time ./scomb
CPU: 11.45s     Real: 11.54s    RAM: 251628KB

scala -version
Scala code runner version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL
$ time scala -J-Xmx3000M scomb.scala
CPU: 83.42s     Real: 46.10s    RAM: 1093924KB

python --version
Python 3.4.2
$ time python scomb.py
CPU: 121.33s    Real: 122.06s   RAM: 641844KB

pypy --version
Python 2.7.8 (c6ad44ecf5d8, Nov 18 2014, 18:04:31) [PyPy 2.4.0 with GCC 4.9.2]
$ time pypy scomb.py
CPU: 14.72s     Real: 14.97s    RAM: 522080KB

lua -v
Lua 5.2.3  Copyright (C) 1994-2013 Lua.org, PUC-Rio
$ time lua scomb.lua
CPU: 98.11s     Real: 98.54s    RAM: 863880KB

luajit -v
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall.
$ time luajit scomb.lua
CPU: 21.30s     Real: 21.61s    RAM: 511268KB

dart --version
Dart VM version: 1.8.5 (Tue Jan 13 12:44:14 2015) on "linux_x64"
$ time dart scomb.dart
CPU: 11.92s     Real: 12.15s    RAM: 497788KB

The code can be found on my dropbox (folder: str-comb), and here's the summary:

Compiler / InterpreterLanguageCompile DurationCompile RAMRuntime DurationRuntime RAMTotal Duration
g++ (debug)C++180358681932054891219500
g++ (-O2)C++200381841376054881613960
clang++ (debug)C++150422401889054886819040
clang++ (-O2)C++200458241387054882014070
javac, javaJava1130653244896090665250090
hhvmPHP8992087746889920
rubyRuby114670870612114670
nodeJavascript1744041114417440
jsc-3Javascript6008083474460080
js24Javascript1927073555619270
goGo140304281145025162811590
scalaScala83420109392483420
python3Python 3121330641844121330
pypyPython 21472052208014720
luaLua9811086388098110
luajitLua2130051126821300
dartDart1192049778811920

Write down your opinion (or pastie if you found a bug on these source, or if you want to add more language implementation) on the comment section ^^)b

Note #1PHP 5.6.4, Rubinius 2.5.2, JRuby 9.0.0a1, Rhino 1.7, MCS 3.12 failed to end their execution within approx. 120s timeout

2015-02-25

Techempower Framework Benchmark 10 Preview

As you can see from this link, preliminary result has been published on their mailing list. C++ still dominating, JavaScript (NodeJS), Scala, and Dart gotten better, Go, PHP (HHVM), and Java still at the top tier.


One more point worth seeing, PostgreSQL above MySQL in single query benchmark (after SQLite), I'm so happy ^_^]b


As usual, this is just a benchmark, happy programming experience is also important: available libraries, ease of coding -- writing and reading other people source codes, available editors/tools, great documentation, start-up/compile/test duration.

2015-02-24

Command line to power off portable harddisk / usb drive on Linux

Sometimes we plug and mount a portable harddisk or USB on Linux server and want to unplug it using command line :3 script ninja! You can use udisks comand to do this, just find which drive are your portable disk attached as, for example using lsblkfdisk, or df command

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 931.5G  0 disk 
└─sda1   8:1    0 931.5G  0 part 
sdb      8:16   0 458.6G  0 disk 
├─sdb1   8:17   0     1G  0 part /boot
├─sdb2   8:18   0     2G  0 part [SWAP]
├─sdb3   8:19   0    64G  0 part /
└─sdb4   8:20   0 391.6G  0 part /home
sr0     11:0    1  1024M  0 rom  

$ fdisk -l 
Disk /dev/sdb: 458.6 GiB, 492387172352 bytes, 961693696 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xc5900613

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sdb1  *           63   2104514   2104452     1G 83 Linux
/dev/sdb2         2104515   6313544   4209030     2G 82 Linux swap / Solaris
/dev/sdb3         6313545 140536619 134223075    64G 83 Linux
/dev/sdb4       140536620 961683029 821146410 391.6G 83 Linux

Disk /dev/sda: 931.5 GiB, 1000204885504 bytes, 1953525167 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 33553920 bytes
Disklabel type: dos
Disk identifier: 0x9a70439c

Device     Boot Start        End    Sectors   Size Id Type
/dev/sda1        2048 1953521663 1953519616 931.5G  7 HPFS/NTFS/exFAT

$ df | grep -v tmpfs
Filesystem     Type     1M-blocks  Used Available Use% Mounted on
/dev/sdb3      ext4         64381 33387     27701  55% /
/dev/sdb1      ext2          1012    40       921   5% /boot
/dev/sdb4      ext4        394531 59129    315339  16% /home

$ cat /proc/partitions 
major minor  #blocks  name
   8       16  480846848 sdb
   8       17    1052226 sdb1
   8       18    2104515 sdb2
   8       19   67111537 sdb3
   8       20  410573205 sdb4
  11        0    1048575 sr0
   8        0  976762583 sda
   8        1  976759808 sda1

Then just call udisks with --detach flag to safely remove the device, for example:

sudo udisks --detach /dev/sda

Or you can use this script, download/save then you can execute it, for example:

sudo sh suspend-usb-device.sh -v /dev/sda 

now your portable disk can be removed safely.

Note: if you're using kernel newer than 2.6.32, there would be an error line 180: echo: write error: Invalid argument, change that script on line 180 from suspend into auto. And of course if you have Nemo (or maybe Nautilus too, but not for Thunar) installed you can always right click and then safely remove the drive without command line.