Showing posts with label terraform. Show all posts
Showing posts with label terraform. Show all posts

2023-07-02

KEDA Kubernetes Event-Driven Autoscaling

Autoscaling mostly useless, if the number of host/nodes/hypervisor is limited, eg. we only have N number of nodes, and we tried to autoscale the services inside of it, so by default we already waste a lot of unused resources (especially if the billing criteria is not like Jelastic, you use whatever you allocate not whatever you utilize). Autoscaling also quite useless if your problem is I/O-bound not CPU-bound, for example you don't use autoscaled database (or whatever the I/O bottleneck are). CPU are rarely the bottleneck in my past experience. But today we're gonna try to use KEDA, to autoscale kubernetes service. First we need to install fastest kube:

# install minikube for local kubernetes cluster
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 
sudo install minikube-linux-amd64 /usr/local/bin/minikube
minikube start # --driver=kvm2 or --driver=virtualbox
minikube kubectl
alias k='minikube kubectl --'
k get pods --all-namespaces

# if not using --driver=docker (default)
minikube addons configure registry-creds
Do you want to enable AWS Elastic Container Registry? [y/n]: n
Do you want to enable Google Container Registry? [y/n]: n
Do you want to enable Docker Registry? [y/n]: y
-- Enter docker registry server url: https://hub.docker.com/
-- Enter docker registry username: kokizzu
-- Enter docker registry password:
Do you want to enable Azure Container Registry? [y/n]: n
✅  registry-creds was successfully configured

 

Next we need to create a dummy container as for a pod that we want to be autoscaled:

# build example docker image
docker login -u kokizzu # replace with your docker hub username
docker build -t pf1 .
docker image ls pf1
# REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
# pf1          latest    204670ee86bd   2 minutes ago   89.3MB

# run locally for testing
docker run -it pf1 -p 3000:3000

# tag and upload
docker image tag pf1 kokizzu/pf1:v0001
docker image push kokizzu/pf1:v0001


Create deployment terraform file, something like this:

# main.tf
terraform {
  required_version = ">= 1.3.0"
  required_providers {
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "= 2.20.0"
    }
  }
  backend "local" {
    path = "/tmp/pf1.tfstate"
  }
}
provider "kubernetes" {
  config_path    = "~/.kube/config"
  # from k config view | grep -A 3 minikube | grep server:
  host           = "https://240.1.0.2:8443"
  config_context = "minikube"
}
provider "helm" {
  kubernetes {
    config_path    = "~/.kube/config"
    config_context = "minikube"
  }
}
resource "kubernetes_namespace_v1" "pf1ns" {
  metadata {
    name        = "pf1ns"
    annotations = {
      name = "deployment namespace"
    }
  }
}
resource "kubernetes_deployment_v1" "promfiberdeploy" {
  metadata {
    name      = "promfiberdeploy"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    selector {
      match_labels = {
        app = "promfiber"
      }
    }
    replicas = "1"
    template {
      metadata {
        labels = {
          app = "promfiber"
        }
        annotations = {
          "prometheus.io/path"   = "/metrics"
          "prometheus.io/scrape" = "true"
          "prometheus.io/port"   = 3000
        }
      }
      spec {
        container {
          name  = "pf1"
          image = "kokizzu/pf1:v0001" # from promfiber.go
          port {
            container_port = 3000
          }
        }
      }
    }
  }
}
resource "kubernetes_service_v1" "pf1svc" {
  metadata {
    name      = "pf1svc"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    selector = {
      app = kubernetes_deployment_v1.promfiberdeploy.spec.0.template.0.metadata.0.labels.app
    }
    port {
      port        = 33000 # no effect in minikube, will forwarded to random port anyway
      target_port = kubernetes_deployment_v1.promfiberdeploy.spec.0.template.0.spec.0.container.0.port.0.container_port
    }
    type = "NodePort"
  }
}
resource "kubernetes_ingress_v1" "pf1ingress" {
  metadata {
    name        = "pf1ingress"
    namespace   = kubernetes_namespace_v1.pf1ns.metadata.0.name
    annotations = {
      "kubernetes.io/ingress.class" = "nginx"
    }
  }
  spec {
    rule {
      host = "pf1svc.pf1ns.svc.cluster.local"
      http {
        path {
          path = "/"
          backend {
            service {
              name = kubernetes_service_v1.pf1svc.metadata.0.name
              port {
                number = kubernetes_service_v1.pf1svc.spec.0.port.0.port
              }
            }
          }
        }
      }
    }
  }
}
resource "kubernetes_config_map_v1" "prom1conf" {
  metadata {
    name      = "prom1conf"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  data = {
    # from https://github.com/techiescamp/kubernetes-prometheus/blob/master/config-map.yaml
    "prometheus.yml" : <<EOF
global:
  scrape_interval: 15s
  evaluation_interval: 15s
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093
rule_files:
  #- /etc/prometheus/prometheus.rules
scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "pf1"
    static_configs:
      - targets: [
          "${kubernetes_ingress_v1.pf1ingress.spec.0.rule.0.host}:${kubernetes_service_v1.pf1svc.spec.0.port.0.port}"
        ]
EOF
    # need to delete stateful set if this changed after terraform apply
    # or kubectl rollout restart statefulset prom1stateful -n pf1ns
    # because statefulset pod not restarted automatically when changed
    # if configmap set as env or config file
  }
}
resource "kubernetes_persistent_volume_v1" "prom1datavol" {
  metadata {
    name = "prom1datavol"
  }
  spec {
    access_modes = ["ReadWriteOnce"]
    capacity     = {
      storage = "1Gi"
    }
    # do not add storage_class_name or it would stuck
    persistent_volume_source {
      host_path {
        path = "/tmp/prom1data" # mkdir first?
      }
    }
  }
}
resource "kubernetes_persistent_volume_claim_v1" "prom1dataclaim" {
  metadata {
    name      = "prom1dataclaim"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    # do not add storage_class_name or it would stuck
    access_modes = ["ReadWriteOnce"]
    resources {
      requests = {
        storage = "1Gi"
      }
    }
  }
}
resource "kubernetes_stateful_set_v1" "prom1stateful" {
  metadata {
    name      = "prom1stateful"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
    labels    = {
      app = "prom1"
    }
  }
  spec {
    selector {
      match_labels = {
        app = "prom1"
      }
    }
    template {
      metadata {
        labels = {
          app = "prom1"
        }
      }
      # example: https://github.com/mateothegreat/terraform-kubernetes-monitoring-prometheus/blob/main/deployment.tf
      spec {
        container {
          name  = "prometheus"
          image = "prom/prometheus:latest"
          args  = [
            "--config.file=/etc/prometheus/prometheus.yml",
            "--storage.tsdb.path=/prometheus/",
            "--web.console.libraries=/etc/prometheus/console_libraries",
            "--web.console.templates=/etc/prometheus/consoles",
            "--web.enable-lifecycle",
            "--web.enable-admin-api",
            "--web.listen-address=:10902"
          ]
          port {
            name           = "http1"
            container_port = 10902
          }
          volume_mount {
            name       = kubernetes_config_map_v1.prom1conf.metadata.0.name
            mount_path = "/etc/prometheus/"
          }
          volume_mount {
            name       = "prom1datastorage"
            mount_path = "/prometheus/"
          }
          #security_context {
          #  run_as_group = "1000" # because /tmp/prom1data is owned by 1000
          #}
        }
        volume {
          name = kubernetes_config_map_v1.prom1conf.metadata.0.name
          config_map {
            default_mode = "0666"
            name         = kubernetes_config_map_v1.prom1conf.metadata.0.name
          }
        }
        volume {
          name = "prom1datastorage"
          persistent_volume_claim {
            claim_name = kubernetes_persistent_volume_claim_v1.prom1dataclaim.metadata.0.name
          }
        }
      }
    }
    service_name = ""
  }
}
resource "kubernetes_service_v1" "prom1svc" {
  metadata {
    name      = "prom1svc"
    namespace = kubernetes_namespace_v1.pf1ns.metadata.0.name
  }
  spec {
    selector = {
      app = kubernetes_stateful_set_v1.prom1stateful.spec.0.template.0.metadata.0.labels.app
    }
    port {
      port        = 10902 # no effect in minikube, will forwarded to random port anyway
      target_port = kubernetes_stateful_set_v1.prom1stateful.spec.0.template.0.spec.0.container.0.port.0.container_port
    }
    type = "NodePort"
  }
}
resource "helm_release" "pf1keda" {
  name       = "pf1keda"
  repository = "https://kedacore.github.io/charts"
  chart      = "keda"
  namespace  = kubernetes_namespace_v1.pf1ns.metadata.0.name
  # uninstall: https://keda.sh/docs/2.11/deploy/#helm
}
# run with this commented first, then uncomment
## from: https://www.youtube.com/watch?v=1kEKrhYMf_g
#resource "kubernetes_manifest" "scaled_object" {
#  manifest = {
#    "apiVersion" = "keda.sh/v1alpha1"
#    "kind"       = "ScaledObject"
#    "metadata"   = {
#      "name"      = "pf1scaledobject"
#      "namespace" = kubernetes_namespace_v1.pf1ns.metadata.0.name
#    }
#    "spec" = {
#      "scaleTargetRef" = {
#        "apiVersion" = "apps/v1"
#        "name"       = kubernetes_deployment_v1.promfiberdeploy.metadata.0.name
#        "kind"       = "Deployment"
#      }
#      "minReplicaCount" = 1
#      "maxReplicaCount" = 5
#      "triggers"        = [
#        {
#          "type"     = "prometheus"
#          "metadata" = {
#            "serverAddress" = "http://prom1svc.pf1ns.svc.cluster.local:10902"
#            "threshold"     = "100"
#            "query"         = "sum(irate(http_requests_total[1m]))"
#            # with or without {service=\"promfiber\"} is the same since 1 service 1 pod in our case
#          }
#        }
#      ]
#    }
#  }
#}


terraform init # download dependencies
terraform plan # check changes
terraform apply # deploy
terraform apply # uncomment first scaled_object part

k get pods --all-namespaces -w # check deployment
NAMESPACE NAME                          READY STATUS  RESTARTS AGE
keda-admission-webhooks-xzkp4          1/1    Running 0        2m2s
keda-operator-r6hsh                    1/1    Running 1        2m2s
keda-operator-metrics-apiserver-xjp4d  1/1    Running 0        2m2s
promfiberdeploy-868697d555-8jh6r       1/1    Running 0        3m40s
prom1stateful-0                        1/1    Running 0        22s

k get services --all-namespaces
NAMESP NAME     TYPE     CLUSTER-IP     EXTERNAL-IP PORT(S) AGE
pf1ns  pf1svc   NodePort 10.111.141.44  <none>      33000:30308/TCP 2s
pf1ns  prom1svc NodePort 10.109.131.196 <none>      10902:30423/TCP 6s

minikube service list
|------------|--------------|-------------|------------------------|
|  NAMESPACE |    NAME      | TARGET PORT |          URL           |
|------------|--------------|-------------|------------------------|
| pf1ns      | pf1svc       |       33000 | http://240.1.0.2:30308 |
| pf1ns      | prom1service |       10902 | http://240.1.0.2:30423 |
|------------|--------------|-------------|------------------------|

 

To debug if something goes wrong, you can use something like this:


# debug inside container, replace with pod name
k exec -it pf1deploy-77bf69d7b6-cqqwq -n pf1ns -- bash

# or use dedicated debug pod
k apply -f debug-pod.yml
k exec -it debug-pod -n pf1ns -- bash
# delete if done using
k delete pod debug-pod -n pf1ns


# install debugging tools
apt update
apt install curl iputils-ping
dnsutils net-tools # dig and netstat

 

To check metrics that we want to use as autoscaler we can check from multiple place:

# check metrics inside pod
curl http://pf1svc.pf1ns.svc.cluster.local:33000/metrics

# check metrics from outside
curl
http://240.1.0.2:30308/metrics

# or open from prometheus UI:
http://240.1.0.2:30423

# get metrics
k get --raw "/apis/external.metrics.k8s.io/v1beta1"                                                        1 ↵
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[{"name":"externalmetrics","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]}]}

# get scaled object
k get scaledobject pf1keda -n pf1ns
NAME      SCALETARGETKIND      SCALETARGETNAME   MIN   MAX   TRIGGERS     AUTHENTICATION   READY   ACTIVE   FALLBACK   PAUSED    AGE
pf1keda   apps/v1.Deployment   pf1deploy         1     5     prometheus                    True    False    False      Unknown   3d20h

# get metric name
k get scaledobject pf1keda -n pf1ns -o 'jsonpath={.status.externalMetricNames}'
["s0-prometheus-prometheus"]

 

Next we can do loadtest while watching pods:

# do loadtest
hey -c 100 -n 100000
http://240.1.0.2:30308

# check with kubectl get pods -w -n pf1ns, it would spawn:
promfiberdeploy-96qq9  0/1     Pending             0  0s
promfiberdeploy-j5qw9  0/1     Pending             0  0s
promfiberdeploy-96qq9  0/1     Pending             0  0s
promfiberdeploy-76pvt  0/1     Pending             0  0s
promfiberdeploy-76pvt  0/1     Pending             0  0s
promfiberdeploy-j5qw9  0/1     Pending             0  0s
promfiberdeploy-96qq9  0/1     ContainerCreating   0  0s
promfiberdeploy-76pvt  0/1     ContainerCreating   0  0s
promfiberdeploy-j5qw9  0/1     ContainerCreating   0  0s
promfiberdeploy-96qq9  1/1     Running             0  1s
promfiberdeploy-j5qw9  1/1     Running             0  1s
promfiberdeploy-76pvt  1/1     Running             0  1s
...
promfiberdeploy-j5qw9  1/1     Terminating         0  5m45s
promfiberdeploy-96qq9  1/1     Terminating  
      0  5m45s
promfiberdeploy-gt2h5  1/1     Terminating  
      0  5m30s
promfiberdeploy-76pvt  1/1     Terminating  
      0  5m45s

# all events includes scale up event
k get events -n pf1ns -w
21m    Normal ScalingReplicaSet deployment/promfiberdeploy Scaled up replica set promfiberdeploy-868697d555 to 1
9m20s  Normal ScalingReplicaSet deployment/promfiberdeploy Scaled up replica set promfiberdeploy-868697d555 to 4 from 1
9m5s   Normal ScalingReplicaSet deployment/promfiberdeploy Scaled up replica set promfiberdeploy-868697d555 to 5 from 4
3m35s  Normal ScalingReplicaSet deployment/promfiberdeploy Scaled down replica set promfiberdeploy-868697d555 to 1 from 5

That's it, that's how you use KEDA and terraform to autoscale deployments. The key parts on the .tf files are:

  • terraform - needed to let terraform know what plugins being used on terraform init
  • kubernetes and helm - needed to know which config being used, and which cluster being contacted
  • kubernetes_namespace_v1 - to create a namespace (eg. per tenant)
  • kubernetes_deployment_v1 - to set what pod being used and which docker container to be used
  • kubernetes_service_v1 - to expose port on the node (in this case only NodePort), to loadbalance between pods
  • kubernetes_ingress_v1 - should be used to redirect request to proper services, but since we only have 1 service and we use minikube (that it have it's own forwarding) this one not used in our case
  • kubernetes_config_map_v1 - used to bind a config file (volume) for prometheus deployment, this sets where to scrape the service, this is NOT a proper way to do this, the proper way is on the latest commit on that repository, using PodMonitor from prometheus-operator:
    • kubernetes_service_v1 - to expose global prometheus (that monitor whole kubernetes, not per namespace)
    • kubernetes_service_account_v1 - crates service account so prometheus on namespace can retrieve pods list
    • kubernetes_cluster_role_v1 - role to allow list pods
    • kubernetes_cluster_role_binding_v1 - bind service account with the role above
    • kubernetes_manifest - creates podmonitor kubernetes manifest, this is the rules generated for prometheus on namespace to match specific pod
    • kubernetes_manifest - creates prometheus manifest that deploys prometheus on specific namespace
  • kubernetes_persistent_volume_v1 and kubernetes_persistent_volume_claim_v1 - used to bind data diectory (volume) to prometheus deployment
  • kubernetes_stateful_set_v1 - to deploy the prometheus, since it's not a stateless service, we have to bind data volume to prevent data loss
  • kubernetes_service_v1 - to expose port of prometheus to outside
  • helm_release - to deploy keda
  • kubernetes_manifest - to create custom manifest since scaled object is not supported by kubernetes terraform provider, this configures which service that able to be autoscaled

If you need the source code, you can take a look at terraform1 repo, the latest one is using podmonitor.


2022-05-31

Getting started with SaltStack Configuration Management

SaltStack is one of the least popular Automation and Configuration Management tool, similar to Ansible, Chef, and Puppet (also stuff that not a CM, it's IaC tool but if you are skilled enough can be used as CM: Terraform and Pulumi). Unlike Ansible, it have agent that needs to be installed to the target servers. Unlike Chef and Puppet, it uses Python. Like Ansible, SaltStack can be used for infrastructure lifecycle management too. To install Salt, you need to run these commands:

sudo add-apt-repository ppa:saltstack/salt
sudo apt install salt-master salt-minion salt-cloud

# for minion:
sudo vim /etc/salt/minion
# change line "#master: salt" into:
# master: 127.0.0.1 # should be master's IP
# id: thisMachineId
sudo systemctl restart salt-minion


There's simpler way to install, bootstrap master and minion, it's using a bootstrap script:

curl -L https://bootstrap.saltstack.com -o install_salt.sh
sudo sh install_salt.sh -M # for master

sudo sh install_salt.sh    # for minion
# don't forget change /etc/salt/minion, and restart salt-minion


To the minion will send the key to master, to list all possible minion, run:

sudo salt-key -L

To accept/add minion, you can use:

sudo salt-key --accept=FROM_LIST_ABOVE
sudo salt-key -A # accept all unaccepted, -R reject all

# test connection to all minion
sudo salt '*' test.ping

# run a command 
sudo salt '*' cmd.run 'uptime'

If you don't want open port, you can also use salt-ssh, so it would work like Ansible:

pip install salt-ssh
# create /etc/salt/roster containing:
ID1:
  host: minionIp1
  user: root
  sudo: True
ID2_usually_hostname:
  host: minionIp2
  user: root
  sudo: True

To execute a command on all roster you can use salt-ssh:

salt-ssh '*' cmd.run 'hostname'

On SaltStack, there's 5 things that you need to know:
  1. Master (the controller) 
  2. Minion (the servers/machines being controlled)
  3. States (current state of servers)
  4. Modules (same like Ansible modules)
    1. Grains (facts/properties of machines, to gather facts call: salt-call --grains)
    2. Execution (execute action on machines, salt-call moduleX.functionX, for example: cmd.run 'cat /etc/hosts | grep 127. ; uptime ; lsblk', or user.list_users, or grains.ls -- for example to list all grains available properties, or grains.get ipv4_interfaces, or grains.get ipv4_interfaces:docker0)
    3. States (idempotent multiplatform for CM)
    4. Renderers (module that transform any format to Python dictionary)
    5. Pillars (user configuration properties)
  5. salt-call (primary command)
To run locally just add --local. To run on every host we can use salt '*' modulename.functionname. The wildcard can be changed with compound filtering argument, more detail and example here.
To start using salt with file mode, create a directory called salt, and create a top.sls file (it's just a yaml combined with jinja) which contains list of host, filters, and state module you want to call, usually it saved on /srv/ directory, containing:

base:
  '*': # every machine
    - requirements # state module to be called
    - statemodule0
dev1:
  'osrelease:*22.04*': # only machine with specific os version
     - match: grain
     - statemodule1
dev2:
  'os:MacOS': # only run on mac
     - match: grain
     - statemodule2/somesubmodule1
prod:
  'os:Pop and host:*.domain1': # only run on PopOS with tld domain1
     - match: grain
     - statemodule3
     - statemodule4
  'os:Pop': # this too will run if the machine match
     - match: grain
     - statemodule5

Then create a directory for each those items containing init.sls or create files for each those items with .sls extension. For example requirements.sls:

essential-packages: # ID = what this state module do
  pkg.installed: # module name, function name
    - pkgs:
      - bash
      - build-essentials
      - git
      - tmux
      - byobu
      - zsh
      - curl
      - htop
      - python-software-properties
      - software-properties-common
      - apache2-utils
  file.managed: 
    - name: /tmp/a/b
    - makedirs: True
    - user: root
    - group: wheel
    - mode: 644
    - source: salt://files/b # will copy ./files/b to machine
  file.managed:
    - name: /tmp/a/c
    - contents: # this will create a file with specific lines
      - line 1
      - line 2
  service.running:
    - name: myservice1
    - watch: # will restart the service if these changed
      - file: /etc/myservice.conf
      - file: /tmp/a/b
  file.append:
    - name: /tmp/a/c
    - text: 'some line' # will append to that file
  cmd.run:
    - name: /bin/someCmd1 param1; /bin/cmd2 --someflag2
    - onchanges:
      - file: /tmp/a/c # run cmd above if this file changed
  file.directory: # ensure directory created
    - name: /tmp/d
    - user: root
    - dirmode: 700
  archive.extracted: # extract from specific archive file
    - name: /tmp/e
    - source: https://somedomain/somefile.tgz
    - force: True
    - keep_source: True
    - clean: True

To apply run: salt-call state.apply requirements
Some other example, we can create a template with jinja and yaml combined, like this:

statemodule0:
  file.managed:

    - name: /tmp/myconf.cfg # will copy file based on jinja condition
    {% if '/usr/bin' in grains['pythonpath'] %}
    - source: salt://files/defaultpython.conf
    {% elif 'Pop' == grains['os'] %}
    - source: salt://files/popos.conf
    {% else %}
    - source: salt://files/unknown.conf
    {% endif %}
    - makedirs: True
  cmd.run:
    - name: echo
    - onchanges:
      - file: statemodule0 # refering statemodule0.file.managed.name

To create a python state module, you can create a file containing something like this:

#!py
def run():
  config = {}
  a_var = 'test1' # we can also do a loop, everything is dict/array
  config['create_file_{}'.format(a_var)] = {
    'file.managed': [
      {'name': '/tmp/{}'.format(a_var)},
      {'makedirs': True},
      {'contents': [
        'line1',
        'line2'
        ]
      },
    ],
  }
  return config
 
To include another state module, you can specify on statemodulename/init.sls, something like this:

include:
  - .statemodule2 # if this a folder, will run the init.sls inside
  - .statemodule3 # if this a file, will run statemodule3.sls

To run all state it you can call salt-call state.highstate or salt-call state.apply without any parameter.
It would execute 
top.sls file and the includes in order recursively.
To create a scheduled state, you can create a file containing something like this:

id_for_this:
  schedule.present:
    - name: highstate
    - run_on_start: True
    - function: state.highstate
    - minutes: 60
    - maxrunning: 1
    - enabled: True
    - returner: rawfile_json
    - splay: 600 


Full example can be something like this:

install nginx:
  pkg.install:
    - nginx

/etc/nginx/nginx.conf: # used as name
   file.managed:
     source: salt://_files/nginx.j2
     template: jinja
     require:
       - install nginx

run nginx:
  service.running:
    name: nginx
    enable: true
    watch: 
      - /etc/nginx/nginx.conf

Next, to create a pillar config, just create a normal sls file, containing something like this:

user1:
  active: true
  sudo: true
  ssh_keys:
    - ssh-rsa censored user1@domain1
nginx:
  server_name: 'foo.domain1'

To reference this on other salt file, you can use jinja something like this:

{% set sn = salt['pillar.get']('nginx:server_name') -%}
server {
  listen 443 ssl;
  server_name {{ sn }};
...

That's it for now, next if you want to learn more is to create your own executor module or other topics here.