20 Apr 2019

Deployment Strategies with Kubernetes and Istio

In this post I am going to discuss various deployment strategies and how they can be implemented with K8s and Istio. Basically the implementation of all strategies is based on the ability of K8s to run multiple versions of a microservice simultaneously and on the concept that consumers can access the microservice only through some entry point. At that entry point we can control what version of a microservice the consumer should be routed to.

The sample application for this post is going to be a simple Spring Boot application wrapped into a Docker image. So there are two images superapp:old and superapp:new representing an old and a new versions of the application respectively:

docker run -d --name old -p 9001:8080 eugeneflexagon/superapp:old
docker run -d --name new -p 9002:8080 eugeneflexagon/superapp:new


curl http://localhost:9001/version
{"id":1,"content":"old"}

curl http://localhost:9002/version
{"id":1,"content":"new"}


Let's assume the old version of the application is deployed to a K8s cluster running on Oracle Kubernetes Engine with the following manifest:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: superapp
spec:
  replicas: 3
  template:
    metadata:
       labels:
         app: superapp
    spec:
      containers:
        - name: superapp
          image: eugeneflexagon/superapp:old
          ports:
            - containerPort: 8080
So there are three replicas of a pod running the old version of the application. There is also a service routing the traffic to these pods:
apiVersion: v1
kind: Service
metadata:
  name: superapp
spec:
  selector:
    app: superapp   
  ports:
    - port: 8080
      targetPort: 8080      

Rolling Update
This deployment strategy updates pods in a rolling update way, changing them one by one.


This is a default strategy handled by a K8s cluster itself, so we just need to update the superapp deployment with a reference to the new image:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: superapp
spec:
  replicas: 3
  template:
    metadata:
       labels:
         app: superapp
    spec:
      containers:
        - name: superapp
          image: eugeneflexagon/superapp:new
          ports:
            - containerPort: 8080
However, we can fine-tune the rolling update algorithm by providing parameters for this deployment strategy in the manifest file:
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
       maxSurge: 30%
       maxUnavailable: 30%   
  template:
  ...

The maxSurge parameter defines the maximum number of pods that can be created over the desired number of pods. It can be either a percentage or an absolute number. Default value is 25%.
The maxUnavailable parameter defines the maximum number of pods that can be unavailable during the update process. It can be either a percentage or an absolute number. Default value is 25%.


Recreate
This deployment strategy kills all old pods and then creates the new ones.

spec:
  replicas: 3
  strategy:
    type: Recreate
  template:
  ...

Very simple.

Blue/Green
This strategy defines an old version of the application as a green one and a new version as a blue one. Users always have access only to the green version.


apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: superapp-01
spec:
  template:
    metadata:
       labels: 
         app: superapp
         version: "01"
...



apiVersion: v1
kind: Service
metadata:
  name: superapp
spec:
  selector:
    app: superapp 
    version: "01"
...

The service routes the traffic only to pods with label version: "01".

We deploy a blue version to a K8s cluster and make it available only for QAs or for a test automation tool (via a separate service or direct port-forwarding).



apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: superapp-02
spec:
  template:
    metadata:
       labels:
         app: superapp
         version: "02"
...


Once the new version is tested we switch the service to it and scale down the old version:



apiVersion: v1
kind: Service
metadata:
  name: superapp
spec:
  selector:
    app: superapp 
    version: "02"
...


kubectl scale deployment superapp-01 --replicas=0

Having done that,  all users work with the new version.

So there is no Istio so far. Everything is handled by a K8s cluster out-of-the box. Let's move on to the next strategy.


Canary
I love this deployment strategy as it lets the users test the new version of the application and they don't even know about that. The idea is that we deploy a new version of the application and route 10% of the traffic to it. The users have no idea about that.


If it works for a while, we can balance the traffic 70/30, then 50/50 and eventually 0/100.
Even though this strategy can be implemented with K8s resources only by playing with the number of old and new pods, it is way more convenient to implement it with Istio.
So the old and the new applications are defined as the following deployments:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: superapp-01
spec:
  template:
    metadata:
       labels: 
         app: superapp
         version: "01"
...

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: superapp-02
spec:
  template:
    metadata:
       labels:
         app: superapp
         version: "02"
...

The service routes the traffic to both of them:
apiVersion: v1
kind: Service
metadata:
  name: superapp
spec:
  selector:
    app: superapp 
...

On top of that we are going to use the following Istio resources: VirtualService and  DestinationRule.
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: superapp
spec:
  host: superapp
  subsets:
  - name: green
    labels:
      version: "01"
  - name: blue
    labels:
      version: "02"
---     
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: superapp
spec:
  hosts:
    - superapp   
  http:
  - match:
    - uri:
        prefix: /version
    route:
    - destination:
        port:
          number: 8080
        host: superapp
        subset: green
      weight: 90
    - destination:
        port:
          number: 8080
        host: superapp    
        subset: blue  
      weight: 10
The VirtualService will route all the traffic coming to the superapp service (hosts) to the green and blue pods according to the provided weights (90/10).

A/B Testing
With this strategy we can precisely control what users, from what devices, departments, etc. are routed to the new version of the application.

For example here we are going to analyze the request header and if its custom tag "end-user" equals "xammer" it will be routed to the new version of the application. The rest of the requests will be routed to the old one:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: superapp
spec:
  gateways:
    - superapp
  hosts:
    - superapp   
  http:
  - match:
    - headers:
        end-user:
          exact: xammer                 
    route:
    - destination:
        port:
          number: 8080
        host: superapp
        subset: blue
  - route:
    - destination:
        port:
          number: 8080
        host: superapp
        subset: green

All examples and manifest files for this post are available on GitHub so you can play with various strategies and sophisticated routing rules on your own. You just need a K8s cluster (e.g. Minikube on your laptop) with preinstalled Istio. Happy deploying!

That's it!

13 Apr 2019

Load Testing of a Microservice. Kubernetes way.

Let's assume there is a microservice represented by a composition of containers running on a K8s cluster somewhere in a cloud, e.g. Oracle Kubernetes Engine (OKE). At some point we want to quickly stress test a specific microservice component or the entire microservice. So we want to know how it works under the load, how it handles many subsequent requests coming from many parallel clients. The good news is that we have already a tool for that. Up and running. This is the Kubernetes cluster itself.

We're going to use Kubernetes Job for this testing described in the following manifest file:
apiVersion: batch/v1
kind: Job
metadata:
   name: job-load
spec:
   parallelism: 50   
   template:
     spec:
       containers:
         - name: loader
           image: eugeneflexagon/aplpine-with-curl:1.0.0
           command: ["time", "curl", "http://my_service:8080/my_path?[1-100]"]     
       restartPolicy: OnFailure   
This job is going to spin up 50 pods running in parallel and sending 100 requests each to my_service on port 8080 and with path my_path. Having the job created and started by invoking
kubectl  apply -f loadjob.yaml
We can observe all 50 pods created by the job using

kubectl get pods -l job-name=job-load
NAME             READY     STATUS      RESTARTS   AGE
job-load-4n262   1/2       Completed   1          12m
job-load-dsqtc   1/2       Completed   1          12m
job-load-khdn4   1/2       Completed   1          12m
job-load-kptww   1/2       Completed   1          12m
job-load-wf9pd   1/2       Completed   1          12m
...

If we look at the logs of any of these pods

kubectl logs job-load-4n262

We'll see something like the following:
[1/100]: http://my_service.my_namespace:8080/my_path?1 --> <stdout>
{"id":456,"content":"Hello world!"}

[2/100]: http://my_service.my_namespace:8080/my_path?2 --> <stdout>
{"id":457,"content":"Hello world!"}

[3/100]: http://my_service.my_namespace:8080/my_path?3 --> <stdout>
{"id":458,"content":"Hello world!"}

....

real    0m 10.04s
user    0m 0.00s
sys     0m 0.04s

That's it!