🔬 🍚 📋 DEVOXX UK. Kubernetes in production: Blue / Green deployment, autoscaling, and deployment automation. Part 2 🌄 👨🏿‍⚕️ 👨‍🚀

Kubernetes is a great tool for running Docker containers in a clustered production environment. However, there are tasks that Kubernetes is unable to solve. When deploying frequently in a production environment, we need a fully automated Blue / Green deployment to avoid downtime in this process, which also requires external HTTP requests and SSL upload. This requires integration with a load balancer such as ha-proxy. Another task is the semi-automatic scaling of the Kubernetes cluster itself when working in the cloud, for example, partial reduction of the cluster scale at night.

Although Kubernetes does not have these features out of the box, it provides an API that can be used to solve such problems. Kubernetes cluster automated blue / green deployment and scaling tools were developed as part of the Cloud RTI open source project.

This video transcript describes how to configure Kubernetes along with other open source components to get a production-ready environment that accepts code from the git commit change commit without downtime in production.

DEVOXX UK. Kubernetes in production: Blue / Green deployment, autoscaling, and deployment automation. Part 1

So, after you have accessed your applications from the outside world, you can begin to fully configure automation, that is, bring it to the stage where you can execute git commit and make sure that this git commit ends in production. Naturally, in the implementation of these steps, in the implementation of the deployment, we do not want to face downtime. So, any automation in Kubernetes starts with an API.

Kubernetes is not a tool that can be used “out of the box” productively. Of course, you can do this, use kubectl and so on, but still the API is the most interesting and useful thing about this platform. Using the API as a feature set, you can access almost everything you want to do in Kubernetes. Kubectl itself also uses the REST API.

This is REST, so you can use any languages and tools to work with this API, but user libraries will greatly facilitate your life. My team wrote 2 such libraries: one for Java / OSGi and one for Go. The second is not often used, but in any case, these useful things are at your disposal. They are a partially licensed open-source project. There are many such libraries for different languages, so you can choose the most suitable.

So, before embarking on deployment automation, you need to make sure that this process is not subject to any downtime. For example, our team conducts production deployment in the middle of the day, when people make the most of their applications, so it’s very important to avoid delays in this process. In order to avoid downtime, 2 methods are used: blue / green deployment or rolling update rolling update. In the latter case, if you have 5 replicas of the application running, they are sequentially updated one after another. This method works great, but it does not work if you are running different versions of the application at the same time during the deployment process. In this case, you can update the user interface while the backend will work with the old version and the application will stop working.Therefore, from the point of view of programming, working in such conditions is rather difficult.

This is one of the reasons we prefer to use blue / green deployment to automate the deployment of our applications. With this method, you must make sure that at a certain point in time only one version of the application is active.

The blue / green deployment mechanism is as follows. We get traffic for our applications through ha-proxy, which directs it to running application replicas of the same version.

When a new deployment is carried out, we use Deployer, which is provided with new components, and it deploys the new version. Deploying a new version of an application means that a new set of replicas is "rising", after which these replicas of the new version are launched in a separate, new pod. However, ha-proxy does not know anything about them and so far has not sent them any workload.

Therefore, first of all, it is necessary to check the health of new versions of health cheking to make sure that the replicas are ready to serve the load.

All deployment components must support some form of health chek. This can be a very simple HTTP check with a call when you receive a code with a status of 200, or a deeper check in which you check the connection of replicas with the database and other services, the stability of the connections of the dynamic environment, whether everything starts and works correctly. This process can be quite complicated.

After the system has verified that all updated replicas are operational, Deployer will update the configuration and pass the correct confd, which will reconfigure ha-proxy.

Only after that the traffic will be directed to the under with replicas of the new version, and the old one will disappear.

This mechanism is not a feature of Kubernetes. The Blue / green deployment concept has been around for quite some time, and it has always used a load balancer. First, you direct all traffic to the old version of the application, and after the upgrade, completely transfer it to the new version. This principle is used not only in Kubernetes.

Now I will introduce you to a new deployment component - Deployer, which performs a health check, reconfigures proxies, and so on. This is a concept that does not apply to the outside world and exists inside Kubernetes. I will show how you can create your own Deployer concept using open-source tools.

So, the first thing Deployer does is create an RC replication controller using the Kubernetes API. This API creates pods and services for further deployment, that is, it creates a completely new cluster for our applications. Once the RC verifies that the replicas have started, it will check their health check. To do this, Deployer uses the GET / health command. It launches the corresponding verification components and verifies all the elements that ensure the operation of the cluster.

After all the pods have reported their “health”, Deployer creates a new configuration item - the distributed storage etcd, which is used inside Kubernetes, including to store the load balancer configuration. We write data to etcd, and a small tool, confd, monitors etcd for new data.

If he finds any changes to the initial configuration, he generates a new settings file and passes it to ha-proxy. In this case, ha-proxy reboots without losing any connections and addresses the load with new services that provide the new version of our applications.

As you can see, despite the abundance of components, there is nothing complicated. You just need to pay more attention to the API and etcd. I want to tell you about the open-source deployer that we ourselves use - this is Amdatu Kubernetes Deployer.

This is a Kubernetes deployment orchestration tool with the following features:

Blue / Green deployment
setting up an external load balancer;
Deployment descriptor management
Actual deployment management
Health checks during deployment
implementation of environment variables in pods.

Created on top of the Kubernetes API, this Deployer provides a REST API for managing descriptors and deployments, as well as a Websocket API for stream logs during deployment.

It puts the load balancer configuration data in etcd, so you can not use ha-proxy with support "out of the box", but it is easy to use your own balancer configuration file. Amdatu Deployer is written in Go, just like Kubernetes itself, and licensed by Apache.

Before using this version of the deployer, I used the following deployment descriptor, which specifies the parameters I need.

One of the important parameters of this code is to enable the “useHealthCheck" flag. We need to indicate that a health check is required during the deployment process. This option can be turned off when the deployment uses third-party containers that do not need to be verified. This descriptor also indicates the number of replicas and the frontend URL that ha-proxy needs. At the end is the specification flag for the podspec pod, which calls Kubernetes for information on port configuration, image, etc. This is a fairly simple descriptor in JSON format.

Another tool that is part of the Amdatu open-source project is Deploymentctl. It has a user interface UI for configuring the deployment, stores the deployment history and contains webhooks for callbacks by third-party users and developers. You can not use the UI, since Amdatu Deployer itself is a REST API, but this interface can make it much easier for you to deploy without involving any API. Deploymentctl is written in OSGi / Vertx using Angular 2.

Now I will demonstrate the above on the screen using a pre-made recording, so you do not have to wait. We will deploy a simple application on Go. Don’t worry, if you haven’t encountered Go before, this is a very simple application, so you should understand everything.

Here we create an HTTP server that responds only to / health, so this application only checks the health check and nothing else. If the check passes, the JSON structure shown below is invoked. It contains the version of the application that will be deployed by the deployer, the message that you see at the top of the file, and the boolean logical data type - whether our application is working or not.

I cheated a little with the last line, because I placed a fixed boolean value at the top of the file, which in the future will help me deploy even an “unhealthy” application. We will deal with this later.

So let's get started. First, we check for any running pods using the ~ kubectl get pods command, and if there is no response from the frontend URL, we make sure that no deployments are currently being performed.

Next, on the screen, you see the Deploymentctl interface I mentioned, in which the deployment parameters are set: namespace, application name, deployment version, number of replicas, frontend URL, container name, image, resource limits, port number for checking health check, etc. . Resource limits are very important, as they allow you to use the maximum possible amount of "iron". You can also see the Deployment log deployment log here.

If you repeat the ~ kubectl get pods command now, you can see that the system “freezes” for 20 seconds, during which the ha-proxy reconfiguration occurs. After that, it starts under, and our replica can be seen in the deployment log.

I cut out a 20-second wait from the video, and now you see on the screen that the first version of the application is deployed. All this was done only with the help of the UI.

Now let's try the second version. To do this, I change the message of the application with "Hello, Kubernetes!" to “Hello, Deployer!”, the system creates this image and places it in the Docker registry, after which we simply click the “Deploy” button in the Deploymentctl window again. In this case, the deployment log is automatically launched in the same way as it did when the first version of the application was deployed.

The ~ kubectl get pods command shows that 2 versions of the application are currently running, but the front-end shows that we are still running version 1. The

load balancer waits until the health check is performed, and then redirects traffic to the new version. After 20 seconds, we switch to curl and see that now we have deployed version 2 of the application, and the first is removed.

It was the deployment of a “healthy” - healthy - application. Let's see what happens if for the new version of the application I change the value of the Healthy parameter from true to false, that is, I will try to deploy an unhealthy application that has not passed the health check. This can happen if at the development stage some configuration errors were made in the application, and it went into production in this form.

As you can see, the deployment goes through all the above steps, and ~ kubectl get pods shows that both pods are running. But unlike the previous deployment, the log shows the state of timeout. That is, due to the fact that the health check did not pass, the new version of the application cannot be deployed. As a result, you see that the system returned to using the old version of the application, and the new version was simply deleted.

The good thing about this is that even if you have a huge number of simultaneous requests coming into the application, they won’t even notice downtime during the implementation of the deployment procedure. If you test this application using the Gatling framework, which sends it the maximum possible number of requests, then not one of these requests will be dropped. This means that our users will not even notice real-time version updates. If it fails, work will continue on the old version; if it succeeds, users will switch to the new version.

There is only one thing that can lead to failure - if the health check was successful, and the application crashed as soon as it received the workload, that is, the collapse will occur only after the deployment is completed. In this case, you will have to manually roll back to the old version. So, we looked at how to use Kubernetes with its open-source tools. The deployment process will be much simpler if you embed these tools in the Build / Deploy pipelines creation / deployment pipelines. At the same time, to start the deployment, you can use both the user interface and fully automate this process, applying, for example, commit to master.

Our Build Server build server will create a Docker image, paste it into the Docker Hub, or any other registry you use. The Docker hub supports webhook, so we can start remote deployment through Deployer as shown above. Thus, you can fully automate the deployment of the application in potential production.

Let's move on to the next topic - scaling the Kubernetes cluster. I note that the kubectl command is a scaling command. With the help of another, you can easily increase the number of replicas in our cluster. However, in practice, we usually want to increase the number of nodes, not nodes.

At the same time, during working hours, you may need to increase, and at night, to reduce the cost of Amazon services, decrease the number of running instances of the application. This does not mean that only the number of pods will scale enough, because even if one of the nodes is not busy, you still have to pay Amazon for it. That is, along with scaling the hearths, you need to scale the number of machines used.

This can be tricky because regardless of whether we use Amazon or another cloud service, Kubernetes knows nothing about the number of machines used. It lacks a tool that allows you to scale the system at the level of nodes.

So we will have to take care of the nodes and the pods. We can easily scale the launch of new nodes using the AWS API and Scaling group machines to configure the number of Kubernetes work nodes. You can also use cloud-init or a similar script to register nodes in a Kubernetes cluster.

The new machine starts in the Scaling group, initiates itself as a node, registers in the registry of the wizard and starts work. After that, you can increase the number of replicas for use on the resulting nodes. Reducing the scale requires more effort, since you need to make sure that such a step will not lead to the destruction of already running applications after turning off "unnecessary" machines. To prevent this scenario, you need to bring the nodes to the “unschedulable” status. This means that the default scheduler when scheduling DaemonSet pods will ignore these nodes. The scheduler will not delete anything from these servers, but it will also launch no new containers there. The next step is to displace the drain node, that is, to transfer the working hearths from it to another machine, or other nodes that have sufficient capacity for this.After verifying that there are no more containers on these nodes, you can remove them from Kubernetes. After that, for Kubernetes, they simply cease to exist. Next, you need to use the AWS API to disable unnecessary nodes, or machines.
You can use Amdatu Scalerd, another open-source scaling tool similar to the AWS API. It provides a CLI for adding or removing nodes in a cluster. Its interesting feature is the ability to configure the scheduler using the following json file.

The code shown halves the capacity of the cluster at night. It is configured as the number of available replicas, and the desired capacity of the Amazon cluster. Using this scheduler will automatically reduce the number of nodes at night and increase them in the morning, thereby saving the cost of using nodes of a cloud service like Amazon. This feature is not built into Kubernetes, but using Scalerd will allow you to scale this platform as you like.

I want to draw your attention to the fact that many people tell me: “All this is good, but what about my database, which is usually in a static state?” How can I run something like this in a dynamic environment like Kubernetes? In my opinion, you should not do this, should not try to organize the operation of the data warehouse in Kubernetes. Technically, this is possible, and there are manuals on the Internet on this subject, but it will seriously complicate your life.

Yes, the concept of persistent storage exists in Kubernetes, and you can try to run data warehouses such as Mongo or MySQL, but this is a rather time-consuming task. This is due to the fact that data warehouses do not fully support interaction with a dynamic environment. Most databases require significant tuning, including manually configuring the cluster, do not like autoscaling and other similar things.
Therefore, do not complicate your life when trying to start a data warehouse in Kubernetes. Organize their work in the traditional way using familiar services and just give Kubernetes the opportunity to use them.

At the end of the topic I want to introduce you to the Cloud RTI platform based on Kubernetes, which my team is working on. It provides centralized logging, monitoring applications and clusters, and has many other useful features that are useful to you. It uses various open-source tools such as Grafana to display monitoring.

The question was raised, why use the ha-proxy load balancer with Kubernetes. Good question, because there are currently 2 levels of load balancing. Kubernetes services are still located on virtual IP addresses. You cannot use them for external host ports, because if Amazon reboots its cloud host, the address will change. That's why we place ha-proxy services in front of services - to create a more static structure for seamless traffic interaction with Kubernetes.

Another good question is how can I take care of changing the database schema during blue / green deployment? The fact is that regardless of the use of Kubernetes, changing the database schema is a complex task. You need to ensure the compatibility of the old and the new scheme, after which you can update the database and then update the applications themselves. You can hot swap the database and then upgrade the applications. I know people who downloaded a completely new database cluster with a new scheme, this is an option if you have a schemeless database like Mongo, but in any case this is not an easy task. If there are no more questions, thank you for your attention!

A bit of advertising :)

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to your friends cloud-based VPS for developers from $ 4.99 , a unique analog of entry-level servers that was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $ 19 or how to divide the server? (options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper at the Equinix Tier IV data center in Amsterdam? Only we have 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $ 199 in the Netherlands!Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $ 99! Read about How to Build Infrastructure Bldg. class c using Dell R730xd E5-2650 v4 servers costing 9,000 euros for a penny?

DEVOXX UK. Kubernetes in production: Blue / Green deployment, autoscaling, and deployment automation. Part 2

A bit of advertising :)

More articles: