Zero Downtime Deployments in Kuberentes
If you don't already know Kuberentes, its a Container Orchestration Platform originally designed by Google. I'll assume you already have a decent understanding of containers and Kuberentes, but if not, I recommend learning about it here. If you want to get a cluster up and running quickly you can check out Minikube.
Kubernetes has out-of-the-box support for rolling deployments via Deployments. A Deployment uses Replica Sets (which are essentially Replication Controllers with support for set-based selectors) to orchestrate pod deployment. For example, when a deployment is updated (a change to the pod specification), a new Replica Set is created and depending on your strategy, the new and old Replica Sets are adjusted until your deployment is finished.
Deployments do offer an easy, effective way to do rolling updates but is not a full solution if you want zero-downtime (not a single failed request) during the rolling update.
In order to acheive no availability disruptions during the deployment, there are a few things we'll need to address:
- Connection Draining
- HTTP Keep-Alive
We don't want a container to be killed while in-flight requests are being processed. To do this, we need the following steps:
- Indicate that we don't want to receive any more new requests
- Be given time to finish requests that are currently being processed
Kubernetes allows us to add a readinessProbe to the container. If this probe fails, requests to the service will not be routed to the container. This will take care of #1 from above.
For #2, we need to give the pending requests time to finish. Kubernetes sends a SIGTERM signal to the container's process to indicate that it should shut itself down gracefully, prior to being sent a SIGKILL. The container is given a grace period, terminationGracePeriodSeconds to shut down. If it doesn't exit, the process is sent the SIGKILL signal and will exit. The default grace period as of this writing is 30 seconds.
We can hook into this and instead of handling SIGTERM in the application add a lifecycle hook. This means we can execute an arbitrary command, for example, to attempt to shut down the container gracefully.
For example, the following fails the readiness probe by taking down the health check right before Kubernetes sends the SIGTERM signal.
The goal of the connection draining above is to allow in-flight requests to finish, but if a connection is using HTTP Keep-Alive then a TCP connection will be kept open and used for multiple requests. The draining usually is best-effort -- a reasonable amount of time (30-300) seconds is given to allow requests to finish and the container will then exit. The keep-alive connections will get closed forcefully when the container exits and will cause clients to experience an error.
To solve the above issue we will add a proxy tier that terminates the HTTP connections (and keeps them alive) and proxies requests to our actual backend service (the actual containers). The proxy will use Connection: close with the backends amongst other things (redispatching of requests) to make the request processing more robust.
NOTE: Kubernetes recently added support for terminating HTTP(s) connections on AWS ELBs in 1.3.0 which may let us avoid using the HAProxy layer.
Previously there was only support for TCP load balancing which meant the persistent HTTP Keep-Alive connection to a container would be torn down resulting in client errors during a rolling update.
To follow along check out the zero-downtime-tutorial repository on GitHub, it has all the things needed to test this out on your own.
This example contains a simple nginx container that has 2 versions -- one that says RED on the homepage and other says BLUE. We'll switch between these two to test our deployments. In addition, we'll be using an HAProxy container to terminate the HTTP connections on the proxy tier.
Create the services
There are two services that we'll be creating:
- myapp - The application
- myapp-proxy - Proxy for the application
The myapp-proxy service will take requests and send them to our backend which is provided by the myapp service.
Now that the services are created we'll need to create our deployments for the application and proxy.
Create the deployments
After a few minutes the deployments should be running. If you are using minikube type the following to see the application. Your browser should open and you will see RED displayed on the screen.
Get the URL for the myapp-proxy service as we'll need it for doing a test -- mine is http://192.168.99.100:30301/. We'll start using ApacheBench to send traffic to our service and then we'll update our deployment with the BLUE image. Note that we're using the -k flag to test the HTTP Keep-Alive scenario also.
Once the above is running, quickly run the following command to kick off a rolling update:
Once you run this you can refresh a few times in your browser and eventually you'll see it swapping between RED and BLUE and finally settle on our new version BLUE.
Looking back at ApacheBench you should see zero failed requests. Success!