Monday, February 4, 2019

VM, Lambda, Kubernetes & Terraform Best Practice

I work with these popular cloud technologies.
  • VMs, virtual machines like EC2 or GCE
  • Docker
  • Kubernetes
  • Terraform
  • Lambda / serverless functions
This post contains a short introduction to these technologies and my best practices for which cloud technology to use in different situations.


Virtualization Technologies


Here is a quick history and brief summary of difference.

A Highly Abbreviated Virtualization History

  • 2006 Amazon released EC2 a cloud VM you could spin up fast on demand.
  • 2013 Docker. Describes everything VM needs in a small file, used to build lightweight image.
  • 2014 Google open sourced Kubernetes a system to run Docker images together.
  • 2015 Serverless functions / lambdas. Code independent of VM.
  • 2018 Firecracker. A microVM with 125ms start time used for AWS lambda and Fargate.

VM vs Containers vs Lambdas


Main difference
  • VM has a full operating system that run on a hypervisor.
  • Docker / Kubernetes runs as layers on top of a guest Linux OS.
  • Lambda serverless function running in a minimal VM with a good sandbox separation.
There has been a development from heavyweight VM to super lightweight VM.

Recently AWS lambdas started running in a microVM called Firecracker that can spin up in around 125ms with only 5MB memory overhead.


Best Practices for Virtualization


When should you use full VMs, Docker, Kubernetes or lambdas?

When Should You Use Serverless / Lambdas

There are many names for the same concept: AWS Lambdas, Azure Functions and Cloud Functions on GCP.

Good use cases for serverless functions
  • RESTful call with no state.
  • RESTful call that only interact with a database.
  • Database maintenance tasks.
  • Logging operation.
  • On Azure and GCP they are used to server up machine learning models when they are trained.
Lambdas / serverless functions don't need to have a VM running and they scales from no use to massive use. They are very cheap and flexible.

Serverless functions have been marketed as the future of cloud computing and are clearly going to play a big role.

When Should You Use VMs or Kubernetes


Good use cases for VM or Kubernetes
  • Your program has to load a lot of data on startup.
  • Web application with a lot of functionality that are naturally grouped together. 
  • Your program has to do a long sequence of operations.
You could use lambdas for a long sequence of operations. You would just push messages along from one lambda to the next. This is similar to Erlang or Akka actors model. I find that this gives you little control and it makes error handling hard.

When Should You Use Kubernetes


Good use cases for Kubernetes
  • If you are running a lot of daily tasks from some scheduling system, say Airflow or Luigi, it is faster to start them in Kubernetes than to spin up a new full VM instance for each.
  • You find a Docker image with a program that does what you need.
  • If you have several programs that needs to run together one program might need to be installed on Debian another on Ubuntu and one on CentOS. Kubernetes handles this very well. You can actually deploy all 3 containers to the same Kubernetes pod that share a hard disk.

When Should You Use a Full VM

There is overhead with setting up Kubernetes. You also need to have a Kubernetes master node running which cost money. So sometimes the simplest solution is to use a full VM.

Should You Run Docker Inside a VM?

The advantage of Docker is that you package up the Docker image and you can test it locally running in the same way as it will run on the VM.

The disadvantages are that you still have an extra step of creating the Docker file, build and deploy the Docker image to DockerHub or some other repository. You have to install Docker on your VM. There can be some performance hit by an extra level of virtualization.

I use Docker on my laptop and on Kubernetes but I usually do not use Docker in full VM.


Terraform


Terraform is a new tool for infrastructure as code, released by Hashicorp in 2014. It is a small functional programming language focused of configuration.

In your Terraform program you define the state you want to put your cloud system in. You run these commands from command line in the directory where you have your program:

terraform init
terraform plan
terraform apply

This will start a VM or create your infrastructure for you, and Terraform stores the state of your system in what is called a Terraform state file. This state file can be stored locally or shared in a cloud bucket.

When you want to make changes to your cloud infrastructure you change your Terraform program and you run another:

terraform plan
terraform apply

Terraform is declarative it will compare the state of your system with the state you want it to be in find out what changes it need to make.

I have used Terraform a lot with AWS to spin up EC2 and EMR clusters, but also to create IAM roles, policies, VPNs and security groups.

The documentation is good but there is a steep learning curve for Terraform. I found a class Learn DevOps: Infrastructure Automation With Terraform that helped me.

Terraform Modules

Terraform has a concept called a module. It enables code reuse. It is an advanced topic, but I find it absolutely essential in writing maintainable code. Especially if you have multiple environment say dev, staging and prod.

Terraform Version Problem

A problem that I experienced several times is that one team member accidentally updates Terraform to the current version, the next time somebody runs an update script they get this message:

Terraform doesn't allow running any operations against a state
that was written by a future Terraform version. The state is
reporting it is written by Terraform '0.11.8'.

The good news is that the Terraform state file is written in json and is somewhat robust. So you can download the state file and change the version number back to the old version and there is a good chance that it will work. Still this is not the kind of error message that you want to see when you are doing a prod release.

Issues with Terraform

Terraform is a nice declarative framework, but Terraform state file is stored either locally or in cloud bucket.
  • Local state file makes is hard for a team to collaborate. They will get a different state file.
  • Cloud storage allows you to collaborate but now you are still dealing with a shared mutable state that is susceptible to the version problem mentioned above.
I used Terraform to create a lambda function with IAM roles, policies and code. When I tried to update lambda to newer version. Terraform did not sense the changed program files so I had to destroy everything and recreate it.

Using Terraform is often safter than making manual changes in a web console, but I would hesitate to update a database using Terraform.

There is an enterprise version of Terraform that might alleviate some of these problems, but I have only used the open source version.


Kubernetes


Kubernetes is container orchestration framework. It was open sources by Google in 2014 and it works very well on GCP, Google Compute Platform. Many cloud providers has Kubernetes offerings e.g. AWS, Azure and DigitalOcean.

Kubernetes uses declarative cloud definition. I a yaml file you define how many instances of a web server do you want to run. If a web server crashes Kumbernetes will start a new one without intervention.

Kubernetes was one of the most active developed open source framework in 2018. It feels mature.
The state is part of the Kubernetes system not a file living locally or in an S3 bucket.

Issues with Kubernetes

It is quite complicated to set Kubernetes up in a private cloud. You need highly dedicated DevOps staff to do this. A lot of things can and do go bad. I have many memories of DNS server going missing and the block storage / hard disks disappearing after running programs for hours.


Terraform or Kubernetes


When should you use Terraform and when should you use Kubernetes?

They are both declarative tools that you can use to start programs and define things like security groups in your cloud environment.

Terraform is a good option if you want to define your infrastructure and spin up VMs, EMR clusters etc. It is not AWS specific but works very well with AWS.

Kubernetes is a good option if you chose to use containers and you are working on a cloud that has good Kubernetes support. AWS has a competing technology Fargate and AWS integration with Kubernetes is less mature.

No comments: