What is Kubernetes?

hammad

January 17, 2023

In this article, I’m going to explain what is kubernetes?

and we cover following topics.

Official Definition of what is Kubernetes?
What problems does Kubernetes solve? Or why is there a need for a container orchestration tool?
What features do container orchestration tools offer?
Basic architecture: Master-Slave nodes, Kubernetes processes
Basic concepts: Pods, Containers, Services. What is the role of each?
Example Configuration File

What is Kubernetes?

So let’s jump into the definition what is kubernetes?. So kubernetes is an open source container orchestration framework. Which was originally developed by google. So on the foundation it manages containers. Be docker containers or from some other technology. Which basically means that kubernetes helps you manage applications. That are made up of hundreds or maybe thousands of containers. And it helps you manage them in different environments. Like physical machines virtual machines or cloud environments or even hybrid deployment environments.

What problems does Kubernetes solve?

Why is there a need for a container orchestration tool?

So what problems does kubernetes solve and what are the tasks of a container orchestration tool actually. So to go through this chronologically the rise of microservices cause increased usage of container technologies. Because the containers actually offer the perfect host for small independent applications like microservices. And the microservice technology actually resulted in applications that they’re now comprised of hundreds. Or sometime maybe even thousands of containers.

Now managing those loads of containers across multiple environments using scripts and self-made tools can be really complex. And sometimes even impossible so that specific scenario actually caused the need for having container orchestration technologies.

What features do container orchestration tools offer?

So what those orchestration tools like kubernetes do is actually guarantee following features. One is high availability in simple words high availability means that the application has no downtime. So it’s always accessible by the users. A second one is scalability which means that application has a high performance. It loads fast and users have a very high response rates from the application. And the third one is disaster recovery which basically means that if an infrastructure has some problems. Like data is lost or the server’s explode or something bad happens with the server center. The infrastructure has to have some kind of mechanism to pick up the data and to restore it to the latest state. So that application doesn’t actually lose any data.

And the containerized application can run from the latest state after the recovery. All of these are functionalities that container orchestration technologies like kubernetes offer.

Basic architecture: Master-Slave nodes, Kubernetes processes

So how does the kubernetes basic architecture actually look like. The kubernetes cluster is made up with at least one master node. And then connected to it you have a couple of worker nodes. Where each node has a kubelet process running on it and kubelet is actually a kubernetes process. That makes it possible for the cluster to talk to each other to communicate to each other. And actually execute some tasks on those nodes like running application processes.

Each worker node has docker containers of different applications deployed on it. So depending on how the workload is distributed. You would have different number of docker containers running on worker nodes. And worker nodes are where the actual work is happening. So here is where where your applications are running.

So the question is what is running on master node. Master node actually runs several kubernetes processes that are absolutely necessary to run and manage the cluster properly.

One of such processes is an API server which also is a container. An API server is actually the entry point to the kubernetes cluster. So this is the process which the different kubernetes clients will talk to. Like UI if you’re using kubernetes dashboard. An API if you’re using some scripts and automating technologies and a command-line tool. So all of these will talk to the API server.

Master-Slave nodes

Another process that is running on master node is a controller manager. Which basically keeps an overview of what’s happening in the cluster. Whether something needs to be repaired or maybe if a container died and it needs to be restarted etc.

And another one is scheduler which is basically responsible for scheduling containers on different nodes. Based on the workload and the available server resources on each node. So it’s an intelligent process that decides on which worker node the next container should be scheduled on. Based on the available resources on those worker node. And the load that that container meets.

And another very important component of the whole cluster is actually an etcd key value storage. Which basically holds at any time the current state of the kubernetes cluster. So it has all the configuration data inside and all the status data of each node and each container inside of that node. And the backup and restore that we mentioned previously is actually made from these etcd snapshots. Because you can recover the whole cluster state using that etcd snapshot.

And last but not least also a very important component of kubernetes. Which enables those nodes worker nodes master nodes talk to each other is the virtual network. That spends all the nodes that are part of the cluster. And in simple words virtual network actually turns all the nodes inside of the cluster into one powerful machine. That has the sum of all the resources of individual nodes.

Kubernetes processes

One thing to be noted here is that worker nodes. Because they actually have most load. Because they are running the applications on inside of it. Usually are much bigger and have more resources. Because they will be running hundreds of containers inside of them.

Where is master node will be running just a handful of master processes. Like above this diagram so it doesn’t need that many resources. However as you can imagine master node is much more important than the individual worker nodes. Because if for example you lose a master node access you will not be able to access the cluster anymore.

And that means that you absolutely have to have a backup of your master at any time. So in production environments usually you would have at least two masters inside of your kubernetes cluster. But in more cases of course you’re going to have multiple masters. Where if one master node is down the cluster continues to function smoothly because you have other masters available.

Basic concepts: Pods, Containers, Services. What is the role of each?

So now look at some kubernetes basic concepts like pods and containers. In kubernetes pod is the smallest unit that you as a kubernetes user will configure and interact with. In pod is basically a wrapper of a container. On each worker node you’re gonna have multiple pods. And inside of a pod you can actually have multiple containers. Usually per application you would have one pod. So the only time you would need more than one containers inside of a pod is when you have a main application. That needs some helper containers so usually you would have one pod per application.

Pods, Containers, Services. What is the role of each?

So a database for example would be one pod, a message broker will be another pod, a server will be again another pod. And in nodeJS application for example or a java application will be its own pod.

And as we mentioned previously as well there is a virtual network dispense the kubernetes cluster. So what that does is that it assigns each pod its own IP address. So each pod is its own self containing server with its own IP address. And the way that they can communicate with each other is we using that internal IP addresses.

And to note here we don’t actually configure or create containers inside of kubernetes cluster. But we only work with the pods which is an abstraction layer over containers. And pod is a component of kubernetes that manages the containers running inside itself without our intervention. So for example if a container stops or dies inside of a pod. It will be automatically restarted inside of the pod. However pods are ephemeral components which means that pots can also die very frequently. When a pod dies a new one gets created.

Here is where the notion of service comes into play. So what happens is that whenever a pod gets restarted or weak a new pod is created. And it gets a new IP address.

What is the role of each

So for example if you have your application talking to a database pod. Using the IP address the pods have and the pod restarts it gets a new IP address. Obviously it would be very inconvenient just that IP address all the time. So because of that another component of kubernetes called “service”. Is used which basically is an alternative or a substitute to those IP addresses.

Instead of having this dynamic IP addresses their services sitting in front of each pod that talk to each other. So now if a pod behind the service dies. And gets recreated the service stays in place. Because their life cycles are not tied to each other. And the service has two main functionalities. One is an IP address so it’s a permanent IP address. Which you can use to communicate with between the pods. And at the same time it is a load balancer.

Example Configuration File

So now that we have seen the basic concepts of kubernetes. How do we actually create those components like pods and services to configure the kubernetes cluster. All the configuration in kubernetes cluster actually goes through a master node with the process called API server. Which we mentioned briefly earlier.

So kubernetes clients which could be a UI a kubernetes dashboard for example or an API. Which could be a script or curl command or a command line tool like cube CTL. They all talk to the API server and they send their configuration requests to the API server. Which is the main entry point or the only entry point into the cluster.

YAML format or JSON format

In this requests have to be either in YAML format or JSON format. And this is how a example configuration in the YAML format actually looks like. So with this we are sending a request to kubernetes to configure a component called deployment.

Which is basically a template or a blueprint for creating pots. And in this specific configuration example we tell kubernetes to create two replica pots for us called my-app. With each pod replica having a container based on my image running inside.

In addition to that we configure what the environment variables and the port configuration of this container inside of the pot should be. And as you see the configuration requests in kubernetes our declarative form.

So we declare what is our desired outcome from kubernetes. nd kubernetes tries to meet those requirements. Meaning for example since we declare we want two replica pots of my-app deployment to be running in the cluster. And one of those pots dies the controller manager will see that the east and shoot states now are different the actual state is one part our desired state is two. So it make sure that this desired state is recover automatically and restarting the second replica of that pot. And this is how kubernetes configuration works with all of its component be the parts or the services or deployments what have you.

If you want to learn more please visit our blog page.