Monitoring A Kubernetes cluster


Part 1



In Part 1 of this 2-Part blog, we look at some of the popular methods of monitoring a Kubernetes cluster. In our previous tutorials and blogs, we had deployed our applications (WordPress, Jenkins, and our customized container)

Kubernetes monitoring helps Kubernetes cluster administrators manage complex app modernization and scale-up containerization in human-readable environments. By monitoring Kubernetes clusters, you can easily manage your entire container infrastructure and identify issues quickly; track uptime, cluster resources, and the interaction between various cluster components.

Kubernetes monitoring helps you identify issues and proactively manage Kubernetes clusters. Effective monitoring for Kubernetes clusters makes it easier to manage your containerized workloads, by tracking uptime, utilization of cluster resources (such as memory, CPU, and storage), and interaction between cluster components.

Kubernetes monitoring allows cluster administrators and users to monitor the cluster and identify issues such as insufficient resources, failures, pods that are unable to start, or nodes that cannot join the cluster. Many organizations use specialized cloud-native monitoring tools to gain full visibility over cluster activity.



What are the key metrics in monitoring Kubernetes?



Kubernetes Metrics Server aggregates data collected from the kubelet on each node, passing it through the Metrics API, which can then combine with a number of visualization tools. Some key metrics to consider tracking include:

    • Cluster state metrics, including the health and availability of pods.
    • Node status, including readiness, memory, disk or processor overload, and network availability.
    • Pod availability, since unavailable pods can indicate configuration issues or poorly designed readiness probes.
    • Memory utilization at the pod and node level.
    • Disk utilization including lack of space for file system and index nodes.
    • CPU utilization in relation to the amount of CPU resource allocated to the pod.
    • API request latency is measured in milliseconds, where the lower the number the better the latency.



Kubernetes Monitoring Tools



Kubernetes offers many benefits but adds complexity as well. For example, its ability to distribute containerized applications across multiple data centres—and even different cloud providers—requires a comprehensive monitoring solution to collect and aggregate metrics across many different sources.

Continuous monitoring of system and application health is essential, and many free and commercial solutions provide real-time monitoring of Kubernetes clusters and the applications they host. Here are several open-source tools for Kubernetes monitoring:


This popular monitoring and alerting tool for Kubernetes and Docker provides detailed, actionable metrics and analysis. Developed by SoundCloud and donated to the CNCF community, Prometheus is designed specifically to monitor applications and microservices running in containers at scale. Prometheus is not a dashboard, however, and often is used in conjunction with Grafana to visualize data. 


Grafana, an open-source platform for analytics and metric visualization, includes four dashboards: Cluster, Node, Pod/Container and Deployment. Kubernetes admins often install Grafana and leverage the Prometheus data source to create information-rich dashboards.


Jaeger is a tracing system used to troubleshoot and monitor transactions in complex distributed systems. It addresses software issues that arise in distributed context propagation, distributed transactions monitoring, latency optimization and more.  



Kubernetes Dashboard, a web UI add-on for Kubernetes clusters, allows you to monitor the health status of workloads. 

We will deploy the 2 most popular monitoring tools Prometheus and Grafana


There are many open-source time-series databases available today, including Graphite, InfluxDB, and Cassandra, but none are as well-liked among Kubernetes users as Prometheus. Prometheus, a part of CNCF (Cloud Native Computing Foundation), has become the open-source de-facto standard for monitoring Kubernetes.



Grafana is an open-source visualization tool used to monitor infrastructure in real time. The visualization of data through graphs helps in log analysis and troubleshooting real-time infra issues.

Grafana provides custom dashboards, and these are some of the best dashboards available to be used:

Prometheus has its own dashboard with limited capabilities, which have been enhanced by the use of external visualization tools like Grafana, which uses the Prometheus database to enable sophisticated inquiries, debugging, and reporting that can be tailored for dev, test, and production teams.

Usually, when you install Prometheus in your cluster, you should get a Grafana pod up and running by default. In case your Prometheus installation was devoid of Grafana, you can manually add Grafana to your monitoring namespace.

Prometheus and Grafana work very well together and are the most preferred industry standards for monitoring a Kubernetes cluster, the underlying hardware resources and the pods that are deployed on the Kubernetes cluster.

You can import the JSON of these ready-made dashboards or import the dashboard ID  in the Grafana UI. For Kubernetes cluster monitoring, very detailed dashboards are available through which we can monitor cluster namespace and pod-level details in our Grafana.

We will deploy the Prometheus and Grafana, along with additional metrics collection systems like Alert-manager, Metrics server and Kube-metrics to collect as much information from the Kubernetes cluster. 

In all our deployments, we will use an already mounted EFS (Elastic File System). Details of deploying an EFS are in this blog.  However, we will not create a new mount point for monitoring but use the same mount points that we created for mounting WordPress and Jenkins. We will just create new directories for config files and data files. We will use Persistent Volumes for our monitoring. 

Important: To monitor the Kubernetes cluster, ensure that you have some pods running on the Kubernetes cluster. Also you are using Master and Worker nodes, so that you can monitor the hardware resources. 



Step 1: Create the necessary directories on the Master and Worker node.



On the Master Node, log in as “jenkins“user

    • $ cd /home/jenkins

We will first create a Namespace “monitoring”. Most of our monitoring deployments will be using this Namespace. In addition to “monitoring” Namespace, we will also use “kube-system” Namespace for collecting metrics. Namespace “kube-system” is created by default when the Kubernetes cluster is created. 

    • $ mkdir monitoring

    • $ cd monitoring

    • $ vi monitoring-namespace.yaml

apiVersion: v1
kind: Namespace
  name: monitoring
    • $ kubectl create -f monitoring-namespace.yaml

This creates the “monitoring” Namespace

Next, we will download all the required files for monitoring the cluster. We will create all the sub-directories inside the “monitoring” directory, we created earlier

    • $ git clone .

This will clone all the directories to the current directory “monitoring”.



Step 2: Deploy the metrics



In this step, we will deploy all the metrics so that they can be used by Prometheus. 

    • $ cd monitoring

We have already cloned the GitHub repository where all the monitoring files are stored, we should now have all the directories and “yaml” files, by issuing the below command. 

    • $ ls -ltr

drwxrwxr-x. 2 jenkins jenkins 6144 Apr 17 07:00 alert-manager
drwxrwxr-x. 2 jenkins jenkins 6144 Apr 16 10:23 grafana
drwxrwxr-x. 2 jenkins jenkins 6144 Jan  7 14:04 kube-metrics
drwxrwxr-x. 2 jenkins jenkins 6144 Jan 10 08:57 metrics-server
drwxrwxr-x. 2 jenkins jenkins 6144 Apr 17 06:00 prometheus

Now let’s deploy the alert-manager

    • $ kubectl create -f alert-manager/

This creates the alert-manager pod, services and deployment. Alert-manager gets deployed in the “monitoring” Namespace.

Note: We have exposed the alert manager on port 31000

To access the alert-manager through a web browser, use the below syntax:


Below is a screenshot of the alert-manager in a browser.

Next, let’s deploy the “kube-metrics

    • $ kubectl create -f kube-metrcis/

This creates the “kube-metrics” pod, services and deployment. “kube-metrics” gets deployed in the “kube-system” Namespace.

Next, let’s deploy the “metrics-server

    • $ kubectl create -f metrcis-server/

This creates the “metrics-server” pod, services and deployment. “metrics-server” gets deployed in the “kube-system” Namespace.

This completes the deployment of the necessary metrics components that will be used by Prometheus to monitor the Kubernetes cluster.



Part 2 –> Deploying Prometheus and Grafana