Monitoring
What kind of things need to monitor?
Cluster components
- API Server (kube-apiserver): Monitor request rates, errors rates, and latency
- Scheduler (kube-scheduler): Monitor scheduling latency and errors
- Controller Manager (kube-controller-manager): Monitor the status of various controllers
Nodes
- The number of nodes in the cluster
- Performance metrics:
- CPU and Memory Usage: Monitor resource usage to ensure nodes are not over or underutilized
- Disk Usage: Monitor disk space and I/O operations
- Network: Monitor network bandwidth and errors
Pods
- The number of pods in the cluster
- Performance metrics:
- Resource Usage: Monitor CPU, memory, and disk usage of pods and containers.
- Pod Status: Monitor the status of pods (Running, Pending, Failed, etc.).
- Container Logs: Monitor logs for errors and warnings.
- Health Checks: Monitor the results of liveness and readiness probes.
- Application Metrics: Monitor custom application metrics (e.g., request rates, error rates).
Metrics Server
ℹ️
You can use other metrics server like Prometheus, Datadog, Dynatrace, etc.
Reference (opens in a new tab)
Metrics Server is mainly to collect the resource metrics such as CPU and memory usage from the kubelet and aggregate them (resource usage data), and deliver them to the Kubernetes API server (kube-apiserver) for autoscaling purposes such as Horizontal Pod Autoscaler (HPA).
- HPA will automatically adjust the number of pods in a deployment based on the metrics collected by the Metrics Server.
How metrics collected from the pods?
- The kubelet contains a component called cAdvisor or Container Advisor that is mainly to collect the resource usage data of the pods and will expose them through the kubelet API. So, the Metrics Server will collect the resource usage data in the form of metrics from the kubelet API.
# View performance metrics of the cluster/pod
# This command requires Metrics Server to be correctly configured and working on the server.
kubectl top node
kubectl top pod