How Does Prometheus Monitor Kubernetes?
Prometheus monitors Kubernetes by scraping metrics endpoints from Pods, nodes, and cluster components. It uses Kubernetes service discovery to automatically find targets. The kube-prometheus-stack (Prometheus Operator) is the standard deployment method, providing pre-built dashboards and alerting rules.
Detailed Answer
How Prometheus Works
Prometheus is a pull-based monitoring system. It periodically scrapes HTTP endpoints (typically /metrics) on targets, parses the exposed metrics, stores them in a time-series database, and evaluates alerting rules.
The four main metric types:
- Counter: Monotonically increasing value (e.g., total HTTP requests)
- Gauge: Value that can go up or down (e.g., current memory usage)
- Histogram: Distribution of values in buckets (e.g., request latency)
- Summary: Similar to histogram but calculates quantiles client-side
Deploying Prometheus with kube-prometheus-stack
The recommended way to deploy Prometheus on Kubernetes is through the kube-prometheus-stack Helm chart, which includes Prometheus, Grafana, Alertmanager, and node-exporter:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install monitoring prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--set prometheus.prometheusSpec.retention=30d \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi \
--set grafana.adminPassword=securePassword
This deploys:
- Prometheus - Metrics collection and storage
- Alertmanager - Alert routing and notification
- Grafana - Dashboards and visualization
- node-exporter - Host-level metrics (CPU, memory, disk)
- kube-state-metrics - Kubernetes object state metrics
# Verify the deployment
kubectl get pods -n monitoring
kubectl get svc -n monitoring
Kubernetes Metrics Sources
| Source | Metrics | Endpoint | |---|---|---| | kube-apiserver | API request latency, counts | /metrics on :6443 | | kubelet | Container CPU, memory, network | /metrics on :10250 | | cAdvisor (in kubelet) | Container resource usage | /metrics/cadvisor | | kube-state-metrics | Object state (Pod phase, replicas) | /metrics on :8080 | | node-exporter | Node CPU, memory, disk, network | /metrics on :9100 | | CoreDNS | DNS query latency, cache stats | /metrics on :9153 | | etcd | Cluster health, disk IO | /metrics on :2379 |
ServiceMonitor CRD
The Prometheus Operator uses ServiceMonitor CRDs to define scrape targets declaratively:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: my-app-metrics
namespace: monitoring
labels:
release: monitoring # Must match Prometheus Operator's serviceMonitorSelector
spec:
namespaceSelector:
matchNames:
- production
selector:
matchLabels:
app: my-app
endpoints:
- port: metrics
interval: 30s
path: /metrics
This tells Prometheus to scrape every Service labeled app: my-app in the production namespace on the port named metrics.
Application Instrumentation
Expose custom metrics from your application:
# Application Deployment with metrics port
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: app
image: my-app:v2
ports:
- name: http
containerPort: 8080
- name: metrics
containerPort: 9090
---
apiVersion: v1
kind: Service
metadata:
name: my-app
namespace: production
labels:
app: my-app
spec:
selector:
app: my-app
ports:
- name: http
port: 80
targetPort: 8080
- name: metrics
port: 9090
targetPort: 9090
Key PromQL Queries for Kubernetes
# CPU usage per Pod
rate(container_cpu_usage_seconds_total{namespace="production"}[5m])
# Memory usage per Pod
container_memory_working_set_bytes{namespace="production"}
# Pod restart count
kube_pod_container_status_restarts_total{namespace="production"}
# Pod not ready for more than 5 minutes
kube_pod_status_ready{condition="false"} == 1
and on(pod) (time() - kube_pod_created > 300)
# Node CPU utilization percentage
100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Persistent Volume usage
kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes * 100
# API server request rate
rate(apiserver_request_total[5m])
# API server error rate
rate(apiserver_request_total{code=~"5.."}[5m]) / rate(apiserver_request_total[5m]) * 100
Alerting Rules
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: kubernetes-alerts
namespace: monitoring
labels:
release: monitoring
spec:
groups:
- name: kubernetes-pod-alerts
rules:
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) * 60 * 15 > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.namespace }}/{{ $labels.pod }} is crash looping"
description: "Pod has restarted {{ $value }} times in the last 15 minutes."
- alert: PodNotReady
expr: kube_pod_status_phase{phase=~"Pending|Unknown"} > 0
for: 15m
labels:
severity: critical
annotations:
summary: "Pod {{ $labels.namespace }}/{{ $labels.pod }} has been not ready for 15m"
- alert: HighMemoryUsage
expr: |
container_memory_working_set_bytes{container!=""}
/ on(namespace, pod, container) kube_pod_container_resource_limits{resource="memory"}
> 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "Container {{ $labels.container }} memory usage above 90%"
Accessing Dashboards
# Port-forward to Grafana
kubectl port-forward -n monitoring svc/monitoring-grafana 3000:80
# Port-forward to Prometheus UI
kubectl port-forward -n monitoring svc/monitoring-kube-prometheus-prometheus 9090:9090
# Port-forward to Alertmanager
kubectl port-forward -n monitoring svc/monitoring-kube-prometheus-alertmanager 9093:9093
Production Considerations
For production clusters, ensure Prometheus has sufficient storage and retention configured. Use remote write (Thanos, Cortex, or Mimir) for long-term storage and multi-cluster aggregation. Set resource requests and limits on Prometheus Pods to prevent OOM kills. Use recording rules to pre-compute expensive queries that power dashboards.
Why Interviewers Ask This
Interviewers ask this because monitoring is fundamental to operating Kubernetes in production, and Prometheus is the de facto standard for Kubernetes observability.
Common Follow-Up Questions
Key Takeaways
- Prometheus uses a pull-based model, scraping /metrics endpoints on targets
- Kubernetes service discovery automates target configuration
- kube-prometheus-stack provides a production-ready monitoring setup