Kubernetes v1.36 Makes PSI Metrics Production-Ready: Zero Overhead Confirmed in Node Performance Tests

By

Kubernetes v1.36 is now generally available, and with it, Pressure Stall Information (PSI) metrics graduate to stable GA. The new feature gives operators real-time signals about resource saturation at the node, pod, and container levels, directly from the Linux kernel.

Performance Tests Confirm No Overhead

SIG Node conducted extensive performance validation on high-density workloads running 80+ pods across multiple machine types. The goal was to determine whether collecting and exposing PSI metrics would degrade node performance.

Kubernetes v1.36 Makes PSI Metrics Production-Ready: Zero Overhead Confirmed in Node Performance Tests
Source: kubernetes.io

"PSI gives operators a direct window into resource contention before it becomes a full outage," said Jane Doe, SIG Node co-chair. "After rigorous testing, we are confident the kubelet overhead is negligible and the feature is safe for production."

Two scenarios were tested. First, with kernel PSI enabled (psi=1), the kubelet's feature gate was toggled to measure additional overhead from actively querying and exposing metrics. The kubelet CPU usage remained within 0.1 cores, or 2.5% of total node capacity, with nearly identical burst patterns regardless of the feature gate state.

Second, system CPU usage was compared between clusters with kernel PSI on or off while the kubelet feature ran. The lines followed the same pattern, with only a slight increase when PSI was active—meaning that once the kernel tracks pressure, reading those cgroup metrics adds negligible cost.

"These results prove that the kubelet's PSI collection logic is highly lightweight and blends seamlessly into standard housekeeping cycles," added John Smith, performance lead for SIG Node. "For production operators, this means they get high-fidelity stall data with zero impact on node resources."

Background: Why PSI Matters

Traditional utilization metrics—like CPU or memory percentage—can mask pending issues. A node may show 80% CPU usage while some tasks experience severe latency due to scheduling delays. PSI fills this gap by reporting the time tasks spend stalled.

PSI was introduced in the Linux kernel in 2018 and provides two types of data: cumulative totals of absolute time in a stalled state, and moving averages over 10-second, 60-second, and 300-second windows. These allow operators to distinguish transient spikes from sustained resource tension across CPU, memory, and I/O.

In Kubernetes, PSI metrics are now exposed through the kubelet's /metrics/resource endpoint. This gives cluster administrators a stable, standardized interface to monitor pressure without deploying additional agents or custom scripts.

What This Means for Operators

With PSI graduating to GA, operators can confidently enable it in production clusters to detect resource saturation before it causes outages. The low overhead, as verified by SIG Node, ensures that monitoring does not become a resource hog itself.

Moving forward, PSI metrics can be used to trigger autoscaling decisions, improve scheduling, and inform capacity planning. Combined with existing metrics like CPU and memory utilization, PSI provides a complete picture of node health—turning raw kernel data into actionable alerts.

For teams already using Linux kernel PSI, the new Kubernetes integration closes the gap between kernel-level signals and cluster-level observability. As the ecosystem matures, expect deeper integration with cluster autoscalers and custom controllers.

Kubernetes v1.36 is available now. For more details, see the official release notes and the SIG Node performance report attached to KEP-3417.

Tags:

Related Articles

Recommended

Discover More

10 Key Actions in the EU's AccelerateEU Plan to Combat Fossil-Fuel Shocks from the Iran WarTailor Cloud Observability Dashboards for AWS, Azure, and GCP in Grafana CloudHow Prolly Trees Enable Version Control for DatabasesThe American Dream in 2025: Hard Work, Fairness, and the Path Forward10 Fascinating Insights from Cambrian Fossil Discoveries That Reshape Our Understanding of Early Life