Kubernetes v1.36 Unveils Beta for In-Place Pod-Level Resource Scaling

In Kubernetes v1.36, a major milestone arrives for pod resource management: the In-Place Pod-Level Resources Vertical Scaling feature has graduated to Beta and is now enabled by default. This capability allows you to dynamically adjust the aggregate resource budget of a running pod—often without restarting any containers. It builds on earlier advances (Pod-Level Resources Beta in v1.34 and In-Place Pod Vertical Scaling GA in v1.35) and simplifies operations for complex pods, especially those with sidecars that share a collective pool of CPU and memory. Below, we explore how it works, why it matters, and how to use it.

What is in-place pod-level resource vertical scaling?

In-place pod-level resource vertical scaling lets you update the .spec.resources field of a running pod to change its aggregate resource budget—the total CPU and memory available to all containers combined. Unlike traditional scaling that often requires a pod restart or per-container adjustments, this feature enables on-the-fly modifications. The change is applied via the resize subresource, and the kubelet attempts to update cgroup limits dynamically using the Container Runtime Interface (CRI). If a container's resizePolicy allows non-disruptive updates ( NotRequired), no restart is needed. This reduces downtime and operational overhead, especially for workloads like sidecar patterns where containers share a common resource pool.

Kubernetes v1.36 Unveils Beta for In-Place Pod-Level Resource Scaling

Why does pod-level resizing simplify management for complex pods?

Pods with multiple containers (e.g., main app plus sidecars) often need a coordinated resource plan. Previously, you had to manually calculate and set individual container limits to ensure the total didn't exceed node capacity. With pod-level resources, you define an aggregate boundary once; containers without individual limits automatically inherit from this pool. When demand spikes, you can expand the total budget in one step—no recalculating per-container values. This is especially useful for scenarios where sidecars are lightweight but collectively need more headroom. The v1.36 Beta makes this adjustment possible while the pod is running, so you respond to load changes without service disruption.

How does resource inheritance and resizePolicy work?

When you resize the pod-level budget, the kubelet treats it as a resize event for every container that inherits its limits from the pod-level. Each container has a resizePolicy that tells the kubelet whether a restart is required to apply the change. If the policy is NotRequired, the kubelet updates the container's cgroup limits live via the CRI—no restart. If RestartContainer, the container is restarted to safely apply the new boundaries. Currently, resizePolicy is not supported at the pod level; the kubelet always defers to individual container settings. This granularity lets you keep critical containers running while restarting those that need a full reset.

Can you show an example of scaling a shared resource pool?

Consider a pod named shared-pool-app with a pod-level CPU limit of 2 CPUs and memory 4Gi, and two containers (main-app and sidecar) that do not set individual limits. Both inherit from the pod budget. Their CPU resizePolicy is NotRequired. To double CPU capacity to 4 CPUs, you apply a patch using the resize subresource: kubectl patch pod shared-pool-app --subresource resize --patch '{"spec":{"resources":{"limits":{"cpu":"4"}}}}' Since both containers allow non-disruptive updates, the kubelet adjusts their cgroup limits dynamically—no restarts. The containers immediately can use up to the new aggregate limit. This example demonstrates how simple it is to scale shared pools for sidecar or microservice patterns.

What node-level feasibility and safety checks does the kubelet perform?

Applying a resize patch is only the first step. The kubelet runs a sequence of checks to ensure node stability before applying changes. It verifies that the new aggregate resource request does not exceed the node's allocatable resources, considering other running pods. It also validates that the pod's Quality of Service (QoS) class remains consistent (e.g., Guaranteed pods must have matched requests and limits). The kubelet then updates the cgroup hierarchy accordingly, following the order: first adjust memory limits, then CPU, to avoid transient issues. If a container's resizePolicy requires a restart, the kubelet coordinates that safely. These steps prevent resource overcommit and maintain node health during live resizing.

What's next for pod-level vertical scaling?

With the graduation to Beta in v1.36 and default enablement, the community is gathering more feedback from real-world usage. Future enhancements may include support for resizePolicy at the pod level, more detailed metrics for in-place resize events, and integration with autoscalers like the Vertical Pod Autoscaler (VPA). The goal is to make dynamic resource adjustment fully automated and seamless. For now, users can take advantage of the feature to reduce manual per-container tuning and achieve faster response to workload changes. Check the official Kubernetes documentation for the latest recommendations on using in-place pod-level scaling in production.

Tags: