Skip to content

bug(k8s): Container resource limits and requests cannot be configured separately #1065

@cyejing

Description

@cyejing

Problem

In server/opensandbox_server/services/k8s/provider_common.py, the _build_main_container function always sets Kubernetes requests equal to limits:

# Problematic code (before fix)
resources = V1ResourceRequirements(
    limits=translated_limits,
    requests=translated_limits,  # always mirrors limits — no way to set independently
)

This means users have no way to configure requests and limits separately when using the Kubernetes runtime.

Impact

In Kubernetes, requests and limits serve distinct purposes:

  • requests: The guaranteed resource amount used by the scheduler to place the Pod on a node.
  • limits: The hard cap enforced at runtime via cgroups.

Forcing requests == limits has the following consequences:

  1. QoS class is always Guaranteed — users cannot create Burstable sandboxes (which require requests < limits), making resource utilization less flexible.
  2. Scheduling is overly conservative — nodes must have capacity equal to the full limits value free before a sandbox can be scheduled, even if a lower requests value would be sufficient.
  3. No burst headroom — containers cannot temporarily burst beyond their requests up to limits, which is the standard Kubernetes pattern for workloads with variable resource consumption.

Root Cause

The resourceLimits field in the create-sandbox API only accepts cpu and memory keys. There is no way to pass separate request values, so the implementation unconditionally reuses translated_limits for both limits and requests.

Proposed Fix

Add two optional fields — cpuRequest and memoryRequest — to the resourceLimits parameter. When provided, use them for the requests side of V1ResourceRequirements; otherwise fall back to the existing behavior (requests == limits) for full backward compatibility.

cpu_request = resource_limits.get("cpuRequest") if resource_limits else None
memory_request = resource_limits.get("memoryRequest") if resource_limits else None
limits_only = {k: v for k, v in resource_limits.items() if k not in ("cpuRequest", "memoryRequest")} if resource_limits else {}
translated_limits = _translate_resource_limits_for_k8s(limits_only)

resources = None
if translated_limits:
    if cpu_request or memory_request:
        translated_requests = dict(translated_limits)
        if cpu_request:
            translated_requests["cpu"] = cpu_request
        if memory_request:
            translated_requests["memory"] = memory_request
    else:
        translated_requests = translated_limits
    resources = V1ResourceRequirements(
        limits=translated_limits,
        requests=translated_requests,
    )

Example usage after fix

{
  "resourceLimits": {
    "cpu": "2",
    "memory": "2Gi",
    "cpuRequest": "0.5",
    "memoryRequest": "512Mi"
  }
}

This would produce a Burstable pod that is scheduled with 0.5 CPU / 512Mi but can burst up to 2 CPU / 2Gi.

Affected Component

  • server/opensandbox_server/services/k8s/provider_common.py — Kubernetes runtime only; Docker runtime is unaffected.

Additional Notes

  • The fix is fully backward compatible: existing callers that do not pass cpuRequest/memoryRequest continue to get requests == limits behavior.
  • The _translate_resource_limits_for_k8s function should only process limit keys; including request-specific keys in it could cause unintended key translation.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions