With flow control, EPP's CPU usage reaches 100%

When I was using the experimental flow control feature, based on my observation of the pod metrics, I noticed that the EPP's CPU usage would quickly reach nearly 100% after receiving the first request, approaching its limit. This situation will continue even if EPP doesn't receive any requests.

<img width="840" height="240" alt="Image" src="https://github.com/user-attachments/assets/c49d12b8-417f-4b02-a443-51c2b2d12b54" />

At the same time, I occasionally observe that the EPP restarts due to health check failures, not sure if it is affected by this phenomenon.

By inspecting the code, I found that it might be due to the ShardProcessor continuously invoking `runtime.Gosched()`. I can understand the design intent, but it seems that calling `runtime.Gosched()` too frequently could lead to high CPU usage.

https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/4752d66e372944df44d509d678a6b972fcb002f1/pkg/epp/flowcontrol/controller/internal/processor.go#L202

I attempted to make the following changes by introducing a ticker with a small time interval to avoid continuous calls. 

```go
func (sp *ShardProcessor) Run(ctx context.Context) {
	sp.logger.V(logutil.DEFAULT).Info("Shard processor run loop starting.")
	defer sp.logger.V(logutil.DEFAULT).Info("Shard processor run loop stopped.")

	sp.wg.Add(1)
	go sp.runCleanupSweep(ctx)

	// Create a ticker for periodic dispatch attempts to avoid tight loops
	dispatchTicker := sp.clock.NewTicker(time.Millisecond) // we may adjust interval as needed
	defer dispatchTicker.Stop()

	// ...
	for {
		select {
		case <-ctx.Done():
			sp.shutdown()
			sp.wg.Wait()
			return
		case item, ok := <-sp.enqueueChan:
			if !ok { // Should not happen in practice, but is a clean shutdown signal.
				sp.shutdown()
				sp.wg.Wait()
				return
			}
			// This is a safeguard against logic errors in the distributor.
			if item == nil {
				sp.logger.Error(nil, "Logic error: nil item received on shard processor enqueue channel, ignoring.")
				continue
			}
			sp.enqueue(item)
			sp.dispatchCycle(ctx) // Process immediately when an item arrives
		case <-dispatchTicker.C():
			sp.dispatchCycle(ctx) // Periodically attempt to dispatch from queues
		}
	}
}
``` 

This can reduce the CPU usage of EPP to about 2.5% when it is idle.

<img width="749" height="230" alt="Image" src="https://github.com/user-attachments/assets/b4539dc3-37f9-401c-8e04-38d05b6ea04d" />

But this introduces a behavior change: ShardProcessor will not continuously attempt to perform dispatch. However, I think it might be fine, because the saturation detector is unlikely to change its detection result within a very short period.

Some opinions from the maintainer are needed: 
* Should this phenomenon be considered a problem? (since I have not yet tested its impact on EPP performance)
* Does the aforementioned modification still align with the design intent of Flow control?

cc @LukeAVanDrie 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

With flow control, EPP's CPU usage reaches 100% #1847

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

With flow control, EPP's CPU usage reaches 100% #1847

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions