Skip to content

Conversation

@VaishnaviHire
Copy link
Collaborator

@VaishnaviHire VaishnaviHire commented Dec 5, 2025

Fixes #164
Expose worker field in LlamaStackDistribution CR

Example CR

spec:
  server:
    containerSpec:
      env:
        - name: OLLAMA_INFERENCE_MODEL
          value: 'llama3.2:1b'
        - name: OLLAMA_URL
          value: 'http://ollama-server-service.ollama-dist.svc.cluster.local:11434'
      name: llama-stack
    distribution:
      name: starter
    workers: 2

@VaishnaviHire
Copy link
Collaborator Author

@mergify rebase

@mergify
Copy link

mergify bot commented Dec 8, 2025

rebase

✅ Branch has been successfully rebased

Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the premise of adding more run.yaml into the CRD and i think we should think about how we want to lay out things.

Should we go with:

DistributionServer *DistributionServerSpec

And

type DistributionServerSpec struct {
  Workers  *int32
}

Something like this, i think we need to a way to encapsulate worker into another section?

Thoughts?

EDIT: ok discussed offline:

  • we will revisit this design better when we introduce config properties in the CRD, this will be a new CRD version
  • we just need to make sure the workers match the resource/request

Distribution DistributionType `json:"distribution"`
ContainerSpec ContainerSpec `json:"containerSpec,omitempty"`
PodOverrides *PodOverrides `json:"podOverrides,omitempty"` // Optional pod-level overrides
// Workers configures the number of uvicorn worker processes to run.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we link https://fastapi.tiangolo.com/deployment/server-workers/ for more doc, to be better understand the usage?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@VaishnaviHire VaishnaviHire force-pushed the add_workers branch 3 times, most recently from 5f7df7e to dafd7d7 Compare December 9, 2025 17:31
@VaishnaviHire
Copy link
Collaborator Author

EDIT: ok discussed offline:

  • we will revisit this design better when we introduce config properties in the CRD, this will be a new CRD version
  • we just need to make sure the workers match the resource/request

Updated

Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit thanks for the follow up

// resolveContainerResources ensures the container always has CPU and memory
// requests defined so that HPAs using utilization metrics can function.
func resolveContainerResources(spec llamav1alpha1.ContainerSpec) corev1.ResourceRequirements {
func resolveContainerResources(spec llamav1alpha1.ContainerSpec, workers int32, workersSet bool) corev1.ResourceRequirements {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to log somewhere that we are setting the resources based out of the value of the workers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included it in api docs as well as added a log.

@VaishnaviHire
Copy link
Collaborator Author

@mergify rebase

@mergify
Copy link

mergify bot commented Dec 10, 2025

rebase

✅ Branch has been successfully rebased

Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final nit 🙏🏻

2) llama stack run /etc/llama-stack/run.yaml ;;
*) echo "Invalid version code: $VERSION_CODE, using new CLI"; llama stack run /etc/llama-stack/run.yaml ;;
2) exec uvicorn llama_stack.core.server.server:create_app --host 0.0.0.0 --port "$PORT" --workers "$WORKERS" --factory ;;
*) exec uvicorn llama_stack.core.server.server:create_app --host 0.0.0.0 --port "$PORT" --workers "$WORKERS" --factory ;;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the log back in here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Np, done

@VaishnaviHire
Copy link
Collaborator Author

@mergify rebase

@mergify
Copy link

mergify bot commented Dec 11, 2025

rebase

✅ Branch has been successfully rebased

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add LLS worker capability

3 participants