-
Notifications
You must be signed in to change notification settings - Fork 46
feat: Add support for uvicorn workers for llama-stack #211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5c03fe0 to
af4d5f7
Compare
|
@mergify rebase |
✅ Branch has been successfully rebased |
af4d5f7 to
b2c11a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the premise of adding more run.yaml into the CRD and i think we should think about how we want to lay out things.
Should we go with:
DistributionServer *DistributionServerSpecAnd
type DistributionServerSpec struct {
Workers *int32
}
Something like this, i think we need to a way to encapsulate worker into another section?
Thoughts?
EDIT: ok discussed offline:
- we will revisit this design better when we introduce config properties in the CRD, this will be a new CRD version
- we just need to make sure the
workersmatch the resource/request
| Distribution DistributionType `json:"distribution"` | ||
| ContainerSpec ContainerSpec `json:"containerSpec,omitempty"` | ||
| PodOverrides *PodOverrides `json:"podOverrides,omitempty"` // Optional pod-level overrides | ||
| // Workers configures the number of uvicorn worker processes to run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we link https://fastapi.tiangolo.com/deployment/server-workers/ for more doc, to be better understand the usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5f7df7e to
dafd7d7
Compare
Updated |
leseb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nit thanks for the follow up
| // resolveContainerResources ensures the container always has CPU and memory | ||
| // requests defined so that HPAs using utilization metrics can function. | ||
| func resolveContainerResources(spec llamav1alpha1.ContainerSpec) corev1.ResourceRequirements { | ||
| func resolveContainerResources(spec llamav1alpha1.ContainerSpec, workers int32, workersSet bool) corev1.ResourceRequirements { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to log somewhere that we are setting the resources based out of the value of the workers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included it in api docs as well as added a log.
dafd7d7 to
dd380e4
Compare
|
@mergify rebase |
✅ Branch has been successfully rebased |
dd380e4 to
853140b
Compare
leseb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final nit 🙏🏻
controllers/resource_helper.go
Outdated
| 2) llama stack run /etc/llama-stack/run.yaml ;; | ||
| *) echo "Invalid version code: $VERSION_CODE, using new CLI"; llama stack run /etc/llama-stack/run.yaml ;; | ||
| 2) exec uvicorn llama_stack.core.server.server:create_app --host 0.0.0.0 --port "$PORT" --workers "$WORKERS" --factory ;; | ||
| *) exec uvicorn llama_stack.core.server.server:create_app --host 0.0.0.0 --port "$PORT" --workers "$WORKERS" --factory ;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the log back in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Np, done
853140b to
18523da
Compare
|
@mergify rebase |
Signed-off-by: Vaishnavi Hire <[email protected]>
✅ Branch has been successfully rebased |
18523da to
6b78ff5
Compare
Fixes #164
Expose
workerfield inLlamaStackDistributionCRExample CR