-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JointInferenceService controller enhancement #445
JointInferenceService controller enhancement #445
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are CI issues that remain to be resolved, see https://github.com/kubeedge/sedna/actions/runs/11249926500/job/31277758880?pr=445
318f6fc
to
3342955
Compare
/lgtm |
// Check the changes in specific fields and perform corresponding operations. | ||
if !reflect.DeepEqual(oldService.Spec.CloudWorker, newService.Spec.CloudWorker) { | ||
// If the cloud inference service changes, perform the corresponding update operation. | ||
return c.updateCloudWorker(newService) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't return directly; otherwise, the subsequent code won't be executed directly.
Modify the updateInferenceServices function as follows.
func (c *Controller) updateInferenceServices(old, cur interface{}) error {
oldService := old.(*sednav1.JointInferenceService)
newService := cur.(*sednav1.JointInferenceService)
// Check if the cloud worker configuration has changed
if !reflect.DeepEqual(oldService.Spec.CloudWorker, newService.Spec.CloudWorker) {
// Update cloud worker and log any errors
if err := c.updateCloudWorker(newService); err != nil {
klog.Errorf("Failed to update cloud worker for service %s/%s: %v", newService.Namespace, newService.Name, err)
}
}
// Retrieve the address of the cloud inference service
var bigModelHost string
svc, err := c.kubeClient.CoreV1().Services(oldService.Namespace).Get(context.Background(),
strings.ToLower(oldService.Name+"-"+jointInferenceForCloud), metav1.GetOptions{})
if err != nil {
if errors.IsNotFound(err) {
// If the service does not exist, create a new one and retrieve its address
klog.Info("Cloud service not found, creating new service...")
bigModelHost, err = runtime.CreateEdgeMeshService(c.kubeClient, oldService, jointInferenceForCloud, BigModelPort)
if err != nil {
klog.Errorf("Failed to create EdgeMesh service for service %s/%s: %v", oldService.Namespace, oldService.Name, err)
}
} else {
klog.Errorf("Failed to get cloud service %s/%s: %v", oldService.Namespace, oldService.Name, err)
}
} else {
bigModelHost = fmt.Sprintf("%s.%s", svc.Name, svc.Namespace)
}
// Check if the edge worker configuration has changed
if !reflect.DeepEqual(oldService.Spec.EdgeWorker, newService.Spec.EdgeWorker) {
// Update edge worker and log any errors
if err := c.updateEdgeWorker(newService, bigModelHost); err != nil {
klog.Errorf("Failed to update edge worker for service %s/%s: %v", newService.Namespace, newService.Name, err)
}
}
return nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified.
26befb9
to
55b7d88
Compare
sednav1 "github.com/kubeedge/sedna/pkg/apis/sedna/v1alpha1" | ||
fakeseednaclientset "github.com/kubeedge/sedna/pkg/client/clientset/versioned/fake" | ||
"github.com/kubeedge/sedna/pkg/globalmanager/config" | ||
"github.com/kubeedge/sedna/pkg/globalmanager/runtime" | ||
appsv1 "k8s.io/api/apps/v1" | ||
v1 "k8s.io/api/core/v1" | ||
k8serrors "k8s.io/apimachinery/pkg/api/errors" | ||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
"k8s.io/apimachinery/pkg/labels" | ||
"k8s.io/apimachinery/pkg/watch" | ||
kubernetesfake "k8s.io/client-go/kubernetes/fake" | ||
"k8s.io/client-go/kubernetes/scheme" | ||
v1core "k8s.io/client-go/kubernetes/typed/core/v1" | ||
corelisters "k8s.io/client-go/listers/apps/v1" | ||
corelistersv1 "k8s.io/client-go/listers/core/v1" | ||
"k8s.io/client-go/tools/record" | ||
"k8s.io/client-go/util/workqueue" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grouping of imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified.
55b7d88
to
e02a997
Compare
"BIG_MODEL_BIND_PORT": strconv.Itoa(int(bigModelPort)), | ||
} | ||
workerParam.WorkerType = workerType | ||
workerParam.HostNetwork = false // The cloud does not need HostNetwork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave the configuration of hostNetwork to the user. There is no need for mandatory configuration here.
"LC_SERVER": c.cfg.LC.Server, | ||
} | ||
workerParam.WorkerType = workerType | ||
workerParam.HostNetwork = true // Edge nodes need HostNetwork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After these pull requests are merged, the current joint-inference examples need to add the hostNetwork configuration in the YAML, similar to the following:
CLOUD_NODE="cloud-node-name"
EDGE_NODE="edge-node-name"
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: JointInferenceService
metadata:
name: helmet-detection-inference-example
namespace: default
spec:
edgeWorker:
model:
name: "helmet-detection-inference-little-model"
hardExampleMining:
name: "IBT"
parameters:
- key: "threshold_img"
value: "0.9"
- key: "threshold_box"
value: "0.9"
template:
spec:
nodeName: $EDGE_NODE
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
containers:
- image: kubeedge/sedna-example-joint-inference-helmet-detection-little:v0.3.0
imagePullPolicy: IfNotPresent
name: little-model
env: # user defined environments
- name: input_shape
value: "416,736"
- name: "video_url"
value: "rtsp://localhost/video"
- name: "all_examples_inference_output"
value: "/data/output"
- name: "hard_example_cloud_inference_output"
value: "/data/hard_example_cloud_inference_output"
- name: "hard_example_edge_inference_output"
value: "/data/hard_example_edge_inference_output"
resources: # user defined resources
requests:
memory: 64M
cpu: 100m
limits:
memory: 2Gi
volumeMounts:
- name: outputdir
mountPath: /data/
volumes: # user defined volumes
- name: outputdir
hostPath:
# user must create the directory in host
path: /joint_inference/output
type: Directory
cloudWorker:
model:
name: "helmet-detection-inference-big-model"
template:
spec:
nodeName: $CLOUD_NODE
dnsPolicy: ClusterFirstWithHostNet
containers:
- image: kubeedge/sedna-example-joint-inference-helmet-detection-big:v0.3.0
name: big-model
imagePullPolicy: IfNotPresent
env: # user defined environments
- name: "input_shape"
value: "544,544"
resources: # user defined resources
requests:
memory: 2Gi
EOF
/lgtm |
…e a Deployment resource instead of a Pod resource, and let the Deployment controller manage the number of Pods.automatically update deployment when modifying CRD of joint inference.add a test file to ensure the correctness of joint inference controller. Signed-off-by: SherlockShemol <[email protected]>
e02a997
to
bfa3a65
Compare
/lgtm |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jaypume The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #