You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the endpoint was first created, this works fine. However, if there is a pre-existing model deployed, then deploying a new version of the model will throw the following error:
google.api_core.exceptions.FailedPrecondition: 400 There might be another deployed model or a failed deployed model that hasn't been cleaned up under the same private endpoint, please try again later or create another endpoint.
I also tried to use gcloud ai command line
gcloud ai endpoints deploy-model $ENDPOINT_NAME --project=$PROJECT_ID --region=us-central1 --model=$MODEL_NAME --display-name=$DISPLAY_NAME
The doc-string of the PrivateEndpoint.deploy method, the traffic_percentage states the following
traffic_percentage (int):
Optional. Desired traffic to newly deployed model.
Defaults to 0 if there are pre-existing deployed models.
Defaults to 100 if there are no pre-existing deployed models.
Defaults to 100 for PSA based private endpoint.
Negative values should not be provided. Traffic of previously
deployed models at the endpoint will be scaled down to
accommodate new deployed model's traffic.
Should not be provided if traffic_split is provided.
So it appears replacing a model with another model should have been supported. Setting traffic_percentage to 0 or commenting this argument out got me the same error as previously. I have also tried to set this traffic_percentage argument to None, and got
if traffic_percentage > 100:
^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'int'
Expected behavior: The PrivateEndpoint should be able to deploy new model version replacing the traffic of older model versions.
Other considerations (but maybe out of scope of the ticket): I need to access a MemoryStore Redis instance on GCP, and so far, only deploying the endpoint and Redis with PSA worked.
The text was updated successfully, but these errors were encountered:
I am deploying models to a PrivateEndpoint that attaches to a VPC network, using the following script.
or alternatively, I have also attempted to use
endpoint.deploy
When the endpoint was first created, this works fine. However, if there is a pre-existing model deployed, then deploying a new version of the model will throw the following error:
I also tried to use
gcloud ai
command lineThe doc-string of the
PrivateEndpoint.deploy
method, thetraffic_percentage
states the followingSo it appears replacing a model with another model should have been supported. Setting
traffic_percentage
to 0 or commenting this argument out got me the same error as previously. I have also tried to set thistraffic_percentage
argument toNone
, and gotExpected behavior: The PrivateEndpoint should be able to deploy new model version replacing the traffic of older model versions.
Other considerations (but maybe out of scope of the ticket): I need to access a MemoryStore Redis instance on GCP, and so far, only deploying the endpoint and Redis with PSA worked.
The text was updated successfully, but these errors were encountered: