Open
Description
Description
I'm using the Python SDK to query the model, like:
response = w.serving_endpoints.query(
name='databricks-meta-llama-3-3-70b-instruct',
messages=[
ChatMessage(role=ChatMessageRole.SYSTEM, content=systemPrompt),
ChatMessage(role=ChatMessageRole.USER, content=userContent)
],
temperature=0.0,
)
I'm facing a timeout error: "TimeoutError: Timed out after 0:05:00"
I could not find how to change the timout value.
Expected behavior
A clear documentation about it, and a way to change the default timeout value.
Is it a regression?
I don't know. But I found this PR related
Other Information
Looking the source code I found the following workaround:
from databricks.sdk import WorkspaceClient, client
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole
w = WorkspaceClient(config=client.Config(
http_timeout_seconds=60*60
))
response = w.serving_endpoints.query(
name='external-gpt-4o-mini',
messages=[
ChatMessage(role=ChatMessageRole.SYSTEM, content=systemPrompt),
ChatMessage(role=ChatMessageRole.USER, content=usercontent)
],
temperature=0.0,
)