Skip to content

[ISSUE] Timeout at query serving model #860

Open
@JoseNogueiraFH

Description

@JoseNogueiraFH

Description
I'm using the Python SDK to query the model, like:

response = w.serving_endpoints.query(
        name='databricks-meta-llama-3-3-70b-instruct',
        messages=[
            ChatMessage(role=ChatMessageRole.SYSTEM, content=systemPrompt),
            ChatMessage(role=ChatMessageRole.USER, content=userContent)
        ],
        temperature=0.0,
    )

I'm facing a timeout error: "TimeoutError: Timed out after 0:05:00" 

I could not find how to change the timout value.

Expected behavior
A clear documentation about it, and a way to change the default timeout value.

Is it a regression?
I don't know. But I found this PR related

Other Information

Looking the source code I found the following workaround:

from databricks.sdk import WorkspaceClient, client
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole

w = WorkspaceClient(config=client.Config(
    http_timeout_seconds=60*60
))
response = w.serving_endpoints.query(
    name='external-gpt-4o-mini',
    messages=[
        ChatMessage(role=ChatMessageRole.SYSTEM, content=systemPrompt),
        ChatMessage(role=ChatMessageRole.USER, content=usercontent)
    ],
    temperature=0.0,
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    TriagedThe issue has been reviewed. Issues without a “Triaged” label require triage/review.documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions