Skip to content

[PERFORMANCE] AI Triage endpoint performs blocking LangGraph/Gemini inference inside async FastAPI route #1928

Description

@VaishnaviL2005

🔍 Have You Searched Existing Issues?

  • I have searched the existing issues to avoid duplicates

📉 Describe the Performance Issue

The AI triage endpoint (POST /triage/chat) is implemented as an async FastAPI route but executes a synchronous LangGraph workflow through run_triage_flow() -> triage_app.invoke().

The LangGraph workflow performs multiple blocking Gemini API calls using synchronous .invoke() operations. As a result, the FastAPI event loop thread handling the request remains occupied for the duration of the LLM inference, which can take several seconds.

Under concurrent load, this reduces responsiveness and concurrency for other requests handled by the same FastAPI worker. Long-running LLM inference should be offloaded to a worker thread rather than executing directly inside the event loop.

🧪 Environment Details

OS: Windows 11

🔁 Steps to Reproduce

  • Start the ML backend service.
  • Send a request to POST /triage/chat that triggers a full triage workflow.
  • While the triage request is processing, send additional requests to endpoints served by the same FastAPI worker.
  • Observe increased latency and reduced responsiveness while the LangGraph workflow is executing.
  • Inspect the route implementation and note that the async endpoint directly invokes the synchronous run_triage_flow() function, which ultimately performs multiple blocking Gemini API calls through LangGraph's .invoke() methods.

📋 Logs / Screenshots (Optional)

Image Image

🙌 Contributor Checklist

  • I agree to follow this project's Code of Conduct
  • I want to work on this issue
  • I am a GSSOC'26 contributor

Metadata

Metadata

Labels

gssoc:approvedApproved for gssoctype:performancePerformance optimization or latency improvements

Type

No type

Fields

No fields configured for issues without a type.

Projects

Status
🎉 Merged

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions