Skip to content

Commit 8dad481

Browse files
authoredMar 7, 2025··
Improve Litellm proxy related error handling (#491)
* Adding termcolor to requirements. * Adding new litellm model for chat. * Fixing import path for add dummy data script. * Adding openai/chat litellm config. * Add utils to generate random int32. * Adding chat management functionalities. * Adding prompts for ChatHistory. * Removed zero-shot cot and updated system message passing. * Updated pre-commit to include types-requests. * Added chat fallback model. * Added REDIS timeout variable to config. * Updated prompts for ChatHistory. * Passing in paraphrase argument so that paraphrasing can be skipped for chat histories. * Updat chat management utilities. * Updated prompts for ChatHistory. * Added response generation with RAG and chat history. * Updated decorators and llm query generation to include chat history. Added /chat endpoint. Temporariliy commented out /search endpoint for quick testing. * Updated chat manager functions. Updated parameter name for _ask_llm_async from json to _json. * Updated prompts for spacing. * Updated prompt and chat management. * Added utils for generating random int32. * Refactored chat and search endpoints. * Consolidated chat and search endpoints. * CCs. * Removed termcolor package. * Adding types-requests to requirements-dev.txt for github workflow. * Passing along session ID for QueryResponse. * Logic shift to query refined template. * CCs. * Removing paraphrase argument. * Removing paraphrase argument. * CCs. * No need to return session ID. * Added tests for chat. * Fixing os env issue with github workflow. * Fixing os env issue with github workflow. * Fixing os env issue with github workflow. * Fixing os env issue with github workflow. * Fixing os env issue with github workflow. * Test. * Reverting tests. * CCs. * Checking mocked tests for github actions. * Updated tests. * Updated tests and fixed issue with truncation. * CCs. * CCs. * CCs. * CCs. * Removed WorkspaceRetrieve pydantic model. * CCs. * CCs. * CCs. * CCs. * Updated contents and tags packages for workspaces * Updated question_answer package for workspaces. Modified parts of data_api and llm_call packages. Finished up lagging function calls that were missing workspaces in previous commits. * Fixed function signatures. * Finished data_api package. * Finished admin package. * Updated urgency_detection and urgency_rules packages. * CCs. * Updated user_tools package. CCs. * CCs to utils and tags packages. * Updated question_answer and contents packages. * CCs to urgency_detection and uregncy_rules packages. * Updated data_api package. * CCs to llm_call/dashboard.py. * CCs to llm_call/llm_prompts.py. * Updated add_dummy_data_to_db to use workspace_id. * Updated add_new_data_to_db to use workspace_id. * Removed unused import. * Separated workspace logic into its own package with its own routers, utils, and schemas. Updated auth dependencies and routers to resolve circular import issues. * Linting. * Changing default workspace to be Workspace_{user.username}. * Added delete workspace and get workspace by user ID endpoints. * Updated table names. Added default_workspace column. Updated auth to pull default workspace. Added login-workspace endpoint. Updating tests... * CCs. * Updated workspace endpoints and schemas. Included better checks for quotas. * Checking for unique workspace name when updating workspace. Added ability to remove users from workspaces. * Added user removal functionality. * CCs to remaining modules. Fixed circular import issue and removed user_tools package---consolidated with users package now. Additional updates to users routers. * CCs. * Updated tests/rails package. * CCs. Going through and updating tests/api/conftest.py. * Updated test_admin.py. * Fixed alembic migration naming issue. Verified alembic tests pass. * Verified test_archive_content.py. * Verified test_chat.py * Verified test_data_api.py. * Verified test_import_content.py. * Verified test_import_content.py test_manage_content.py test_manage_tags.py * Verified test_manage_ud_rules.py. * Finished verifying existing tests except for dashboard tests. Added migration for on cascade deletion. * Finished verifying existing tests with pytest-randomly. Fixed lagging issues. * Added ability for any user to create a workspace. * commit message * commit message * Adding BDD tests. * Updating workspace BDD tests. * Updating workspace BDD tests. * Merging in frontend changes only for multi-turn conv. * Updating with multi-turn conv frontend PR and pylint fixes. * Adding linting make command. * Merged with topic modeling PR. * Folding in hotfixes to admin_app. * CCs. * CCs. * CCs. * CCs. * Folding in hotfixes to admin_app. * Updated dashboard package for workspace. * Verified remaining tests. * Add workspace bar * Updated github workflow for tests. Updated test_urgency_detect.py to include proper teardown. Updated dashboard filtering logic to point to UrgencyResponseDB instead of ResponseFeedbackDB. CCs. * Updated optional_components for linting and updated httpx dependency in order to pass github workflow. * Testing reverting back to using type. * Testing reverting back to using isinstance. * CCs. * CCs. * CCs. * CCs. * Added accidentally deleted pytest fixture. * Moved archive content test to its own workspace. * login endpoints now return workspace_name in AuthenticatedDetails. login-workspace now has dependency injection on get_current_user so that access token is required. workspace default quotas changed to env defaults. UserRetrieve now returns list of dicts instead of two separate lists. retrieve_all_users is now retrieve_all_users_in_current_workspace. added get_current_workspace endpoint. * Create new workspace component * Returning WorkspaceRetrieve after creating workspaces instead of WorkspaceCreate so that workspace_id is available. * Edit workspace button * Moved login-workspace endpoint to workspace/routers.py and changed to switch-workspace endpoint due to authentication requirement. Disabled updating workspace quotas on backend. * Update edit users * Added is_default_workspace in return object when adding existing users to a workspace. * Switch to diferent workspaces feature * Folding in changes for frontend and backend from update dashboard page 2 PR. * Modularizing BDD test. * CCs. * Added user resetting passwords BDD tests. * Changed endpoint from /workspace/current to /workspace/current-workspace. * Added retrieving user information BDD tests. * Added removing user BDD tests. * Fixed error in removing users from workspaces BDD tests. * Added updating user information BDD tests. Other CCs. * Added creating workspaces BDD tests. * Updating tests to pass in GHA. * Adding user role to access token and authentication. Removed is_admin attribute. * Added users endpoint to check if a username exists. * Added user routers to differentiate between creating new users and adding existing users. * Added adding users BDD tests. Put endpoint for checking if username exists back in. Separated out logic for creating new users vs. adding existing users to workspaces. * Added updating workspaces BDD tests. * Added type check. * Added retrieving workspaces BDD tests. Updated pyproject.toml and requirements-dev for coverage. * Router name change to add-existing-user-to-workspace. * Updated tests. * Add new changes * Add new changes * Updated user head endpoint to check if username exists to return a status code instead of a boolean. * CCs. * Add user to workspace * Removed access token requirement when resetting user password. Updated tests. * Default workspace implementation * Merging frontend changes from main. * Added official docs for multi-turn chat and workspaces. Removed HACK FIX comments. * Add reset user logic * Remove quotas from form * clean up * Fix read only issue * Remove section from integration page for read only users * Few bug fixes * Handle long workspace names * Changing default workspace name to {username}'s Workspace. * Fix user role bug * Consolidated workspace migration files to a single migration file that also takes care of data migration. Updated tests. * Fixing github workflow for tests. * Separated single workspace migration file into 3 stages for production. * Final fixes * Final final changes * Fixing migration script errors. * Fixing migration script errors. * Removed typo in front of Make command. * Fixing migration script errors. * Fixing migration script errors. * Updating with main. * Fix recovery_code being null issue * Improve error handling for LLM endpoint * Catch exception when adding/edting rules --------- Co-authored-by: tonyzhao6 <>
1 parent bb12a74 commit 8dad481

File tree

7 files changed

+197
-120
lines changed

7 files changed

+197
-120
lines changed
 

‎admin_app/src/app/content/components/ChatSideBar.tsx

+1-4
Original file line numberDiff line numberDiff line change
@@ -100,13 +100,10 @@ const ChatSideBar = ({
100100
: getResponse(question);
101101
responsePromise
102102
.then((response) => {
103-
const errorMessage = response.error
104-
? response.error.error_message
105-
: "LLM Response failed.";
106103
const responseMessage = {
107104
dateTime: new Date().toISOString(),
108105
type: "response",
109-
content: response.status == 200 ? response.llm_response : errorMessage,
106+
content: response.llm_response,
110107
json: response,
111108
} as ResponseMessage;
112109

‎admin_app/src/app/login/page.tsx

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ import * as React from "react";
2121
import { useEffect } from "react";
2222
import { appColors, sizes } from "@/utils";
2323
import {
24+
checkIfUsernameExists,
2425
getRegisterOption,
2526
registerUser,
2627
resetPassword,
@@ -135,7 +136,6 @@ const Login = () => {
135136
const handleCloseConfirmationModal = () => {
136137
setShowConfirmationModal(false);
137138
};
138-
139139
return isLoading ? (
140140
<Grid>
141141
{" "}

‎core_backend/app/contents/routers.py

+26-13
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
from ..tags.schemas import TagCreate, TagRetrieve
1919
from ..users.models import UserDB, user_has_required_role_in_workspace
2020
from ..users.schemas import UserRoles
21-
from ..utils import setup_logger
21+
from ..utils import EmbeddingCallException, setup_logger
2222
from ..workspaces.utils import (
2323
get_content_quota_by_workspace_id,
2424
get_workspace_by_workspace_name,
@@ -103,6 +103,7 @@ async def create_content(
103103
HTTPException
104104
If the user does not have the required role to create content in the workspace.
105105
If the content tags are invalid or the user would exceed their content quota.
106+
If the embedding of the content fails.
106107
"""
107108

108109
workspace_db = await get_workspace_by_workspace_name(
@@ -147,12 +148,18 @@ async def create_content(
147148
) from e
148149

149150
# 4.
150-
content_db = await save_content_to_db(
151-
asession=asession,
152-
content=content,
153-
exclude_archived=False, # Don't exclude for newly saved content!
154-
workspace_id=workspace_id,
155-
)
151+
try:
152+
content_db = await save_content_to_db(
153+
asession=asession,
154+
content=content,
155+
exclude_archived=False, # Don't exclude for newly saved content!
156+
workspace_id=workspace_id,
157+
)
158+
except EmbeddingCallException as e:
159+
raise HTTPException(
160+
status_code=status.HTTP_502_BAD_GATEWAY,
161+
detail="Error embedding content. Please check embedding service.",
162+
) from e
156163
return _convert_record_to_schema(record=content_db)
157164

158165

@@ -237,12 +244,18 @@ async def edit_content(
237244

238245
content.content_tags = content_tags
239246
content.is_archived = old_content.is_archived
240-
updated_content = await update_content_in_db(
241-
asession=asession,
242-
content=content,
243-
content_id=content_id,
244-
workspace_id=workspace_id,
245-
)
247+
try:
248+
updated_content = await update_content_in_db(
249+
asession=asession,
250+
content=content,
251+
content_id=content_id,
252+
workspace_id=workspace_id,
253+
)
254+
except EmbeddingCallException as e:
255+
raise HTTPException(
256+
status_code=status.HTTP_502_BAD_GATEWAY,
257+
detail="Error embedding content. Please check embedding service.",
258+
) from e
246259

247260
return _convert_record_to_schema(record=updated_content)
248261

‎core_backend/app/llm_call/utils.py

+38-19
Original file line numberDiff line numberDiff line change
@@ -71,33 +71,52 @@ async def _ask_llm_async(
7171
if not messages:
7272
assert isinstance(user_message, str) and isinstance(system_message, str)
7373
messages = [
74-
{
75-
"content": system_message,
76-
"role": "system",
77-
},
78-
{
79-
"content": user_message,
80-
"role": "user",
81-
},
74+
{"content": system_message, "role": "system"},
75+
{"content": user_message, "role": "user"},
8276
]
77+
8378
llm_generation_params = llm_generation_params or {
8479
"max_tokens": 1024,
8580
"temperature": 0,
8681
}
8782

8883
logger.info(f"LLM input: 'model': {litellm_model}, 'endpoint': {litellm_endpoint}")
8984

90-
llm_response_raw = await acompletion(
91-
model=litellm_model,
92-
messages=messages,
93-
api_base=litellm_endpoint,
94-
api_key=LITELLM_API_KEY,
95-
metadata=metadata,
96-
**extra_kwargs,
97-
**llm_generation_params,
98-
)
99-
logger.info(f"LLM output: {llm_response_raw.choices[0].message.content}")
100-
return llm_response_raw.choices[0].message.content
85+
try:
86+
llm_response_raw = await acompletion(
87+
model=litellm_model,
88+
messages=messages,
89+
api_base=litellm_endpoint,
90+
api_key=LITELLM_API_KEY,
91+
metadata=metadata,
92+
**extra_kwargs,
93+
**llm_generation_params,
94+
)
95+
except Exception as err:
96+
logger.error("Error calling the LLM", exc_info=True)
97+
raise LLMCallException(f"Error during LLM call: {err}") from err
98+
99+
# Optionally check if the returned response contains an error field
100+
if hasattr(llm_response_raw, "error") and llm_response_raw.error:
101+
error_msg = getattr(llm_response_raw, "error", "Unknown error")
102+
logger.error(f"LLM call returned an error: {error_msg}")
103+
raise LLMCallException(f"LLM call returned an error: {error_msg}")
104+
105+
# Ensure that the response has valid content
106+
try:
107+
content = llm_response_raw.choices[0].message.content
108+
except (AttributeError, IndexError) as e:
109+
logger.error("LLM response structure is not as expected", exc_info=True)
110+
raise LLMCallException("LLM response structure is not as expected") from e
111+
112+
logger.info(f"LLM output: {content}")
113+
return content
114+
115+
116+
class LLMCallException(Exception):
117+
"""Custom exception for LLM call errors."""
118+
119+
pass
101120

102121

103122
def _truncate_chat_history(

‎core_backend/app/question_answer/routers.py

+82-59
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
generate_tts__after,
4040
)
4141
from ..llm_call.utils import (
42+
LLMCallException,
4243
append_message_content_to_chat_history,
4344
get_chat_response,
4445
init_chat_history,
@@ -131,21 +132,32 @@ async def chat(
131132
QueryResponse | JSONResponse
132133
The query response object or an appropriate JSON response.
133134
"""
135+
try:
136+
# 1.
137+
user_query = await init_user_query_and_chat_histories(
138+
redis_client=request.app.state.redis,
139+
reset_chat_history=reset_chat_history,
140+
user_query=user_query,
141+
)
134142

135-
# 1.
136-
user_query = await init_user_query_and_chat_histories(
137-
redis_client=request.app.state.redis,
138-
reset_chat_history=reset_chat_history,
139-
user_query=user_query,
140-
)
143+
# 2
141144

142-
# 2.
143-
return await search(
144-
user_query=user_query,
145-
request=request,
146-
asession=asession,
147-
workspace_db=workspace_db,
148-
)
145+
response = await search(
146+
user_query=user_query,
147+
request=request,
148+
asession=asession,
149+
workspace_db=workspace_db,
150+
)
151+
return response
152+
except LLMCallException:
153+
return JSONResponse(
154+
status_code=status.HTTP_502_BAD_GATEWAY,
155+
content={
156+
"error_message": (
157+
"LLM call returned an error: Please check LLM configuration"
158+
)
159+
},
160+
)
149161

150162

151163
@router.post(
@@ -186,63 +198,74 @@ async def search(
186198
QueryResponse | JSONResponse
187199
The query response object or an appropriate JSON response.
188200
"""
201+
try:
202+
workspace_id = workspace_db.workspace_id
203+
user_query_db, user_query_refined_template, response_template = (
204+
await get_user_query_and_response(
205+
asession=asession,
206+
generate_tts=False,
207+
user_query=user_query,
208+
workspace_id=workspace_id,
209+
)
210+
)
211+
assert isinstance(user_query_db, QueryDB)
189212

190-
workspace_id = workspace_db.workspace_id
191-
user_query_db, user_query_refined_template, response_template = (
192-
await get_user_query_and_response(
213+
response = await get_search_response(
193214
asession=asession,
194-
generate_tts=False,
195-
user_query=user_query,
215+
exclude_archived=True,
216+
n_similar=int(N_TOP_CONTENT),
217+
n_to_crossencoder=int(N_TOP_CONTENT_TO_CROSSENCODER),
218+
query_refined=user_query_refined_template,
219+
request=request,
220+
response=response_template,
196221
workspace_id=workspace_id,
197222
)
198-
)
199-
assert isinstance(user_query_db, QueryDB)
200223

201-
response = await get_search_response(
202-
asession=asession,
203-
exclude_archived=True,
204-
n_similar=int(N_TOP_CONTENT),
205-
n_to_crossencoder=int(N_TOP_CONTENT_TO_CROSSENCODER),
206-
query_refined=user_query_refined_template,
207-
request=request,
208-
response=response_template,
209-
workspace_id=workspace_id,
210-
)
224+
if user_query.generate_llm_response:
225+
response = await get_generation_response(
226+
query_refined=user_query_refined_template, response=response
227+
)
211228

212-
if user_query.generate_llm_response:
213-
response = await get_generation_response(
214-
query_refined=user_query_refined_template, response=response
229+
await save_query_response_to_db(
230+
asession=asession,
231+
response=response,
232+
user_query_db=user_query_db,
233+
workspace_id=workspace_id,
234+
)
235+
await increment_query_count(
236+
asession=asession,
237+
contents=response.search_results,
238+
workspace_id=workspace_id,
239+
)
240+
await save_content_for_query_to_db(
241+
asession=asession,
242+
contents=response.search_results,
243+
query_id=response.query_id,
244+
session_id=user_query.session_id,
245+
workspace_id=workspace_id,
215246
)
216247

217-
await save_query_response_to_db(
218-
asession=asession,
219-
response=response,
220-
user_query_db=user_query_db,
221-
workspace_id=workspace_id,
222-
)
223-
await increment_query_count(
224-
asession=asession, contents=response.search_results, workspace_id=workspace_id
225-
)
226-
await save_content_for_query_to_db(
227-
asession=asession,
228-
contents=response.search_results,
229-
query_id=response.query_id,
230-
session_id=user_query.session_id,
231-
workspace_id=workspace_id,
232-
)
248+
if isinstance(response, QueryResponseError):
249+
return JSONResponse(
250+
status_code=status.HTTP_400_BAD_REQUEST, content=response.model_dump()
251+
)
252+
253+
if isinstance(response, QueryResponse):
254+
return response
233255

234-
if isinstance(response, QueryResponseError):
235256
return JSONResponse(
236-
status_code=status.HTTP_400_BAD_REQUEST, content=response.model_dump()
257+
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
258+
content={"error_message": "Internal server error"},
259+
)
260+
except LLMCallException:
261+
return JSONResponse(
262+
status_code=status.HTTP_502_BAD_GATEWAY,
263+
content={
264+
"error_message": (
265+
"LLM call returned an error: Please check LLM configuration"
266+
)
267+
},
237268
)
238-
239-
if isinstance(response, QueryResponse):
240-
return response
241-
242-
return JSONResponse(
243-
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
244-
content={"message": "Internal server error"},
245-
)
246269

247270

248271
@router.post(

‎core_backend/app/urgency_rules/routers.py

+24-13
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
from ..database import get_async_session
1111
from ..users.models import UserDB, user_has_required_role_in_workspace
1212
from ..users.schemas import UserRoles
13-
from ..utils import setup_logger
13+
from ..utils import EmbeddingCallException, setup_logger
1414
from ..workspaces.utils import get_workspace_by_workspace_name
1515
from .models import (
1616
UrgencyRuleDB,
@@ -78,12 +78,18 @@ async def create_urgency_rule(
7878
detail="User does not have the required role to create urgency rules in "
7979
"the workspace.",
8080
)
81+
try:
8182

82-
urgency_rule_db = await save_urgency_rule_to_db(
83-
asession=asession,
84-
urgency_rule=urgency_rule,
85-
workspace_id=workspace_db.workspace_id,
86-
)
83+
urgency_rule_db = await save_urgency_rule_to_db(
84+
asession=asession,
85+
urgency_rule=urgency_rule,
86+
workspace_id=workspace_db.workspace_id,
87+
)
88+
except EmbeddingCallException as e:
89+
raise HTTPException(
90+
status_code=status.HTTP_502_BAD_GATEWAY,
91+
detail="Error embedding rule. Please check embedding service.",
92+
) from e
8793
return _convert_record_to_schema(urgency_rule_db=urgency_rule_db)
8894

8995

@@ -255,13 +261,18 @@ async def update_urgency_rule(
255261
status_code=status.HTTP_404_NOT_FOUND,
256262
detail=f"Urgency Rule ID `{urgency_rule_id}` not found",
257263
)
258-
259-
urgency_rule_db = await update_urgency_rule_in_db(
260-
asession=asession,
261-
urgency_rule=urgency_rule,
262-
urgency_rule_id=urgency_rule_id,
263-
workspace_id=workspace_id,
264-
)
264+
try:
265+
urgency_rule_db = await update_urgency_rule_in_db(
266+
asession=asession,
267+
urgency_rule=urgency_rule,
268+
urgency_rule_id=urgency_rule_id,
269+
workspace_id=workspace_id,
270+
)
271+
except EmbeddingCallException as e:
272+
raise HTTPException(
273+
status_code=status.HTTP_502_BAD_GATEWAY,
274+
detail="Error embedding rule. Please check embedding service.",
275+
) from e
265276
return _convert_record_to_schema(urgency_rule_db=urgency_rule_db)
266277

267278

‎core_backend/app/utils.py

+25-11
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,12 @@ def create_langfuse_metadata(
120120
return metadata
121121

122122

123+
class EmbeddingCallException(Exception):
124+
"""Custom exception for embedding call errors."""
125+
126+
pass
127+
128+
123129
async def embedding(
124130
*, metadata: Optional[dict] = None, text_to_embed: str
125131
) -> list[float]:
@@ -138,17 +144,25 @@ async def embedding(
138144
The embedding for the given text.
139145
"""
140146

141-
metadata = metadata or {}
142-
143-
content_embedding = await aembedding(
144-
api_base=LITELLM_ENDPOINT,
145-
api_key=LITELLM_API_KEY,
146-
input=text_to_embed,
147-
metadata=metadata,
148-
model=LITELLM_MODEL_EMBEDDING,
149-
)
150-
151-
return content_embedding.data[0]["embedding"]
147+
try:
148+
content_embedding = await aembedding(
149+
api_base=LITELLM_ENDPOINT,
150+
api_key=LITELLM_API_KEY,
151+
input=text_to_embed,
152+
metadata=metadata,
153+
model=LITELLM_MODEL_EMBEDDING,
154+
)
155+
except Exception as err:
156+
raise EmbeddingCallException(f"Error during embedding call: {err}") from err
157+
158+
# Validate the response structure
159+
try:
160+
embedding_value = content_embedding.data[0]["embedding"]
161+
except (AttributeError, IndexError, KeyError) as err:
162+
raise EmbeddingCallException(
163+
"Embedding response structure is not as expected"
164+
) from err
165+
return embedding_value
152166

153167

154168
def encode_api_limit(*, api_limit: int | None) -> int | str:

0 commit comments

Comments
 (0)
Please sign in to comment.