Skip to content

Conversation

a-klos
Copy link
Member

@a-klos a-klos commented Sep 10, 2025

This pull request focuses on improving error logging, updating docstrings for clarity, and adding module-level documentation across the admin and extractor API libraries. The most significant changes include replacing logger error statements with logger.exception for better tracebacks, simplifying and clarifying docstrings, and adding module/class documentation headers.

Error Logging Improvements

  • Replaced logger.error calls with logger.exception in exception handlers throughout the codebase to automatically include stack traces, and removed manual traceback.format_exc() usage. This affects files such as default_file_uploader.py, default_source_uploader.py, default_document_reference_retriever.py, s3_service.py, langchain_summarizer.py, and general_file_extractor.py. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Docstring and Documentation Updates

  • Updated method docstrings for clarity and consistency, especially for upload methods, and simplified class initialization docstrings. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
  • Added module-level docstrings to several files for improved documentation and readability, such as default_file_uploader.py, default_source_uploader.py, file_extractor.py, and source_extractor.py. [1] [2] [3] [4]

Code Quality and Style

  • Improved logging format consistency by switching from string interpolation to logger formatting where applicable.
  • Updated the extractor_type property in InformationExtractor to include a descriptive docstring.

Configuration and Linting

  • Updated pyproject.toml to add missing docstring linting ignores for test files and API base files, improving code style enforcement.

Cleanup

  • Removed unnecessary imports of traceback from files where it is no longer used due to logging improvements. [1] [2] [3] [4] [5]

Let me know if you have questions about any specific change or want to discuss best practices for error handling and documentation!

@huhn511 huhn511 requested a review from Copilot September 23, 2025 15:24
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request focuses on improving error logging practices and enhancing documentation consistency across the codebase. The changes primarily involve replacing logger.error calls with logger.exception for better stack trace visibility and updating docstrings for clarity.

  • Standardized error logging by replacing logger.error with logger.exception in exception handlers
  • Updated docstrings to use imperative mood consistently (e.g., "Upload" instead of "Uploads")
  • Added module-level documentation headers and improved code documentation

Reviewed Changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
container.py Added module and class docstrings for dependency container
chat_endpoint.py Added comprehensive module and method documentation
chat_model_provider_test.py Simplified test file header documentation
llm_factory.py Added module-level docstring
langfuse_manager.py Improved logging format and replaced error with exception logging
qdrant_database.py Updated error logging to use logger.exception
chat_graph.py Enhanced error logging in retrieve node
langfuse_ragas_evaluator.py Standardized exception logging across evaluation methods
default_information_pieces_remover.py Improved error logging for metadata parsing and vector db operations
pdf_extractor_test.py Updated logging format and removed extra whitespace
sitemap_extractor_utils.py Added module docstring and improved function documentation
extractor_types.py Added module-level documentation
s3_service.py Enhanced error logging for file deletion operations
sitemap_extractor.py Added comprehensive docstring for extractor_type property
pdf_extractor.py Improved logging format and exception handling throughout
confluence_extractor.py Added detailed docstring for extractor_type property
general_file_extractor.py Simplified error logging and removed traceback import
information_extractor.py Added detailed abstract method documentation
source_extractor.py Added module-level docstring
file_extractor.py Added module-level docstring
pyproject.toml Updated linting configuration for test and API base files
langchain_summarizer.py Enhanced logging format and removed traceback dependency
admin s3_service.py Improved file deletion logging and removed traceback import
default_source_uploader.py Added module docs and improved error logging consistency
default_file_uploader.py Enhanced documentation and standardized exception logging
default_document_reference_retriever.py Simplified error logging for document retrieval
admin_api.py Updated docstrings to use imperative mood
uploader_base.py Simplified initialization docstring
source_uploader.py Updated method docstring to imperative mood
file_uploader.py Added class docstring and updated method documentation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +239 to 242
logger.exception("Error while searching for documents in vector database")
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Error while searching for documents in vector database: %s" % e,
Copy link
Preview

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable e is referenced in the error detail message but is no longer captured in the exception handler. Either capture the exception as e or remove the variable reference from the detail message.

Copilot uses AI. Check for mistakes.

Comment on lines +52 to 55
logger.exception("Error while parsing metadata")
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Error while parsing metadata: %s" % e,
Copy link
Preview

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable e is referenced in the error detail message but is no longer captured in the exception handler. Either capture the exception as e or remove the variable reference from the detail message.

Copilot uses AI. Check for mistakes.

logger.exception("Failed to link item to generation")
retries += 1
if retries > self.MAX_RETRIES:
raise e
Copy link
Preview

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable e is referenced in the raise statement but is no longer captured in the exception handler. Either capture the exception as e or use raise without the variable to re-raise the current exception.

Suggested change
raise e
raise

Copilot uses AI. Check for mistakes.

Copy link
Collaborator

@huhn511 huhn511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants