Skip to content

Conversation

edlee123
Copy link
Contributor

@edlee123 edlee123 commented Jun 24, 2025

Description

Allows ChatQnA to be used with hundreds of OpenAI-like endpoints e.g. OpenRouter.ai, Hugging Face, Denvr, and improve the developer experience to use OPEA quickly even on low resource environments.

Key Changes Made:

  • Created ChatQnA/docker_compose/intel/cpu/xeon/README_endpoint_openai.md: instructions to spin up example.
  • Created ChatQnA/docker_compose/intel/cpu/xeon/compose_endpoint_openai.yaml: replaces vLLM with an opeai-like endpoint

Also:

  • Fixed align_generator function to properly detect and skip chunks where content is null in open-ai like endpoints. Previously it'd show the null json in the UI.
  • Added better error handling and debug logging for easier troubleshooting of endpoint issues.

Issues

N/A

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

N/A

Tests

OpenRouter.ai: anthropic/claude-3.7-sonnet
Denvr: meta-llama/Llama-3.1-70B-Instruct
Hugging Face Inference Endpoint: microsoft/phi-4

edlee123 and others added 30 commits June 24, 2025 18:08
…w null json. Also improved exception handling and logging

Signed-off-by: Ed Lee <[email protected]>
Integrate MultimodalQnA set_env to ut scripts.
Add README.md for UT scripts.

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
…nt (opea-project#1996)

Signed-off-by: Mustafa <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Ed Lee <[email protected]>
Signed-off-by: Yi Yao <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
Signed-off-by: ZePan110 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Ed Lee <[email protected]>
…archQnA and Translation (opea-project#2038)

update secrets token name for ProductivitySuite, RerankFinetuning, SearchQnA and Translation
Fix shellcheck issue

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
…rkflowExecAgent (opea-project#2039)

update secrets token name for InstructionTuning, MultimodalQnA and WorkflowExecAgent
Fix shellcheck issue

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings June 24, 2025 23:34
Copy link

github-actions bot commented Jun 24, 2025

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces an OpenAI-compatible endpoint for ChatQnA, updates the deployment documentation, and includes improvements in error handling and logging.

  • Added new Docker Compose file (compose_endpoint_openai.yaml) to support OpenAI-like endpoints.
  • Updated README files for clearer deployment instructions and configuration details.
  • Fixed the align_generator function in chatqna.py to better handle and filter null content chunks.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
CodeGen/docker_compose/intel/cpu/xeon/README.md Updated docker compose command and environment variable documentation; note a markdown table formatting issue.
ChatQnA/docker_compose/intel/cpu/xeon/compose_endpoint_openai.yaml Added new compose file for OpenAI-compatible endpoint integration.
ChatQnA/docker_compose/intel/cpu/xeon/README_endpoint_openai.md New documentation with detailed instructions for deploying ChatQnA using the new endpoint.
ChatQnA/chatqna.py Improved logging and error handling in input/output alignment and generator functions.
Comments suppressed due to low confidence (1)

CodeGen/docker_compose/intel/cpu/xeon/README.md:111

  • The table row for LLM_ENDPOINT appears to be broken into two columns due to an unintended pipe character. Please merge the content into a single cell to ensure the URL displays correctly.
| `LLM_ENDPOINT`                          | Internal URL for the LLM serving endpoint (used by `codegen-llm-server`). Configured in `compose.yaml`.             | `http://codegen-vllm                           | tgi-server:9000/v1/chat/completions` |

@edlee123 edlee123 requested a review from letonghan July 2, 2025 05:09
@edlee123
Copy link
Contributor Author

edlee123 commented Jul 2, 2025

Hi @yao531441 @letonghan if either of you can, I'm looking for one more reviewer please :)

@joshuayao joshuayao added this to the v1.4 milestone Aug 14, 2025
@joshuayao joshuayao added this to OPEA Aug 14, 2025
@joshuayao joshuayao moved this to In progress in OPEA Aug 14, 2025
@edlee123 edlee123 changed the title ChatQnA Example with OpenAI-Compatible Endpoint Bump: ChatQnA Example with OpenAI-Compatible Endpoint Aug 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.