Skip to content

Comments

feat: add vLLM backend integration#201

Merged
LarFii merged 2 commits intoHKUDS:mainfrom
sotastack:feature/vllm-integration
Feb 20, 2026
Merged

feat: add vLLM backend integration#201
LarFii merged 2 commits intoHKUDS:mainfrom
sotastack:feature/vllm-integration

Conversation

@teamauresta
Copy link
Contributor

  • Add examples/vllm_integration_example.py with full working example
  • Add docs/vllm_integration.md with setup guide and performance tips
  • Update env.example with vLLM configuration section

vLLM provides an OpenAI-compatible API with continuous batching, PagedAttention, and tensor parallelism for production RAG workloads.

Description

[Briefly describe the changes made in this pull request.]

Related Issues

[Reference any related issues or tasks addressed by this pull request.]

Changes Made

[List the specific changes made in this pull request.]

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

[Add any additional notes or context for the reviewer(s).]

- Add examples/vllm_integration_example.py with full working example
- Add docs/vllm_integration.md with setup guide and performance tips
- Update env.example with vLLM configuration section

vLLM provides an OpenAI-compatible API with continuous batching,
PagedAttention, and tensor parallelism for production RAG workloads.
@LarFii
Copy link
Collaborator

LarFii commented Feb 18, 2026

@cursor review

@LarFii
Copy link
Collaborator

LarFii commented Feb 18, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47ce1635db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

def embedding_func_factory(self):
"""Create a completely serializable embedding function."""
return EmbeddingFunc(
embedding_dim=1024, # bge-m3 default dimension

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make embedding dimension configurable for chosen model

The embedding wrapper hardcodes embedding_dim=1024 even though the model is read from environment variables, so switching to any non-1024 embedding model (including the guide's own OpenAI embedding option) will initialize LightRAG with the wrong vector size and can fail during insert/query with dimension mismatch errors. Read the dimension from EMBEDDING_DIM (or model metadata) instead of fixing it to 1024.

Useful? React with 👍 / 👎.

## Prerequisites

1. **NVIDIA GPU(s)** with CUDA support (compute capability ≥ 7.0)
2. **Python 3.9+**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Correct Python prerequisite to 3.10+

This guide says Python 3.9+ is supported, but the package metadata requires Python >=3.10 (pyproject.toml), so users on 3.9 following this setup will fail at installation before they can run the integration. Updating this prerequisite avoids a broken onboarding path.

Useful? React with 👍 / 👎.

@LarFii
Copy link
Collaborator

LarFii commented Feb 20, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4384b9b38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

## Prerequisites

1. **NVIDIA GPU(s)** with CUDA support (compute capability ≥ 7.0)
2. **Python 3.9+**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Raise documented Python minimum to 3.10

The new guide states Python 3.9+, but this repository declares requires-python = ">=3.10" in pyproject.toml, so users following this doc on Python 3.9 will fail during installation before they can run the vLLM example. Please align the prerequisite here with the actual package requirement to avoid a broken setup path.

Useful? React with 👍 / 👎.

@LarFii LarFii merged commit 20164f6 into HKUDS:main Feb 20, 2026
1 check passed
@teamauresta teamauresta deleted the feature/vllm-integration branch February 23, 2026 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants