Skip to content

Conversation

@hlahkar
Copy link

@hlahkar hlahkar commented Jan 2, 2026

No description provided.

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates support for the GPT OSS model type, including additions for handling model-specific routing logic, bias support in MoE layers, and attention sink mechanisms for improved inference.

  • Adds GPT OSS-specific expert routing and softmax handling in the MoE forward pass
  • Implements bias support throughout the MoE pipeline
  • Introduces attention sink functionality across attention backends and operations

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vllm_gaudi/v1/worker/hpu_model_runner.py Increases sliding window block size calculation by 1
vllm_gaudi/ops/hpu_fused_moe.py Adds GPT OSS model type detection, bias handling in MoE operations, and model-specific expert routing
vllm_gaudi/extension/utils.py Adds sinks parameter support to forward pass
vllm_gaudi/extension/ops.py Implements sink attention mechanism in pipelined and naive attention functions, adds bias support to MoE operations
vllm_gaudi/attention/backends/hpu_attn.py Adds sinks parameter and dtype consistency checks in attention implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hlahkar hlahkar mentioned this pull request Jan 2, 2026
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@wpyszka
Copy link
Collaborator

wpyszka commented Jan 19, 2026

/run-gaudi-tests

wpyszka and others added 3 commits January 19, 2026 13:29
Signed-off-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
@hlahkar
Copy link
Author

hlahkar commented Jan 22, 2026

/run-gaudi-tests

@sys-hab-pt-service
Copy link
Collaborator

Only codeowners and testowners can request to run Gaudi tests. Contact list: kzawora-intel, xuechendi, adobrzyn, mgawarkiewicz-intel, afierka-intel, michalkuligowski, iboiko-habana, kamil-kaczor, ksmusz, PatrykWo, kamil-kaczor, kfojcik-intel, ksmusz, wuxun-zhang, xuechendi, attafosu, ulivne, Kacper-Pietkun, iboiko-habana, jkaniecki, jbyczkow, wpyszka

hlahkar and others added 3 commits January 23, 2026 06:07
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@wpyszka
Copy link
Collaborator

wpyszka commented Jan 23, 2026

/run-gaudi-tests

iboiko-habana and others added 2 commits January 23, 2026 15:42
Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants