Skip to content

Tsingmicro add txda backend to vllm-fl plugin.#52

Open
tsingmicro-public-e wants to merge 6 commits intoflagos-ai:mainfrom
tsingmicro-public-e:main
Open

Tsingmicro add txda backend to vllm-fl plugin.#52
tsingmicro-public-e wants to merge 6 commits intoflagos-ai:mainfrom
tsingmicro-public-e:main

Conversation

@tsingmicro-public-e
Copy link
Copy Markdown

PR Category

PR Types

PR Description

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Feb 16, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ tsingmicro-public-e
❌ ruanchunyang


ruanchunyang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an initial “txda” vendor backend into the vllm-plugin-FL dispatch system and enables static-graph mode for the txda device type.

Changes:

  • Extend static-graph support gating to include txda.
  • Add a new TxdaBackend backend class with operator entrypoints (silu_and_mul, rms_norm, rotary_embedding, attention_backend).
  • Add the txda vendor package initializer.

Reviewed changes

Copilot reviewed 1 out of 3 changed files in this pull request and generated no comments.

File Description
vllm_fl/platform.py Adds txda to the static-graph enablement allowlist.
vllm_fl/dispatch/backends/vendor/txda/txda.py Introduces the Txda backend class and operator/attention backend routing.
vllm_fl/dispatch/backends/vendor/txda/__init__.py Exposes the new backend from the txda vendor package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# fn=_bind_is_available(backend.rotary_embedding, is_avail),
# vendor="txda",
# priority=BackendPriority.VENDOR,
# ),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we dropping the other implementations?

Copy link
Copy Markdown

@tengqm tengqm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please label this PR with [WIP] if it is still a work in progress,
in case it gets merged accidentally.


from .txda import TxdaBackend

__all__ = ["TxadBackend"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TxdaBackend typo

# Copyright (c) 2026 BAAI. All rights reserved.

"""
METAX backend operator registrations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
METAX backend operator registrations.
TsingMicro backend operator registrations.

"""
METAX backend operator registrations.

This module registers all VENDOR (METAX) implementations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This module registers all VENDOR (METAX) implementations.
This module registers all VENDOR (TsingMicro) implementations.


def register_builtins(registry) -> None:
"""
Register all METAX (VENDOR) operator implementations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Register all METAX (VENDOR) operator implementations.
Register all TsingMicro (VENDOR) operator implementations.


import torch

# from vllm_fl.dispatch.backends.flaggems import FlagGemsBackend
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.

# Check for torch_npu (Txda PyTorch extension)
import torch_txda

# Check if NPU device is available
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Check if NPU device is available
# Check if TsingMicro txda device is available

if torch.txda.is_available() and torch.txda.device_count() > 0:
TxdaBackend._available = True
else:
TxdaBackend._available = False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is weird that we change class properties in an instance method.
You may want to annotate this method with @classmethod.

flagcx_path = os.getenv('FLAGCX_PATH')
library_path=os.path.join(flagcx_path, "build/lib/libflagcx.so")
#library_path=os.path.join(flagcx_path, "libflagcx.so") # rcy fix
library_path= "/usr/local/kuiper/lib/libflagcx.so"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using hard-coded path rather than using the environment variable?

from vllm_fl.utils import use_flaggems_op

if use_flaggems_op("fused_recurrent_gated_delta_rule_fwd"):
if True:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

ray_device_key: str = "GPU"
ray_device_key: str = "flagos"
dist_backend: str = "flagcx" if "FLAGCX_PATH" in os.environ else "nccl"
device_control_env_var: str = "TXDA_VISIBLE_DEVICES"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a platform specific class ...
This property should not live here?


from .txda import TxdaBackend

__all__ = ["TxadBackend"]

Check failure

Code scanning / CodeQL

Explicit export is not defined Error library

The name 'TxadBackend' is exported by __all__ but is not defined.

from __future__ import annotations

from typing import Optional, Union

Check notice

Code scanning / CodeQL

Unused import Note library

Import of 'Union' is not used.
flagcx_stream)
self.flagcx.adaptor_stream_free(flagcx_stream)
if change_type:
in_tensor = in_tensor.to(torch.bfloat16)

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable in_tensor is not used.
Comment on lines 261 to 262

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
fused_recurrent_gated_delta_rule_fwd,
)

from flag_gems.fused.FLA.utils import input_guard

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'input_guard' is not used.
Comment on lines +387 to +396
# if self.init_snapshot.free_memory < self.requested_memory:
# GiB = lambda b: round(b / GiB_bytes, 2)
# raise ValueError(
# f"Free memory on device "
# f"({GiB(self.init_snapshot.free_memory)}/"
# f"{GiB(self.init_snapshot.total_memory)} GiB) on startup "
# f"is less than desired GPU memory utilization "
# f"({self.cache_config.gpu_memory_utilization}, "
# f"{GiB(self.requested_memory)} GiB). Decrease GPU memory "
# f"utilization or reduce GPU memory used by other processes."

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
torch_device_fn = device_info.torch_device_fn
vendor_name = device_info.vendor_name
ray_device_key: str = "GPU"
ray_device_key: str = "flagos"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why modify here?


def is_cuda_alike(self) -> bool:
"""Stateless version of [torch.cuda.is_available][]."""
if self.vendor_name == "iluvatar":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants