Tsingmicro add txda backend to vllm-fl plugin.#52
Tsingmicro add txda backend to vllm-fl plugin.#52tsingmicro-public-e wants to merge 6 commits intoflagos-ai:mainfrom
Conversation
|
ruanchunyang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Pull request overview
This PR introduces an initial “txda” vendor backend into the vllm-plugin-FL dispatch system and enables static-graph mode for the txda device type.
Changes:
- Extend static-graph support gating to include
txda. - Add a new
TxdaBackendbackend class with operator entrypoints (silu_and_mul,rms_norm,rotary_embedding,attention_backend). - Add the
txdavendor package initializer.
Reviewed changes
Copilot reviewed 1 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
vllm_fl/platform.py |
Adds txda to the static-graph enablement allowlist. |
vllm_fl/dispatch/backends/vendor/txda/txda.py |
Introduces the Txda backend class and operator/attention backend routing. |
vllm_fl/dispatch/backends/vendor/txda/__init__.py |
Exposes the new backend from the txda vendor package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # fn=_bind_is_available(backend.rotary_embedding, is_avail), | ||
| # vendor="txda", | ||
| # priority=BackendPriority.VENDOR, | ||
| # ), |
There was a problem hiding this comment.
Why are we dropping the other implementations?
tengqm
left a comment
There was a problem hiding this comment.
Please label this PR with [WIP] if it is still a work in progress,
in case it gets merged accidentally.
|
|
||
| from .txda import TxdaBackend | ||
|
|
||
| __all__ = ["TxadBackend"] |
| # Copyright (c) 2026 BAAI. All rights reserved. | ||
|
|
||
| """ | ||
| METAX backend operator registrations. |
There was a problem hiding this comment.
| METAX backend operator registrations. | |
| TsingMicro backend operator registrations. |
| """ | ||
| METAX backend operator registrations. | ||
|
|
||
| This module registers all VENDOR (METAX) implementations. |
There was a problem hiding this comment.
| This module registers all VENDOR (METAX) implementations. | |
| This module registers all VENDOR (TsingMicro) implementations. |
|
|
||
| def register_builtins(registry) -> None: | ||
| """ | ||
| Register all METAX (VENDOR) operator implementations. |
There was a problem hiding this comment.
| Register all METAX (VENDOR) operator implementations. | |
| Register all TsingMicro (VENDOR) operator implementations. |
|
|
||
| import torch | ||
|
|
||
| # from vllm_fl.dispatch.backends.flaggems import FlagGemsBackend |
| # Check for torch_npu (Txda PyTorch extension) | ||
| import torch_txda | ||
|
|
||
| # Check if NPU device is available |
There was a problem hiding this comment.
| # Check if NPU device is available | |
| # Check if TsingMicro txda device is available |
| if torch.txda.is_available() and torch.txda.device_count() > 0: | ||
| TxdaBackend._available = True | ||
| else: | ||
| TxdaBackend._available = False |
There was a problem hiding this comment.
It is weird that we change class properties in an instance method.
You may want to annotate this method with @classmethod.
| flagcx_path = os.getenv('FLAGCX_PATH') | ||
| library_path=os.path.join(flagcx_path, "build/lib/libflagcx.so") | ||
| #library_path=os.path.join(flagcx_path, "libflagcx.so") # rcy fix | ||
| library_path= "/usr/local/kuiper/lib/libflagcx.so" |
There was a problem hiding this comment.
We are using hard-coded path rather than using the environment variable?
| from vllm_fl.utils import use_flaggems_op | ||
|
|
||
| if use_flaggems_op("fused_recurrent_gated_delta_rule_fwd"): | ||
| if True: |
| ray_device_key: str = "GPU" | ||
| ray_device_key: str = "flagos" | ||
| dist_backend: str = "flagcx" if "FLAGCX_PATH" in os.environ else "nccl" | ||
| device_control_env_var: str = "TXDA_VISIBLE_DEVICES" |
There was a problem hiding this comment.
This is not a platform specific class ...
This property should not live here?
|
|
||
| from .txda import TxdaBackend | ||
|
|
||
| __all__ = ["TxadBackend"] |
Check failure
Code scanning / CodeQL
Explicit export is not defined Error library
|
|
||
| from __future__ import annotations | ||
|
|
||
| from typing import Optional, Union |
Check notice
Code scanning / CodeQL
Unused import Note library
| flagcx_stream) | ||
| self.flagcx.adaptor_stream_free(flagcx_stream) | ||
| if change_type: | ||
| in_tensor = in_tensor.to(torch.bfloat16) |
Check notice
Code scanning / CodeQL
Unused local variable Note
vllm_fl/models/qwen3_5.py
Outdated
Check notice
Code scanning / CodeQL
Commented-out code Note
| fused_recurrent_gated_delta_rule_fwd, | ||
| ) | ||
|
|
||
| from flag_gems.fused.FLA.utils import input_guard |
Check notice
Code scanning / CodeQL
Unused import Note
| # if self.init_snapshot.free_memory < self.requested_memory: | ||
| # GiB = lambda b: round(b / GiB_bytes, 2) | ||
| # raise ValueError( | ||
| # f"Free memory on device " | ||
| # f"({GiB(self.init_snapshot.free_memory)}/" | ||
| # f"{GiB(self.init_snapshot.total_memory)} GiB) on startup " | ||
| # f"is less than desired GPU memory utilization " | ||
| # f"({self.cache_config.gpu_memory_utilization}, " | ||
| # f"{GiB(self.requested_memory)} GiB). Decrease GPU memory " | ||
| # f"utilization or reduce GPU memory used by other processes." |
Check notice
Code scanning / CodeQL
Commented-out code Note
| torch_device_fn = device_info.torch_device_fn | ||
| vendor_name = device_info.vendor_name | ||
| ray_device_key: str = "GPU" | ||
| ray_device_key: str = "flagos" |
|
|
||
| def is_cuda_alike(self) -> bool: | ||
| """Stateless version of [torch.cuda.is_available][].""" | ||
| if self.vendor_name == "iluvatar": |
PR Category
PR Types
PR Description