Skip to content

[Cambricon] Optimize 8 ops with larger BLOCK_SIZE kernels#2218

Open
Ankaluoer wants to merge 2 commits intoflagos-ai:masterfrom
Ankaluoer:cambricon-opt
Open

[Cambricon] Optimize 8 ops with larger BLOCK_SIZE kernels#2218
Ankaluoer wants to merge 2 commits intoflagos-ai:masterfrom
Ankaluoer:cambricon-opt

Conversation

@Ankaluoer
Copy link
Copy Markdown

@Ankaluoer Ankaluoer commented Apr 2, 2026

PR Category

Operator

Type of Change

Performance Optimization

Description

Operator Avg Speedup (Before) Avg Speedup (After) Improvement
abs_ 0.735 0.967 +31.6%
neg_ 0.685 0.953 +39.1%
ceil_ 0.797 0.872 +9.4%
relu_ 0.755 0.919 +21.7%
threshold (new) 0.520 0.889 +71.2%
dropout 0.562 0.589 +4.9%
logical_and_ 0.595 0.867 +45.9%
logical_or_ 0.595 0.820 +37.7%

Issue

Progress

  • Change is properly reviewed (1 reviewer required, 2 recommended).
  • Change is responded to an issue.
  • Change is fully covered by a UT.

Performance

All benchmarks run on MLU590-M9DE with FlagGems standard benchmark suite.

@huangyiqun huangyiqun self-assigned this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants