Skip to content

[Cambricon] Optimize 8 ops with larger BLOCK_SIZE kernels#2218

Merged
huangyiqun merged 4 commits intoflagos-ai:masterfrom
Ankaluoer:cambricon-opt
Apr 10, 2026
Merged

[Cambricon] Optimize 8 ops with larger BLOCK_SIZE kernels#2218
huangyiqun merged 4 commits intoflagos-ai:masterfrom
Ankaluoer:cambricon-opt

Conversation

@Ankaluoer
Copy link
Copy Markdown
Contributor

@Ankaluoer Ankaluoer commented Apr 2, 2026

PR Category

Operator

Type of Change

Performance Optimization

Description

Operator Avg Speedup (Before) Avg Speedup (After) Improvement
abs_ 0.735 0.967 +31.6%
neg_ 0.685 0.953 +39.1%
ceil_ 0.797 0.872 +9.4%
relu_ 0.755 0.919 +21.7%
threshold (new) 0.520 0.889 +71.2%
dropout 0.562 0.589 +4.9%
logical_and_ 0.595 0.867 +45.9%
logical_or_ 0.595 0.820 +37.7%

Issue

Progress

  • Change is properly reviewed (1 reviewer required, 2 recommended).
  • Change is responded to an issue.
  • Change is fully covered by a UT.

Performance

All benchmarks run on MLU590-M9DE with FlagGems standard benchmark suite.

@huangyiqun huangyiqun self-assigned this Apr 3, 2026
huangyiqun
huangyiqun previously approved these changes Apr 7, 2026
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 7, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Collaborator

@huangyiqun huangyiqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in CLA.

Copy link
Copy Markdown
Contributor

@tengqm tengqm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a nit in source. Please check.

Also please don't change file mode for .pre-commit-config.yaml.

@github-actions github-actions bot removed the tests label Apr 10, 2026
@huangyiqun huangyiqun merged commit 5ddad5f into flagos-ai:master Apr 10, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants