-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix accuracy of max-pooling backpropagation for bfloat16 data #2386
Open
asimonov1
wants to merge
6
commits into
main
Choose a base branch
from
asimonov/maxpool_bwd_bf16_accuracy
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
101f13c
benchdnn: pool: use zero threshold for max-pooling errors
asimonov1 4062d62
cpu: x64: pooling: use fp32 accumulator for bf16 max-pooling backprop
asimonov1 aafc363
cpu: x64: pooling: fix ur_bc calculation in back propagation
asimonov1 7d0f33b
cpu: x64: pooling: init scratchpad refactoring
asimonov1 ed1ee67
cpu: x64: pooling: support acc mode in max pooling backprop for bf16
asimonov1 3abcafb
cpu: x64: pooling: refactor to use io injector
asimonov1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add support for acc_mode (separate commit is totally fine) which would allow to preserve former behavior (accumulation in bf16) to avoid issues like with softmax reported recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added partial support of acc_mode. jit_uni_pooling implementation of backpropagation for max-pooling for axb, aBx16b/aBx8b for bf16 numbers switches to an old implementation (without f32 accumulator) if 'relaxed' or 'any' acc mode is specified. We have 'strict', 'relaxed', 'any', 'f32', 's32', 'f16' acc modes. Modes 's32' and 'f16' are just ignored, and 'f32' and 'strict' modes allow new implementation (f32 accumulator is not required if strides are larger than corresponding kernel sizes).
benchdnn is updated to use zero error threshold in case of max-pooling with strict or f32 accumulation modes.
If this approach is ok then docs are to be updated. It looks that docs are not correct/complete (https://oneapi-src.github.io/oneDNN/dev_guide_pooling.html) E.g. in case of abx format, our jit_uni_pooling implementation converts inputs/outputs to/from f32 arrays, so its accumulation mode is actually always 'strict'. 'relaxed' mode is not always faster than 'strict' but uses less memory. f64 data type can be used on GPUs only (?).
I also noticed that GPU version is out of scope (?) of MFDNN-11050 and MFDNN-11396, I did not test it.