Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

floor mode for div tensors finished #8987

Merged
merged 24 commits into from
Jan 17, 2023
Merged

floor mode for div tensors finished #8987

merged 24 commits into from
Jan 17, 2023

Conversation

doombeaker
Copy link
Contributor

@doombeaker doombeaker commented Aug 23, 2022

flow.div 接口添加 rounding_mode keyword 参数

问题来源:Oneflow-Inc/vision#242

  • div(tensor x, tensor y, rounding_mode)
  • div(scalar x, tensor y, rounding_mode)
  • div(tensor x, scalar y, rounding_mode)

image

@doombeaker doombeaker marked this pull request as ready for review October 24, 2022 09:55
Comment on lines +85 to +87
"Tensor (Tensor input, Tensor other, *, String rounding_mode=None) => DivMode",
"Tensor (Tensor input, Scalar other, *, String rounding_mode=None) => ScalarDivMode",
"Tensor (Scalar input, Tensor other, *, String rounding_mode=None) => ScalarDivMode",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为 oneflow 内部比较多的地方使用了 functional::Div, functional::ScalarDiv 等,所以不改以前的接口,直接增加接口。

dim1 = random(low=1, high=4).to(int)
x = random_tensor(ndim=2, dim0=dim0, dim1=dim1).to(device)
y = random_tensor(ndim=2, dim0=dim0, dim1=dim1).to(device)
z = torch.div(x, y, rounding_mode="floor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.div(torch.tensor(-1), torch.tensor(2), rounding_mode="floor")
tensor(-1)

oneflow.floor_divide(oneflow.tensor(-1), oneflow.tensor(2))
tensor(0, dtype=oneflow.int64)

oneflow.floor_divide的错误,发生在整形且两个输入的符号不同时
这个测试用例里只测了float,可以加一个整形的测试

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_div.py完善了以后,test_trunc_divide.py就可以废弃了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个问题是存在的,会在另外外的 PR 解决。

device = random_device()
x1 = random(low=1, high=5).to(float)
x2 = random_tensor(2, 2, 3).to(device)
y = torch.div(x1, x2, rounding_mode="floor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

标量除法也可以加一个整形的测试

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.div(torch.tensor(1).cpu(), torch.randn(2).cuda(), rounding_mode="floor")
torch支持这种行为,这个pr是否支持这种行为,如果支持的话可以加上这个测试

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个修改会在另外的 PR 解决。

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2023

Speed stats:

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2023

Static analysis with clang failed. PR label automerge has been removed

@github-actions github-actions bot removed the automerge label Jan 9, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2023

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.5ms (= 13945.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 159.9ms (= 15992.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 159.9ms / 139.5ms)

OneFlow resnet50 time: 84.7ms (= 8472.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 110.4ms (= 11036.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 110.4ms / 84.7ms)

OneFlow resnet50 time: 57.5ms (= 11497.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.6ms (= 15726.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 78.6ms / 57.5ms)

OneFlow resnet50 time: 45.4ms (= 9079.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 65.2ms (= 13031.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.44 (= 65.2ms / 45.4ms)

OneFlow resnet50 time: 40.6ms (= 8116.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.2ms (= 13836.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.70 (= 69.2ms / 40.6ms)

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2023

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8987/

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2023

CI failed when running job: cuda-misc. PR label automerge has been removed

@doombeaker doombeaker requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 16, 2023 02:36
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 140.1ms (= 14007.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.7ms (= 16065.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.7ms / 140.1ms)

OneFlow resnet50 time: 84.8ms (= 8477.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.8ms (= 10278.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.21 (= 102.8ms / 84.8ms)

OneFlow resnet50 time: 57.7ms (= 11536.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 87.2ms (= 17447.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.51 (= 87.2ms / 57.7ms)

OneFlow resnet50 time: 44.1ms (= 8827.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.9ms (= 14180.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.61 (= 70.9ms / 44.1ms)

OneFlow resnet50 time: 42.1ms (= 8425.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 64.3ms (= 12851.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.53 (= 64.3ms / 42.1ms)

@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8987/

@doombeaker doombeaker requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 16, 2023 06:18
@github-actions
Copy link
Contributor

CI failed when running job: cpu-misc. PR label automerge has been removed

@github-actions
Copy link
Contributor

Speed stats:

@doombeaker doombeaker requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 17, 2023 03:25
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.9ms (= 13985.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.7ms (= 16168.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 161.7ms / 139.9ms)

OneFlow resnet50 time: 87.3ms (= 8732.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.2ms (= 10217.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 102.2ms / 87.3ms)

OneFlow resnet50 time: 59.7ms (= 11944.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.3ms (= 15256.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.28 (= 76.3ms / 59.7ms)

OneFlow resnet50 time: 48.0ms (= 9605.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.3ms (= 13451.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.40 (= 67.3ms / 48.0ms)

OneFlow resnet50 time: 44.3ms (= 8855.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 57.7ms (= 11537.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 57.7ms / 44.3ms)

@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8987/

@mergify mergify bot merged commit 6c723ab into master Jan 17, 2023
@mergify mergify bot deleted the fix_div_print_inalgined branch January 17, 2023 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants