-
Notifications
You must be signed in to change notification settings - Fork 615
Made blur faster by using gaussian decomposition property. #2315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rflow#1767) * Use the mark for the gpu tests. * Skip if gpu and tf.float16.
* added test for normalize_data_format in keras_utils * change after review
* add sample_weight support to FScore metrics, generate test data with https://colab.research.google.com/drive/1ymB5iOj9YCeBQ-g-8eMGpQhiHr7QUBM4
* Remove `sequential_update` from AverageWrapper In TF2.0, sequential_update is redundant. This allows the removal of `tf.control_dependencies` from average_wrapper and its subclasses: moving_average and stochastic_weight_averaging. * Revert "Remove `sequential_update` from AverageWrapper" This reverts commit 7cf4201. * Remove `tf.control_dependencies` from AverageWrapper Add deprecation warning for `sequential_update`. * Set type of sequential_update to Optional[bool] `sequential_update` is no longer part of the optimizer's configuration. Loading an older configuration throws the DeprecationWarning. * black format
* README fixes and consistency improvements
* Added echo state network (ESN) recurrent cell
* enable half and double for resampler * register GPU kernels * specialize half and double kernels
* remove internal modules * clean up * refact * remove six * remove losses * refact get_config
…ll (tensorflow#1861) Some RNN cells such as GRUCell and SimpleRNNCell do not return the same state structure in get_initial_state and in the call method.
* Incorporate low-rank techniques into DCN. It supports a low-rank kernel W = U * V, where U \in R^{last_dim x projection_dim} and V \in R^{projection_dim x last_dim}; Introduces a flag diag_scale that increases the diagonal of kernel W by diag_scale.
tensorflow#1872) * Fix AttentionWrapper type annotation for multiple attention mechanisms * Revise attention_layer_size type
* Make cutout compatible with keras layer
* Support unknown rank image
* Fix sparse_image_warp unknown batch size * More tests
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
You are owners of some files modified in this pull request. |
@googlebot I fixed it. |
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
Check #1450 and more general tensorflow/tensorflow#39050 |
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
@googlebot I fixed it. |
I guess running 2D in Convolution in-stead of 2 1D convolutions takes lesser time in GPU due to parallel computations. So, earlier I left it as 2D convolutions only. |
@googlebot I fixed it. |
"I guess running 2D in Convolution in-stead of 2 1D convolutions takes lesser time in GPU due to parallel computations. So, earlier I left it as 2D convolutions only." The main problem is that the 2D convolution is quadratic in the kernel size. Earlier, I used a Gaussian blur with a sigma of 3, for which a proper kernel is 17x17 (around 3 sigma). The difference is doing 1 convolution with 298 values or two convolutions with 17 values. Parallelism will not help much here. |
Please check both refence tickets that I've mentioned in the previous comment. |
Did I close this? I was having trouble with git... Anyway, according to some local testing, for small kernels (<3) the separable version is about 20% slower than the 2D version. Note that blurring generally requires a bit larger kernels (you want a kernel of 7 for a sigma of 1). For larger kernels (> 15), the separable version is about twice as fast. On CPU, for small kernels the differences are negligible. For a kernel of size 10 the separable version is almost 5x faster. On TPUs I don't know what will happen. So let me know if this worthwhile or you prefer waiting until the depthwise convolution gets updated to support separable kernels. |
Description
Image blurring is now linear in kernel size instead of quadratic by making use of decomposition of Gaussian.
For large kernels this results in especially large speed-ups.
Type of change
More efficient implementation of gaussian_filter2d.
(Note that mean_filter2d could also be implemented with two 1d filters).
Checklist:
How Has This Been Tested?
Using standard testing. The original (unmodified) test still works:
python3 -m pytest tensorflow_addons/image/tests/filters_test.py
======================================= test session starts ========================================
platform linux -- Python 3.8.6, pytest-5.4.3, py-1.10.0, pluggy-0.13.1
rootdir: ~/home/jrru/git_repositories/addons_mine, inifile: pytest.ini
plugins: extra-durations-0.1.3, xdist-1.34.0, forked-1.3.0, typeguard-2.10.0
collected 116 items
tensorflow_addons/image/tests/filters_test.py .............................................. [ 39%]
...................................................................... [100%]
==================================== sum of all tests durations ====================================
11.08s
======================================= 116 passed in 11.19s =======================================