Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid allocation of internal array to get _distance code #97

Merged
merged 1 commit into from
Nov 8, 2020

Conversation

dinosaure
Copy link
Member

I was thinking that OCaml was able to allocate t only one time but it seems not true and I think it really slow-down the deflation when, at each call of _distance, we do a major-allocation. This behavior was discovered by @Engil. This issue appear when we enlarge the minor heap and when malloc hardly try to allocate this big array.

In our case, it's a static array, so this PR should fix this slow-down.

@dinosaure
Copy link
Member Author

So with #98 and lzbench (decompress branch), I ran a benchmark to see the impact of this modification (to avoid regression) and:

dinosaure@turbine:~/dev/lzbench$ ./lzbench -ezlib,4/decompress,3 news 
lzbench 1.8 (64-bit Linux)   Assembled by P.Skibinski
Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                  14079 MB/s 14869 MB/s      377109 100.00 news
zlib 1.2.11 -4             32 MB/s   200 MB/s      149282  39.59 news
decompress dev -3        1.91 MB/s    74 MB/s      292951  77.68 news
done... (cIters=1 dIters=1 cTime=1.0 dTime=2.0 chunkSize=1706MB cSpeed=0MB)

dinosaure@turbine:~/dev/lzbench$ ./lzbench -ezlib,4/decompress,3 news 
lzbench 1.8 (64-bit Linux)   Assembled by P.Skibinski
Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                  15176 MB/s 15324 MB/s      377109 100.00 news
zlib 1.2.11 -4             32 MB/s   201 MB/s      149282  39.59 news
decompress dev -3          20 MB/s    74 MB/s      292951  77.68 news
done... (cIters=1 dIters=1 cTime=1.0 dTime=2.0 chunkSize=1706MB cSpeed=0MB)

In other words, 1.91 MB/s to 20 MB/s! Good catch @Engil! I will merge it.

@dinosaure dinosaure merged commit 802fd3d into master Nov 8, 2020
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Mar 3, 2021
CHANGES:

- Add a little executable to benchmark inflation into the distribution (@dinosaure, mirage/decompress#93)
- Add instructions for running benchmark (@gs0510, mirage/decompress#94)
- Clarify the description (@XVilka, mirage/decompress#96)
- Improve the benchmark and outputs (@dinosaure, @gs0510, mirage/decompress#95)
- Avoid allocation of distance table (@Engil, @dinosaure, mirage/decompress#97)
- Swapping from arithmetic to logical bitshifts on `d.hold` (@clecat, mirage/decompress#99)
- Make the use of all `Higher.compress` arguments (@vect0r-vicall, mirage/decompress#103)
- Apply ocamlformat.0.16.0 (@dinosaure, mirage/decompress#105, mirage/decompress#107)
- Improve Lz77 algorithms (@dinosaure, mirage/decompress#108)
  **breaking changes** the deflation expects a new window: `De.Lz77.make_window`
  instead of `De.make_window` (which is twice larger to improve the compression
  algorithm)

  Depending on the level and your corpus, we did not observe performance
  regression on deflation (and mirage/decompress#97 improves a lot performances). An higher level
  is slower (but the compression ratio is better). We advise, by default, to use
  the level 6.

  Note that the user is able to make its own compression algorithm according to
  his corpus. An example of such implementation is available on the new
  `decompress.lz` libraries which fills a queue and compress the input.

  **breaking changes** decompress expects a level between 0 and 9 (inclusive)
  (instead of 0 and 3).
- Add tests about level compression (@dinosaure, mirage/decompress#109)
- Add level on GZip layer (@dinosaure, mirage/decompress#110)
- Provide a `ctypes` reverse binding (@dinosaure, mirage/decompress#98)
- Provide a binary `decompress.pipe` which can compress/uncompress with
  deflate, zlib or gzip format.
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Mar 3, 2021
CHANGES:

- Add a little executable to benchmark inflation into the distribution (@dinosaure, mirage/decompress#93)
- Add instructions for running benchmark (@gs0510, mirage/decompress#94)
- Clarify the description (@XVilka, mirage/decompress#96)
- Improve the benchmark and outputs (@dinosaure, @gs0510, mirage/decompress#95)
- Avoid allocation of distance table (@Engil, @dinosaure, mirage/decompress#97)
- Swapping from arithmetic to logical bitshifts on `d.hold` (@clecat, mirage/decompress#99)
- Make the use of all `Higher.compress` arguments (@vect0r-vicall, mirage/decompress#103)
- Apply ocamlformat.0.16.0 (@dinosaure, mirage/decompress#105, mirage/decompress#107)
- Improve Lz77 algorithms (@dinosaure, mirage/decompress#108)
  **breaking changes** the deflation expects a new window: `De.Lz77.make_window`
  instead of `De.make_window` (which is twice larger to improve the compression
  algorithm)

  Depending on the level and your corpus, we did not observe performance
  regression on deflation (and mirage/decompress#97 improves a lot performances). An higher level
  is slower (but the compression ratio is better). We advise, by default, to use
  the level 6.

  Note that the user is able to make its own compression algorithm according to
  his corpus. An example of such implementation is available on the new
  `decompress.lz` libraries which fills a queue and compress the input.

  **breaking changes** decompress expects a level between 0 and 9 (inclusive)
  (instead of 0 and 3).
- Add tests about level compression (@dinosaure, mirage/decompress#109)
- Add level on GZip layer (@dinosaure, mirage/decompress#110)
- Provide a `ctypes` reverse binding (@dinosaure, mirage/decompress#98)
- Provide a binary `decompress.pipe` which can compress/uncompress with
  deflate, zlib or gzip format.
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Mar 3, 2021
CHANGES:

- Add a little executable to benchmark inflation into the distribution (@dinosaure, mirage/decompress#93)
- Add instructions for running benchmark (@gs0510, mirage/decompress#94)
- Clarify the description (@XVilka, mirage/decompress#96)
- Improve the benchmark and outputs (@dinosaure, @gs0510, mirage/decompress#95)
- Avoid allocation of distance table (@Engil, @dinosaure, mirage/decompress#97)
- Swapping from arithmetic to logical bitshifts on `d.hold` (@clecat, mirage/decompress#99)
- Make the use of all `Higher.compress` arguments (@vect0r-vicall, mirage/decompress#103)
- Apply ocamlformat.0.16.0 (@dinosaure, mirage/decompress#105, mirage/decompress#107)
- Improve Lz77 algorithms (@dinosaure, mirage/decompress#108)
  **breaking changes** the deflation expects a new window: `De.Lz77.make_window`
  instead of `De.make_window` (which is twice larger to improve the compression
  algorithm)

  Depending on the level and your corpus, we did not observe performance
  regression on deflation (and mirage/decompress#97 improves a lot performances). An higher level
  is slower (but the compression ratio is better). We advise, by default, to use
  the level 6.

  Note that the user is able to make its own compression algorithm according to
  his corpus. An example of such implementation is available on the new
  `decompress.lz` libraries which fills a queue and compress the input.

  **breaking changes** decompress expects a level between 0 and 9 (inclusive)
  (instead of 0 and 3).
- Add tests about level compression (@dinosaure, mirage/decompress#109)
- Add level on GZip layer (@dinosaure, mirage/decompress#110)
- Provide a `ctypes` reverse binding (@dinosaure, mirage/decompress#98)
- Provide a binary `decompress.pipe` which can compress/uncompress with
  deflate, zlib or gzip format.
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Mar 3, 2021
CHANGES:

- Add a little executable to benchmark inflation into the distribution (@dinosaure, mirage/decompress#93)
- Add instructions for running benchmark (@gs0510, mirage/decompress#94)
- Clarify the description (@XVilka, mirage/decompress#96)
- Improve the benchmark and outputs (@dinosaure, @gs0510, mirage/decompress#95)
- Avoid allocation of distance table (@Engil, @dinosaure, mirage/decompress#97)
- Swapping from arithmetic to logical bitshifts on `d.hold` (@clecat, mirage/decompress#99)
- Make the use of all `Higher.compress` arguments (@vect0r-vicall, mirage/decompress#103)
- Apply ocamlformat.0.16.0 (@dinosaure, mirage/decompress#105, mirage/decompress#107)
- Improve Lz77 algorithms (@dinosaure, mirage/decompress#108)
  **breaking changes** the deflation expects a new window: `De.Lz77.make_window`
  instead of `De.make_window` (which is twice larger to improve the compression
  algorithm)

  Depending on the level and your corpus, we did not observe performance
  regression on deflation (and mirage/decompress#97 improves a lot performances). An higher level
  is slower (but the compression ratio is better). We advise, by default, to use
  the level 6.

  Note that the user is able to make its own compression algorithm according to
  his corpus. An example of such implementation is available on the new
  `decompress.lz` libraries which fills a queue and compress the input.

  **breaking changes** decompress expects a level between 0 and 9 (inclusive)
  (instead of 0 and 3).
- Add tests about level compression (@dinosaure, mirage/decompress#109)
- Add level on GZip layer (@dinosaure, mirage/decompress#110)
- Provide a `ctypes` reverse binding (@dinosaure, mirage/decompress#98)
- Provide a binary `decompress.pipe` which can compress/uncompress with
  deflate, zlib or gzip format.
@dinosaure dinosaure deleted the avoid-distance-allocation branch April 11, 2022 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant