Skip to content

A compressed bitmap format optimized for small, sparse sets of u32s with zero-copy querying.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

orbitinghail/splinter-rs

Repository files navigation

Splinter

docs.rs   Build Status   crates.io

Splinter is a compressed bitmap format similar to Roaring, optimized specifically for small, sparse sets of 32-bit unsigned integers (u32).

Key Features:

  • Tree-based Encoding: Splinter encodes u32 values into a 256-way tree structure by decomposing integers into big-endian component bytes. Nodes throughout the tree (including the root) are optimized into four different storage classes: tree, vec, bitmap, run.

  • Zero-copy Access: Designed for efficient querying without deserialization, the SplinterRef type allows direct, zero-copy reads from any type implementing Deref<Target = [u8]>.

Comparison to Roaring

The following table tests Splinter and Roaring with/without LZ4 compression against many different data distributions. The size column represents number of bytes used when serialized. View the actual test code to understand the precise meaning of each distribution.

All tests optimize the bitmap before serialization. This ensures that Splinter and Roaring are able to maximize compression via the use of run-length encoding for cases where it is helpful.

Roaring tests use roaring-rs.

The Baseline "bitmap" is simply an array of u32 integers.

The ok column compares each subsequent row to the Splinter row for that test. - and + signs refer to the relative size of the row compared to the Splinter per the relative column. For example a row containing ++++ is many times larger than the corresponding Splinter.

test                           bitmap         size   expected   relative         ok
empty                          Splinter         13         13       1.00         ok
                               Roaring           8          8       0.62         --
                               Splinter LZ4     14         14       1.08         ok
                               Roaring LZ4       9          9       0.64          -
                               Baseline          0          0       0.00       ----
1 element                      Splinter         21         21       1.00         ok
                               Roaring          18         18       0.86          -
                               Splinter LZ4     23         23       1.10         ok
                               Roaring LZ4      20         20       0.87          -
                               Baseline          4          4       0.19       ----
1 dense block                  Splinter         25         25       1.00         ok
                               Roaring          15         15       0.60         --
                               Splinter LZ4     27         27       1.08         ok
                               Roaring LZ4      17         17       0.63          -
                               Baseline       1024       1024      40.96       ++++
1 half full block              Splinter         63         63       1.00         ok
                               Roaring         255        255       4.05       ++++
                               Splinter LZ4     64         64       1.02         ok
                               Roaring LZ4     257        257       4.02       ++++
                               Baseline        512        512       8.13       ++++
1 sparse block                 Splinter         48         48       1.00         ok
                               Roaring          48         48       1.00         ok
                               Splinter LZ4     49         49       1.02         ok
                               Roaring LZ4      50         50       1.02         ok
                               Baseline         64         64       1.33          +
8 half full blocks             Splinter        315        315       1.00         ok
                               Roaring        2003       2003       6.36       ++++
                               Splinter LZ4    318        318       1.01         ok
                               Roaring LZ4    2012       2012       6.33       ++++
                               Baseline       4096       4096      13.00       ++++
8 sparse blocks                Splinter         60         60       1.00         ok
                               Roaring          48         48       0.80          -
                               Splinter LZ4     62         62       1.03         ok
                               Roaring LZ4      50         50       0.81          -
                               Baseline         64         64       1.07         ok
64 half full blocks            Splinter       2442       2442       1.00         ok
                               Roaring       16452      16452       6.74       ++++
                               Splinter LZ4   2336       2336       0.96         ok
                               Roaring LZ4   16503      16503       7.06       ++++
                               Baseline      32768      32768      13.42       ++++
64 sparse blocks               Splinter        410        410       1.00         ok
                               Roaring         392        392       0.96         ok
                               Splinter LZ4    379        379       0.92         ok
                               Roaring LZ4     395        395       1.04         ok
                               Baseline        512        512       1.25          +
256 half full blocks           Splinter       9450       9450       1.00         ok
                               Roaring       65580      65580       6.94       ++++
                               Splinter LZ4   9015       9015       0.95         ok
                               Roaring LZ4   65835      65835       7.30       ++++
                               Baseline     131072     131072      13.87       ++++
256 sparse blocks              Splinter       1290       1290       1.00         ok
                               Roaring        1288       1288       1.00         ok
                               Splinter LZ4   1250       1250       0.97         ok
                               Roaring LZ4    1294       1294       1.04         ok
                               Baseline       2048       2048       1.59          +
512 half full blocks           Splinter      18886      18886       1.00         ok
                               Roaring      130810     130810       6.93       ++++
                               Splinter LZ4  17974      17974       0.95         ok
                               Roaring LZ4  131248     131248       7.30       ++++
                               Baseline     262144     262144      13.88       ++++
512 sparse blocks              Splinter       2566       2566       1.00         ok
                               Roaring        2568       2568       1.00         ok
                               Splinter LZ4   2416       2416       0.94         ok
                               Roaring LZ4    2580       2580       1.07         ok
                               Baseline       4096       4096       1.60          +
fully dense                    Splinter         80         80       1.00         ok
                               Roaring          63         63       0.79          -
                               Splinter LZ4     82         82       1.02         ok
                               Roaring LZ4      65         65       0.79          -
                               Baseline      16384      16384     204.80       ++++
128/block; dense               Splinter       1179       1179       1.00         ok
                               Roaring        8208       8208       6.96       ++++
                               Splinter LZ4   1185       1185       1.01         ok
                               Roaring LZ4    8242       8242       6.96       ++++
                               Baseline      16384      16384      13.90       ++++
32/block; dense                Splinter       4539       4539       1.00         ok
                               Roaring        8208       8208       1.81         ++
                               Splinter LZ4   4302       4302       0.95         ok
                               Roaring LZ4    8242       8242       1.92         ++
                               Baseline      16384      16384       3.61        +++
16/block; dense                Splinter       5147       5147       1.00         ok
                               Roaring        8208       8208       1.59          +
                               Splinter LZ4   5145       5145       1.00         ok
                               Roaring LZ4    8242       8242       1.60         ++
                               Baseline      16384      16384       3.18        +++
128/block; sparse mid          Splinter       1365       1365       1.00         ok
                               Roaring        8282       8282       6.07       ++++
                               Splinter LZ4   1372       1372       1.01         ok
                               Roaring LZ4    8311       8311       6.06       ++++
                               Baseline      16384      16384      12.00       ++++
128/block; sparse high         Splinter       1582       1582       1.00         ok
                               Roaring        8224       8224       5.20       ++++
                               Splinter LZ4   1539       1539       0.97         ok
                               Roaring LZ4    8258       8258       5.37       ++++
                               Baseline      16384      16384      10.36       ++++
1/block; sparse mid            Splinter       9749       9749       1.00         ok
                               Roaring       10248      10248       1.05         ok
                               Splinter LZ4   9750       9750       1.00         ok
                               Roaring LZ4   10290      10290       1.06         ok
                               Baseline      16384      16384       1.68         ++
1/block; sparse high           Splinter      14350      14350       1.00         ok
                               Roaring       40968      40968       2.85        +++
                               Splinter LZ4  14297      14297       1.00         ok
                               Roaring LZ4   41084      41084       2.87        +++
                               Baseline      16384      16384       1.14          +
1/block; spread low            Splinter       8325       8325       1.00         ok
                               Roaring        8328       8328       1.00         ok
                               Splinter LZ4    637        637       0.08       ----
                               Roaring LZ4     689        689       1.08         ok
                               Baseline      16384      16384       1.97         ++
dense throughout               Splinter       4113       4113       1.00         ok
                               Roaring        2700       2700       0.66          -
                               Splinter LZ4   3643       3643       0.89          -
                               Roaring LZ4     608        608       0.17       ----
                               Baseline      16384      16384       3.98        +++
dense low                      Splinter        529        529       1.00         ok
                               Roaring         267        267       0.50         --
                               Splinter LZ4    529        529       1.00         ok
                               Roaring LZ4     269        269       0.51         --
                               Baseline      16384      16384      30.97       ++++
dense mid/low                  Splinter       4113       4113       1.00         ok
                               Roaring        2376       2376       0.58         --
                               Splinter LZ4   4077       4077       0.99         ok
                               Roaring LZ4     348        348       0.09       ----
                               Baseline      16384      16384       3.98        +++
random/32                      Splinter        145        145       1.00         ok
                               Roaring         328        328       2.26         ++
                               Splinter LZ4    147        147       1.01         ok
                               Roaring LZ4     331        331       2.25         ++
                               Baseline        128        128       0.88          -
random/256                     Splinter       1041       1041       1.00         ok
                               Roaring        2544       2544       2.44         ++
                               Splinter LZ4   1047       1047       1.01         ok
                               Roaring LZ4    2553       2553       2.44         ++
                               Baseline       1024       1024       0.98         ok
random/1024                    Splinter       4113       4113       1.00         ok
                               Roaring       10168      10168       2.47         ++
                               Splinter LZ4   4131       4131       1.00         ok
                               Roaring LZ4   10208      10208       2.47         ++
                               Baseline       4096       4096       1.00         ok
random/4096                    Splinter      14350      14350       1.00         ok
                               Roaring       40056      40056       2.79        +++
                               Splinter LZ4  14359      14359       1.00         ok
                               Roaring LZ4   40208      40208       2.80        +++
                               Baseline      16384      16384       1.14          +
random/16384                   Splinter      51214      51214       1.00         ok
                               Roaring      148656     148656       2.90        +++
                               Splinter LZ4  51416      51416       1.00         ok
                               Roaring LZ4  149229     149229       2.90        +++
                               Baseline      65536      65536       1.28          +
random/65536                   Splinter     198670     198670       1.00         ok
                               Roaring      461288     461288       2.32         ++
                               Splinter LZ4 199451     199451       1.00         ok
                               Roaring LZ4  463095     463095       2.32         ++
                               Baseline     262144     262144       1.32          +
random/32/65536                Splinter         92         92       1.00         ok
                               Roaring          80         80       0.87          -
                               Splinter LZ4     92         92       1.00         ok
                               Roaring LZ4      81         81       0.88          -
                               Baseline        128        128       1.39          +
random/256/65536               Splinter        540        540       1.00         ok
                               Roaring         528        528       0.98         ok
                               Splinter LZ4    544        544       1.01         ok
                               Roaring LZ4     530        530       0.97         ok
                               Baseline       1024       1024       1.90         ++
random/1024/65536              Splinter       2071       2071       1.00         ok
                               Roaring        2064       2064       1.00         ok
                               Splinter LZ4   2081       2081       1.00         ok
                               Roaring LZ4    2072       2072       1.00         ok
                               Baseline       4096       4096       1.98         ++
random/4096/65536              Splinter       5147       5147       1.00         ok
                               Roaring        8208       8208       1.59          +
                               Splinter LZ4   5169       5169       1.00         ok
                               Roaring LZ4    8241       8241       1.59          +
                               Baseline      16384      16384       3.18        +++
random/65536/65536             Splinter         25         25       1.00         ok
                               Roaring          15         15       0.60         --
                               Splinter LZ4     23         23       0.92         ok
                               Roaring LZ4      17         17       0.74          -
                               Baseline     262144     262144   10485.76       ++++
random/8/1024                  Splinter         49         49       1.00         ok
                               Roaring          32         32       0.65          -
                               Splinter LZ4     51         51       1.04         ok
                               Roaring LZ4      33         33       0.65          -
                               Baseline         32         32       0.65          -
random/16/1024                 Splinter         60         60       1.00         ok
                               Roaring          48         48       0.80          -
                               Splinter LZ4     61         61       1.02         ok
                               Roaring LZ4      49         49       0.80          -
                               Baseline         64         64       1.07         ok
random/32/1024                 Splinter         79         79       1.00         ok
                               Roaring          80         80       1.01         ok
                               Splinter LZ4     77         77       0.97         ok
                               Roaring LZ4      81         81       1.05         ok
                               Baseline        128        128       1.62         ++
random/64/1024                 Splinter        111        111       1.00         ok
                               Roaring         144        144       1.30          +
                               Splinter LZ4    110        110       0.99         ok
                               Roaring LZ4     145        145       1.32          +
                               Baseline        256        256       2.31         ++
random/128/1024                Splinter        168        168       1.00         ok
                               Roaring         272        272       1.62         ++
                               Splinter LZ4    167        167       0.99         ok
                               Roaring LZ4     273        273       1.63         ++
                               Baseline        512        512       3.05        +++
average compression ratio (splinter_lz4 / splinter): 0.97

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be dual licensed as above, without any additional terms or conditions.

About

A compressed bitmap format optimized for small, sparse sets of u32s with zero-copy querying.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •  

Languages