Skip to content

Commit

Permalink
Add pack and unpack of a more compact serialization format (#68)
Browse files Browse the repository at this point in the history
* Store the original size argument in struct binary_fuse{8,16}_s

... in preparation of a more compact serialization format: All other
parameters except for the Seed are derived from the size parameter.

The drawback is that this format is sensitive to changes of
binary_fuse8_allocate().

Due to alignment, this does not need any more space on 64bit.
(There were 5 32bit values inbetween two 64bit values)

Yet formally, this is a breaking change of the in-core format, which
should not be used to store information across versions. See follow up
commits for new compact serialization formats.

* Add {xor,binary_fuse}{8,16}_{pack,unpack} serialization formats.

Rationale:

As mentioned in the previous commit, for binary_fuse filters, we do not
need to save values derived from the size, saving 5 x sizeof(uint32_t).

For both filter implementations, we add a bitmap to indicate non-zero
fingerprint values. This adds 1/{8,16} of the fingerprint array size,
but saves one or two bytes for each zero fingerprint.

The net result is a packed format which can not be compressed further by
zlib for the bundled unit tests.

Note that this format is incompatible with the existing _serialize()
format and, in the case of binary_fuse, sensitive to changes of the
derived parameters in _allocate.

Interface:

We add _pack_bytes() to match _serialization_bytes(). _pack() and
_unpack() match _serialize() and _deserialize().

The existing _{de,}serialize() interfaces take a buffer pointer only and
thus implicitly assume that the buffer will be of sufficient size. For
the new functions, we add a size_t parameter indicating the size of the
buffer and check its bounds in the implementation.

_pack returns the used size or zero for "does not fit", so when called
with a buffer of arbitrary size, the used space or error condition can
be determined without an additional call to _pack_bytes(), avoiding
duplicate work.

Implementation:

We add some XOR_bitf_* macros to address words and individual bits of
bitfields.

The XOR_ser and XOR_deser macros have the otherwise repeated code for
bounds checking and the actual serialization.

Because the implementations for the 8 and 16 bit words are equal except
for the data type, we add macros and create the actual functions by
expanding the macros with the possible data types.

Alternatives considered:

Compared to _{de,}serialize(), the new functions need to copy individual
fingerprint words rather than the whole array at once, which is less
efficient. Therefor, an implementation using Duff's Device with
branchless code was attempted but dismissed because avoiding
out-of-bounds access would require an over-allocated buffer.

* Adjust unit tests to new _{un,}pack() interface

To exercise the new code without too much of a change to the existing
unit test, we change the signature of _{un,}serialize_gen() to take an
additional (const) size_t argument, which we ignore for
_{un,}serialize().

We add to the reported metrics absolute and relative size information
for the "in-core" and "wire" format, the latter jointly referencing to
_{un,}serialize() and _{un,}pack().

* Document the new _{un,}pack() interface

* tuning the wording and adding a spaceusage benchmark

* changing the wording.

* rewording.

* explicit casts

---------

Co-authored-by: Daniel Lemire <[email protected]>
  • Loading branch information
nigoroll and lemire authored Jan 22, 2025
1 parent 5539876 commit d3bb4e9
Show file tree
Hide file tree
Showing 7 changed files with 498 additions and 18 deletions.
44 changes: 41 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,20 @@ about 0.0015%. The type is `binary_fuse16_t` and you may use it with
functions such as `binary_fuse16_allocate`, `binary_fuse16_populate`,
`binary_fuse8_contain` and `binary_fuse8_free`.
You may serialize the data as follows:
For serialization, there is a choice between an unpacked and a packed format.
The unpacked format is roughly of the same size as in-core data, but uses most
efficient memory copy operations.
The packed format avoids storing zero bytes and relies on a bitset to locate them, so it
should be expected to be somewhat slower. The packed format might be smaller or larger.
It might be beneficial when using 16-bit binary fuse filters for users who need to preserve
every bytes, and who do not care about the computational overhead.
When in doubt, prefer the regular (unpacked) format.
The two formats use slightly different APIs.
You may serialize and deserialize in unpacked format as follows:
```C
size_t buffer_size = binary_fuse16_serialization_bytes(&filter);
Expand All @@ -65,9 +78,34 @@ You may serialize the data as follows:
free(buffer);
```

The serialization does not handle endianess: it is expected that you will serialize
and deserialize on the little endian systems. (Big endian systems are vanishingly rare.)
This should be the default.

To serialize and deserialize in packed format, use the `_pack_bytes()`,
`_pack()` and `_unpack()` functions. The latter two have an additional `size_t`
argument for the buffer length. `_pack()` can be used with a buffer of arbitrary
size, it returns the used space if serialization fit into the buffer or 0
otherwise. Note that the packed format will be slower and may not save space
although it is likely smaller on disk when using the 16-bit binary fuse filters.

For example:

```C
size_t buffer_size = binary_fuse16_pack_bytes(&filter);
char *buffer = (char*)malloc(buffer_size);
if (binary_fuse16_pack(&filter, buffer, buffer_size) != buffer_size) {
printf("pack failed\n");
free(buffer);
return;
}
binary_fuse16_free(&filter);
if (! binary_fuse16_unpack(&filter, buffer, buffer_size)) {
printf("unpack failed\n");
}
free(buffer);
```
Either serialization does not handle endianess changes: it is expected that you
serialize and deserialize with equal byte order.
## C++ wrapper
Expand Down
3 changes: 3 additions & 0 deletions benchmarks/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
add_executable(bench bench.c)
target_link_libraries(bench PUBLIC xor_singleheader)

add_executable(spaceusage spaceusage.c)
target_link_libraries(spaceusage PUBLIC xor_singleheader)
119 changes: 119 additions & 0 deletions benchmarks/spaceusage.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
#include "binaryfusefilter.h"
#include "xorfilter.h"
#include <stdlib.h>
#include <iso646.h>

typedef struct {
size_t standard;
size_t pack;
} sizes;

sizes fuse16(size_t n) {
binary_fuse16_t filter = {0};
if (! binary_fuse16_allocate(n, &filter)) {
printf("allocation failed\n");
return (sizes) {0, 0};
}
uint64_t* big_set = malloc(n * sizeof(uint64_t));
for(size_t i = 0; i < n; i++) {
big_set[i] = i;
}
bool is_ok = binary_fuse16_populate(big_set, n, &filter);
if(! is_ok ) {
printf("populating failed\n");
}
free(big_set);
sizes s = {
.standard = binary_fuse16_serialization_bytes(&filter),
.pack = binary_fuse16_pack_bytes(&filter)
};
binary_fuse16_free(&filter);
return s;
}

sizes fuse8(size_t n) {
binary_fuse8_t filter = {0};
if (! binary_fuse8_allocate(n, &filter)) {
printf("allocation failed\n");
return (sizes) {0, 0};
}
uint64_t* big_set = malloc(n * sizeof(uint64_t));
for(size_t i = 0; i < n; i++) {
big_set[i] = i;
}
bool is_ok = binary_fuse8_populate(big_set, n, &filter);
if(! is_ok ) {
printf("populating failed\n");
}
free(big_set);
sizes s = {
.standard = binary_fuse8_serialization_bytes(&filter),
.pack = binary_fuse8_pack_bytes(&filter)
};
binary_fuse8_free(&filter);
return s;
}

sizes xor16(size_t n) {
xor16_t filter = {0};
if (! xor16_allocate(n, &filter)) {
printf("allocation failed\n");
return (sizes) {0, 0};
}
uint64_t* big_set = malloc(n * sizeof(uint64_t));
for(size_t i = 0; i < n; i++) {
big_set[i] = i;
}
bool is_ok = xor16_populate(big_set, n, &filter);
if(! is_ok ) {
printf("populating failed\n");
}
free(big_set);
sizes s = {
.standard = xor16_serialization_bytes(&filter),
.pack = xor16_pack_bytes(&filter)
};
xor16_free(&filter);
return s;
}

sizes xor8(size_t n) {
xor8_t filter = {0};
if (! xor8_allocate(n, &filter)) {
printf("allocation failed\n");
return (sizes) {0, 0};
}
uint64_t* big_set = malloc(n * sizeof(uint64_t));
for(size_t i = 0; i < n; i++) {
big_set[i] = i;
}
bool is_ok = xor8_populate(big_set, n, &filter);
if(! is_ok ) {
printf("populating failed\n");
}
free(big_set);
sizes s = {
.standard = xor8_serialization_bytes(&filter),
.pack = xor8_pack_bytes(&filter)
};
xor8_free(&filter);

return s;
}

int main() {
for (size_t n = 10; n <= 10000000; n *= 2) {
printf("%-10zu ", n); // Align number to 10 characters wide
sizes f16 = fuse16(n);
sizes f8 = fuse8(n);
sizes x16 = xor16(n);
sizes x8 = xor8(n);

printf("fuse16: %5.2f %5.2f ", (double)f16.standard * 8.0 / n, (double)f16.pack * 8.0 / n);
printf("fuse8: %5.2f %5.2f ", (double)f8.standard * 8.0 / n, (double)f8.pack * 8.0 / n);
printf("xor16: %5.2f %5.2f ", (double)x16.standard * 8.0 / n, (double)x16.pack * 8.0 / n);
printf("xor8: %5.2f %5.2f ", (double)x8.standard * 8.0 / n, (double)x8.pack * 8.0 / n);
printf("\n");
}
return EXIT_SUCCESS;
}
113 changes: 113 additions & 0 deletions include/binaryfusefilter.h
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ static inline uint64_t binary_fuse_rng_splitmix64(uint64_t *seed) {

typedef struct binary_fuse8_s {
uint64_t Seed;
uint32_t Size;
uint32_t SegmentLength;
uint32_t SegmentLengthMask;
uint32_t SegmentCount;
Expand Down Expand Up @@ -222,6 +223,7 @@ static inline double binary_fuse_calculate_size_factor(uint32_t arity,
static inline bool binary_fuse8_allocate(uint32_t size,
binary_fuse8_t *filter) {
uint32_t arity = 3;
filter->Size = size;
filter->SegmentLength = size == 0 ? 4 : binary_fuse_calculate_segment_length(arity, size);
if (filter->SegmentLength > 262144) {
filter->SegmentLength = 262144;
Expand Down Expand Up @@ -258,6 +260,7 @@ static inline void binary_fuse8_free(binary_fuse8_t *filter) {
free(filter->Fingerprints);
filter->Fingerprints = NULL;
filter->Seed = 0;
filter->Size = 0;
filter->SegmentLength = 0;
filter->SegmentLengthMask = 0;
filter->SegmentCount = 0;
Expand Down Expand Up @@ -459,6 +462,7 @@ static inline bool binary_fuse8_populate(uint64_t *keys, uint32_t size,

typedef struct binary_fuse16_s {
uint64_t Seed;
uint32_t Size;
uint32_t SegmentLength;
uint32_t SegmentLengthMask;
uint32_t SegmentCount;
Expand Down Expand Up @@ -512,6 +516,7 @@ static inline bool binary_fuse16_contain(uint64_t key,
static inline bool binary_fuse16_allocate(uint32_t size,
binary_fuse16_t *filter) {
uint32_t arity = 3;
filter->Size = size;
filter->SegmentLength = size == 0 ? 4 : binary_fuse_calculate_segment_length(arity, size);
if (filter->SegmentLength > 262144) {
filter->SegmentLength = 262144;
Expand Down Expand Up @@ -548,6 +553,7 @@ static inline void binary_fuse16_free(binary_fuse16_t *filter) {
free(filter->Fingerprints);
filter->Fingerprints = NULL;
filter->Seed = 0;
filter->Size = 0;
filter->SegmentLength = 0;
filter->SegmentLengthMask = 0;
filter->SegmentCount = 0;
Expand Down Expand Up @@ -858,4 +864,111 @@ static inline bool binary_fuse8_deserialize(binary_fuse8_t * filter, const char
return true;
}

// minimal bitfield implementation
#define XOR_bitf_w (sizeof(uint8_t) * 8)
#define XOR_bitf_sz(bits) (((bits) + XOR_bitf_w - 1) / XOR_bitf_w)
#define XOR_bitf_word(bit) (bit / XOR_bitf_w)
#define XOR_bitf_bit(bit) ((1U << (bit % XOR_bitf_w)) % 256)

#define XOR_ser(buf, lim, src) do { \
if ((buf) + sizeof src > (lim)) \
return (0); \
memcpy(buf, &src, sizeof src); \
buf += sizeof src; \
} while (0)

#define XOR_deser(dst, buf, lim) do { \
if ((buf) + sizeof dst > (lim)) \
return (false); \
memcpy(&dst, buf, sizeof dst); \
buf += sizeof dst; \
} while (0)

// return required space for binary_fuse{8,16}_pack()
#define XOR_bytesf(fuse) \
static inline size_t binary_ ## fuse ## _pack_bytes(const binary_ ## fuse ## _t *filter) \
{ \
size_t sz = 0; \
sz += sizeof filter->Seed; \
sz += sizeof filter->Size; \
sz += XOR_bitf_sz(filter->ArrayLength); \
for (size_t i = 0; i < filter->ArrayLength; i++) { \
if (filter->Fingerprints[i] == 0) \
continue; \
sz += sizeof filter->Fingerprints[i]; \
} \
return (sz); \
}

// serialize as packed format, return size used or 0 for insufficient space
#define XOR_packf(fuse) \
static inline size_t binary_ ## fuse ## _pack(const binary_ ## fuse ## _t *filter, char *buffer, size_t space) { \
uint8_t *s = (uint8_t *)(void *)buffer; \
uint8_t *buf = s, *e = buf + space; \
\
XOR_ser(buf, e, filter->Seed); \
XOR_ser(buf, e, filter->Size); \
size_t bsz = XOR_bitf_sz(filter->ArrayLength); \
if (buf + bsz > e) \
return (0); \
uint8_t *bitf = buf; \
memset(bitf, 0, bsz); \
buf += bsz; \
\
for (size_t i = 0; i < filter->ArrayLength; i++) { \
if (filter->Fingerprints[i] == 0) \
continue; \
bitf[XOR_bitf_word(i)] |= XOR_bitf_bit(i); \
XOR_ser(buf, e, filter->Fingerprints[i]); \
} \
return ((size_t)(buf - s)); \
}

#define XOR_unpackf(fuse) \
static inline bool binary_ ## fuse ## _unpack(binary_ ## fuse ## _t *filter, const char *buffer, size_t len) \
{ \
const uint8_t *s = (const uint8_t *)(const void *)buffer; \
const uint8_t *buf = s, *e = buf + len; \
bool r; \
\
uint64_t Seed; \
uint32_t Size; \
\
memset(filter, 0, sizeof *filter); \
XOR_deser(Seed, buf, e); \
XOR_deser(Size, buf, e); \
r = binary_ ## fuse ## _allocate(Size, filter); \
if (! r) \
return (r); \
filter->Seed = Seed; \
const uint8_t *bitf = buf; \
buf += XOR_bitf_sz(filter->ArrayLength); \
for (size_t i = 0; i < filter->ArrayLength; i++) { \
if ((bitf[XOR_bitf_word(i)] & XOR_bitf_bit(i)) == 0) \
continue; \
XOR_deser(filter->Fingerprints[i], buf, e); \
} \
return (true); \
}

#define XOR_packers(fuse) \
XOR_bytesf(fuse) \
XOR_packf(fuse) \
XOR_unpackf(fuse) \

XOR_packers(fuse8)
XOR_packers(fuse16)

#undef XOR_packers
#undef XOR_bytesf
#undef XOR_packf
#undef XOR_unpackf

#undef XOR_bitf_w
#undef XOR_bitf_sz
#undef XOR_bitf_word
#undef XOR_bitf_bit
#undef XOR_ser
#undef XOR_deser

#endif
Loading

0 comments on commit d3bb4e9

Please sign in to comment.