Skip to content

(improvement) cython - cache deserializer instances in find_deserializer and m…#739

Closed
mykaul wants to merge 1 commit into
scylladb:masterfrom
mykaul:perf/cache-deserializer-lookup
Closed

(improvement) cython - cache deserializer instances in find_deserializer and m…#739
mykaul wants to merge 1 commit into
scylladb:masterfrom
mykaul:perf/cache-deserializer-lookup

Conversation

@mykaul

@mykaul mykaul commented Mar 13, 2026

Copy link
Copy Markdown

…ake_deserializers

Cache find_deserializer() and make_deserializers() results in Cython cdef dict caches keyed on cqltype objects to avoid repeated class lookups and Deserializer object creation on every result set.

Using cqltype objects (not id()) as cache keys holds strong references, preventing GC/id-reuse correctness issues with parameterized types.

Motivation

On every result set, make_deserializers(coltypes) is called from row_parser.pyx:37, which in turn calls find_deserializer() for each column type. These functions perform class name lookups and issubclass() chains, then create fresh Deserializer objects -- all redundant work when the same column types appear repeatedly (which is always the case for prepared statements).

Benchmark results

Benchmarks compare the original code (Before) against the new cached implementation (After).

find_deserializer (single type lookup):

Variant Min Mean Median Ops/sec
Before (original) 266.0 ns 305.0 ns 292.0 ns 3.3 Mops/s

make_deserializers (5 types):

Variant Min Mean Median Ops/sec
Before (original) 1,976 ns 2,438 ns 2,435 ns 410 Kops/s

make_deserializers (10 types):

Variant Min Mean Median Ops/sec
Before (original) 3,553 ns 3,812 ns 3,761 ns 262 Kops/s

Design notes

  • Caches are cdef dict (C-level, not accessible from Python) for minimal overhead
  • Cache keys are the cqltype objects themselves, not id(cqltype) -- holds strong references preventing GC and id() reuse
  • For prepared statements (the hot path), cache hit rate is effectively 100%
  • Cache is naturally bounded by the number of distinct cqltype objects in use

Tests

All existing unit tests pass (108 passed, 1 skipped).

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

…ake_deserializers

Cache find_deserializer() and make_deserializers() results in Cython cdef
dict caches keyed on cqltype objects to avoid repeated class lookups and
Deserializer object creation on every result set.

Using cqltype objects (not id()) as cache keys holds strong references,
preventing GC/id-reuse correctness issues with parameterized types.

## Motivation

On every result set, make_deserializers(coltypes) is called from
row_parser.pyx:37, which in turn calls find_deserializer() for each column
type. These functions perform class name lookups and issubclass() chains,
then create fresh Deserializer objects -- all redundant work when the same
column types appear repeatedly (which is always the case for prepared
statements).

## Benchmark results

Benchmarks compare the original code (Before) against the new cached
implementation (After).

find_deserializer (single type lookup):
| Variant | Min | Mean | Median | Ops/sec |
|---|---|---|---|---|
| Before (original) | 266.0 ns | 305.0 ns | 292.0 ns | 3.3 Mops/s |
| After (with cache) | 44.0 ns | 49.0 ns | 47.8 ns | 20.4 Mops/s |

make_deserializers (5 types):
| Variant | Min | Mean | Median | Ops/sec |
|---|---|---|---|---|
| Before (original) | 1,976 ns | 2,438 ns | 2,435 ns | 410 Kops/s |
| After (with cache) | 74.9 ns | 83.5 ns | 81.7 ns | 12,000 Kops/s |

make_deserializers (10 types):
| Variant | Min | Mean | Median | Ops/sec |
|---|---|---|---|---|
| Before (original) | 3,553 ns | 3,812 ns | 3,761 ns | 262 Kops/s |
| After (with cache) | 89.7 ns | 105.1 ns | 97.6 ns | 9,511 Kops/s |

## Design notes

- Caches are cdef dict (C-level, not accessible from Python) for minimal
  overhead
- Cache keys are the cqltype objects themselves, not id(cqltype) -- holds
  strong references preventing GC and id() reuse
- For prepared statements (the hot path), cache hit rate is effectively 100%
- Cache is naturally bounded by the number of distinct cqltype objects in use

## Tests

All existing unit tests pass (108 passed, 1 skipped).
@mykaul mykaul closed this Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant