Skip to content

Commit 4534904

Browse files
authored
Refactor (#18)
* core refactor and update benchmark * update benchmark ci * fix bench * update params * update bench * update bench * fix bench * remove too slow bench * add with theine bench * update * optimize bench * update * update bench * simplify benchmark * update bench * update * update readme * add trace script * update trace script * update readme * update readme * fix python 3.7 test * reduce benchmark requests count
1 parent a4fa703 commit 4534904

20 files changed

+1142
-733
lines changed

Diff for: .github/workflows/benchmark.yml

+5-11
Original file line numberDiff line numberDiff line change
@@ -5,38 +5,32 @@ on:
55
push:
66
branches:
77
- master
8+
- refactor/reduce_overhead
89

910
jobs:
1011
benchmark-1:
1112
uses: ./.github/workflows/benchmark_template.yml
1213
with:
13-
case: "test_read_write_async"
14+
case: "test_read_only"
1415
secrets: inherit
1516

1617
benchmark-2:
1718
needs: benchmark-1
1819
uses: ./.github/workflows/benchmark_template.yml
1920
with:
20-
case: "test_read_write_with_local_async"
21+
case: "test_write_only"
2122
secrets: inherit
2223

2324
benchmark-3:
2425
needs: benchmark-2
2526
uses: ./.github/workflows/benchmark_template.yml
2627
with:
27-
case: "test_read_only_async"
28+
case: "test_zipf"
2829
secrets: inherit
2930

3031
benchmark-4:
3132
needs: benchmark-3
3233
uses: ./.github/workflows/benchmark_template.yml
3334
with:
34-
case: "test_read_only_with_local_async"
35-
secrets: inherit
36-
37-
benchmark-5:
38-
needs: benchmark-4
39-
uses: ./.github/workflows/benchmark_template.yml
40-
with:
41-
case: "test_read_write_batch_async"
35+
case: "test_read_only_batch"
4236
secrets: inherit

Diff for: .github/workflows/benchmark_template.yml

+1
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ jobs:
8585
run: "poetry run pytest benchmarks/benchmark_test.py::${{ inputs.case }} --benchmark-only --benchmark-json output.json"
8686
- name: "Publish Benchmark Result"
8787
uses: benchmark-action/github-action-benchmark@v1
88+
if: ${{ github.ref == 'refs/heads/master' }}
8889
with:
8990
name: 'Cacheme Benchmark: ${{ inputs.case }}'
9091
tool: 'pytest'

Diff for: Makefile

+9-1
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,18 @@
22
test:
33
poetry run pytest --benchmark-skip
44

5+
.PHONY: testx
6+
testx:
7+
poetry run pytest --benchmark-skip -x
8+
59
.PHONY: benchmark
610
benchmark:
711
poetry run pytest --benchmark-only
812

913
.PHONY: lint
1014
lint:
11-
mypy --ignore-missing-imports .
15+
poetry run mypy --check-untyped-defs --ignore-missing-imports .
16+
17+
.PHONY: trace
18+
trace:
19+
poetry run python -m benchmarks.trace

Diff for: README.md

+104-14
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,17 @@
11
# Cacheme
22

3-
Asyncio cache framework with multiple cache storages. [中文文档](README_ZH.md)
3+
Asyncio cache framework with multiple cache storages.
44

5-
- **Better cache management:** Cache configuration with node, you can apply different strategies on different nodes.
5+
- **Organize cache better:** Cache configuration with node, you can apply different strategies on different nodes.
66
- **Multiple cache storages:** in-memory/redis/mongodb/postgres..., also support chain storages.
77
- **Multiple serializers:** Pickle/Json/Msgpack serializers.
88
- **Type annotated:** All cacheme API are type annotated with generics.
9-
- **High hit ratio in-memory cache:** TinyLFU written in Rust with little memory overhead.
10-
- **Thundering herd protection:** Simultaneously requests to same key are blocked by asyncio Event and only load from source once.
9+
- **Thundering herd protection:** Simultaneously requests to same key are blocked by asyncio Event and only load from source once. See Benchemark section.
1110
- **Cache stats API:** Stats of each node and colected automatically.
11+
- **Performance:** See Benchemark section.
1212

1313
Related projects:
1414
- High performance in-memory cache: https://github.com/Yiling-J/theine
15-
- Benchmark(auto updated): https://github.com/Yiling-J/cacheme-benchmark
1615

1716
## Table of Contents
1817

@@ -32,7 +31,11 @@ Related projects:
3231
+ [Sqlite Storage](#sqlite-storage)
3332
+ [PostgreSQL Storage](#postgresql-storage)
3433
+ [MySQL Storage](#mysql-storage)
34+
- [How Thundering Herd Protection Works](#how-thundering-herd-protection-works)
3535
- [Benchmarks](#benchmarks)
36+
+ [continuous benchmark](#continuous-benchemark)
37+
+ [200k concurrent requests](#200k-concurrent-requests)
38+
+ [20k concurrent batch requests](#20k-concurrent-batch-requests)
3639

3740
## Requirements
3841
Python 3.7+
@@ -52,7 +55,7 @@ pip install cacheme[asyncpg]
5255
```
5356

5457
## Add Node
55-
Node is the core part of cache. Each node has its own key function, load function and storage options. Stats of each node are collected independently. You can place all node definations into one package/module, so everyone knows exactly what is cached now and how they are cached. All cacheme API are based on node.
58+
Node is the core part of cache. Each node has its own key function, load function and storage options. Stats of each node are collected independently. You can place all node definations into one package/module, so everyone knows exactly what is cached and how they are cached. All cacheme API are based on node.
5659

5760
Each node contains:
5861
- Key attritubes and `key` method, which are used to generate cache key. Here the `UserInfoNode` is a dataclass, so `__init__` method is generated automatically.
@@ -249,20 +252,21 @@ BloomFilter is cleared automatically when requests count == size.
249252
## Cache Storage
250253

251254
#### Local Storage
252-
Local storage uses dictionary to store data. A policy is used to evict keys when cache is full.
255+
Local storage use the state-of-the-art library **Theine** to store data. If your use case in simple, also consider using [Theine](https://github.com/Yiling-J/theine) directly, which will have the best performance.
256+
253257
```python
254258
# lru policy
255259
Storage(url="local://lru", size=10000)
256260

257-
# tinylfu policy
261+
# w-tinylfu policy
258262
Storage(url="local://tlfu", size=10000)
259263

260264
```
261265
Parameters:
262266

263267
- `url`: `local://{policy}`. 2 policies are currently supported:
264268
- `lru`
265-
- `tlfu`: TinyLfu policy, see https://arxiv.org/pdf/1512.00727.pdf
269+
- `tlfu`: W-TinyLfu policy
266270

267271
- `size`: size of the storage. Policy will be used to evict key when cache is full.
268272

@@ -324,11 +328,97 @@ Parameters:
324328
- `table`: cache table name.
325329
- `pool_size`: connection pool size, default 50.
326330

331+
## How Thundering Herd Protection Works
332+
333+
If you are familar with Go [singleflight](https://pkg.go.dev/golang.org/x/sync/singleflight), you may have an idea how Cacheme works. Cacheme group concurrent requests to same resource(node) into a singleflight with asyncio Event, which will **load from remote cache OR data source only once**. That's why in next Benchmarks section, you will find Cacheme even reduce total redis GET command count under high concurrency.
334+
335+
327336
## Benchmarks
328-
- Local Storage Hit Ratios(hit_count/request_count)
329-
![hit ratios](benchmarks/hit_ratio.png)
330-
[source code](benchmarks/tlfu_hit.py)
331337

332-
- Throughput Benchmark of different storages
338+
### continuous benchmark
339+
https://github.com/Yiling-J/cacheme-benchmark
340+
341+
### 200k concurrent requests
342+
343+
aiocache: https://github.com/aio-libs/aiocache
344+
345+
cashews: https://github.com/Krukov/cashews
346+
347+
source code:
348+
349+
How this benchmark run:
350+
351+
1. Initialize Cacheme/Aiocache/Cashews with Redis backend, use Redis blocking pool and set pool size to 100.
352+
2. Decorate Aiocache/Cashews/Cacheme with a function which accept a number and sleep 0.1s. This function also record how many times it is called.
353+
3. Register Redis response callback, so we can know how many times GET command are called.
354+
4. Create 200k coroutines use a zipf generator and put them in async queue(around 50k-60k unique numbers).
355+
5. Run coroutines in queue with N concurrent workers.
356+
6. Collect results.
357+
358+
Result:
359+
- Time: How long it takes to finish bench.
360+
- Redis GET: How many times Redis GET command are called, use this to evaluate pressure to remote cache server.
361+
- Load Hits: How many times the load function(which sleep 0.1s) are called, use this to evaluate pressure to load source(database or something else).
362+
363+
#### 1k concurrency
364+
365+
| | Time | Redis GET | Load Hits |
366+
|------------|-------|------------|-----------|
367+
| Cacheme | 30 s | 166454 | 55579 |
368+
| Aiocache | 46 s | 200000 | 56367 |
369+
| Aiocache-2 | 63 s | 256492 | 55417 |
370+
| Cashews | 51 s | 200000 | 56920 |
371+
| cashews-2 | 134 s | 200000 | 55450 |
372+
373+
374+
#### 10k concurrency
375+
376+
| | Time | Redis GET | Load Hits |
377+
|------------|-------|-----------|-----------|
378+
| Cacheme | 32 s | 123704 | 56736 |
379+
| Aiocache | 67 s | 200000 | 62568 |
380+
| Aiocache-2 | 113 s | 263195 | 55507 |
381+
| Cashews | 68 s | 200000 | 66036 |
382+
| cashews-2 | 175 s | 200000 | 55709 |
383+
384+
385+
#### 100k concurrency
386+
387+
| | Time | Redis GET | Load Hits |
388+
|------------|-------|-----------|-----------|
389+
| Cacheme | 30 s | 60990 | 56782 |
390+
| Aiocache | 80 s | 200000 | 125085 |
391+
| Aiocache-2 | 178 s | 326417 | 65598 |
392+
| Cashews | 88 s | 200000 | 87894 |
393+
| cashews-2 | 236 s | 200000 | 55647 |
394+
395+
### 20k concurrent batch requests
396+
397+
source code:
398+
399+
How this benchmark run:
400+
401+
1. Initialize Cacheme with Redis backend, use Redis blocking pool and set pool size to 100.
402+
2. Decorate Cacheme with a function which accept a number and sleep 0.1s. This function also record how many times it is called.
403+
3. Register Redis response callback, so we can know how many times MGET command are called.
404+
4. Create 20k `get_all` coroutines use a zipf generator and put them in async queue(around 50k-60k unique numbers). Each `get_all` request will get 20 unique numbers in batch. So totally 400k numbers.
405+
5. Run coroutines in queue with N concurrent workers.
406+
6. Collect results.
407+
408+
Result:
409+
- Time: How long it takes to finish bench.
410+
- Redis MGET: How many times Redis MGET command are called, use this to evaluate pressure to remote cache server.
411+
- Load Hits: How many times the load function(which sleep 0.1s) are called, use this to evaluate pressure to load source(database or something else).
412+
413+
#### 1k concurrency
414+
415+
| | Time | Redis MGET | Load Hits |
416+
|------------|------|------------|-----------|
417+
| Cacheme | 12 s | 9996 | 55902 |
418+
419+
420+
#### 10k concurrency
333421

334-
See [benchmark]( https://github.com/Yiling-J/cacheme-benchmark)
422+
| | Time | Redis MGET | Load Hits |
423+
|------------|-------|------------|-----------|
424+
| Cacheme | 11 s | 9908 | 42894 |

0 commit comments

Comments
 (0)