provider: add a buffered KeyChanFunc. #870

hsanjuan · 2025-03-05T12:17:14Z

This provider helper allows to buffer all the results from a keyChanFunc in memory.

The purpose is to fix issues with slow re-providing. In the case of ipfs/kubo#10596, the slow re-providing process causes starvation of any operations trying to read/write to the pinset.

With the new buffered KeyChanFunc, we can read everything we need to announce into memory first, and free any locks as soon as possible.

Given the compact size of CIDs (<50bytes), I don't think any complexer approach (batch reading and lock/unlock handling) is justified. People with pinsets of 20 million items that want to announce everything can well spare an extra GB of RAM. For smaller repos the impact becomes negligible.

The test targets precisely the use-case and ensures we don't starve operations while interacting with dspinner.

This provider helper allows to buffer all the results from a keyChanFunc in memory. The purpose is to fix issues with slow re-providing. In the case of ipfs/kubo#10596, the slow re-providing process causes starvation of any operations trying to read/write to the pinset. With the new buffered KeyChanFunc, we can read everything we need to announce into memory first, and free any locks as soon as possible. Given the compact size of CIDs (<50bytes), I don't think any complexer approach (batch reading and lock/unlock handling) is justified. People with pinsets of 20 million items that want to announce everything can well spare an extra GB of RAM. For smaller repos the impact becomes negligible. The test targets precisely the use-case and ensures we don't starve operations while interacting with dspinner.

guillaumemichel

The reasoning and the code make sense. The windows CI doesn't seem happy with the tests though

hsanjuan · 2025-03-05T17:10:59Z

yeah @gammazero is this related to dequed "power of two" grow allocations?

gammazero · 2025-03-05T22:51:25Z

is this related to dequed "power of two" grow allocations?

I do not think so. It does not look like your test creates such a huge number of pins that it should cause OOM. Besides, the grow by power of two is what a slice does anyway. I will test this locally and see what it eating up all the memory.

The deque is very memory efficient when inflow and outflow are somewhat balanced, i.e. when size changes do not incur many queue resizes. However, if the use pattern is to enqueue a large (relative to the base capacity) amount of data and then read all of the data, then a better deque implementation is this: https://github.com/edwingeng/deque. It really depends on the use pattern which is more efficient.

provider/provider.go

provider/provider_test.go

provider/provider.go

hsanjuan · 2025-03-06T14:37:54Z

The test failure was due to recursion when ipld-prime traverses a 4092-deep graph.

provider/provider.go

gammazero

See comment.

hsanjuan self-assigned this Mar 5, 2025

hsanjuan requested a review from a team as a code owner March 5, 2025 12:17

hsanjuan added the skip/changelog label Mar 5, 2025

guillaumemichel approved these changes Mar 5, 2025

View reviewed changes

gammazero reviewed Mar 5, 2025

View reviewed changes

provider/provider.go Show resolved Hide resolved

gammazero reviewed Mar 6, 2025

View reviewed changes

provider/provider_test.go Outdated Show resolved Hide resolved

gammazero reviewed Mar 6, 2025

View reviewed changes

provider/provider.go Outdated Show resolved Hide resolved

gammazero reviewed Mar 6, 2025

View reviewed changes

provider/provider.go Outdated Show resolved Hide resolved

provider: NewBufferedProvider: implement using chanqueue

4b0aebd

hsanjuan requested a review from gammazero March 6, 2025 13:53

provider: windows-32bit tests fail due to deep recursion (fix)

5ce2312

gammazero reviewed Mar 6, 2025

View reviewed changes

provider/provider.go Outdated Show resolved Hide resolved

gammazero approved these changes Mar 6, 2025

View reviewed changes

Use latest chanqeue to read directly from pins channel

e380c9f

gammazero approved these changes Mar 7, 2025

View reviewed changes

hsanjuan merged commit 93ea580 into main Mar 7, 2025
13 checks passed

hsanjuan deleted the feat/buffered-provider-KeyChanFunc branch March 7, 2025 07:36

This was referenced Mar 11, 2025

feat: NewDAGProvider to walk partial DAGs in offline mode #847

Open

Add MFS Support to Reprovider Strategy ipfs/kubo#10704

Closed

BrewTestBot mentioned this pull request Mar 20, 2025

ipfs 0.34.0 Homebrew/homebrew-core#212741

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provider: add a buffered KeyChanFunc. #870

provider: add a buffered KeyChanFunc. #870

hsanjuan commented Mar 5, 2025

guillaumemichel left a comment

hsanjuan commented Mar 5, 2025

gammazero commented Mar 5, 2025 •

edited

Loading

hsanjuan commented Mar 6, 2025

gammazero left a comment

provider: add a buffered KeyChanFunc. #870

provider: add a buffered KeyChanFunc. #870

Conversation

hsanjuan commented Mar 5, 2025

guillaumemichel left a comment

Choose a reason for hiding this comment

hsanjuan commented Mar 5, 2025

gammazero commented Mar 5, 2025 • edited Loading

hsanjuan commented Mar 6, 2025

gammazero left a comment

Choose a reason for hiding this comment

gammazero commented Mar 5, 2025 •

edited

Loading