Skip to content

Conversation

@juliangruber
Copy link
Member

@juliangruber juliangruber commented Oct 1, 2025

As in #315, TS errors that I don't know how to fix are making CI fail.

Closes #306

@juliangruber juliangruber marked this pull request as ready for review October 2, 2025 13:16
Copy link
Contributor

@pyropy pyropy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. I'd only like to discuss cache design before giving it a approval.

findInBadBits(env, pieceCid),
])
const indexCacheKey = `${payerWalletAddress}/${pieceCid}`
let [dataSetId, serviceUrl] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth exploring the possibility of caching multiple datasets that share the same indexCacheKey? Users may have the same piece stored across multiple CDN-enabled data sets. If we only cache information for one data set, users could face retrieval failures when the cached data set's egress limit is reached, even though the same piece exists in other datasets.

Apart from that it would've also be nice to cache other info like egress usage and remaining egress quota (maybe not in this pull-request).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great point. I missed it for so long! I will think about it. Immediate thoughts:

  • store an array of possible pieces as the value
  • store multiple kv pairs, and perform a list() (slower)
  • rotate the cache value after retrieval (pick a different possible piece)

Copy link
Contributor

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start!

@juliangruber juliangruber requested a review from bajtos October 14, 2025 12:08
Copy link
Contributor

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-request another review after you implement the change we agreed on yesterday, where the bad-bits worker will use the KV store only, no D1 database.

@juliangruber juliangruber requested a review from bajtos October 15, 2025 14:40
@juliangruber juliangruber requested review from bajtos and pyropy October 22, 2025 07:52
@juliangruber juliangruber marked this pull request as ready for review October 22, 2025 07:52
Base automatically changed from update/move-bad-bits-to-kv to main October 22, 2025 14:01
Copy link
Contributor

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love how much simpler this pull request became after we removed the changes related to bad-bits 👏🏻

Comment on lines +100 to +102
results.map(async ({ payerAddress, pieceCID }) => {
await env.INDEX_CACHE_KV.delete(`${payerAddress}/${pieceCID}`)
}),
Copy link
Contributor

@bajtos bajtos Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this run into the limit of KV calls we can make per worker invocation? (I vaguely remember the number 1000.)

I think it's not likely for a long time, so we don't need to worry about that too much right now.

But it would be nice to have some visibility, so that we know early when we have a user approaching 1000 pieces stored. For example, we can have a Grafana chart with an alert where we show the value returned by a SQL query like the following one:

SELECT MAX(COUNT(*)) 
FROM pieces INNER JOIN data_sets ON pieces.data_set_id = data_sets.id
GROUP BY payer_address

I propose to open a follow-up tech-debt issue.

The question is whether we need this for the GA launch, and I don't think so.

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right it can happen, when there are at least 1000 pieces in a data set for example. I don't see this case as unlikely.

I see two options going forward:

  • use queues
  • use the REST API, which has higher batch limits

I will evaluate both tomorrow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on second thought, I also concluded that the limit of 1000 pieces per dataset is too low, and we need to explore other options.

Considering the complexities, maybe we should put this performance optimisation on hold until the GA launch. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, let's reevaluate

@bajtos
Copy link
Contributor

bajtos commented Oct 22, 2025

@juliangruber please get @pyropy's approval before landing this change.

His comment about a potential design issue seems relevant to me.

#323 (comment)

])

const indexCacheKey = `${payerWalletAddress}/${pieceCid}`
const [indexCacheValue, isBadBit] = await Promise.all([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're also going to need to store egress quota inside the KV store as we're not going to query database unless indexCacheValue is null or undefined.

How are we supposed to update these values given that KV store update is not a atomic operation?

@juliangruber juliangruber marked this pull request as draft October 27, 2025 15:20
@juliangruber
Copy link
Member Author

Converting back to draft, as we're deprioritizing this in favor of ipfs/egress/x402 work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use KV for index lookup

4 participants